File Integrity with Checksums

This section describes how file integrity can be improved by using checksums to detect compromised files.

Each time a file is checked out, FCS creates a checksum on the file to be checked out, and compares it with the checksum that was created and recorded when the file was previously checked in.

This page discusses:

How FCS Uses Checksums

You can improve your system's file integrity by turning on the use of checksums. When a file is checked out, FCS generates a checksum at runtime for the file to be checked out and compares it with the checksum sent on the ticket for that file. If the checksums differ, the file has been modified or corrupted since it was last checked in. The file checkout fails, and FCS returns an error. The user should alert a system administrator who can then resolve the problem.

Data inconsistency is detected when a user retrieves the file at checkout, not when the data is actually corrupted.

The following steps describe how the checksum integrity check process works.

  1. When a file is checked in, FCS computes a checksum for it. The checksum is sent back to the MCS as part of the checkin receipt and stored for future reference.
  2. When the file is checked out, the MCS sends the recorded checksum as part of the checkout ticket.
  3. When FCS receives the ticket, it computes a checksum of the target file and compares it with the recorded checksum from the MCS. If the two checksums differ, the file has been modified since its last checkin; FCS returns an error and the checkout fails.

This figure shows where the checksums are generated, sent, and stored during the checkin process.



This figure shows where the checksums are generated and sent during the checkout process.



During file synchronization or a file copy, FCS automatically copies the checksums for files.

You can generate checksums for migrated data without having to check in the individual files. See Data Migration.

Checksum Activation

To activate or deactivate the usage of checksums, use the following MQL commands:

mod store TSTStore checksumon ;
mod store TSTStore checksumwarnonlyon ;

mod store TSTStore checksumoff ;
mod store TSTStore checksumwarnonlyoff ;

To activate or deactivate locations:

mod location TSTLoc checksumon ;
mod location TSTLoc checksumwarnonlyon ;

mod location TSTLoc checksumoff ;
mod location TSTLoc checksumwarnonlyoff ;

When checksums are activated, FCS will compute and send a checksum to the MCS when a file is checked in, and perform the checksum integrity check when a file is checked out. The default value for both is off.

Also, when checksums are activated, you can set the checkout process to simply warn the user instead of failing with an error. If the checksum integrity check detects a corrupted file, the file checkout succeeds, the user is warned, and the invalid checksum is logged.

add store STORE_NAME checksumwarnonly [ on | off ];

modify store STORE_NAME checksumwarnonly [ on | off ];

The default value for both is off.

You can show the current value of the checksum options with the following MQL commands:

print store STORE_NAME checksum;

print store STORE_NAME checksumwarnonly; 

Checksum Activation at Upgrade Time

Checksums are activated on new stores by default . This activation does not affect old stores. Here is what you see before the upgrade:

print store STORE_NAME checksum;
store TSTStore
checksum = FALSE

Here is what you see after the upgrade:

print store STORE_NAME checksum;
store TSTStore
checksum = TRUE

After that change newly checked-in files on migrated captured stores will get a MD5 checksum.

print bus BO_NAME select format.file.checksum

Where BO_NAME is:

| TYPE_NAME NAME REVISION [in VAULT] |
| ID |

Newly checked-in file MD5 checksum example:

print bus Document Doc-0000002 0 select format.file.checksum;
business object  Document Doc-0000002 0
format.file.checksum = {MD5}997a13852cd7e376eca82130e1db3a67

Corruption Detection

If a data inconsistency is detected during checkout, the systems throws an FcsException error and displays an error message to the user, and the checkout process fails. The user should alert the system administrator.

throw new FcsException("HttpOutputHandler: File Checksum Error - db checksum is "+checksum+", runtime checksum is "+rtChecksum+”, for file ”+hashname);

Data Correction

The system administrator should manually correct the problem by checking in a non-corrupted copy of the file. This will obsolete all old (corrupted or not) copies and start again with the copy that is newly checked in.

Data Migration

Normally, a checksum is created for a file when it is checked in. For migrated data that doesn't go through the checkin process, you can create checksums with MQL commands. Note that this migration method assumes the files are not already corrupted. You can run the validate command first to help detect any problems before creating the checksums. See Validation.

To calculate the checksum for all files owned by a business object, use:

rechecksum businessobject TNR;

To calculate checksums for a list of business objects, committing a transaction for every N business objects (the default is 10), use:

rechecksum businessobjectlist QUERY [commit N] [continue];

The continue option allows the next transaction to carry on when an error occurs; the default value is to quit.

To calculate checksums for all the files in a store, committing a transaction for every N files (the default is 10), use:

rechecksum store STORE_NAME [commit N] [continue]; 

The continue option allows the next transaction to carry on when an error occurs; the default value is to quit.

A checksum will be calculated for a file only if it has not been previously calculated. The checksum will be based on an up-to-date copy of the file at an arbitrary location. The checksum value will then be propagated to all non-obsolete copies of the file.

The keyword rechecksum can be included in the MQL commands modify businessobject and modify businessobjectlist to allow a checksum to be recomputed without rechecking in a file.

Print Checksum

To show the current checksum value of a file, use:

print businessobject TNR select format.file.checksum;

Inventory Store

You can add the checksum value field to the inventory result. This is optional.

inventory store STORE_NAME [fcsdbchecksum];

Validation

To calculate the current checksum values of the files in a business object and compare them with the recorded values, use:

validate businessobject TNR fcsdbchecksum; 

To validate the checksum for a list of business objects, use:

validate businessobjectlist QUERY fcsdbchecksum;  

Note that the commands for validating checksums are expensive operations. For each file to be validated, a compute checksum request is sent to the corresponding FCS and the file is scanned to compute the current checksum.

Also, the validate checksum commands are read-only operations. If a new checksum is different, it will be reported to the output file, but not stored.

Performance

The checksum computation is based on streamed data, which increases FCS checkin and FCS checkout time slightly, but the increase is a reasonable tradeoff for the increased data integrity.

Using the rechecksum command during migration may consume a lot of time. For large stores and locations, it may be impractical to do a rechecksum store command. In this case, you can either use rechecksum buslist to migrate only active objects or do not use rechecksum at all. If you do not use rechecksum at all, only files that are newly checked in (with the checksum option on) will have their checksums verified on checkout.