Scenario 1—EXALEAD Indexing Server DownIn this scenario, one of the indexing servers in the high-availability deployment fails. The MQL and consolidation server keep pushing new index data to the working indexing server. The Search client detects when the index slices corresponding to the down indexing server are out of date (using the checkpoint) and direct searches to the up-to-date index slices. To resume, the failed indexing server is restarted, correcting whatever the log file indicates was the failure. The next partial index detects that the two indexing servers are not identical (using the checkpoint). This means that the MQL session runs two different queries to determine the objects to index for this partial indexing. Once that partial is complete, the two indexing servers are identical and searches now detect (using the checkpoint) that they are to send searches across both servers. Scenario 2—Other Indexing/Search Components DownIn this scenario, any of the build index, search index, or distributed index are down for any one build group. The indexing server for that build group is still up and running, so new jobs are queued for this build group. On the other build group, indexing processes normally. The Search client detects that the down index components are causing the index to be out of date (using the checkpoint) and direct searches to the up-to-date index slices. To resume, the failed index component is restarted, correcting whatever the log file indicates was the failure.
Scenario 3—EXALEAD Consolidation Server DownIn this scenario, if the failure is on the backup aggregator, there is no impact. You can fix the problem and restart. If the failure is on the main aggregator, however, MQL detects this and starts using the backup aggregator for new index requests. To resume, fix the main aggregator problem and restart. MQL detects that the server is backed up and starts sending new index requests to that module. The queue that was built up in the backup aggregator while the main one was down continues to be processed. There is an impact on the index in this scenario, as any jobs that were
queued up in the consolidation server when it went down are lost. These
can be identified through tracing on the consolidation server, as well
as stamping index data and querying them. This "stamping" approach means
that you can put a Scenario 4—Full EXALEAD Machine DownIn this scenario, one of the machines in the high-availability deployment fails. MQL keeps pushing new index data to the working machine. The Search client detects the down server and directs searches to the running machine. To resume, restore the failed machine. If that machine is recovered or if it is restored from backup, the next partial index detects that the two indexing servers are not identical (using the checkpoint). Either way, this means that the MQL session runs two different queries to determine the objects that are indexed for this partial. Once that partial is complete, the two indexing servers are identical and the searches now detect (using the checkpoint) that they are to send searches across both servers. If recovered, the same problems identified in Scenario 3—EXALEAD Consolidation Server Down apply here. Any jobs that were queued in the consolidation server are lost. Follow Scenario 3—EXALEAD Consolidation Server Down for recommendations on how to deal with that. Scenario 5—EXALEAD Data Lost or CorruptedIn this scenario, data in the index is "lost" (for example, accidentally deleted) or corrupted. It is impossible to recover partially if data is destroyed and must be restored from backup. Restoring from backup would involve:
The resume step here is identical to Scenario 1—EXALEAD Indexing Server Down. All indexing and searching has to work through the other machine while the restore takes place. Scenario 6—Pause for EXALEAD Data BackupThis scenario interrupts indexing and searching on each build group as it is frozen/unfrozen. Indexing is paused for a nightly backup. The recovery procedure is as follows:
Scenario 7—Do Not Pause for EXALEAD Data BackupThis scenario interrupts searching on each build group as it is frozen/unfrozen. In this scenario, there is backup with no pausing of the MQL process.
To freeze or unfreeze a build group:
The recovery for this procedure is identical to that described in Scenario 1—EXALEAD Indexing Server Down, where each build group is not identical. The next partial indexing after this process indexes the right data, and both build groups are in sync. Scenario 8—Planned EXALEAD Server RestartThis scenario is similar to Scenario 6—Pause for EXALEAD Data Backup. Instead of freezing the build groups, you instead stop the servers. Once the server is stopped, reboot the machine. |