Best Practices

This page discusses:

Crash resistance

To test the connector crash resistance, you can:

  • Stop the source server during indexing time to simulate a source server crash.

  • Unplug the network cable to simulate a network error.

  • Restart the Push API server while the connector is indexing to simulate a Push API server crash.

All these tests should pass without losing any document.

Log management

Exalead CloudView uses log4j to report logs. You can:

  • either use the getLogger() static method in the Logger class,

  • or the getLogger() method of the Connector object.

The global log level of the product is managed in the Logs menu of the Administration Console. You can:

  • Display the exception stack for each message.

  • Log the URIs of documents sent to the index in trace mode.

  • Log the plugin version number at the beginning of the scan method.

    Note: You can also configure log levels more precisely by editing the <DATADIR>/config/Logging.xml file.

Test plan & monitoring

These are a few tests that you can perform to test your connector:

  • Index 1 million documents in a single indexing phase without crash.

  • Calculate the required time for incremental indexing just after a full scan, without any modification on the source server. This will give you an idea of the minimum time required for incremental indexing.

  • Launch several incremental indexing and monitor memory consumption. Note that the connector process memory is shared by all connectors.

  • If you encounter java.lang.OutOfMemoryError: Java heap space or java.lang.OutOfMemoryError: PermGen space errors in a specific process, the memory setting for this process may be too low.

    Edit DeploymentInternal.xml, and change the corresponding <ProcessInternalConfig> node value(s):

    • Change the -Xmx value for heap space issues. For example: <StringValue value="-Xmx1024m"/>

    • Change the -XX:MaxPermSize value for PermGen space issues. For example: <StringValue value="-XX:MaxPermSize=1024m"/>

      Do not forget to rebuild the configuration (for example with <DATADIR>/bin/buildgct master).

Package the connector

Do not forget to update the plugin version number for each new release.

Aggregate Documents

Sometimes, building a PAPI document is a really complex task, especially when you need to rebuild it entirely for an incremental update. For example, let’s say that for a connector indexing emails, we want to create a single PAPI document for each email thread that aggregates all the emails of the thread. When a new email arrives in a thread, the connector must rebuild the entire document by aggregating all emails once again.

For this kind of situation, we recommend using the Consolidation Server. See the Exalead CloudView Consolidation Server Guide.

Other best practices

  • Index raw documents without connector aggregation. If you want to perform aggregation, use the Consolidation Server. See the Exalead CloudView Consolidation Server Guide.

  • Do not store anything on the hard drive, everything must be stored in Exalead CloudView.

  • Build document URIs in a hierarchical way, for example, /ROOT/FolderA/FolderB/DocumentA, to be able to delete a whole folder content with only one call to the deleteDocumentsRootPath() method.

  • If the indexing is multi-threaded, the number of threads must be configurable in the connector UI to adjust the server load.

  • To send documents as batches to the indexing server, you can select the Buffer operations option in the Administration Console > Connectors > Deployment > Push API section. You don't need to develop your own buffering strategy, just rely on this option.