Document Processing in the Consolidation Server

The following diagram gives a detailed view of document processing in the Consolidation Server.

See Also
How the Consolidation Server Fits into Exalead CloudView
Processor Action Context
Control the Processing


At the top level, connectors send documents to the Consolidation Server. The PushAPI Server receives them and first pass them to the Consolidated Document Identifier Holder (CDIH), which assigns them unique IDs.

Note: If we send a delete order for a particular document that the CDIH does not know, the order does not even proceed to the transformation processors. This is the case for the document depicted in black in the picture.

For each transformation thread, the PushAPI Server then dispatches them to a list of transformation threads. In the processing chain of one transformation thread, a document tries to be applied on all defined processors (here 4 in the diagram, 1 <= TPi <= 4). We say "try" since, as we will see later, you can associate a processor code to a particular document type hierarchy. As a result, some processors are skipped (colored in orange) and others are selected (colored in green) depending on the document type. For more information, see Processor Type Inheritance and Runtime Selection.

At execution time, once the document is transmitted to a transformation processor, it is then automatically passed to the next available and valid processor... unless told otherwise (using a discard call). This the case, for example, for the processor highlighted in red where the document is not transmitted to the next phase (either next processor or here the Consolidation Store). Clearly when making such decision, this document does not participate to the Aggregation Phase.

The Consolidation Store stores all the documents pushed to it as well as the potential relationships created at the transformation phase.

Once some documents are available in the Store, the aggregation phase can start, independently of the transformation phase. So, the transformation and aggregation phases are performed in parallel. And similarly to the transformation phase, the aggregation is concurrently applied using a number of threads defined at configuration time. The logic of selection and processing is then totally similar to the one described for the transformation. The difference is that in this phase:

  1. We execute aggregation processors (here 4, 1 <= APi <= 4),
  2. Then documents are passed to the forward rules handler,
  3. The forward rules handler ultimately route (or not) consolidated documents to the Indexing Servers or to other Consolidation Servers.