Manage Documents Explicitly

You can set aside the subtleties about the lifecycle management of documents created during the transformation and aggregation phases if rather than creating custom URIs (that is, documents with URIs that do no share anything in common with the document that created them) you create child documents.

This page discusses:

Creating child documents from a given document managed by a connector, ensures that when the connector pushes the document deletion, the Consolidation Server registers for deletion all child documents automatically. This behavior is true for both the transformation and aggregation phases.

Note: In such case, the deletion occurs whether the document is attached to another one or not. The deletion criteria is URI-based.

As as result, this type of document creation is the preferred one if you do not want to bother with the lifecycle management of these objects. If you choose this method, it is unlikely that you ever need to write a processor in delete action context.

In the Transformation Phase

In the Company Hierarchy Example code, we pushed the creation and updates of employees to the Consolidation Store. With them, we have possibly created manually new documents representing the service they belong to.

Note: Connectors do not manage service documents. In our use case, the CDIH would refuse delete orders for a service document, as it is created afterward.

Deleting the documents created within the transformation phase is within the hands of the developer writing the processor logic. If you create managed documents, like we did with the call to createDocument, then the document is removed from the Store automatically once no other documents are attached to it. If the connectors send delete orders for the 3ds company as well as all its employees, then service documents become orphaned, and garbage collected automatically.

What would happen if you sent an order to delete all the employees of a given service? In such case, it would ultimately delete all employees from the Store, and with them all the edges that were pointing to the associated service. However as services would still point to the company, these documents would stay in the Store. Remember that managed documents are garbage collected only when no edges are attached to them at the end of the transformation phase.

Consider that we have the following graph in the Consolidation Store.

Company's Hierarchy in the Consolidation Store

Connectors push the blue documents. The Consolidation Server creates the purple ones (as written in the previous section). If connectors send delete orders to all employees and companies, all nodes and edges are properly deleted from the Store. In the following graphic, transparency means that documents disappeared from the Store.



But if we send delete orders for all employees only, we end up in the following case, in which, the colored nodes and arcs stay in the Consolidation Store.



What about the delete processor?

If we take the case of the Company's Hierarchy and send delete orders for all employees, we can safely write a delete processor that for each employee, calls a deleteDocument method. For example, this sends twice the same delete order for the Research-Development service when we delete Bob and Alice sequentially. But this is okay, since the second one would become a no-operation (like null or void).

What if the delete orders for the employees are incremental?

Do we know for sure that the employees delete operation is always global? We must not send a delete order to the service that an employee belongs to. If you send a delete order for Bob, you cannot delete the Research-Development service since it still has an employee (Alice) attached to it. To do so, we would need to traverse the graph during the transformation phase, but such operation is only allowed at the aggregation phase. To deal with a similar case, writing a custom delete processor is not a viable solution. You would rather keep the default delete processor, which deletes the employee visited.

In the Aggregation Phase

Every document manually created in the aggregation processors is pushed as is to the Indexing Server once it has passed the forward rules phase.

If you want to associate that manually created document with the lifecycle of the "master" document, use the createChildDocument method.

When a master document is deleted, the Consolidation Server does not send a delete order on all existing child documents automatically, if any. This is because they are not in the storage and the Consolidation Server cannot determine their types. To delete child documents automatically, you must create a delete processor.

If you create a document manually, you have to handle its deletion by yourself. To do so, you can:

  • Send delete orders to the PushAPI server of the Indexing Server directly.

  • Write a custom aggregation delete processor, which would send delete orders only on the documents/URIs known/managed by you.