void updateDocument(Document document, string[] fields) and void updateDocumentList(Document[ ] documentList, string[][] fieldsList)

There are two update methods in the PushAPI: updateDocument(Document doc, String[] fields) and updateDocumentList(Document[] docList, String[][] fieldsList)

The first one is used to update one document, the second one to update several documents at once. The fields/ fieldsList parameters are not handled yet, so let's say they are useless as for now.

To update a document, you have to call one of these methods with a new document which:

  • has the same URI as the one you want to update,

  • and contains the updated parts/ metas.

The parts/ metas that do not have to be updated will be fetched from the document cache, so there is no need to put them in the document used for update.

This page discusses:

Constraints

  • For the update feature to work, you must either enable the Build Group document cache or target another Consolidation Server. For more information, see Using Document Cache in the Exalead CloudView Connectors Guide.

  • Only documents that have been added after the document cache has been enabled will be updatable.

Notes

  • The old values of multivalued metas will be dropped. If you want to update a multivalued meta by adding values, you have to put the old values you want to keep in the document used for update too.

  • Remember that parts = fields and metas = fields. The index fields that will be updated depend on the part/meta field mappings, not on the part/meta names. For example, if you want to update the “text” field, you probably want to put an updated “master” part in the document used for update, and not a “text” meta.

  • The document in the document cache is updated too, so subsequent updates of a document do not need to be cumulative.

  • It is a good idea to perform batches of updates instead of single updates.

Document data types

When you implement the updateDocument method you must send one or more documents to be updated to the index. The Document object should contain:

Types

Description

uri

A URI, which is an opaque string that uniquely identifies the document from the connector point of view.

See also URI.

stamp

An optional Stamp, which is an opaque string that the connector may use to track document changes. Document stamps may be retrieved through the getDocumentStatus method.

See also Stamps.

MetaContainer

The MetaContainer of the document. Metadata are open name-value pairs.

For a complete list of metadata understood by the API, see Metadata Examples.

PartContainer

The PartContainer of the document. The Connector sends raw bytes containing the document content. Exalead CloudView conversion services will translate and extract the textual content of the document before indexing.

The Part contains a DirectiveContainer.

DirectiveContainer

The DirectiveContainer of the document (different from the directive associated to a Part).

Implement the part object

The Part object must provide accessors for the following predefined directives:

  • encoding

  • filename

  • mimeHint

  • certifiedMime

To set a custom directive, the Part object must also provide a method, for example:

public void setCustomDirective(string name, string value)
public void setCustomDirective(Directive directive)
public void setCustomDirective(string name, string[] values)
public void addCustomDirective(string name, string value)

Implement the document object

The Document object must provide accessors for these predefined directives:

  • forcedSlice

  • sameSlice

And a method to set a custom directive:

public void setCustomDirective(string name, string value)
public void setCustomDirective(Directive directive)
public void setCustomDirective(string name, string[] values)
public void addCustomDirective(string name, string value)

HTTP parameters

The update_documents parameters are described in the table below.

Note: These parameters must be repeated (with a different id) for every document you want to send.

For better performance, we recommend using a multipart/form-data instead of application/x-www-form-urlencoded.

Parameter

Location

Description

PAPI_<id>:uri

[URL/

FORM]

The uri parameter is the string of the document URI.

PAPI_<id>:stamp

[URL/

FORM]

The optional stamp parameter is the string representing the document's Stamp.

PAPI_<id>:meta:<meta_name>

[URL/

FORM]

The meta_* parameter is a string containing the value of the metadata referenced by meta_name.

Multiple values may exist for the same parameter. You must generate as many parameters as there are values.

PAPI_<id>:directive:

<directive_name>

[URL/

FORM]

The list of optional supported directives (at the document level):

forcedSlice: advanced feature

PAPI_<id>:directive:fields

[URL/

FORM]

Not supported for the moment.

PAPI_<id>:part_bytes:<part_name>

[URL/

FORM]

The part_bytes parameter is the content of the document's part that is identified by part_name.

PAPI_<id>:part_directive:

<part_name>:<directive_name>

[URL/

FORM]

The list of optional supported directives (at the part level):

filename: the document filename

mimeHint: the hint mime parameter

mime: the forced mime (use very carefully)

encoding: the document encoding

PAPI_session

[URL]

The optional parameter that retrieves the session given by a previous call to get_current_session_id

Action: if there is a session mismatch, the Push API server refuses the command and returns an exception.