Indexing your first PAPI document

This page discusses:

Run the sample program

This section explains how to run the sample program to index your first document.

Java Code

import com.exalead.papi.helper.Document;
import com.exalead.papi.helper.Meta;
import com.exalead.papi.helper.Part;

// [...]

final PushAPI papi = createConnection(...);

//new document (uri , stamp)
final Document doc = new Document("doc1", "2014-03-15");

// create the metas
doc.addMeta(new Meta("title", "My document's title"));
doc.addMeta(new Meta("date", "2014-03-20"));
doc.addMeta(new Meta("size", "5493"));
doc.addMeta(new Meta("approved", "false"));

// master part
final byte[] bytes = new String("the text to index...").getBytes("UTF-8");

// if you don't specify part name, the part is considered as Master part
final Part masterPart = new Part(bytes);
masterPart.setEncoding("UTF-8");
masterPart.setFileName("filename.txt")
doc.addPart(masterPart);

// another part
final Part part = new Part("Second part",bytes);
part.setEncoding("UTF-8");
part.setExtension("txt");
doc.addPart(part);

// push the document
papi.addDocument(doc);

C# Code

This code snippet demonstrates how to send the document.

//How to send a document.
void IndexDocument()
{
    Document doc = new Document("doc1");

    // the stamp associated to the document
    doc.Stamp = "2014-03-15";

    // create the metas
    MetaContainer metaContainer = new MetaContainer();
    metaContainer.AddMeta(new Meta("title", "My document's title"));
    metaContainer.AddMeta(new Meta("date", "2014-03-20"));
    metaContainer.AddMeta(new Meta("size", "5493"));
    metaContainer.AddMeta(new Meta("approved", "false"));
    doc.MetaContainer = metaContainer;

    PartContainer partContainer = new PartContainer();
    // master part
    byte[] bytes = new UTF8Encoding().GetBytes("the text to index...");
    Part masterPart = new Part(bytes);
    masterPart.Encoding = "UTF-8";
    masterPart.Filename = "foo.txt";

    partContainer.AddPart(masterPart);

    Part part = new Part(bytes);
    part.Encoding = "UTF-8";
    part.Filename = "foo.txt";

    partContainer.AddPart(part);
    doc.PartContainer = partContainer;

    // push the document
    papi.AddDocument(doc);
}

How to force the indexing of pending operations

To force indexing, you must call the two following methods.

Java Code

// This forces a flush to disk
papi.sync()

// This triggers the indexing of committed documents. 
// In V6R2014 and higher, the task queue is optional (no task queue by default)
// If there is no task queue, the following method may commit an indexing job if a
// document analysis has been started. Unlike, the sync method, this method does not
// block the PAPI
papi.triggerIndexingJob()

C# Code

// This forces a flush to disk
papi.Sync()

// This triggers the indexing of committed documents.
// In V6R2014 and higher, the task queue is optional (no task queue by default)
// If there is no task queue, the following method may commit an indexing job if a
// document analysis has been started. Unlike, the sync method, this method does not
// block the PAPI
papi.TriggerIndexingJob()
Important: In V6R2014 and higher versions, the triggerIndexingJob() method may commit an indexing job if a document analysis has been started. Unlike, the sync() method, this method does not block the PAPI.
Important: In Exalead CloudView V6, the sync() method should not be called by the connector during standard indexing. It is controlled by the Force Indexing after scan option in the Administration Console > Connectors > Deployment > Push API section. When this option is selected, Exalead CloudView will automatically trigger the indexing job after each scan. You should use the sync() method for very specific use cases only. For example, if you need to make a diff between indexed documents in Exalead CloudView and documents in the source. In that case, you must: push new documents, make a sync() to trigger the indexing job, then enumerate synced entries to make a diff with your source.

Check the document status

You can use GetDocumentStatus to retrieve the status of a specific document using its URI.

Java Code

void getDocumentStatus() throws PushAPIException {
    final String uri = "doc1";
    final DocumentStatus ds = papi.getDocumentStatus(uri);
    if (ds.isExist()) {
      System.out.println("EXISTS! Stamp = " + ds.getStamp());
    } else {
      System.out.println("MISSING!!!");
    }
  }

C# Code

public void GetDocumentStatus()
{
    string uri = "doc1";
    DocumentStatus ds = papi.GetDocumentStatus(uri);
    if (ds.Exist)
        Console.WriteLine("EXISTS! Stamp = " + (ds.Stamp ?? "(null)"));
    else
        Console.WriteLine("MISSING!!!");
}