Writing Custom Tokenizers and Semantic Processors

Exalead CloudView is delivered with a vast number of semantic processors that can alter documents in analysis pipelines. You can perform most analysis tasks by assembling these processors. However, for advanced and custom operations, it may be more convenient to write custom semantic processors.

You can:

Replace the analysis pipeline tokenizer with a custom one written in Java.
Add a custom semantic processor at the end of the analysis pipeline.

Both are implemented as custom document processors, so make sure that you are acquainted with the proper way to develop and deploy them on a Exalead CloudView instance.

For information on how to build and deploy with the Eclipse plugin, see "Develop and deploy components using the Eclipse plugin".

In this section:

About Tokens and Annotations
Write a Java Custom Tokenizer
Write a Java Custom Semantic Processor