Index

This section lists the elements you can use to configure the Index.

This page discusses:

AnalysisConfig

  • com.exalead.indexing.analysis.v10.AnalysisConfig
  • AnalysisConfig represents a self-contained module for Document Analysis. AnalysisConfig is referenced by a BuildGroup. An analysis module defines a set of pipelines that are applied in this module.
  • Attributes:
    Name Type Default value Description
    name string Name of the analysis module. Must be unique.
    linguistic boolean True Extracts linguistic data for the dictionary, such as word counts. This impacts the ability to compute related terms and use word counts for ranking.
  • Nested elements:
    Name Type Description
    AnalysisPipeline com.exalead.indexing.analysis.v10.AnalysisPipeline*

AnalysisPipeline

  • com.exalead.indexing.analysis.v10.AnalysisPipeline
  • A document analysis pipeline. Each pipeline has an associated accept condition. This condition is tested for each input document. If a document matches the condition, it is processed by this pipeline. If not, the condition is tested for the next pipeline in the list of pipelines defined in a DocumentAnalysis object. A document refused by all pipelines is neither processed nor indexed. Pipeline processing is made of several stages:
    • Document Processing Stage - is performed by a list of DocumentProcessor which process each Document sequentially. Document Processors manipulate the 'DocumentParts' (binary data pushed through the PAPI) and the 'DocumentChunks' (textual data obtained either from PAPI meta or by processing of Document Part or by processing of pre-existing Document Chunks) Each DocumentChunk has a textual content, a ContextName, a language, a score, may belong to a DocumentPart. A DocumentChunk belonging to no DocumentPart is called a root DocumentChunk.
    • Semantic Processing Stage - involves a list of SemanticProcessor which process each Document Chunk of each Document sequentially (except those for which Semantic Processing is disabled in the mapping). Semantic Processing segments text into 'tokens' and then processes text as a flow of tokens. SemanticAnnotations are produced on each token.
    • Mapping - involves mapping DocumentChunk and Semantic Annotations to index fields.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisConfig (as AnalysisConfig)
  • Attributes:
    Name Type Default value Description
    name string
    errorAction string continue Specifies the action to launch if there is a document error during processing:
    • "discard": Discards the document from the job. If the document was already in the index, it's not removed if it already existed.
    • "delete": Discards the document from the job and deletes it from the index.
    • "continue": Keeps processing the document. The document will probably be incomplete in the index.
    reportDocumentErrors boolean True Reports the document errors in the global reporting store, for further analysis.
    globalLogDocumentErrors boolean Logs errors and exceptions reported by the processors in the global log (without stack trace).
    autoBlacklistDocuments boolean True Tries to blacklist the documents triggering serious failure automatically. This option helps preventing loop failures, that is to say, when documents always trigger the same analysis process failures.
    tokenizationConfig string Reference to the TokenizationConfig object to use for tokenization during Semantic Processing Stage.
    autoconfigureFromDataModel boolean True
    documentProcessorsProfiling boolean Logs the CPU time spent for each document processor and for the main indexing phase. The total time spent for each processor is dumped in the analyzer log at the end of the job.
    semanticPipeTimeout int CPU-time limit for the processing of a text chunk by the semantic pipe, in seconds.
    slowDocumentWarningTimeUS long 5000000 If the processing of a document is longer than this time, a message will be printed in the analyzer log. A value of 0 disables the warning feature.
    semanticProcessorsProfiling boolean Logs the CPU time spent for each semantic processor. The total time spent for each processor is dumped in the analyzer log at the end of the job. Warning: This feature strongly impacts performance, only enable it if required.
  • Nested elements:
    Name Type Description
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition
    DocumentProcessor com.exalead.indexing.analysis.v10.DocumentProcessor*
    FilteringConfiguration com.exalead.indexing.analysis.v10.FilteringConfiguration
    LanguageConfiguration com.exalead.indexing.analysis.v10.LanguageConfiguration*
    MappingConfiguration com.exalead.indexing.analysis.v10.MappingConfiguration
    SemanticProcessor com.exalead.indexing.analysis.v10.SemanticProcessor*

AndCondition

  • com.exalead.indexing.analysis.v10.AndCondition
  • AndCondition matches if all children AcceptCondition match.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Nested elements:
    Name Type Description
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition*

OrCondition

  • com.exalead.indexing.analysis.v10.OrCondition
  • OrCondition matches if one child matches.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Nested elements:
    Name Type Description
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition*

NotCondition

  • com.exalead.indexing.analysis.v10.NotCondition
  • Matches if the child condition does not match. If there is no child condition (null), this condition never matches.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Nested elements:
    Name Type Description
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition

SourceCondition

  • com.exalead.indexing.analysis.v10.SourceCondition
  • SourceCondition matches if the source of the document matches 'source'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    source string Value of the 'source' for the document against which to check.

BuildGroupCondition

  • com.exalead.indexing.analysis.v10.BuildGroupCondition
  • BuildGroupCondition matches if the current buildgroup matches 'name'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    name string Value of the "buildgroup" for the document against which to check.

MetaCondition

  • com.exalead.indexing.analysis.v10.MetaCondition
  • MetaCondition matches if the Document contains a DocumentChunk whose meta name and value match the specified condition.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    name string Name of the meta against which to check.
    nameMode enum(equals, matches) equals Meta name test mode:
    • "equals": Evaluates the DocumentChunk with a name equal to the specified one.
    • "matches": Evaluates the DocumentChunk with a name matching the specified regular expression.The match is case insensitive.
    valueMode enum(equals, contains, exists, matches) exists Value test mode:
    • "exists": Matches if a DocumentChunk pass the name condition.
    • "equals": Matches if a DocumentChunk pass the name condition and the textual content is equal to the 'value' attribute.
    • "contains": Matches if a DocumentChunk pass the name condition and the textual content contains 'value' (Pure string matching is performed without tokenization).
    • "matches": Matches if a DocumentChunk pass the name condition and the textual content matches the regular expression specified by the 'value' attribute. The match is case insensitive.
    value string The string to check against the value of DocumentChunks.

MimeCondition

  • com.exalead.indexing.analysis.v10.MimeCondition
  • A condition that matches if the FIRST document part mime type is in the list. Note: Conditions work on document but mimes are set per document part. The MimeCondition only tests the mime type of the first part, if present.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Nested elements:
    Name Type Description
    mimes exa.bee.StringValue*

URLMatchCondition

  • com.exalead.indexing.analysis.v10.URLMatchCondition
  • A condition that matches if the URI matches the regexp.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    regexp string The regexp. Note: It is not anchored by default ; i.e., use '.*\.asp to match .asp URIs.

FilenameMatchCondition

  • com.exalead.indexing.analysis.v10.FilenameMatchCondition
  • A condition that matches if the FIRST document part Filename type matches the regexp. Note: Conditions work on document but Filenames are set per document part. FilenameMatchCondition only tests the Filename type of the first part, if present.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    regexp string The regexp. Note: It is not anchored by default ; i.e., use '.*\.doc' to match .doc files.

BinaryContentCondition

  • com.exalead.indexing.analysis.v10.BinaryContentCondition
  • A condition that matches if the FIRST document part binary content type matches the binary string. Note: Conditions work on document but content is set per document part. BinaryContentCondition only tests the binary content of the first part, if present.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    offset int Offset in bytes for the binary data to be compared, in bytes (0 for the beginning of the file). Negative values are taken as offset from the end of the file (-1 for the last byte).
    match string Binary string to be compared. The string may contain any ASCII (7-bit) character, or the following '\' escape sequences:
    • \xNN An hexadecimal-encoded character (N part of '0'..'9' or 'A'..'F')
    • \NNN An octal-encoded character (N part of '0'..'9')
    • \n Character 10
    • \r Character 13
    • \\ Character '\'
    • \" Character '"'
    • \? Any character

DataModelClassCondition

  • com.exalead.indexing.analysis.v10.DataModelClassCondition
  • A condition that matches if the document has the corresponding DataModel.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    className string The restricted DataModel class

CustomDirectiveCondition

  • com.exalead.indexing.analysis.v10.CustomDirectiveCondition
  • A condition that matches if the document has the specified directive name, with an optional specific value.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
    • com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
    • com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
    • com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
    • com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
    • com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
    • com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
    • com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
    • com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
    • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
    • com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
    • com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
    • com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
    • com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
    • com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
    • com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
    • com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
    • com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
    • com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
    • com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
    • com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
    • com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
    • com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
    • com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
    • com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
    • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
    • com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
    • com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
    • com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
    • com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
    • com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
    • com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
    • com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
    • com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
    • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
    • com.exalead.indexing.analysis.v10.StringHash (as StringHash)
    • com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
    • com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
    • com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
    • com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
    • com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
    • com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
    • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
    • com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
    • com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
  • Attributes:
    Name Type Default value Description
    directiveName string The expected directive name
    directiveValue string An optional expected value for the given directive

LanguageDetector

  • com.exalead.indexing.analysis.v10.LanguageDetector
  • Language detection is performed using the text of all the DocumentChunks associated with the specified input ContextNames for which language was not already detected or specified. The whole text of all these DocumentChunks is taken into account by a statistical algorithm that detects the language. This language is then set as the language for all specified chunks. For example, the language attribute of a DocumentChunk is used by semantic processing. Language is represented by its iso639-1 code: fr, en.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    languageContext string If this is not null and if there is a DocumentChunk with a ContextName matching 'languageContext':
    • no automatic detection will be performed,
    • the language specified will be used as the language of the DocumentChunks associated with the ContextNames specified as input.
    languagesToDetect string If not null, restrict the language detector to a set of languages. If you only have a small set of languages to detect, you can restrict language detector to this set to improve precision. List is comma-separated, ex: "en,fr"
    defaultLanguage string If not null, 'defaultLanguage' will be used as the default language when automatic detection fails.
    exclude boolean If true, "inputContexts" is an exclude list instead of an include list. Language detection is then performed on all DocumentChunks except those whose ContextName appears in 'inputContexts'.
    outputContext string ContextName of the DocumentChunk to create. It will contain the language detected in the processed DocumentChunks as defined in ISO 639-1.
    minLangPercentage int 33 Minimum ratio ([0-100]) of language to be detected (0 = always keeps a detected language)
    languagesToKeep int Keeps the n most represented languages in the document. A value of 0 lets the minLangPercentage select the languages.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

LanguageSetter

  • com.exalead.indexing.analysis.v10.LanguageSetter
  • The language is set as the language for all the DocumentChunks associated with the specified input ContextNames. For example, the language attribute of a DocumentChunk is used by semantic processing. The language is represented by its iso639-1 code: fr, en
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    language iso code Language specified by ISO 639-1 code.
    outputContext string ContextName of the DocumentChunk to create. It will contain the language name as defined in ISO 639-1.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ContentCleanup

  • com.exalead.indexing.analysis.v10.ContentCleanup
  • Analyzes each DocumentChunk and performs whitespace removal, 'Whitespaces' being defined by the Unicode specification. This includes ' ' '\r' and '\n'. Input: All DocumentChunks associated with the specified 'inputContext' ContextNames. Output: Same as input
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    duplicateWhitespaces boolean Removes duplicate whitespaces. (' ' -> ' ')
    leading boolean Removes the leading whitespaces
    trailing boolean Removes the trailing whitespaces
    spaces boolean Removes *all* whitespaces.
    stripHTML boolean Strips HTML tags
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ValueSelector

  • com.exalead.indexing.analysis.v10.ValueSelector
  • Takes the input contexts in the specified order, and as soon as one is found, it copies the content to the output context and stops.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each selection.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

UTF8Checker

  • com.exalead.indexing.analysis.v10.UTF8Checker
  • Checks that the text passing through is valid UTF-8. Emits a warning with the document URI and the context name if input is malformed. Optionally deletes invalid chunks.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    deleteInvalidChunks boolean Removes invalid chunks from documents.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ConcatValues

  • com.exalead.indexing.analysis.v10.ConcatValues
  • Concatenates all textual content of DocumentChunks where ContextName matches 'inputContexts', and joins them with the 'join' string. A single DocumentChunk with ContextName 'outputContext' is created as an output.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each concatenated value.
    join string Optional string inserted between concatenated values.
    strict boolean True Forces all the input contexts found to generate the concatenation.
    allowDuplicates boolean True If true, and if there are multiple DocumentChunks with the same ContextName, it concatenates them all. If false, only the first DocumentChunk among all those with the same ContextName is kept.
    cartesianProduct boolean If there are multiple DocumentChunks with the same ContextName, it generates the cartesian product between all values.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RemoveContexts

  • com.exalead.indexing.analysis.v10.RemoveContexts
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

MultiContextCSVEncoder

  • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder
  • Creates a DocumentChunk containing the ContextName and the textual value of the DocumentChunks matching 'inputContexts'. This processor can be used, for instance, to store arbitrary (key,value) pairs into one single index field. Note that this storing method is inefficient and should be used with caution. @csh AC_MULTICONTEXT_ENCODER_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    processUnmappedContexts boolean All DocumentChunks with an unmapped ContextName in the document will be used for input. This can be used to emulate the 'default meta' and 'content' field feature of CloudView 4.6.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

StringHash

  • com.exalead.indexing.analysis.v10.StringHash
  • The StringHash processor computes a signed hash of the textual input value. For example, this value can be used in a field used for grouping.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    nbBits int 64 The size of the hash, in bits, including the sign bit. The hash values will be in [-2^(nbBits-1); 2^(nbBits-1) - 1].
    outputContext string The ContextName used for the newly created chunk.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

StringHash64

  • com.exalead.indexing.analysis.v10.StringHash64
  • The StringHash processor computes a signed hash of the textual input value on 64 bits. For example, this value can be used in a field used for grouping.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for the newly created chunk.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

StringHash32

  • com.exalead.indexing.analysis.v10.StringHash32
  • The StringHash processor computes a signed hash of the textual input value on 32 bits. For example, this value can be used in a field used for grouping.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for the newly created chunk.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

NumericalFormatter

  • com.exalead.indexing.analysis.v10.NumericalFormatter
  • The Numerical Formatter processor creates valid numerical chunks from various number formats.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for the newly created chunk. If null, it uses the same name as the input.
    precision int Number of digits relevant in the decimal part.
    round int Rounds the integer part with this range.
    removeTrailingZeros boolean True Removes the trailing zeros in the decimal part.
    groupSeparator string group separator
    decimalSeparator string . decimal separator
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

CoordinatesFormatter

  • com.exalead.indexing.analysis.v10.CoordinatesFormatter
  • The Coordinates Formatter processor creates a normalized chunk for the latitude and longitude.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for the newly created chunk.
    latitudeContext string The ContextName used as input for the latitude
    latitudeFormat enum(DMS, Decimal) The input format for the latitude Value can be one of
    • DMS
    • Decimal
    longitudeContext string The ContextName used as input for the longitude
    longitudeFormat enum(DMS, Decimal) The input format for the longitude Value can be one of
    • DMS
    • Decimal
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DebugProcessor

  • com.exalead.indexing.analysis.v10.DebugProcessor
  • Dumps all the DocumentChunks named after 'inputContexts' on Standard Output. This provides a log of the 'Analysis' process. @descr
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    dump boolean True
    outputContext string The ContextName used for the newly created chunk.
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RemoteMOTAPIDocumentProcessor

  • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor
  • The processing of each input context will be handled by the targeted remote API. @param targetBuildGroups list of build groups that should be used to handle processing. @param remoteMOTAPIConfigName the name of the RemoteMOTAPIConfig object as seen in RemoteMOTAPIConfig.xml high level configuration file.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    remoteMOTAPIConfigName string
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    targetInstances exa.bee.StringValue*
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

StringTransform

  • com.exalead.indexing.analysis.v10.StringTransform
  • Applies textual transformations on chunks from several contexts:
    • trims blanks at the beginning and end of chunks
    • reduces sequences of blanks to just one
    • changes text to uppercase/lowercase/normalized/capitalized
    Outputs replace inputs.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    spaces string What to do with spaces ("trim" or "normalize-spaces", default set to nothing)
    form string What transformation to apply ("lowercase", "uppercase", "normalized", "capitalized", default set to nothing)
  • Nested elements:
    Name Type Description
    inputContexts exa.bee.StringValue* The processor will only be applied to DocumentChunks with a ContextName specified in this list.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ReplaceValues

  • com.exalead.indexing.analysis.v10.ReplaceValues
  • The ReplaceValues processor compares all DocumentChunks for a given inputContext with the specified KeyValue map. When the DocumentChunk value is an exact match, it is replaced by the specified string. This processor can be used, for instance, to normalize different spelling for document metadata. @csh AC_REPLACE_VALUES_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    KeyValue exa.bee.KeyValue*

PublicUrlProcessor

  • com.exalead.indexing.analysis.v10.PublicUrlProcessor
  • For each input DocumentChunk associated with the 'inputContext' ContextName, 4 DocumentChunks are created, each associated with a different ContextName:
    • 'treeOutputContext'
    • 'leafOutputContext'
    • 'urlOutputContext'
    • 'urlCategoryOutputContext'
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    treeOutputContext string The ContextName for the DocumentChunk created from the category path encoding the web site tree.
    leafOutputContext string The ContextName for the DocumentChunks created from the complete, normalized, URL.
    urlOutputContext string The ContextName for the DocumentChunk created from the complete, normalized URL.
    urlPathOutputContext string The ContextName for the DocumentChunk created from the normalized URL.
    maxPathDepth int 4 maximum depth of url path
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DateFormatter

  • com.exalead.indexing.analysis.v10.DateFormatter
  • If a document chunk matches either:
    • a custom input format defined with UNIX date syntax (for example,%Y/%m/%d-%H:%M:%S)
    • one of the automatically recognized date formats (click icon for more information)
    the Date Formatter generates three additional document chunks, each with its own context name, using the following naming convention:
    • $inputContext$dateTimeOutputContext (Default format: %Y/%m/%d-%H:%M:%S)
    • $inputContext$dateOutputContext (Default format: %Y/%m/%d)
    • $inputContext$timeOutputContext (Default format: %H:%M:%S)
    @csh AC_DATE_FORMATTER_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    dateTimeOutputContext string Suffix for the name of the DocumentChunk containing the date as defined by dateTimeOutputFormat (default YYYY/MM/DD-HH:MM:SS). The original ContextName of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the ContextName actually used. This DocumentChunk is usually used for date display.
    dateTimeOutputFormat string A date and time output format compliant with libc's strftime.
    dateOutputContext string Suffix for the name of the DocumentChunk containing the date as defined by dateOutputFormat (default YYYY/MM/DD). The original name of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the name actually used. This DocumentChunk is usually remapped to a category for navigation.
    dateOutputFormat string A date output format compliant with libc's strftime.
    timeOutputContext string Suffix for the name of the DocumentChunk containing the date as defined by timeOutputFormat (default HH:MM:SS). The original name of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the name actually used.
    timeOutputFormat string A time output format compliant with libc's strftime.
    inputFormat string An optional date input format, compliant with libc's strptime() format. If such a format is provided, the automatic date format heuristic is disabled, and the provided date format is used exclusively. Documentation of accepted formats: (days and month literals are only recognized in English)
    • Day
      • %a: weekday abbreviated ("Mon", ...)
      • %A: weekday full ("Monday", ...)
      • %d: day of the month, zero filled [01-31]
      • %e: Equivalent to %d [1-31]
      • %j: day year, zero filled [001-366]
      • %u: day of week starting with Monday (1), i.e. mtwtfss [7 (for Sunday)]
      • %w: day of week as a decimal number [0,6], with 0 representing Sunday
    • Week
      • %U: week number of the year (Sunday as first day of the week) as a decimal number [00,53]
      • %W: week number of the year (Monday as the first day of the week) as a decimal number [01,53]
      • %V: week of the year [01-53]
    • Month
      • %m: the month number [01-12]
      • %b: month locale abbreviated ("Aug", ...)
      • %h: equivalent to %b
      • %B: locale's full month, variable length ("August")
    • Year
      • %y: The year within the century with two-digit dates, for example [69,99] is mapped to [1969,1999] and [00,68] is mapped to [2000,2068]
      • %Y: The year, including the century (for example, 2014)
      • %g: last two digits of year of ISO week number (see %G)
      • %G: year of ISO week number (see %V), for example, 2014; normally useful only with %V
    • Century
      • %C: The century number [00,99]
    • Date
      • %D: Equivalent to mm/dd/yy (08/20/14)
      • %x: locale's date representation (mm/dd/yy), 08/20/2014
      • %F: %Y-%m-%d (2014-08-20)
    • Hours
      • %l: hour (12-hour clock), for example, [1-12]
      • %I: hour (12-hour clock) zero filled, [01-12]
      • %k: hour (24 hour), for example, 17
      • %H: hour (24 hour) zero padded, 17
      • %p: locale's upper case AM or PM (blank in many locales), for example, PM
      • %P: locale's lower case am or pm, for example, pm
    • Minutes
      • %M: The minute [00-59]
    • Seconds
      • %s: seconds since 00:00:00 1970-01-01 UTC (Unix epoch), for example, 1345483096
      • %S: seconds [00-60], (The 60 is necessary to accommodate a leap second)
    • Time
      • %r: hours, minutes, seconds (12-hour clock), for example, 05:18:16 PM
      • %R: hours, minutes (24-hour clock), for example, 17:18
      • %T: hours, minutes, seconds (24-hour clock), for example, 17:18:16
      • %X: locale's time representation, for example, 11:07:26 AM
      • %dt: AM or PM
    • Date and Time
      • %c: locale's date and time, for example, Sat Nov 04 12:02:33 EST 1989
    • Others
      • %n: Any white space
      • %t: Any white space
      • %%: Replaced by %
    removeOriginalChunk boolean True Removes the original input chunk.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

FormatCheckerDate

  • com.exalead.indexing.analysis.v10.FormatCheckerDate
  • The FormatCheckDate processor checks the chunk matches either:
    • a custom input format defined with UNIX date syntax (for example,%Y/%m/%d-%H:%M:%S)
    • one of the automatically recognized date formats
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    inputFormat string An optional date input format, compliant with libc's strptime() format. If such a format is provided, the automatic date format heuristic is disabled, and the provided date format is used exclusively. Documentation of accepted formats: (days and month literals are only recognized in English)
    • %a: The day of the week ("Monday", ...)
    • %A: Equivalent to %a
    • %b: The month ("January", ...)
    • %B: Equivalent to %b
    • %c: Equivalent to %a %b %e %H:%M:%S %Y
    • %C: The century number [00,99]
    • %d: The day of the month [01,31]
    • %D: Equivalent to %m/%d/%y
    • %e: Equivalent to %d
    • %h: Equivalent to %b
    • %H: The hour (24-hour clock) [00,23]
    • %I: The hour (12-hour clock) [01,12]
    • %j: The day number of the year [001,366]
    • %m: The month number [01,12]
    • %M: The minute [00,59]
    • %n: Any white space
    • %dt: AM or PM
    • %r: Equivalent to %I:%M:%S %p
    • %R: Equivalent to %H:%M
    • %S: The seconds [00,60]
    • %t: Any white space
    • %T: Equivalent to %H:%M:%S
    • %U: The week number of the year (Sunday as the first day of the week) as a decimal number [00,53]
    • %w: The weekday as a decimal number [0,6], with 0 representing Sunday
    • %W: The week number of the year (Monday as the first day of the week) as a decimal number [00,53]
    • %x: Equivalent to %m/%d/%y
    • %X: Equivalent to %H:%M:%S
    • %y: The year within century. (for two-digit dates, [69,99] is mapped to [1969,1999] and [00,68] is mapped to [2000,2068])
    • %Y: The year, including the century (for example, 1988)
    • %%: Replaced by %
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

SplitValues

  • com.exalead.indexing.analysis.v10.SplitValues
  • Splits the content of all DocumentChunks associated with the ContextName 'inputContext' using 'separator' as a separator regular expression. A new DocumentChunk is created for each segment, with 'outputContext' as the ContextName.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each split segment.
    separator string Separator around which to split. ASTL library is used to perform regular expression matching. The regular expression language supported is Perl 5, WITHOUT support for:
    • assertions like \b, \B, ?=, ?!, ?<=, ?<!
    • backreferences \1, \2, ...
    • UNICODE escaping like \u0020 or \p{name}
    • non-greedy (lazy) repeat operators like ??, *?, +?
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RenameContext

  • com.exalead.indexing.analysis.v10.RenameContext
  • Each DocumentChunk with ContextName matching 'inputContext' is renamed with a ContextName 'outputContext'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The new ContextName for DocumentChunks with ContextName matching 'inputContext'.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

CopyContext

  • com.exalead.indexing.analysis.v10.CopyContext
  • Copies all DocumentChunks with 'inputContext' as ContextName, and creates new DocumentChunks with the same score, language and part but with 'outputContext' as ContextName.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    requiredAnnotation string The name of the required annotation the chunk must have to be copied. If null, no special handling is done on annotations.
    restrictValues string A regexp which values of the chunk must match to be copied to the output context. Values that don't match the regexp will not be copied.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

FixedRangeNumericalPartitioning

  • com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning
  • Matches numerical values in a range. It transforms a numerical value into a matching range, based on a fixed range size. For example, with rangeSize = 100,
    • 101 -> 100_199
    • 234 -> 200_299
    It also works for negative numbers:
    • -20 -> -100_-1
    • 0 -> 0_99
    This helps to create categories (for navigation) from numerical values.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    separator string _ The range separator.
    rangeSize long 1 The size of the range to consider.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ForcedRangeNumericalPartitioning

  • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning
  • Transforms a numerical value into the text value associated to its matching range from a set of predetermined ranges specified in 'NumericalRange'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    separator string _ The separator between the beginning and the end of the range. This parameter is deprecated.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    NumericalRange com.exalead.indexing.analysis.v10.NumericalRange* The forced ranges.

NumericalRange

  • com.exalead.indexing.analysis.v10.NumericalRange
  • Associates text with a numerical range. The range includes all values >= beg and <= end (beg <= x <= end). A range corresponding to a unique value with beg = end is allowed.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
  • Attributes:
    Name Type Default value Description
    beg long The lower bound.
    end long The upper bound.
    text string The associated text.

TextToNum

  • com.exalead.indexing.analysis.v10.TextToNum
  • Processor to hack an approximate sort on a text field. Implements a surjection from the set of strings to the set of integers [0..N] with N close but inferior or equal to 18,446,744,073,709,551,615 User defines an ordered alphabet. A first surjection from the set of all strings to the set of finite sequences of symbols taken from this alphabet is applied (strip the string from symbols out of the alphabet). A partial order relation is inferred on the latter set by the alphabet (lexicographical order). For obvious cardinal numbers reasons (one set is infinite the other is not), the second surjection cannot be partial-order preserving. The idea is to preserve the relation on the shorter strings, AND preserve the relation between shorter strings and longer strings, such as:
    • if STRING2ULONG('shortstring1') <= STRING2ULONG('shortstring2') then 'shortstring1' <= 'shortstring2'
    • STRING2ULONG('longstring1') <= STRING2ULONG('longstring2') does NOT insure 'longstring1' <= 'longstring2'
    • if STRING2ULONG('shortstring1') <= STRING2ULONG('longstring2') then 'shortstring1' <= 'longstring2'
    The size of the prefix obviously depends on the size of the alphabet.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    alphabet string 0123456789abcdefghijklmnopqrstuvwxyz The ordered alphabet.
    outputContext string The ContextName used for the newly created chunk.
    nbBits int 63 Number of bits of unsigned field used for sorting.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DoubleToLong

  • com.exalead.indexing.analysis.v10.DoubleToLong
  • Using this processor you can store floating point values into signed fields that can then be queried with the DoublePrefixHandler.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    precision int 1000 The multiplicator. Each value will be multiplied by this factor.
    outputContext string The ContextName used for the newly created chunk.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

GeoBBoxProcessor

  • com.exalead.indexing.analysis.v10.GeoBBoxProcessor
  • The Geo BBox processor converts the input geometry from WKT to WKB and compute its bouding box. Both WKB and bounding box are returned as chunks. @descr
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    precision int 6 The number of decimals that will be used in geometrical representations and computations.
    bboxMetaName string
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

JavaProcessor

  • com.exalead.indexing.analysis.v10.JavaProcessor
  • Deprecated)
  • Allows documents to be sent to a java process for analysis.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    id string
    target string
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ReplaceRegexp

  • com.exalead.indexing.analysis.v10.ReplaceRegexp
  • Substitutes the content substring of all DocumentChunks having the ContextName 'inputContext', using:
    • 'pattern' as the matching substring regular expression
    • and 'value' as the replacement value.
    This value may have the form of sed output format using references to captures \0 through \9. A new DocumentChunk is created with the substitutions.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each new context.
    pattern string Pattern used to match the substrings to replace. ASTL library is used to perform regular expression matching. The regular expression language supported is Perl 5, WITHOUT support for:
    • lazy (non-greedy) quantifiers like *?, +?, ??, {n}?, {n,}?, {n,m}?
    • possessive quantifiers like *+, ++, ?+, {n}+, {n,}+, {n,m}+
    • assertions like \b, \B, \A, \z, \Z, \G
    • look-around assertions (?=pattern), (?!pattern), (?<=pattern), (?<!pattern)
    • named captures (?'name'pattern), (?<name>pattern)
    • numeric and named backreferences like \1, \g1, g{-1}, \g{name}, k<name>, k'name'
    • named Unicode character \N{name}
    • all operators related to Perl code inlining like (?{ code })
    • all operators related to backtracking algorithm control like independent subexpression (?>pattern)
    • \C matching a single C char (octet)
    • of the pattern-match modifiers (?pimsx-imsx) only (?i:pattern) and (?i) are supported (no negative form)
    value string The replacement value (sed-like output format).
    replaceAll boolean True Replaces all first occurrences of patterns.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

URLCodec

  • com.exalead.indexing.analysis.v10.URLCodec
  • URL encode/decode with UTF-8 charset only
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string Stores URL encoded form in outputContext. If outputContext = inputContext, it removes the original chunk.
    encodeURIComponent boolean True If true (default), it encodes the following characters: ',' '/' '?' ':' '@' '&' '=' '+' '$' '#'
    mode enum(encode, decode) encode mode = "encode" or "decode"
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

WildcardIndexing

  • com.exalead.indexing.analysis.v10.WildcardIndexing
  • Computes the input chunk substring to perform efficient prefix/substring/suffix search
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string Stores exact/prefix/substring/suffix in outputContext. If outputContext = inputContext, it removes the original chunk.
    exactScore int 4 Specifies the score for an exact match.
    prefixSearch boolean True Enables the prefix search.
    prefixScore int 3 Specifies the score for a prefix match.
    suffixSearch boolean True Enables the suffix search.
    suffixScore int 2 Specifies the score for a suffix match.
    substringSearch boolean True Enables the substring search.
    substringScore int 1 Specifies the score for a substring match.
    maxStringSize int 100 Specifies the max string size for which this processor will be applied.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

URLTransformer

  • com.exalead.indexing.analysis.v10.URLTransformer
  • Parses a context string as a regular URL (RFC 2396, "Uniform Resource Identifier") and transforms it according to the given URL pattern. A new DocumentChunk is created with the substitution. Pattern used to transform the URL (in the form &lt;scheme&gt;://&lt;authority&gt;&lt;path&gt;?&lt;query&gt;#&lt;fragment&gt;):
    • Characters other than '$' or '\' are kept as-is
    • The '$' character and the '\' character must be escaped with a leading \
    • The ${expression} form allows to compute a string expression based on URL components (see "Expression" below)
    Expression used inside the enclosing ${}:
    • url: Original URL
    • scheme: Scheme name ("http", "https", "file", ...)
    • authority: Authority (host:port or host) (may be empty)
    • host: Hostname part of the authority (may be empty)
    • port: Port number part of the authority (may be empty)
    • userInfo: username:password field of the authority (may be empty)
    • file: File starting with / and query string, if any
    • pathurl: Normalized absolute path starting with /
    • path: Normalized absolute path (may start with C:\ on Windows)
    • query: Normalized query part starting with ? (may be empty)
    • args: Query part without the leading ? (may be empty)
    • fragment: Fragment part starting with #(may be empty)
    • reference: Reference part ; i.e., fragment without the leading # (may be empty)
    • arg:name: Query part argument identified by its name, unescaped (you must re-escape it using "urlencode:" when necessary)
    • str:string: The final argument is not a variable name, but a string (only useful for clarity purpose)
    • tolower:<i>expression</i>: Transform into lowercase (ONLY A-Z)
    • toupper:<i>expression</i>: Transform into uppercase (ONLY a-z)
    • urlencode:<i>expression</i> :URL encoding (%NN or +)
    • urlpathencode:expression</i>: URL encoding outside / fragments
    • urldecode:<i>expression</i>: URL decoding
    • pathslash:<i>expression</i>: Convert \ into /
    • pathantislash:<i>expression</i>: Convert / into \
    Notes:
    • Unreserved characters are unescaped during URL processing (i.e., never '%' or '\')
    • The lower other similar prefix accept recursion (i.e., the expression "${urlpathencode:pathantislash:toupper:path}" is valid)
    • Both "file://C:\path" and "file:///C:\path" will produce path="/C:\path"
    Examples:
    • With the input context value "http://www.example.com/bar/foo?bar=42"
      • "hello, world" => "hello, world"
      • "the scheme is ${scheme}" => "the scheme is http"
      • "the scheme is \${scheme}" => "the scheme is \${scheme}
      • "http://myserver${path}${query}" => "http://myserver/bar/foo?bar=42"
      • "http://myserver/applet?f=${urlpathencode:path}&t=${arg:bar}" => "http://myserver/applet?f=/bar/foo&t=42"
      • "http://myserver/applet?f=${urlencode:path}&t=${arg:bar}" => "http://myserver/applet?f=%2Fbar%2Ffoo&t=42"
      • "http://myserver/applet?f=${urlpathencode:pathantislash:toupper:path}" => "http://myserver/applet?f=%5CBAR%5CFOO"
    • With the input context value "file:///C:/My%20Documents/Document.doc"
      • "${pathantislash:urldecode:path}" => "C:\My Documents\Document.doc"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each new context.
    urlPattern string Pattern used to transform the URL.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

GeoCategorizer

  • com.exalead.indexing.analysis.v10.GeoCategorizer
  • A processor that categorizes geographic points given their inclusion in a GeoDomain.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    inputContext string The processor will only be applied to DocumentChunks with this ContextName.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName of the chunk to create.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    GeoDomain com.exalead.search.v30.GeoDomain*

DiskDomain

  • com.exalead.search.v30.DiskDomain
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
  • Attributes:
    Name Type Default value Description
    title string
    id int Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value.
    radius double Disk radius in meters
    x double First coordinate of the center for the DiskDomain. If the point type is XY, it will be interpreted as the X coordinate (integer units). For geographic points (GPS), it will be interpreted as the latitude coordinate.
    y double Second coordinate of the center for the DiskDomain. If the point type is XY, it will be interpreted as the Y coordinate (integer units). For geographic points (GPS), it will be interpreted as the longitude coordinate.

PolygonDomain

  • com.exalead.search.v30.PolygonDomain
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
  • Attributes:
    Name Type Default value Description
    title string
    id int Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value.
    vertices string Polygon vertices, as a list of (x,y) coordinates. For example: "0.0,0.0;1.1,0.1;1.1,1.1"

KMLDomain

  • com.exalead.search.v30.KMLDomain
  • Definition of a geographic domain using a KML or KMZ resource
  • Parent elements:
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
  • Attributes:
    Name Type Default value Description
    title string
    id int Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value.
    resource string
    KMZ boolean Is this resource a KMZ resource?

SHPDomain

  • com.exalead.search.v30.SHPDomain
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
  • Attributes:
    Name Type Default value Description
    title string
    id int Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value.
    shpResource string
    shxResource string
    dbfResource string

MimeTypeSetter

  • com.exalead.indexing.analysis.v10.MimeTypeSetter
  • Manually sets the mime type
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    value string New mime type
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

MetaFinder

  • com.exalead.indexing.analysis.v10.MetaFinder
  • Keeps track of all document metas
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

JavaDocumentProcessor

  • com.exalead.indexing.analysis.v10.JavaDocumentProcessor
  • Takes Java code either inline or from a file, and executes it on-the-fly. For production mode, we recommend packaging your custom code as a Java Plugin (CVPlugin) and using the Custom Document Processor to call it. Plugins allow better packaging and source code maintenance.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    path string User defined path to a Java file containing the processor code
    priority int Defines which path to use (0: user defined path, 1: resource managed path (inlined Java))
    sourceCode string Inline Java code
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

JavaScriptProcessor

  • com.exalead.indexing.analysis.v10.JavaScriptProcessor
  • Deprecated)
  • This document processor is deprecated. Use the Java document processor instead. The JavaScript Processor takes a JS script and executes it.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    path string User defined path to a JS file containing the processor code
    priority int Defines which path to use (0: user defined path, 1: resource managed path (inlined JS))
    script string Inline script
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

StorageServiceDocumentProcessor

  • com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor
  • Queries the storage for any meta to attach to the document. Multi-valued pairs are pushed as multi-valued metas. For example:
    • The storage key "nb_comment" will be attached as "nb_comment" meta on the document.
    • The storage key "tags[]" will be attached as "tags" multi-valued meta on the document.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    instance string Storage service instance
    metaIdentifier string Defines an optional meta name that will be used as storage Identifier instead of the document Uri.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

MathDocumentProcessor

  • com.exalead.indexing.analysis.v10.MathDocumentProcessor
  • Performs mathematical operations on a numerical field. Expressions must be prefaced by a $. For example, the expression `$ht_price * 1.196` finds the first chunk in the `ht_price` context, and replaces all occurrences of `ht_price` with the mathematical expression. The result will be a new text chunk, either in the Output context (if specified), or in the original `ht_price` context.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    expression string Arithmetic expression to evaluate. For example: "$file_size + 42"
    outputContext string ContextName of the chunk to create.
    floatingPoint boolean Output: A floating point number instead of the default integer one.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

PrecomputedThumbnailsDocumentProcessor

  • com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor
  • The Precomputed Thumbnails Document Processor precomputes thumbnails of the first DocumentPart.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    convertAddresses string Semicolon separated list of convert instance names or urls to use.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RealTimeAlerting

  • com.exalead.indexing.analysis.v10.RealTimeAlerting
  • The Real-time alerting document processor matches queries defined by end-users and alerts them as soon as possible a new matching document is indexed. To be used only when not in task queue mode.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    alertGroups com.exalead.indexing.analysis.v10.AlertGroup* List of alert groups handled by this processor, empty means ALL groups
    customPublishers com.exalead.indexing.analysis.v10.CustomPublisher*
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

AlertGroup

  • com.exalead.indexing.analysis.v10.AlertGroup
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as alertGroups)
  • Attributes:
    Name Type Default value Description
    name string

CustomPublisher

  • com.exalead.indexing.analysis.v10.CustomPublisher
  • Custom publisher configuration
  • Parent elements:
    • com.exalead.indexing.analysis.v10.RealTimeAlerting (as customPublishers)
  • Attributes:
    Name Type Default value Description
    classId string Custom publisher type
  • Nested elements:
    Name Type Description
    config exa.bee.KeyValue*

MIMEDetector

  • com.exalead.indexing.analysis.v10.MIMEDetector
  • The MIME detector operates on each DocumentPart for which a MIME-type is not available. The MIME-type can be specified for each DocumentPart in the PAPI. For DocumentPart, the 'bytes' and the 'filename' are used to guess the real MIME-type and charset. The guessed MIME-type and the charset are then set as attributes of the DocumentPart. Input: The DocumentPart of the document. Output: 'mime' and 'encodingToUse' attributes of DocumentParts. This document processor does not create any document chunks.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    defaultValue string Default mime to use if not detected.
    defaultCharset string On text or HTML files, the MIME detector tries to detect charset encoding automatically. If the encoding cannot be detected, this 'defaultCharset' is used.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

NativeTextExtractor

  • com.exalead.indexing.analysis.v10.NativeTextExtractor
  • Extraction is performed for the following data types:
    • text/plain for Text files.
    • text/html for HTML Files.
    • application/x-exalead-document for CloudView 4.6 document format (com.exalead.document)
    • application/x-exalead-ndoc for CloudView 5 internal document format, binary.
    • application/x-exalead-ndoc-v10+xml for CloudView internal document format, XML.
    @csh AC_TEXTEXTRACTOR_HTML_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    annotateHTML boolean Adds some stylish annotations to DocumentChunks (for HTML files only):
    • html:p for DocumentChunks generated from <p>
    • html:row for DocumentChunks generated from <tr>
    • html:column for DocumentChunks generated from <td> or <th>
    • html:table for DocumentChunks generated from <table>
    • html:h1 for DocumentChunks generated from <h1>
    • html:h2 for DocumentChunks generated from <h2>
    • html:h3 for DocumentChunks generated from <h3>
    • html:h4 for DocumentChunks generated from <h4>
    • html:h5 for DocumentChunks generated from <h5>
    • html:h6 for DocumentChunks generated from <h6>
    • html:link for DocumentChunks generated from <a>, <iframe> or <frame>
      • html:link:rel if the link has a "rel" attribute
      • html:link:name if the link has a "name" attribute
    • html:list for DocumentChunks generated from <ul>, <ol> or <dl>
    • html:item for DocumentChunks generated from <li>
    • html:bold for DocumentChunks generated from <b> or <strong>
    • html:italic for DocumentChunks generated from <i> or <em>
    • html:underline for DocumentChunks generated from <u>
    • html:strike for DocumentChunks generated from <s> or <strike>
    • html:pre for DocumentChunks generated from <pre>
    • html:invisible for DocumentChunks containing invisible text (display: none, white on white)
    • html:class for DocumentChunks taken in a CSS class
    • html:id for DocumentChunks taken in a CSS id
    • html:img:src for DocumentChunks created from a <img>
    It also creates specific HTML DocumentChunks with the following contexts:
    • html:lang when parsing a <html> containing the "lang" attribute
    • html:xml:lang when parsing a <html> containing the "xml:lang" attribute
    • html:title when parsing a <title>
    • html:title:other when parsing a second <title>
    • html:base:href when parsing a <base>
    • html:link when parsing a <link> containing the "src" attribute and annotated by:
      • html:link:rel if the link has a "rel" attribute
      • html:link:type if the link has a "type" attribute
    • html:http-equiv:NAME when parsing a http-equiv meta
    • html:meta:NAME when parsing a meta named "NAME"
    skipInvisibleHTMLText boolean Skips the invisible text. For example, white fonts on white backgrounds (for HTML files only).
    extractJs boolean Tries to parse JavaScript and then extract links.
    extractHTMLTables boolean Adds annotations on table, tr, td, th
    extractHTMLStyles boolean Adds annotations on style attributes.
    extractHTMLForms boolean Add annotations on Forms, select.
    maxHTMLAnnotationDepth int 20 Prevents new annotations from being created after @c maxHTMLAnnotationDepth HTML level.
    disableAutomaticHTMLDTDFix boolean Disables automatic DTD fix on HTML documents.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ConvertTextExtractor

  • com.exalead.indexing.analysis.v10.ConvertTextExtractor
  • This processor performs text content extraction for all MIME-types (300+ file formats are currently handled). See the "Supported Formats" technical note for more information. Text, HTML, and built-in data types must be processed by the 'NativeTextExtractor' rather than this processor. Make sure to have a 'NativeTextExtractor' before the ConvertTextExtractor in your pipeline. @csh AC_TEXTEXTRACTOR_MIME_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    looseTextDetection boolean True Looses text detection to detect more text files, including suspicious ones (not *.txt or *.html) ("true", "false")
    forceContent boolean Forces to accept the content, even if the MIME type does not seem to be a known or supported MIME type.
    minInputSizeKB long -1 Minimum document size accepted, in kilobytes.
    maxInputSizeKB long -1 Maximum document size accepted, in kilobytes.
    maxRecursionDepth int -1 Maximum recursion depth.
    maxRecursionDocuments int -1 Maximum number of documents that can be converted in one directory level.
    maxRecursionDocumentsTotal int -1 Maximum number of documents that can be converted over all levels.
    strictSizeCheck boolean Strict size validation mode (even for partial reads).
    retryIO string Uses regular I/O when mmap fails. ("true", "false")
    filter string Native filter identifier list to be used specifically. The list is a comma-separated (,) list of filter identifiers with optional ending argument(s) separated by semi-colons (;). If the filter identifier is prefixed by '!', the corresponding filter will be explicitly excluded. The special filter identifier '*' stands for "all other filters". First match wins: "*,!doc" is identical to "*". For example: filter="!jpeg,*" will accept all filters but the jpeg filter.
    timeoutMs long -1 Conversion timeout value, in milliseconds. If the conversion process takes longer, the remote side attemps to abort the conversion process.
    priority string Worker thread(s) priority to be used for the processing ("normal", "lowest", "very low", "low", "normal", "high", "very high")
    embedded string Includes embedded images ("true", "false", "optional")
    attachments string Includes embedded attachments ("true", "false", "optional")
    styles string Attempts to extract more text styles for HTML conversion ("true", "false", "optional")
    forceConversion boolean Attempts to generate an empty document upon conversion error (may be ignored)
    startPage long -1 Starts conversion from this page number (page number starts at 1). This parameter is only taken into account for image processing and may be ignored.
    maxPages long -1 Maximum number of pages to process for xml conversion (may be ignored).
    maxOutputSizeKB long -1 Maximum output size on the remote side, in kilobytes. If the generated output exceeds this value, the document may be truncated or invalid.
    allowUnicode32 boolean Allows the use of 32-bit unicode points.
    allowDocumentChars boolean Allows the use of Unicode private range characters (E0XX) for separators (keyword, sentence, paragraph separators, ...)
    outsideIn string This feature is no longer supported. ("true", "false", "optional")
    outsideInFallback string This feature is no longer supported. ("true", "false", "optional")
    outsideInOnly string This feature is no longer supported. ("true", "false", "optional")
    outsideInForPreview string This feature is no longer supported. ("true", "false", "optional")
    outsideInSimpleXHTMLFallback string This feature is no longer supported. ("true", "false", "optional")
    ocr string Converts using OCR ("true", "false", "optional")
    ocrFallback string Fallback to OCR if heuristics deem it necessary ("true", "false", "optional")
    ocrDetect string Detects documents requiring OCR (and rejects them) ("true", "false")
    ocrQuality string OCR quality ("fast", "normal", "best")
    ocrLang string OCR language(s) ("en" for English, "en;fr" for French and English, etc.)
    ocrTimeoutMs long -1 OCR conversion timeout value, in milliseconds. If the OCR process takes longer, the remote side attemps to abort the conversion process. This value overrides the timeout value if the processing involves an OCR operation.
    ocrMaxPages int -1 Maximum number of pages to process for OCR.
    ocrPriority string Worker thread(s) priority to be used for the OCR processing ("normal", "lowest", "very low", "low", "normal", "high", "very high")
    httpProxyUrl string Optional HTTP proxy URL. The URL can embed credentials if required.
    disablePlugins boolean Disables external plugins.
    overrideAddresses string
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    KeyValue exa.bee.KeyValue*

RemoteHTTPTransformer

  • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer
  • The processor posts part bytes to the remote HTTP service, and gets the typed resource as a result. The remote service may return a Document.MIME_V10 document, or any other document that can later be processed in the pipeline. If the remote service returns a non "OK" HTTP status (!= 200 error code), the corresponding error is passed as a regular error. The service may also advertise a filename, using the standard Content-Disposition's 'filename' attribute.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    remoteUrl string Remote URL
    timeoutMs int Remote processor timeout, in milliseconds. This is the timeout.
    httpIdleTimeoutMs int Cached HTTP connection idle timeout. This is an advanced setting. For efficiency, the RemoteHTTPTransformer maintains a pool of opened connections to the remote HTTP service. This defines the timeout for connections which are no longer used. Default is 10.000.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    argMapping com.exalead.indexing.analysis.v10.RemoteHTTPTransformerRemoteArgMapping* Argument(s) mapping, if any. @see RemoteHTTPTransformerRemoteArgMapping
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RemoteHTTPTransformerRemoteArgMapping

  • com.exalead.indexing.analysis.v10.RemoteHTTPTransformerRemoteArgMapping
  • Transformation RemoteHTTPTransformer argument mapping.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as argMapping)
  • Attributes:
    Name Type Default value Description
    key string URL key to map. This key name will be used as remote HTTP argument name.
    value string Value to use. If @c null, the @c defaultValue value will be used. The following values names are reserved:
    • $docname: the document name or URI
    • $msg.uri: see @c com.exalead.mercury.papi.PAPIMessage
    • $msg.source: see @c com.exalead.mercury.papi.PAPIMessage
    • $part.name: see @c com.exalead.indexing.DocPart
    • $part.filename: see @c com.exalead.indexing.DocPart
    • $part.encoding: see @c com.exalead.indexing.DocPart
    • $part.forcedMime: see @c com.exalead.indexing.DocPart
    • $part.mimeHint: see @c com.exalead.indexing.DocPart
    • $part.mime: see @c com.exalead.indexing.DocPart
    • $part.encodingToUse: see @c com.exalead.indexing.DocPart
    • $part.bytes.length: see @c com.exalead.indexing.DocPart
    • $part.customDirectives.*: see @c com.exalead.indexing.DocPart
    • $$$foo: escaping for $foo
    defaultValue string Value to use if the @c value is @c null. If this value is @c null, the empty string will be used.

StandardPartsMerger

  • com.exalead.indexing.analysis.v10.StandardPartsMerger
  • This processor does nothing if there are no DocumentParts (only root DocumentChunks). This processor needs one DocumentPart called the 'Master Part'. If there is only one part, this part is the 'Master Part'. If there are multiple parts, the part named after the 'masterPart' attribute is the 'Master Part'. @csh AC_STANDARDPARTS_MERGER_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    masterPart string Name of the master part. This name should be "master" to follow the convention used by connectors that send documents composed of multiple parts (e.g. mails with attachments).
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    partSpecificContexts exa.bee.StringValue* The ContextNames of the DocumentChunk from the non-master part that should be copied to the root document.
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

SemanticPipeDocumentProcessor

  • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor
  • Instantiates a semantic pipe and creates chunks out of resulting annotations. It can be used to instantiate classification processors, and perform document level operations from their output.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    pipeline string Analysis pipeline on which semantic processors will be used.
    annotations string A chunk will be created for each annotation which name is in the list. Comma-separated list of annotations.
    topLevelAnnotationsOnly boolean Considers top level annotations only. For example, results from the QueryMatcher or Fast Rules.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    SemanticProcessor com.exalead.indexing.analysis.v10.SemanticProcessor* List of semantic processors to use

Anchorer

  • com.exalead.indexing.analysis.v10.Anchorer
  • Adds an annotation on the first and last tokens of either a processed sequence (first/last) or a range defined by an annotation a (first_a/last_a)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    tagsToAnchor string List of comma-separated tags on which to work
    finalAnnotationOnNextToken boolean If true, sets final annotation on the token after the last token of annotation a
    finalCannotBeSepSpace boolean If final can't be a space, the annotation last may be set on the next non-blank token
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

CompoundWordSplitter

  • com.exalead.indexing.analysis.v10.CompoundWordSplitter
  • Annotates compound words that use CamelCase (like SearchServer) or underscores (like my_variable) to separate the root words. This allows users to search for the root words individually. Annotations generated:
    • "compound": for example, compound="search server"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    tokenizeAnnotations boolean True Subtokenizes "SearchServer" into "Search" "Server" automatically, and keep original annotations.
    doCamelCase boolean True Separates compound words before each capital letter. For example, the annotation for "CamelCase" is compound="camel case".
    doUnderscore boolean True Separates multi-word strings wherever there is an underscore. For example, the annotation for "under_score" is compound="under score".
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

OntologyMatcher

  • com.exalead.indexing.analysis.v10.OntologyMatcher
  • An OntologyMatcher detects concepts defined in an ontology in the textual content of the Document Chunks. Typically, an ontology contains a list of business terms to be detected. Resulting Annotations are mapped to enable navigation by business concepts. Annotations generated:
    • Depends on the resource (See Pkg).
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    enableApproxMatching boolean Enables approximative matching in ontology. Approximative matching uses the Damerau-Levenshtein edit distance.
    minWordSizeForDist1 int 3 Minimum number of chars in token to enable the Damerau-Levenshtein distance of 1.
    minWordSizeForDist2 int 8 Minimum number of chars in token to enable the Damerau-Levenshtein distance of 2.
    resourceDir string URL for the directory containing the ontology (data://, file;// or resource://).
    restrictLanguage boolean True Keeps only the expression added with language == Language.XX or with the document language. For example, if the Ontology contains an expression added with language=En, it will be extracted only for an English document if restrictLanguage is set to true.
    keepLongestMatch boolean True Keeps only the longest match. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations.
    keepLongestMatchInterTag boolean Keeps only the longest match (tag independant). For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations.
    tokenizeAnnotations boolean If you have some multi-tokens annotations (like "super market" annotation on token "supermarket", this option will automatically subtokenize "supermarket" in "super" "market" and keep original annotations. If you enable this option, keepLongestMatch and keepLongestMatcherInterTag will be set to true.
    annotationsToIgnore string Sets the list of annotations to be ignored (comma-separated). This feature allows you to define a list of words/expressions to ignore in the recognition of this ontology. For example, if you add:
    • the expressions "of" and "the" with the tag "toIgnore" in ontology A,
    • and the expression "website embassy" in ontology B with tagsToIgnore="toIgnore",
    ... you will be able to match "website of the embassy", "website of embassy" and "website embassy".
    ignoreSpaces boolean If your ontology was compiled with matchOnSeparators=false, this allows 'lemonde' to retrieve 'le monde' or 'le monde' to retrieve 'lemonde'. If your ontology was compiled with matchOnSeparators=true, this allows 'le monde' to retrieve 'le monde'.
    annotationPrefix string A prefix to add to each annotation tag. For example, if the package of the entry matched in the ontology is "exalead.location.country" and the annotationPrefix is "myOntology_", an annotation will be added with the tag "myOntology_exalead.location.country".
    trustLevelBasedDedup boolean Keeps only the annotation with the highest trust level when several entries from a package match the same text chunk.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

NamedEntitiesMatcher

  • com.exalead.indexing.analysis.v10.NamedEntitiesMatcher
  • The Named Entities Matcher detects named entities such as people, organizations, or places, in the textual content of the document. It generates annotations like NE.person or NE.organization, using ontology-based matching and/or rule-based matching.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file;// or resource://).
    rules string ne Defines which entities will be extracted:
    • The default value, ne triggers the extraction of people, organizations, locations and events.
    • The value ne-all triggers the extraction of all types of entities.
    prefix string NE Prefix to add in front of each annotation generated by the named entity matcher.
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
    partOfSpeechFiltering boolean True It discards annotations for parts of text made of a name followed by a verb or an adverb with the first letter in uppercase. This filter is useful if your documents contain a lot of titles with several capitalized words (what is called 'Title Case'). It applies to NE.person, NE.place and NE.organization.
    useKnownWordsForDisambiguisation boolean True Uses a resource of known words to disambiguate named entities candidates. It works only for English and French.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

Classifier

  • com.exalead.indexing.analysis.v10.Classifier
  • A Classifier classifies a whole document according to the existing annotations on selected Document Chunks. The annotations are matched against a learning resource.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the vocabulary resource (data://, file;// or resource://)
    annotationName string Name of the annotation to add.
    language iso code Language for which the vocabulary classifier is activated.
    excludedLanguages string Language for which the vocabulary classifier is deactivated (works only if language=xx, comma-separated).
    addAnnotationsOnKeywords boolean If true, it adds annotations to all matching tokens.
    maxAnnotations int -1 Maximum number of annotations per document.
    minTrustLevel int The minimum trust level of categories to keep.
    maxKeywords int -1 The maximum number of keywords to keep.
    minKeywords int 1 The minimum number of keywords per class.
    collapseToken boolean If true, all identical tokens are collapsed.
    extraPrefixAnnotations string The optional list of prefix annotations to keep (comma-separated).
    extraAnnotationsMinTrustLevel int 100 The minimum trust level to keep an extra annotation.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

HierarchicalClassifier

  • com.exalead.indexing.analysis.v10.HierarchicalClassifier
  • A Classifier classifies a whole document according to the existing annotations on selected Document Chunks. The annotations are matched against a learning resource.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    resourceDir string URL for the vocabulary resource (data://, file;// or resource://)
    annotationName string Name of the annotation to add.
    language iso code Language for which the vocabulary classifier is activated.
    excludedLanguages string Language for which the vocabulary classifier is deactivated (works only if language=xx, comma-separated).
    addAnnotationsOnKeywords boolean If true, it adds annotations to all matching tokens.
    maxAnnotations int -1 Maximum number of annotations per document.
    minTrustLevel int The minimum trust level of categories to keep.
    maxKeywords int -1 The maximum number of keywords to keep.
    minKeywords int 1 The minimum number of keywords per class.
    collapseToken boolean If true, all identical tokens are collapsed.
    extraPrefixAnnotations string The optional list of prefix annotations to keep (comma-separated).
    extraAnnotationsMinTrustLevel int 100 The minimum trust level to keep an extra annotation.
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

RulesMatcher

  • com.exalead.indexing.analysis.v10.RulesMatcher
  • A RuleMatcher applies a rule engine on the textual content of the DocumentChunks. The rules are defined in a separate XML 'resourceFile' and are a combination of regular expression, word matching and boolean operators over content. Annotations generated:
    • The matching rule defined in the XML specifies the annotation to generate
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceFile string URL for the resource (data://, file;// or resource://).
    language iso code Language for which this processor is activated.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

RelatedTerms

  • com.exalead.indexing.analysis.v10.RelatedTerms
  • Extracts all possible related terms. Only one instance of this processor may exist per input context. Annotations generated:
    • "relatedTerm": RelatedTerm identifier (stored in the dictionary and in the index)
    • "relatedTermDisplay": display form of the RelatedTerm (stored in the dictionary)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    relatedTermsMinSpan int 3 Minimum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist).
    relatedTermsMaxSpan int 6 Maximum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist).
    maxRelatedTermsPerDoc int 64 The maximum number of related terms per document.
    keepLongestMatch boolean True Keeps only the longest term when several overlap. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 related terms 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other related terms.
    dictionaryName string Name of the dictionary populated by terms extracted by this processor. If null, use the default dictionary.
    preprocResourceDir string URL for the resource of the related terms preprocessor (data://, file;// or resource://). If null, we use the standard preprocessor of the product.
    whitelistResource string Path to a related terms whitelist resource.
    blacklistResource string Path to a related terms blacklist resource.
    withPartOfSpeech boolean True Adds a PartOfSpeechTagger to the list of processors automatically. Improves quality of automatically extracted terms.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

PartOfSpeechTagger

  • com.exalead.indexing.analysis.v10.PartOfSpeechTagger
  • A PartOfSpeechTagger detects the part of speech for each word in the text of Document Chunks. It improves the quality of other processors, such as the named entity detector or the sentiment analyzer. Annotations generated:
    • "tagger"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file;// or resource://).
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

Phonetizer

  • com.exalead.indexing.analysis.v10.Phonetizer
  • Creates a phonetic form for each word. This processor is used:
    • as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform phonetic matches.
    • to perform search-time phonetic analysis using the Phonetic expansion module (this creates the dictionary of phonetic forms that will be used by the expansion module at search-time).
    • to greatly improve the quality of spell checking.
    Annotations generated:
    • "phonetic"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceFile string URL for the resource (data://, file;// or resource://).
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

NGramsExtractor

  • com.exalead.indexing.analysis.v10.NGramsExtractor
  • Extracts normalized word-grams. N-grams are useful for spell checking and statistical processings. Annotations generated:
    • "ngram"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    min int 2 Minimum ngram size
    max int 3 Maximum ngram size
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

Lemmatizer

  • com.exalead.indexing.analysis.v10.Lemmatizer
  • Creates a lemmatized form for each word (nouns and adjectives only). This processor is mostly used as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform lemmatized matches. Annotations generated:
    • "lemma": normalized lemmatized form of the word (singular/masculine)
    • "lemma_lowercase": lemmatized form of the word (singular/masculine)
    • "fsingular": normalized singular form of the word
    • "fsingular_lowercase": singular form of the word
    • "masculine": if the token is a masculine word
    • "feminine": if the token is a feminine word
    • "neuter": if the token is neuter
    • "singular": if the word is singular
    • "plural": if the word is plural
    • "unnumbered": if the word is unnumbered
    • "pos": the static Part of Speech
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file;// or resource://).
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
    lemmatizeNormalizedAnnotations boolean
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

AcronymDetector

  • com.exalead.indexing.analysis.v10.AcronymDetector
  • Detects acronyms like 'o.n.u' and extracts 'onu'. '.', '-' and ' ' are the standard acronym separators. Custom alphanumeric separators can be added with the "separators" attribute. Annotations generated:
    • "acronym"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    addNormalizerAnnotation boolean
    separators string List of allowed separators chars separated by ',' (can only be alphanumerical, for example, 'and' to handle '1 and 1')
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
    strict boolean True In strict mode, the only separator is dot.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

Normalizer

  • com.exalead.indexing.analysis.v10.Normalizer
  • Normalizes all tags given in input tags field. Annotations generated:
    • "NORMALIZE"
    • "LOWERCASE"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    inputTags string Normalize all tags of "inputTags" (comma-separated list of tags).
    trustLevel int 100
    transliteration boolean True When normalizing, convert some characters to their latin equivalent
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

FarTextAnnotator

  • com.exalead.indexing.analysis.v10.FarTextAnnotator
  • A FarTextAnnotator annotates alphanumeric tokens with 'annotation' if they are farther than 'startOffset'
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    startOffset int 8192
    annotation string fartext
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

Chunker

  • com.exalead.indexing.analysis.v10.Chunker
  • A chunker detects noun groups. Annotations generated:
    • "gadv": adverbal group
    • "gadj": adjectival group
    • "gnoun": noun group
    • "gverb": verbal group
    • "gprep": prepositional group
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file;// or resource://).
    language string Languages for which the processor is activated; if no language is specified, the processor is activated for all languages.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

SentimentAnalyzer

  • com.exalead.indexing.analysis.v10.SentimentAnalyzer
  • Analyzes the nouns and adjectives present in the text. It detects topics and annotates the document with:
    • a global rating of good, bad or neutral
    • a rating per topic
    • the adjective(s) used in the document
    @require Tokenizer, Lemmatizer, PartOfSpeechTagger, RelatedTermsPreprocessor, RelatedTermsExtractor, NamedEntitiesMatcher, Chunker @annotations "sentiment" annotation on nouns with a modulated ("really", "quite", "not") appreciation @document-annotations "document_sentiment" annotation on the document with either "good", "bad" or "neutral" and a confidence ratio @attribute resourceDir (defaults to resource://sentiment/sentiment.bin): @attribute language (defaults to all supported languages): @attribute summarize (defaults to false): @attribute annotateGlobally (defaults to false): @attribute showPackage (defaults to false): @attribute packageCount (defaults to false): @attribute nounPackage DEPRECATED (defaults to true): @attribute ignorePartOfSpeech (defaults to false):
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file;// or resource://).
    language iso code
    annotateGlobally boolean
    annotatePronouns boolean
    ignorePartOfSpeech boolean
    ignoreRelatedTerms boolean
    legacyAnnotations boolean
    notApplicableAnnotations boolean True
    normalizeTrustLevels boolean True
    nounPackage boolean True
    packageCount boolean
    showPackage boolean
    suggest boolean
    summarize boolean
    suggestOutput string
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

FastRulesMatcher

  • com.exalead.indexing.analysis.v10.FastRulesMatcher
  • Annotates a document using a set of XML rules, compiled for efficiency. The rules are described with the query language using the AND, OR and NOT operators, as well as 'context' matching operators. The rules can also match whole chunks (and not just words) per regular expressions. Annotations generated:
    • Depending on the resources (See FastRulesDefinition)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string Directory containing the matcher resources. Must not be empty.
    allowsExprStartingBySeparators boolean If you have expressions starting with a separator (",", ";", "&", ...), then you must set this option to true.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

SnowballStemmer

  • com.exalead.indexing.analysis.v10.SnowballStemmer
  • Creates the stemmed form of each word. This uses the Snowball stemming algorithms. This processor is mostly used as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform stemmed matches. Annotations generated:
    • "stem"
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

DebugSemanticProcessor

  • com.exalead.indexing.analysis.v10.DebugSemanticProcessor
  • Dumps all annotated tokens in the specified format on Standard Output, or in @c outputFile. (Log of the 'Analysis' process)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    outputFile string
    format enum(html, xml) html Output format.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

SQI

  • com.exalead.indexing.analysis.v10.SQI
  • Deprecated)
  • A SemanticProcessor applies semantic processing on the textual content of the DocumentChunks. A Semantic Processor creates SemanticAnnotations on tokens. These SemanticAnnotations can then be used in the Mapping.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL for the resource (data://, file:// or resource://)
    breakOnSentence boolean If true, there will be maximum one match per sentence, and no match for inter-sentence. This option will add the SentenceFinder automatically.
    breakOnParagraph boolean True If true, there will be maximum one match per paragraph, and no match for inter-paragraph.
    breakOnLine boolean If true, there will be maximum one match per line, and no match for inter-line.
    matchAllRules boolean True If true, it returns the full list of matched rules. If false, it returns the first matched rule only.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

SemanticExtractor

  • com.exalead.indexing.analysis.v10.SemanticExtractor
  • The resource describes the features to extract, with their term, type and range for numerical values according to a set of rules. Annotations generated:
    • Depending on the resource (See SemanticExtractorConfig)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceDir string URL of the compiled semantic extractor file. Use the format data://, file:// or resource://.
    prefix string Output annotations prefix
    breakOnSentence boolean If true, there will be maximum one match per sentence, and no match for inter-sentence. This option will add the SentenceFinder automatically.
    breakOnParagraph boolean True If true, there will be maximum one match per paragraph, and no match for inter-paragraph.
    breakOnLine boolean If true, there will be maximum one match per line, and no match for inter-line.
    matchAllRules boolean True If true, it returns the full list of matched rules. If false, it returns only the first matched rule.
    language iso code Language for which the extractor is activated. If null, all languages are activated.
    annotateUnusedTokensWith string Used in the context of query rewriting by the Semantic Query Analyzer.
    overlappingMatches boolean True If true, reports all matches even if their locations overlap. Only makes sense when matchAllRules is true.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

ProximityProcessor

  • com.exalead.indexing.analysis.v10.ProximityProcessor
  • A proximity processor detects and annotates pieces of text where several annotations occur given distance constraints. Possible constraints (non mutually exclusive):
    • token window size
    • distance between annotations
    • sentence/paragraph scope
    Annotations generated:
    • Depending on the resource (See Proximity)
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceFile string URL for the resource (data://, file:// or resource://)
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

AnnotationManager

  • com.exalead.indexing.analysis.v10.AnnotationManager
  • An annotation manager implements basic operations on annotations: copy/removal/selection according to a number of conditions like:
    • Removal of overlaping annotations
    • Selection of the most frequent annotations
    • Copy of an annotation unless blacklisted
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    resourceFile string URL for the resource (data://, file:// or resource://)
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor

CustomSemanticProcessor

  • com.exalead.indexing.analysis.v10.CustomSemanticProcessor
  • A custom semantic processor allows you to plug in custom code in the semantic pipeline.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
  • Attributes:
    Name Type Default value Description
    name string Name of the Semantic Processor. This name is only used for tracing and debugging purposes.
    contexts string Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed.
    dataModelState string Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disables the DocumentProcessor
    classId string The specified class must implement the {@code com.exalead.indexing.analysis.semantic.CustomSemanticProcessorInterface} Exascript interface.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.SemanticProcessor If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    KeyValue exa.bee.KeyValue*

PrintfValues

  • com.exalead.indexing.analysis.v10.PrintfValues
  • Prints textual content of DocumentChunks according to a formatting string. This string contains variables in one of the 3 following formats: 1. $(name), the name of a context: output is the textual content of this context. 2. $/name:regexp/, the name of a context whose chunks must match the regexp: output is the piece of text that has matched. 3. $/name:regexp:format/, the name of a context whose chunks must match the regexp: output is defined by a sed-like format referencing the regexp subexpressions. Warning: In the regexp and format parts, colons and slashes must be escaped with a backslash. For example : "$(firstname) $(lastname) : $/age:[0-9]+/ $/date:([0-9]{2})([0-9]{2})([0-9]{4}):day=\\1 month=\\2 year=\\3" Warning: The context used in this method cannot be produced by another processor. It should come from the connector.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    formattingString string This string contains variables in one of the 3 following formats: 1. $(name), the name of a context: output is the textual content of this context. 2. $/name:regexp/, the name of a context whose chunks must match the regexp: output is the piece of text that has matched. 3. $/name:regexp:format/, the name of a context whose chunks must match the regexp: output is defined by a sed-like format referencing the regexp subexpressions. Warning: Colons and slashes must be escaped with a backslash. For example : "$(firstname) $(lastname) : $/age:[0-9]+/ $/date:([0-9]{2})([0-9]{2})([0-9]{4}):day=\\1 month=\\2 year=\\3"
    outputContext string ContextName to be associated with the DocumentChunk created for each generated value.
    strict boolean True Forces all the manipulated contexts found to process.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

RenameUnmappedContexts

  • com.exalead.indexing.analysis.v10.RenameUnmappedContexts
  • This Document Processor changes the ContextName for all DocumentChunks associated with a ContextName that does not have a Mapping Configuration. This avoids extensive renaming using RenameContext.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The new ContextName for DocumentChunks with an unmapped ContextName.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

NewChunk

  • com.exalead.indexing.analysis.v10.NewChunk
  • Creates a new DocumentChunk with 'outputContext' as ContextName, and textual content specified in 'value'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    value string The value used for newly created chunks.
    partName string The part to which the chunk should belong. If nothing is specified here, the chunk will be handled as a global chunk.
    language iso code Language of the chunk, as an ISO639 code.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

UniformRandomContextGenerator

  • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator
  • Adds a new DocumentChunk for one document out of 'modulo' documents processed. The textual content of the DocumentChunk is picked out of the list specified in 'values', with a uniform distribution. @descr
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    modulo int Inverse probability of adding the new chunk. Must be a strictly positive integer.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    values exa.bee.StringValue* List of possible values.
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ZipfRandomContextGenerator

  • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator
  • Adds a new document chunk for one document out of 'modulo'. The textual content of the document chunk is picked out of the list specified in 'values', with a non-uniform discrete Zipf distribution. @descr
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    outputContext string The ContextName used for newly created chunks.
    modulo int Inverse probability of adding the new chunk. Must be a strictly positive integer.
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    parameter double The exponent characterizing the distribution.
  • Nested elements:
    Name Type Description
    values exa.bee.StringValue* List of possible values.
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

HTMLRelevantContentExtractor

  • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor
  • The HTMLRelevantContentExtractor extracts the most relevant parts of an HTML document. Generally, the relevant part of an HTML document is the article on the middle of the page. The header, the footer and the menus are often the same on all pages and should not be indexed. The extraction can be tuned using different attributes. @csh AC_HTMLRELEVANT_CONTENT_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    relevantChunkContext string relevantcontent Relevant text chunks will be copied in this context.
    newContextName string relevantcontent Deprecated, use 'relevantChunkContext'.
    irrelevantChunkContext string excludedcontent Irrelevant text chunks will be copied in this context.
    retrieveFieldContext string htmlcontent Original text chunks will be moved in this context.
    irrelevantChunkAnnotation string If set, the HTMLRelevantContentExtractor will annotate each irrelevant chunk with an annotation.
    minScore int 15 Internally, the HTMLRelevantContentExtractor assigns a score to each chunk of its input. Use 'minScore' to keep only chunks having a score greater than a value.
    minParagraphWords int 7 The minimum number of words a &lt;p&gt; chunk must have to be considered as a paragraph and be boosted.
    minTitleWords int 3 The minimum number of words a title must have to be boosted.
    linkAllowedInTitle boolean True By default, the links contained in a page title produce a malus, this can be disabled.
    paragraphBoost int 10 Each time a paragraph will be detected, the score will be increased by this value.
    maxWordInLinkRatio int 2 The maximum allowed ratio of words contained in links in a chunk of text.
    titleBoost int 5 Each time a title will be detected, the score will be increased by this value.
    classBoost int 10 Each time a CSS class included in 'idsAndClassesToKeep' will be detected, the score will be increased by this value.
    keepOnlyBestChunk boolean If true, the 'relevantcontent' will only be composed by the main article of the page.
    skipBlockquotes boolean Ability to skip HTML blockquote tags.
    skipPre boolean Ability to skip HTML pre tags.
    keepImages boolean If true, the HTML image annotations will be kept in the new context.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    idsAndClassesToIgnore exa.bee.StringValue* The list of CSS classes and HTML ids to ignore.
    idsAndClassesToKeep exa.bee.StringValue* The list of CSS classes and HTML ids to boost.
    annotationsToCopy exa.bee.StringValue* The list of annotations to keep in the new context.
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

HTMLTableExtractor

  • com.exalead.indexing.analysis.v10.HTMLTableExtractor
  • Extracts all HTML tables having minColumnsRequired < nb cols < maxColumnsRequired and duplicates them in context {@link newContextName}
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    newContextName string webtable The ContextName used for newly created chunks.
    minColumnsRequired int 2 The minimum number of columns required to extract.
    maxColumnsRequired int 2147483647 The maximum number of columns required to extract.
    concatenateRows boolean Concatenates all rows.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DiscardDocument

  • com.exalead.indexing.analysis.v10.DiscardDocument
  • DEPRECATED. It does not stop the processing of the document. To do so, add a custom document processor with the following code:
    document.setProcessingFlag(Operation.DISCARD_AND_DELETE);
    ((AnalysisDocumentProcessingContext) context).stopProcessingAfterCurrentProcessor();
    @desc Discards documents from the pipeline.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    logDiscardedDocuments boolean If true, the URI of each discarded documents is logged on each analysis process log file.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

ReplaceContextNames

  • com.exalead.indexing.analysis.v10.ReplaceContextNames
  • Replaces the first matching substring of context names with the given replacement. For example, inputSubstring="abc" and outputReplacement="bar" will rename context abcdef to bardef and somethingabcstuff to somethingbarstuff
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    inputSubstring string The piece of string to be replaced.
    outputReplacement string The replacement string.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

HTMLCSSSelector

  • com.exalead.indexing.analysis.v10.HTMLCSSSelector
  • Deletes all text chunks that are not annotated with a class or an id specified in {@link classes} or {@link ids}
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    classes exa.bee.StringValue*
    ids exa.bee.StringValue*
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

HTMLCSSExtractor

  • com.exalead.indexing.analysis.v10.HTMLCSSExtractor
  • Extracts all text chunks annotated with a class or an id specified in {@link classes} or {@link ids}, and duplicates them in context {@link outputContext}
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string ContextName to be associated with the DocumentChunk created for each new context.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    classes exa.bee.StringValue* List of classes used to determine whether a chunk must be duplicated.
    ids exa.bee.StringValue* List of ids used to determine whether a chunk must be duplicated.
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DataModelClassResolver

  • com.exalead.indexing.analysis.v10.DataModelClassResolver
  • This processor takes the value of the "datamodel_class" papi directive to determine the DataModelClass of the document. If this directive is not found, we assume this is the default class. If this is not the default class, all metas corresponding to an existing DataModelProperty are prefixed with the type of the class declaring the property (it may be a superclass of the class). For the processors following this processor in the pipeline, you must refer to the Data Model property by prefixing it with its class name. For processors preceding this processor in the pipeline, use the meta name only (without prefix).
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

SetDefaultValue

  • com.exalead.indexing.analysis.v10.SetDefaultValue
  • This processor looks for specified contexts. If they are not present in document, they are created with a configured value.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    KeyValue exa.bee.KeyValue*

CustomDocumentProcessor

  • com.exalead.indexing.analysis.v10.CustomDocumentProcessor
  • A Custom document processor allows you to plug in custom code packaged as a CVPlugin into the document processing pipeline.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    classId string Class identifier. The specified class must implement the com.exalead.pdoc.analysis.CustomDocumentProcessor Java Interface.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    KeyValue exa.bee.KeyValue*

InferFileExtension

  • com.exalead.indexing.analysis.v10.InferFileExtension
  • When the file_extension meta is not present, finds the file extension based on the file name or the mime meta (if one of these two is present).
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

InsertCurrentDate

  • com.exalead.indexing.analysis.v10.InsertCurrentDate
  • Adds the current date in an output context
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    outputContext string The ContextName used for newly created chunks.
    format string Either "unixts" or a SimpleDateFormat specification
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

XpathExtractor

  • com.exalead.indexing.analysis.v10.XpathExtractor
  • Extraction is performed for the following data types:
    • text/html. HTML Files.
    • application/xml. XML Files.
    Warning: To put before the NativeTextExtractor because the 'bytes' of each Document Binary Part are deleted by the NativeTextExtractor. Limitations: This extractor handles node set and string functions. Not number and boolean. You can use number or boolean functions inside your xpath //img[starts-with(@src, "http://")] because this xpath return a set of nodes (<img>) but xpath count(//img) doesn't work because it returns a number. @csh AC_XPATH_EXTRACTOR_ID
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    htmlParserToUse enum(htmlCleaner, tagSoup) htmlCleaner HTML parser to use in priority.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    XpathRule com.exalead.indexing.analysis.v10.XpathRule*

XpathRule

  • com.exalead.indexing.analysis.v10.XpathRule
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
  • Attributes:
    Name Type Default value Description
    metaName string
    xpath string
    concatMutiMatch boolean True Concatenates all results in a value when the xpath expression returns several results. Otherwise, it adds each match in a multiValued meta. It should be unselected if you want each node returned by xpath expression in different value (like list of item).

XpathFragmentExtractor

  • com.exalead.indexing.analysis.v10.XpathFragmentExtractor
  • Input: All DocumentChunks associated with the specified 'inputContext' ContextNames. Input can be XML or HTML fragment. Output: DocumentChunks are created for each Xpath Fragment Rule. Each DocumentChunk is associated with the specified 'Meta name' ContextName. Warning: To put before the NativeTextExtractor because the 'bytes' of each Document Binary Part are deleted by the NativeTextExtractor.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    inputFragmentMeta string
    parserToUse enum(htmlCleaner, tagSoup, xmlParser) xmlParser Parser to use in priority.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    XpathFragmentRule com.exalead.indexing.analysis.v10.XpathFragmentRule*

XpathFragmentRule

  • com.exalead.indexing.analysis.v10.XpathFragmentRule
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
  • Attributes:
    Name Type Default value Description
    metaName string
    xpath string

SimilarStringToPart

  • com.exalead.indexing.analysis.v10.SimilarStringToPart
  • Converts the signatures in a string format from a meta to a binary part
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    version int 1 Specifies the version.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    values exa.bee.StringValue* List of the names of the metas to parse and to transform to part.
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DocumentProcessorGroup

  • com.exalead.indexing.analysis.v10.DocumentProcessorGroup
  • Contains a list of document processors, which are executed only if this group document processor condition matches. It avoids condition duplication or distinct pipelines creation when several processors share the same condition.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.
    DocumentProcessor com.exalead.indexing.analysis.v10.DocumentProcessor*

UnitsOfMeasurementNormalizer

  • com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer
  • Unit of measurement detector and convertor
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    indexField string The index field in which the value will be stored.
    indexFieldUnitSymbol string The output unit symbol
    suffixName string _um Output suffix to create a new meta as output
    removeContext boolean Remove contexts after processing
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

DebugCrashProcessor

  • com.exalead.indexing.analysis.v10.DebugCrashProcessor
  • Causes crashes for debugging purpose
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    type string exception The crash type {@code enum(noop,exception,oom,infiniteloop,nullptr,abort,assert,segv,intdiv)}
    delay int Trigger delay in seconds.
    count int 3 Trigger document count.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

PLMExpandDocumentProcessor

  • com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor
  • Treat plm metas to generate octrees and matrices for PLMExpand.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    metaMatrix string matrix Name of the meta containing the matrix data.
    fieldMatrix string matrix Name of the target matrix field.
    fieldInvMatrix string invmatrix Name of the target matrix field.
    metaCGR string cgr Name of the meta containing the CGRs.
    fieldOctree string octree Name of target octree field.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

CGRDocumentProcessor

  • com.exalead.indexing.analysis.v10.CGRDocumentProcessor
  • Calls convert to generate octrees.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
    • com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
  • Attributes:
    Name Type Default value Description
    name string Name of this processor. The name of a processor is used only for tracing and debugging purposes.
    dataModelState string Is this document processor managed by a data model? @enum{null,auto,customized, error}.
    • If null, this document processor is not related to a data model.
    • If "auto", this document processor is auto-generated by a data model.
    • If "customized", this document processor was auto-generated by a data model and then customized.
    • If "error", there is a conflict between this document processor and the data model.
    dataModelClass string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor.
    dataModelProperty string If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor.
    disabled boolean Disable the DocumentProcessor
    partCGR string CGR Name of the part containing the CGR data (tesselation).
    partOctree string octree Name of the part used to store the resulting octree.
    docIdentifyer string majorid Name of the meta identifying the document.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.DocumentProcessor If dataModelState is "customized", you will find here the original document processor generated by the data model. Use this to easily revert to "auto" state from "customized". @IgnoreForValueConstructor
    AcceptCondition com.exalead.indexing.analysis.v10.AcceptCondition Expresses the enablement condition of this DocumentProcessor.

FilteringConfiguration

  • com.exalead.indexing.analysis.v10.FilteringConfiguration
  • Filters to apply to the words extracted from the semantic processors. Words that do not satisfy these conditions will not be indexed. The filtered values are expressed by the number of unicode characters.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
  • Attributes:
    Name Type Default value Description
    wordMaxLength int 100 Maximal length of a word. 100 is the default value.
    hexCharMax int Maximal number of hexadecimal characters that can appear in a word. This filter applies only for words bigger than 'hexLengthMin'. 0 = no filter (default value)
    hexLengthMin int Minimal number of characters in a word for the hexadecimal filter to apply. 0 = no filter (default value)
    maxNumChars int Maximal number of characters in a word. 0 = no filter (default value)

LanguageConfiguration

  • com.exalead.indexing.analysis.v10.LanguageConfiguration
  • Configuration of the linguistic extraction for a given language.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
  • Attributes:
    Name Type Default value Description
    language iso code The language iso code
    generateWordDict boolean Extracts words for the global dictionary.
    wordDictModulo int 1 Word extraction modulo, by default extract all words.
    maxWordDictWordsPerDocument long -1 Maximum number of words extracted per document.
    maxExtractedWordLength int 64 Maximum length a word must have to be extracted.
    spellCheckNGramMaxSize int 3 Maximum number of consecutive words for spellchecking. If the value is set to '-1', spellcheck data is not generated for this language. 0 and 1 values are illegal, default is 3.
    spellCheckNGramsDictModulo int 5 NGrams extraction modulo. It extracts 1 ngram out of 5 by default.
    maxSpellCheckNGramsPerDocument long -1 Maximum number of ngrams extracted per document.
    maxExtractedSpellCheckNGramLength int 256 Maximum length an ngram must have to be extracted.
    relatedTermsDictModulo int 1 Submits 1 out of X documents for related terms generation. If the value is set to 0, related terms are not generated for this language.
    maxRelatedTermsDictContextsPerDocument long -1 Maximum number of related terms extracted per document.

MappingConfiguration

  • com.exalead.indexing.analysis.v10.MappingConfiguration
  • Specifies how DocumentChunks and their SemanticAnnotations populate the index and the dictionary.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
  • Nested elements:
    Name Type Description
    AnnotationMapping com.exalead.indexing.analysis.v10.AnnotationMapping* List of mappings from annotations to index targets, with associated parameters.
    ContextMapping com.exalead.indexing.analysis.v10.ContextMapping* List mappings from contexts to index targets, with associated parameters.
    FieldIndexingLimit com.exalead.indexing.analysis.v10.FieldIndexingLimit* Word count limits to apply to texts mapped to index fields for search.
    FieldRetrievalLimit com.exalead.indexing.analysis.v10.FieldRetrievalLimit* Size limits (in bytes) to apply to texts mapped to the index for retrieval.
    GenerateAnnotationsForContext com.exalead.indexing.analysis.v10.GenerateAnnotationsForContext* List of contexts to process with a semantic pipeline before mapping.
    PartMapping com.exalead.indexing.analysis.v10.PartMapping* List mappings from parts to index targets, with associated parameters.
    WordCountMapping com.exalead.indexing.analysis.v10.WordCountMapping* Specify where to map Word count.

AnnotationMapping

  • com.exalead.indexing.analysis.v10.AnnotationMapping
  • Defines how SemanticAnnotations are used to populate index fields.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    name string Name of the SemanticAnnotation to map.
    context string Optional input context restricting the mapping from the annotations coming from a specific context. Incompatible with the patternMatch feature.
    patternMatch boolean Matches all annotations matching this pattern (must be a valid regular expression).
    dataModelState string Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this annotation mapping is not related to a data model. If "auto", this annotation mapping is auto-generated by a data model If "customized", this annotation mapping was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this annotation mapping.
    dataModelProperty string If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this annotation mapping.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.AnnotationMapping If dataModelState is "customized", you will find here the original annotation mapping generated by the data model. Use this to easily show what reverting to "auto" from "customized" would imply
    AnnotationTarget com.exalead.indexing.analysis.v10.AnnotationTarget*

CategoryAnnotationTarget

  • com.exalead.indexing.analysis.v10.CategoryAnnotationTarget
  • CategoryAnnotationTarget is used to create a new category path inside an index category field, out of a SemanticAnnotation. The category path is built by the concatenation of the 'categoryRoot' and the selected 'form' of the annotation.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
  • Attributes:
    Name Type Default value Description
    indexField string
    forcedRank long
    rankBoost long
    form string normalized Which form of SemanticAnnotation value should we index? {@code enum(exact,normalized)}
    dataModelState string Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget.
    dataModelProperty string If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget.
    categoryRoot string Prefix used to build the CategoryPath.
    categoryAppend boolean True Builds the category path by concatenating the categoryRoot and the selected 'form' of the annotation. If false, only the category root will be used.
    appendAnnotationNameToRoot boolean Appends the annotation name between the root and the value.
    retrievable boolean If true, the category path is retrievable and can be used to create facets. If false, the category path is only searchable. (Advanced usage. langdate hacks)
    cleanupContent boolean True Removes trailing and leading spaces. Removes category path without AlphaNum character.
    detectTitle boolean Detect words set after # in path and use them as title
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.AnnotationTarget If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply.

StandardAnnotationTarget

  • com.exalead.indexing.analysis.v10.StandardAnnotationTarget
  • StandardAnnotationTarget is used to index the textual content of a SemanticAnnotation. The selected 'form' of the SemanticAnnotation is used to populate an index field.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
  • Attributes:
    Name Type Default value Description
    indexField string
    forcedRank long
    rankBoost long
    form string normalized Which form of SemanticAnnotation value should we index? {@code enum(exact,normalized)}
    dataModelState string Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget.
    dataModelProperty string If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget.
    searchable boolean If true, the SemanticAnnotation can be searched for.
    indexLevel string If searchable, index kind where data will be indexed. Can be "exact", "lowercase", "normalized" or "custom".
    customIndexKind int If indexLevel = "custom", this index kind will be used.
    retrievable boolean If true, the SemanticAnnotation can be retrieved.
    retrieveField string The field where the SemanticAnnotation is stored for retrieval, if 'retrievable' is set to true. If null, 'indexField' will be used to store the SemanticAnnotation for retrieval.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.AnnotationTarget If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply.

EnumFacetAnnotationTarget

  • com.exalead.indexing.analysis.v10.EnumFacetAnnotationTarget
  • EnumFacetAnnotationTarget maps the annotations according to the specified EnumFacet.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
  • Attributes:
    Name Type Default value Description
    indexField string
    forcedRank long
    rankBoost long
    form string normalized Which form of SemanticAnnotation value should we index? {@code enum(exact,normalized)}
    dataModelState string Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget.
    dataModelProperty string If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget.
    enumFacetId string The id of the EnumFacetAnnotationTarget this target refers to.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.AnnotationTarget If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply.

ContextMapping

  • com.exalead.indexing.analysis.v10.ContextMapping
  • ContextMapping specifies how DocumentChunks with a given ContextName are remapped to index fields and whether they are used to populate the dictionary.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    name string ContextName of the DocumentChunks to map.
    prefixMatch boolean Matches all context that starts with this prefix.
    unprefix boolean Remove the prefix that was used to match.
    patternMatch boolean Matches all context matching this pattern (must be a valid regular expression).
    semantic boolean True Performs semantic processing on the DocumentChunks processed by this mapping. If false, the textual content of the DocumentChunks will not be tokenized before indexing. This can be used to index 'exact raw values'.
    resourceFreq int 1 To extract a resource, select the frequency to add. For example, if you have a 'firstname lastname' entry, you may want to simulate a frequency of 1000 to avoid spellcheck on this entry.
    tokenizationConfig string
    dataModelState string Is this content target managed by a data model? @enum{null,auto,customized}. If null, this context mapping is not related to a data model. If "auto", this context mapping is auto-generated by a data model If "customized", this context mapping was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this context mapping.
    dataModelProperty string If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this ContextMapping
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.indexing.analysis.v10.ContextMapping If dataModelState is "customized", you will find here the original context mapping generated by the data model. Use this to easily show what reverting to "auto" from "customized" would imply.
    Target com.exalead.indexing.analysis.v10.Target*

CategoryContentTarget

  • com.exalead.indexing.analysis.v10.CategoryContentTarget
  • CategoryContentTarget is used to map a DocumentChunk to a category. A Category Path is created for each DocumentChunk processed. The textual content of the DocumentChunk is used to build a Category Path. 'indexField' should be a category field (usually called 'categories' or 'security').
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    categoryRoot string Builds the category path.
    categoryAppend boolean True Appends the textual content of the DocumentChunk to the category root. If false, only the category root will be used.
    appendContextNameToRoot boolean Appends the context name between the root and the value.
    form string normalized The form of the word to be used to build the category path. {@code enum(exact,normalized)}
    retrievable boolean Stores the category path, which enables display and navigation by category path. If false, we only index the SemanticAnnotation (Advanced usage - langdate hacks).
    cleanupContent boolean True If true:
    • Removes trailing and leading unicode-spaces.
    • Replaces all sequences of unicode-space characters by a single 'space' character.
    • Does not map to the category in append mode if the DocumentChunk does not contain at least one unicode alpha-numerical character.
    detectTitle boolean Detect words set after # in path and use them as title

DateCategoryContentTarget

  • com.exalead.indexing.analysis.v10.DateCategoryContentTarget
  • CategoryContentTarget specific to date.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    categoryRoot string Builds the category path.
    categoryAppend boolean True Appends the textual content of the DocumentChunk to the category root. If false, only the category root will be used.
    appendContextNameToRoot boolean Appends the context name between the root and the value.
    form string normalized The form of the word to be used to build the category path. {@code enum(exact,normalized)}
    retrievable boolean Stores the category path, which enables display and navigation by category path. If false, we only index the SemanticAnnotation (Advanced usage - langdate hacks).
    cleanupContent boolean True If true:
    • Removes trailing and leading unicode-spaces.
    • Replaces all sequences of unicode-space characters by a single 'space' character.
    • Does not map to the category in append mode if the DocumentChunk does not contain at least one unicode alpha-numerical character.
    detectTitle boolean Detect words set after # in path and use them as title
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    inputFormat string Specifies the input format of the date, in UNIX date format. Set null value for automatic detection of standard formats.

StandardContentTarget

  • com.exalead.indexing.analysis.v10.StandardContentTarget
  • A StandardContentTarget is used to populate a textual, numerical or date index field, with the content of a DocumentChunk.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    prefixWithContext boolean Enables prefixing of all words in inverted lists by 'contextName#'.
    addStartEnd boolean Enables the introduction of a word __start__ before chunk content and a word __end__ after chunk content. Only valid if Chunk is mapped with semantic=true. This option is compatible with prefixContextName: produce contextName#__start__ and contextName#__end__)
    indexPrefixes boolean Enables the indexing of all prefixes for each word with a score = prefixScore. The prefix can be mapped to a specific type if you add 'prefix' in formIndexingConfig.
    prefixesScore int 1 Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    maxPrefixLength int Maximum length of the extracted prefixes.
    indexSuffixes boolean Enables the indexing of all suffixes for each word with a score = suffixScore. The suffix can be mapped to a specific kind if you add 'suffix' in formIndexingConfig.
    suffixesScore int 1 Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    maxSuffixLength int Maximum length of the extracted suffixes.
    indexSubstrings boolean Enables the indexing of all substrings for each word with a score = substringScore. The suffix can be mapped to a specific kind if you add 'substring' in formIndexingConfig.
    substringsScore int 1 Score given to extracted substrings. Document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    searchable boolean True Marks the content of the DocumentChunk as indexed and searchable.
    retrievable boolean True Enables the content of the DocumentChunk to be directly stored in the index, so that it can be retrieved. For numerical values, retrievability allows you to sort results by field.
    retrieveField string The index field in which the content will be stored. If null, the content will be put in 'indexField'.
    indexNormalized boolean True Enables the indexing of the normalized form of the word.
    indexLowercase boolean Enables the indexing of the lowercase (non-normalized) form of each token.
    indexExact boolean Enables the indexing of the exact (non-normalized) form of each token.
    indexSeparators boolean Enables the indexing of the index standard separators. Indexed standard separators are: paragraph, sentence and page. Standard separators indexing is required for the SPLIT operator to work with these separators.
    addBreakBetweenChunks boolean True Enables the introduction of a break between document chunks by the indexer. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field.
    • If addBreakBetweenChunks is false, then the document will match on the query "foo bar", foo NEXT bar
    • If addBreakBetweenChunks is true, then the document will not match the query "foo bar" nor foo NEXT bar but will match the query foo AND bar
  • Nested elements:
    Name Type Description
    DecreaseRankOnAnnotation com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation* List of DecreaseRankOnAnnotation
    IncreaseRankOnAnnotation com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation* List of IncreaseRankOnAnnotation
    RankOnAnnotation com.exalead.indexing.analysis.v10.RankOnAnnotation* List of RankOnAnnotation

DateContentTarget

  • com.exalead.indexing.analysis.v10.DateContentTarget
  • DateContentTarget defines indexing a date.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    prefixWithContext boolean Enables prefixing of all words in inverted lists by 'contextName#'.
    addStartEnd boolean Enables the introduction of a word __start__ before chunk content and a word __end__ after chunk content. Only valid if Chunk is mapped with semantic=true. This option is compatible with prefixContextName: produce contextName#__start__ and contextName#__end__)
    indexPrefixes boolean Enables the indexing of all prefixes for each word with a score = prefixScore. The prefix can be mapped to a specific type if you add 'prefix' in formIndexingConfig.
    prefixesScore int 1 Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    maxPrefixLength int Maximum length of the extracted prefixes.
    indexSuffixes boolean Enables the indexing of all suffixes for each word with a score = suffixScore. The suffix can be mapped to a specific kind if you add 'suffix' in formIndexingConfig.
    suffixesScore int 1 Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    maxSuffixLength int Maximum length of the extracted suffixes.
    indexSubstrings boolean Enables the indexing of all substrings for each word with a score = substringScore. The suffix can be mapped to a specific kind if you add 'substring' in formIndexingConfig.
    substringsScore int 1 Score given to extracted substrings. Document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document.
    searchable boolean True Marks the content of the DocumentChunk as indexed and searchable.
    retrievable boolean True Enables the content of the DocumentChunk to be directly stored in the index, so that it can be retrieved. For numerical values, retrievability allows you to sort results by field.
    retrieveField string The index field in which the content will be stored. If null, the content will be put in 'indexField'.
    indexNormalized boolean True Enables the indexing of the normalized form of the word.
    indexLowercase boolean Enables the indexing of the lowercase (non-normalized) form of each token.
    indexExact boolean Enables the indexing of the exact (non-normalized) form of each token.
    indexSeparators boolean Enables the indexing of the index standard separators. Indexed standard separators are: paragraph, sentence and page. Standard separators indexing is required for the SPLIT operator to work with these separators.
    addBreakBetweenChunks boolean True Enables the introduction of a break between document chunks by the indexer. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field.
    • If addBreakBetweenChunks is false, then the document will match on the query "foo bar", foo NEXT bar
    • If addBreakBetweenChunks is true, then the document will not match the query "foo bar" nor foo NEXT bar but will match the query foo AND bar
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    inputFormat string Specifies the input format of the date, in UNIX date format. Set null value for automatic detection of standard formats.
  • Nested elements:
    Name Type Description
    DecreaseRankOnAnnotation com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation* List of DecreaseRankOnAnnotation
    IncreaseRankOnAnnotation com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation* List of IncreaseRankOnAnnotation
    RankOnAnnotation com.exalead.indexing.analysis.v10.RankOnAnnotation* List of RankOnAnnotation

DecreaseRankOnAnnotation

  • com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation
  • Allows you to decrease the ranking when some words are flagged by an annotation (part of speech, ontology, ...).
  • Parent elements:
    • com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
    • com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
  • Attributes:
    Name Type Default value Description
    annotationName string Name of the targeted annotation.
    annotationValue string Value of the annotation that will trigger the decrease in ranking.
    value int Number to decrease from the ranking when triggered.

IncreaseRankOnAnnotation

  • com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation
  • Allows you to increase the ranking when some words are flagged by an annotation (part of speech, ontology, ...).
  • Parent elements:
    • com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
    • com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
  • Attributes:
    Name Type Default value Description
    annotationName string Name of the targeted annotation.
    annotationValue string Value of the annotation that will trigger the increase in ranking.
    value int Number to increase in the ranking when triggered.

RankOnAnnotation

  • com.exalead.indexing.analysis.v10.RankOnAnnotation
  • Modifies ranking when some words are flagged by a given annotation.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
    • com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
  • Attributes:
    Name Type Default value Description
    annotationName string The annotation that triggers the ranking modification.
    annotationValue string The annotation value required to trigger the ranking modification.
    forcedRank int The new ranking.

CustomContentTarget

  • com.exalead.indexing.analysis.v10.CustomContentTarget
  • CustomerContentTarget defines indexing by a custom 'Index Kind'.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    searchable boolean True If true, the content of the DocumentChunk will be indexed and searchable.
    retrieveField string The index field in which the content will be stored. If null, the content will be put in 'indexField'.
    retrievable boolean True Stores the content of the DocumentChunk directly in the index, so that it can be retrieved. For numerical values, retrievability enables to sort results by field.
    indexKind int Index 'Kind' to use for indexing content.
    addBreakBetweenChunks boolean True If true, the indexer introduces a break between document chunks. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field:
    • If addBreakBetweenChunks is false, then the document will match on the query "foo bar", foo NEXT bar
    • If addBreakBetweenChunks is true, then the document will not match the query "foo bar" nor foo NEXT bar but will match the query foo AND bar

EnumFacetContentTarget

  • com.exalead.indexing.analysis.v10.EnumFacetContentTarget
  • EnumFacetContentTarget maps the content according to the specified EnumFacet.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    indexField string The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field.
    forcedRank long Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept.
    rankBoost long Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6.
    enumFacetId string The id of the EnumFacet this target refers to.
    form string normalized The form of the values for the facet stringValues {@code enum(exact,normalized)}

DictionaryTarget

  • com.exalead.indexing.analysis.v10.DictionaryTarget
  • A DictionaryTarget specifies how a DocumentChunk or semantic annotation is processed to the dictionary.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    dictionaryName string
    words boolean True
    ngrams boolean
    rt boolean
    phonemes boolean

PartTarget

  • com.exalead.indexing.analysis.v10.PartTarget
  • A PartTarget specifies how a Part is processed to populate the index.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
  • Attributes:
    Name Type Default value Description
    indexField string The index field in which the content will be stored.

FieldIndexingLimit

  • com.exalead.indexing.analysis.v10.FieldIndexingLimit
  • Limits the number of words that can be retrieved from a given field.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    fieldName string Field to limit.
    maxNbWords int Maximum number of words for this field.

FieldRetrievalLimit

  • com.exalead.indexing.analysis.v10.FieldRetrievalLimit
  • Limits the size of text that can be retrieved from a given field. In some standard configuration, a FieldRetrievalLimit on the 'text' field is set to "maxLength=4096". This limits the size of the index on disk.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    retrieveField string Field to limit.
    maxLength int Max text size in bytes. The text will be clipped to the nearest word. Text is stored in UTF-8.

GenerateAnnotationsForContext

  • com.exalead.indexing.analysis.v10.GenerateAnnotationsForContext
  • Forces a context to be processed by the SemanticProcessor pipeline and to process semantic annotations.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    name string ContextName of the DocumentChunks to map.
    prefixMatch boolean Matches any context starting with this prefix.
    patternMatch boolean Matches any context matching this regular expression.
    tokenizationConfig string If set, it forces the tokenization configuration to use.

PartMapping

  • com.exalead.indexing.analysis.v10.PartMapping
  • PartMapping specifies how parts are remapped to index fields.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    name string Name of the Part to map.
    prefixMatch boolean Matches all parts that starts with this prefix.
    patternMatch boolean Matches all parts matching this pattern (must be a valid regular expression).
  • Nested elements:
    Name Type Description
    PartTarget com.exalead.indexing.analysis.v10.PartTarget*

WordCountMapping

  • com.exalead.indexing.analysis.v10.WordCountMapping
  • Specify where to map Word count.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
  • Attributes:
    Name Type Default value Description
    fromName string Compute the word count of this field.
    toName string Store the word count to this field.

IndexSchema

  • com.exalead.mercury.mami.indexing.v10.IndexSchema
  • Configuration for an index schema. This defines the fields actually stored in an index. Most commonly, only one index schema is defined, and used by all build groups (for all slices). This configuration is referenced in the BuildGroup element in 'Deployment'.
  • Attributes:
    Name Type Default value Description
    name string
    allowIntensiveDiskAccess boolean Allows intensive operations like sorting or faceting to be performed on disk (SSD should be preferred).
  • Nested elements:
    Name Type Description
    AttributeGroupStore com.exalead.mercury.mami.indexing.v10.AttributeGroupStore*
    FieldConfig com.exalead.mercury.mami.indexing.v10.FieldConfig*

AttributeGroupStore

  • com.exalead.mercury.mami.indexing.v10.AttributeGroupStore
  • Configuration of an attribute group. An attribute group define how attributes should be persisted on disk.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    id int A unique identifier for this attribute group.
    label string A human readable name for this attribute group.
    format enum(SimpleRowOrientedStore, ItemOrientedStore) ItemOrientedStore Specifies how to persist the data on disk for this attribute group.
    retrievableRoles string Specifies a comma-separated list of annotations to be handled in this attribute group store. Ex: @Facetable,@Sortable,@Display
    leafSize int 30720 If the format is SimpleRowOrientedStore, configures the leaf size (i.e., maximum IO size read per DID).

AlphanumFieldConfig

  • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig
  • This field stores alphanumeric values (i.e., 'text', 'title').
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    maxStoredWordPosition int Number of words, starting from the beginning of the document, for which word positions will be stored in the index. This enables proximity ranking and position searching (NEAR, NEXT, ...) up to this number of words in the document. '0' should be used to disable position storing.
    maxInlineWordPositions int 2 Advanced setting controlling how many positions are inlined in the main data file for each word of each document.
    useVariablePositionsEncoding boolean Advanced setting to choose which positions encoding algorithm should be used. Variable position encoding should be used to reduce index size when indexing big documents.
    storeTf boolean Stores the number of terms of each document. This information may be used by the ranking algorithm to normalize term frequencies (as "nbTerms"). This costs a few bytes of RAM per document.
    bloomFilter boolean Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing).
    gzip boolean True Activates content compression
    implementation enum(strbtree, trie, fsm) fsm Advanced configuration. Internal structure used to store the field dictionary.
    nbWordsPerLeaf int 1000 Advanced configuration. If using the strbtree structure, it configures the number of words per leaf.
    optimizePatternSearch boolean True Adds extra informations to the dictionaries for pattern search optimization. If false, optimizes data structures for size.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

RiceEncoderConfig

  • com.exalead.mercury.mami.indexing.v10.RiceEncoderConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
    • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
  • Attributes:
    Name Type Default value Description
    bytesPerBlock int 1024
    positionsRiceCodingParam int 1024
    dataFilesPrefetchPages int 2
    extFilesPrefetchPages int 2

VarIntEncoderConfig

  • com.exalead.mercury.mami.indexing.v10.VarIntEncoderConfig
  • Stores each integer in varint encoding
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
    • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)

Apollo11EncoderConfig

  • com.exalead.mercury.mami.indexing.v10.Apollo11EncoderConfig
  • Stores each integer in Apollo11 encoding
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
    • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)

NoOpEncoderConfig

  • com.exalead.mercury.mami.indexing.v10.NoOpEncoderConfig
  • Trivial encoder. For debugging purposes only
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
    • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)

FastNoPosEncoderConfig

  • com.exalead.mercury.mami.indexing.v10.FastNoPosEncoderConfig
  • An encoder that only stores docids, not ranks nor positions.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
    • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
    • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
  • Attributes:
    Name Type Default value Description
    didsPerBlock int 256

LegacyUnsignedFieldConfig

  • com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    bitsForValue int 32 Number of bits used to store numerical values.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

LegacySignedFieldConfig

  • com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

PointFieldConfig

  • com.exalead.mercury.mami.indexing.v10.PointFieldConfig
  • This type of field is used to store geographical points using either GPS coordinates (WGS84) or planar X,Y coordinates (Meter).
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    geoType enum(WGS84, Meter) WGS84 Value can be one of
    • WGS84
    • Meter
    blockSize int 8192
    exact boolean True
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

GeoFieldConfig

  • com.exalead.mercury.mami.indexing.v10.GeoFieldConfig
  • This type of field is used to store 2D geometries using either planar X,Y coordinates (Meter).
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    geoType enum(Meter) Meter Value can be one of
    • Meter
    maxBlockSize int 24
    precision int 6
    bboxFieldName string
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

UidFieldConfig

  • com.exalead.mercury.mami.indexing.v10.UidFieldConfig
  • This field stores a unique value in order to facilitate search.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    dictStorage enum(strbtree, trie, fsm) fsm Associative array implementation.
    bitsetThreshold int 10000 Number of requested documents before switching from a dynamic array to a bitset representation.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

ValueFieldConfig

  • com.exalead.mercury.mami.indexing.v10.ValueFieldConfig
  • Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    ignorePresentBit boolean Uses and loads the present bit.
    minMemberNbBits int 5 Min number of bits for attr part for value field.
    bloomFilter boolean Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing).
    hashThreshold int 128 Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold.
    implementation enum(strbtree, fsm) fsm Advanced configuration. Internal structure used to store the field dictionary.
    optimizeListsForPatternSearch boolean speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

TextFieldConfig

  • com.exalead.mercury.mami.indexing.v10.TextFieldConfig
  • Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean True A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    retrievable boolean True
    ignorePresentBit boolean Uses and loads the present bit.
    minMemberNbBits int 5 Min number of bits for attr part for value field.
    bloomFilter boolean Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing).
    hashThreshold int 128 Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold.
    implementation enum(strbtree, fsm) fsm Advanced configuration. Internal structure used to store the field dictionary.
    optimizeListsForPatternSearch boolean speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space.
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    storePositions boolean True Store positions for seq nodes and proximity scoring.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

ReferenceFieldConfig

  • com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig
  • Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean True A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    retrievable boolean True
    ignorePresentBit boolean Uses and loads the present bit.
    minMemberNbBits int 5 Min number of bits for attr part for value field.
    bloomFilter boolean Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing).
    hashThreshold int 128 Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold.
    implementation enum(strbtree, fsm) fsm Advanced configuration. Internal structure used to store the field dictionary.
    optimizeListsForPatternSearch boolean speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space.
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

UnsignedFieldConfig

  • com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    bitsForValue int 63 Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits.
    blockSize int 8192
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

SignedFieldConfig

  • com.exalead.mercury.mami.indexing.v10.SignedFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    bitsForValue int 63 Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits.
    blockSize int 8192
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

DoubleFieldConfig

  • com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig
  • Configuration of a double precision floating point number field.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    bitsForValue int 63 Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits.
    blockSize int 8192
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    precision int 4 Number of relevant digits in the decimal part.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

TimeFieldConfig

  • com.exalead.mercury.mami.indexing.v10.TimeFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

DateFieldConfig

  • com.exalead.mercury.mami.indexing.v10.DateFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    deltaRefEncodeMultivaluedValues boolean True Delta ref encode multivalued values.
    sortMultivaluedValues boolean True Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions.
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

BinaryFieldConfig

  • com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean A value field must be RAM-based to perform synthesis efficiently.
    multiContext boolean
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    gzip boolean Activates content compression
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

CategoryFieldConfig

  • com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig
  • Stores hierarchy content. Each term is limited to 1024 bytes.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
    ramBased boolean True A value field must be RAM-based to perform synthesis efficiently.
    implementation enum(strbtree, fsm) strbtree Advanced configuration. Internal structure used to store the field dictionary.
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

HierarchyFieldConfig

  • com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig
  • Stores hierarchy content. Each term is limited to 1024 bytes.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
  • Attributes:
    Name Type Default value Description
    ramBased boolean True A value field must be RAM-based to perform synthesis efficiently.
    implementation enum(strbtree, fsm) strbtree Advanced configuration. Internal structure used to store the field dictionary.
    fieldName string The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+
    searchable boolean Allows users to query on this field (using a prefix handler).
    retrievable boolean Allows the content of this field to be retrieved at query time and displayed in the search results.
    dataModelState string Is this index field config managed by a data model? @enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized.
    dataModelClass string If dataModelState is "auto" or customized", you will find here the name of the DataModelClass that generated this field config.
    dataModelProperty string If dataModelState is "auto" or customized", you will find here the name of the DataModelProperty that generated this field config.
    multivalued boolean
    version int
  • Nested elements:
    Name Type Description
    fromDataModel com.exalead.mercury.mami.indexing.v10.FieldConfig If dataModelState is "customized", you will find here the original object generated by the data model. Use this to easily revert to "auto" state from "customized".
    ListsEncoderConfig com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used.

IndexingConfig

  • com.exalead.mercury.mami.indexing.v10.IndexingConfig
  • No documentation for this element.
  • Attributes:
    Name Type Default value Description
    name string
  • Nested elements:
    Name Type Description
    AnalysisPolicy com.exalead.mercury.mami.indexing.v10.AnalysisPolicy
    CommitTriggerCondition com.exalead.mercury.mami.indexing.v10.CommitTriggerCondition*
    ImportPolicy com.exalead.mercury.mami.indexing.v10.ImportPolicy
    IndexManagementPolicy com.exalead.mercury.mami.indexing.v10.IndexManagementPolicy
    WriteAttributeSlotConfig com.exalead.mercury.mami.indexing.v10.WriteAttributeSlotConfig*
    WriteSlotConfig com.exalead.mercury.mami.indexing.v10.WriteSlotConfig

FixedThreadsAnalysisPolicy

  • com.exalead.mercury.mami.indexing.v10.FixedThreadsAnalysisPolicy
  • Instantiates a fixed number of analysis threads. Dispatches documents according to their DIDs (Document IDs) and slice.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxRAMConsumptionThreshold enum(disabled, enabled, auto) enabled When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
    • Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
    • Auto: Commits when the RAM size reaches 2048 MB.'
    maxRAMConsumptionMB int 2048 The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit.
    nbThreads int 4 Number of threads to allocate.

PerSliceAnalysisPolicy

  • com.exalead.mercury.mami.indexing.v10.PerSliceAnalysisPolicy
  • Instantiates an analysis thread for each slice. Dispatches documents according to their slice. Consumes less RAM than the 'FixedThreadsAnalysisPolicy'.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxRAMConsumptionThreshold enum(disabled, enabled, auto) enabled When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
    • Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
    • Auto: Commits when the RAM size reaches 2048 MB.'
    maxRAMConsumptionMB int 2048 The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit.
    nbThreads int 1 Uses N threads per slice.

SameThreadAnalysisPolicy

  • com.exalead.mercury.mami.indexing.v10.SameThreadAnalysisPolicy
  • Instantiates an analysis thread for each incoming PAPI thread. Each PAPI thread analyzes its tasks synchronously.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxRAMConsumptionThreshold enum(disabled, enabled, auto) enabled When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
    • Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
    • Auto: Commits when the RAM size reaches 2048 MB.'
    maxRAMConsumptionMB int 2048 The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit.

AutomaticAnalysisPolicy

  • com.exalead.mercury.mami.indexing.v10.AutomaticAnalysisPolicy
  • Depending on the number of threads specified, CloudView automatically chooses the most efficient analysis policy. Changes made in Analyze require a restart of CloudView, or at least of the indexing server process, to be taken into account.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxRAMConsumptionThreshold enum(disabled, enabled, auto) enabled When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
    • Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
    • Auto: Commits when the RAM size reaches 2048 MB.'
    maxRAMConsumptionMB int 2048 The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit.
    nbThreads int If not set or set with a multiple of 'nbSlices', it uses the 'PerSliceAnalysisPolicy'. Otherwise, it uses 'FixedThreadsAnalysisPolicy'.

NumberOfTasksBasedCommitTriggerCondition

  • com.exalead.mercury.mami.indexing.v10.NumberOfTasksBasedCommitTriggerCondition
  • Triggers a commit after the specified No. tasks has been processed. The No. of tasks calculation is executed each time a batch of documents received, to avoid performance penalties. You might therefore have a bit more than the specified No. of tasks analyzed.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    nbTasks int The number of tasks

SizeBasedCommitTriggerCondition

  • com.exalead.mercury.mami.indexing.v10.SizeBasedCommitTriggerCondition
  • Triggers a commit when the Max size (MB) is reached.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxSizeMB int Max size threshold in MB

RAMUsageCommitTriggerCondition

  • com.exalead.mercury.mami.indexing.v10.RAMUsageCommitTriggerCondition
  • Triggers a commit when RAM usage reaches the limit.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    maxRAMUsageInMB int Max RAM usage in MB

PeriodicCommitTriggerCondition

  • com.exalead.mercury.mami.indexing.v10.PeriodicCommitTriggerCondition
  • Commits every N seconds after the first push order done after the last commit.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    delayS long Time in seconds between two commits.

InactivityCommitTriggerCondition

  • com.exalead.mercury.mami.indexing.v10.InactivityCommitTriggerCondition
  • Inactivity-based condition. This condition is triggered when:
    • there is no new data for the specified time period
    • AND at least the specified No. tasks has been analyzed.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    numberOfTasks int Minimum number of tasks to trigger a commit.
    inactivityTimeS long After N seconds of no indexing activity, it is defined as inactive.

ParallelImportPolicy

  • com.exalead.mercury.mami.indexing.v10.ParallelImportPolicy
  • For each analysis buffers one generation is created. Analysis buffers are imported in parallel.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    nbThreads int 8 The number of parallel import.

MergedImportPolicy

  • com.exalead.mercury.mami.indexing.v10.MergedImportPolicy
  • All analysis buffers are merged into a single one to be imported in an unique generation.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)

StandardIndexManagementPolicy

  • com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy
  • Default index (service + build) runtime configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    gcEveryS int 15 Trigger a GC every N seconds.
  • Nested elements:
    Name Type Description
    CommitPolicy com.exalead.mercury.mami.indexing.v10.CommitPolicy The commit policy used to configured how the index persists its file to disk.
    CompactPolicies com.exalead.mercury.mami.indexing.v10.CompactPolicies The compact policies used to trigger slots compaction.
    UploadPolicy com.exalead.mercury.mami.indexing.v10.UploadPolicy The upload policy used to replicate new slots to replicas.

StandardCommitPolicy

  • com.exalead.mercury.mami.indexing.v10.StandardCommitPolicy
  • Default commit policy
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)

CompactPolicies

  • com.exalead.mercury.mami.indexing.v10.CompactPolicies
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)
  • Attributes:
    Name Type Default value Description
    synchronous boolean By default, compaction jobs are asynchronous. If set, compacts will be done synchronously just after imports.
    maxParallelFullCompacts int Limit the number of full compacts in parallel, can be useful when you don't have too much disk space available. 0 means no limit.
    type enum(mmap, pagecache) mmap Specifies which I/O mode is used while compacting. ( Value can be null or one of
    • mmap
    • pagecache
    )
    maxPageCacheSizeMB int 32 If the policy uses the PageCache mode, it specifies the max cache size.
    pageCachePageSizeKB int 8 If the policy uses the PageCache mode, it specifies the page size.
    priorityCompactThreshold int 48 When compacting a slot gen0-gen1, consider as a priority compact if gen1-gen0 < priorityCompactThreshold. Default is 48. (0: disabled)
    lowPriorityCompactNbThreads int 2 Number of threads to use for a compact having low priority (0: all available threads).
    highPriorityCompactNbThreads int Number of threads to use for a compact having high priority (0: all available threads).
  • Nested elements:
    Name Type Description
    AutoCompactPolicy com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy* Specifies the auto-compact policies.

NumberOfSlotsBasedCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy
  • Compaction policy based on a fixed number of slots for a given number of generations.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
  • Attributes:
    Name Type Default value Description
    component string
    arity int 4 Specifies the number of slots of the same length required to compact.
    maxSlotSizeMb long 5000 If a slot reaches this size; it will never be used by the next automatic compaction processes.
  • Nested elements:
    Name Type Description
    FullCompactPolicy com.exalead.mercury.mami.indexing.v10.FullCompactPolicy

MaxSizeFullCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.MaxSizeFullCompactPolicy
  • A FullCompactPolicy that compacts all slots into one whenever the "tail" of small slots exceeds a certain ratio of the large first slot. This policy is appropriate when auto-compacts are restricted to slots under a certain size for performance reasons. In this case, a full optimization can occasionally be triggered to purge the deletes. If not, the deletes occurring in later slots would never be deleted, incurring performance costs at query-time and extra disk space consumption.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy (as AutoCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy (as LowLatencyCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.NoCompactPolicy (as NoCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy (as NumberOfSlotsBasedCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy (as SlotsLogSizeBasedCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy (as SlotsSizeBasedCompactPolicy)
  • Attributes:
    Name Type Default value Description
    percentage int 100 Minimum percentage to launch a full compaction. Compacts all slots into one whenever the "tail" of small slots exceeds a certain percentage of the large first slot. Eg: with percentage=100, when cumulated size of all slots except biggest is higher than size of the biggest slot, a full compact is triggered.
    minSlots int 2 Minimum number of slots before triggering a full compact.

ArityBasedFullCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.ArityBasedFullCompactPolicy
  • A FullCompactPolicy that compacts all slots into one whenever the "tail" of slots with smaller arities exceeds together a certain arity. The idea is that the arity-based policy guarantees occasional full-compaction, but the time interval between full-compaction increases exponentially. This add-on policy caps the increase at a certain arity, and schedules full-compacts at regular intervals afterwards. This policy is appropriate when auto-compacts are managed per generation-arity. In this case, a full optimization can occasionally be triggered to purge the deletes. If not, the deletes occurring in later slots would never be deleted, incurring performance costs at query-time and extra disk space consumption.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy (as AutoCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy (as LowLatencyCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.NoCompactPolicy (as NoCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy (as NumberOfSlotsBasedCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy (as SlotsLogSizeBasedCompactPolicy)
    • com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy (as SlotsSizeBasedCompactPolicy)
  • Attributes:
    Name Type Default value Description
    maxArity int 256 Whenever the long tail total arity reaches maxArity, a full compact is scheduled. The "long tail" are the slots whose span has an arity inferior to this parameter. This is generally a multiple of the auto-compact Arity policy arity parameter.
    minSize long Slots below this size are considered neglectable.

SlotsSizeBasedCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy
  • Compaction policy based on size that produces slots with similar size. When N consecutive slots have a size below targetSizeForCompactionMB, it performs a compaction if:
    • N is at least minArity AND
      • The N+1 slot makes the size above targetSizeForCompactionMB OR
      • The size is above minSizeForCompactionMB
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
  • Attributes:
    Name Type Default value Description
    component string
    targetSizeForCompactionMB int 200 Targeted size for a compacted slot.
    minSizeForCompactionMB int 50 Minimum size required to compact.
    minArity int 2 Minimum number of slots required to compact.
  • Nested elements:
    Name Type Description
    FullCompactPolicy com.exalead.mercury.mami.indexing.v10.FullCompactPolicy

SlotsLogSizeBasedCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy
  • A CompactPolicy that tries to compact slots into levels of exponentially increasing size, where each level has fewer slots than the value of the compact factor. Whenever extra slots (beyond the compact factor upper bound) are encountered, all slots within the level are compacted.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
  • Attributes:
    Name Type Default value Description
    component string
    compactFactor int 10 Determines how often slots are compacted. With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (greater than 10) are best for batch index creation, and smaller values (lower than 10) for indices that are interactively maintained.
    minSize long 1048576 A size setting type which sets the minimum size for the lowest level slots. Slots below this size are considered to be on the same level (even if they vary drastically in size) and will be merged whenever there are mergeFactor for them. This effectively truncates the "long tail" of small slots that would otherwise be created into a single level. If you set this too large, it can greatly increase the merging cost during indexing (if you flush many small slots).
    maxSize long 9223372036854775807 A size setting type which sets the largest slot that may be merged with other segments.
  • Nested elements:
    Name Type Description
    FullCompactPolicy com.exalead.mercury.mami.indexing.v10.FullCompactPolicy

LowLatencyCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy
  • Compacts when the size of all small slots is above the average large slot size, or when the number of slots is above nbLargeSlots + maxNbSmallSlots.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
  • Attributes:
    Name Type Default value Description
    component string
    nbLargeSlots int 8 The number of large slots to keep.
    maxNbSmallSlots int 8 Maximum number of small slots allowed. As soon as this limit is reached, small slots are compacted together.
    gatherSmallsAtTheEnd boolean True
    contiguousCompact boolean
  • Nested elements:
    Name Type Description
    FullCompactPolicy com.exalead.mercury.mami.indexing.v10.FullCompactPolicy

NoCompactPolicy

  • com.exalead.mercury.mami.indexing.v10.NoCompactPolicy
  • Compact policy that does not perform any compact.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
  • Attributes:
    Name Type Default value Description
    component string
  • Nested elements:
    Name Type Description
    FullCompactPolicy com.exalead.mercury.mami.indexing.v10.FullCompactPolicy

StandardUploadPolicy

  • com.exalead.mercury.mami.indexing.v10.StandardUploadPolicy
  • Default upload policy
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)
  • Attributes:
    Name Type Default value Description
    waitBetweenSwitchesS int If strictly positive, all slices switch to a generation sequentially, and we wait this time in seconds between two slices. This spreads the temporary memory consumption to avoid large memory spikes and swap out.

WriteAttributeSlotConfig

  • com.exalead.mercury.mami.indexing.v10.WriteAttributeSlotConfig
  • Write attribute slot configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    type enum(directio, sequential) directio Access type for writing the new slots. Value can be null or one of
    • directio
    • sequential
    groupId int Specifies which attribute group store this access configuration applies to.

WriteSlotConfig

  • com.exalead.mercury.mami.indexing.v10.WriteSlotConfig
  • Write slot configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
  • Attributes:
    Name Type Default value Description
    type enum(directio, sequential) sequential Access type for writing the new slots. Value can be null or one of
    • directio
    • sequential

IndexRuntimeConfigList

  • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList
  • Lists all index runtime config list.
  • Attributes:
    Name Type Default value Description
    version long
  • Nested elements:
    Name Type Description
    CacheConfig com.exalead.mercury.mami.indexing.v10.CacheConfig* Lists PageCache configurations
    IndexRuntimeConfig com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig* Lists runtime configurations

CacheConfig

  • com.exalead.mercury.mami.indexing.v10.CacheConfig
  • PageCache configuration. Warning: The index page cache is limited to 32000 files in the index directory. If you get an error like "FileRAM: too many cached files (c_max_files=32767)", it means that the limit has been crossed and you should set a more aggressive compact policy.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList (as IndexRuntimeConfigList)
  • Attributes:
    Name Type Default value Description
    name string The cache ID.
    cacheSizeMB int 256 Maximum cache size in MB.
    pageSizeKB int 8 Page size in KB.
    maxSimultaneousIOOperations int 32 Specifies the max number of simultaneous I/O.

IndexRuntimeConfig

  • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig
  • Index runtime configuration for an instance of an index slice. Use key values arguments to provide custom configuration keys.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList (as IndexRuntimeConfigList)
  • Attributes:
    Name Type Default value Description
    name string
    newGenerationBandwidthLimitKB int
    compactBandwidthLimitKB int
    ramBasedAttrGroupLoadPolicy enum(rebuild, copyAndPatch) copyAndPatch Value can be one of
    • rebuild
    • copyAndPatch
  • Nested elements:
    Name Type Description
    AttributeGroupAccess com.exalead.mercury.mami.indexing.v10.AttributeGroupAccess*
    FieldRuntimeConfig com.exalead.mercury.mami.indexing.v10.FieldRuntimeConfig*
    QueryAutocacheConfig com.exalead.mercury.mami.indexing.v10.QueryAutocacheConfig
    ReplicationConfig com.exalead.mercury.mami.indexing.v10.ReplicationConfig
    WarmupConfig com.exalead.mercury.mami.indexing.v10.WarmupConfig

AttributeGroupAccess

  • com.exalead.mercury.mami.indexing.v10.AttributeGroupAccess
  • Configuration specifying how to access the attribute group at runtime.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
  • Attributes:
    Name Type Default value Description
    groupId string Specifies which attribute group store this access configuration applies to.
    runType enum(mmap, pagecache, direct, RAMRow, RAMColumnDense) mmap Specifies how the attribute group should be accessed at runtime.
    preload boolean For RAM-based access type, specifies if the attribute group should be loaded in RAM at startup instead of at access time.
    mlock boolean For RAM-based access type, specifies if the attribute group should be locked in RAM. Preventing it being moved to the swap area.
    cacheId string For pagecache I/O type, specifies the cache ID.

FieldRuntimeConfig

  • com.exalead.mercury.mami.indexing.v10.FieldRuntimeConfig
  • Configuration specifying the index field at runtime.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
  • Attributes:
    Name Type Default value Description
    name string The index field name.
    dictType enum(mmap, pagecache) mmap Specifies the I/O mode used to load the dictionary part of an index field. ( Value can be one of
    • mmap
    • pagecache
    )
    type enum(mmap, pagecache) mmap Specifies the I/O mode used to load the component. ( Value can be one of
    • mmap
    • pagecache
    )
    preload boolean Should the field be preloaded? This will force the field to be loaded in RAM at startup.
    mlock boolean Should the field be locked in RAM.
    cacheId string If PageCache is used, it specifies the cache ID.

QueryAutocacheConfig

  • com.exalead.mercury.mami.indexing.v10.QueryAutocacheConfig
  • Query #autocache configuration.
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
  • Attributes:
    Name Type Default value Description
    totalCacheSizeMB int 20 Maximum cache size in MB (cross queries).
    queryCacheSizeMB int 5 Maximum cached query size.
    maxCachedQueries int 20 Number of queries cached.

ReplicationConfig

  • com.exalead.mercury.mami.indexing.v10.ReplicationConfig
  • Slice replication configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
  • Nested elements:
    Name Type Description
    AttributeReplicationConfig com.exalead.mercury.mami.indexing.v10.AttributeReplicationConfig* Configures the direction usage in attribute replication.
    FieldReplicationConfig com.exalead.mercury.mami.indexing.v10.FieldReplicationConfig* Configures the direction usage in field replication.

AttributeReplicationConfig

  • com.exalead.mercury.mami.indexing.v10.AttributeReplicationConfig
  • Attribute's replication configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.ReplicationConfig (as ReplicationConfig)
  • Attributes:
    Name Type Default value Description
    groupId string Group id of the attribute to configure
    type enum(directio, sequential) directio Access type Value can be null or one of
    • directio
    • sequential

FieldReplicationConfig

  • com.exalead.mercury.mami.indexing.v10.FieldReplicationConfig
  • Index field replication configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.ReplicationConfig (as ReplicationConfig)
  • Attributes:
    Name Type Default value Description
    name string Name of the field to configure.
    type enum(directio, sequential) directio Access type Value can be null or one of
    • directio
    • sequential
    dictType enum(directio, sequential) directio Access type for the dictionary Value can be null or one of
    • directio
    • sequential

WarmupConfig

  • com.exalead.mercury.mami.indexing.v10.WarmupConfig
  • Index warmup configuration
  • Parent elements:
    • com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
  • Attributes:
    Name Type Default value Description
    warmupQueryFile string Warmup list of single queries
    maxWarmupDurationS int 5 Maximum time for warmup. Open the index after and prints a warning indicating which line number has been reached

BuildGroupConfig

  • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig
  • Configuration of a build group. A "Build Group" is defined by references to sub-configurations defined in other MAMI:
    • Analysis (how documents are processed).
    • Index Builder (how indexing jobs are scheduled and managed).
    • Index Schema (schema of the index slices being built).
    • Task Queue (how input document processing tasks are queued before jobs).
    • Similar Document (optional)
    Several build groups may share some or all their sub-configuration. In most configuration, all build groups would share the same index schema configuration. When built with the same schema, index slices built by different build groups can be queried together (see the Search MAMI).
  • Attributes:
    Name Type Default value Description
    buildGroup string Name of the build group. This name should be unique.
    dataModel string Name of the data model.
    indexingConfig string Name of an indexing configuration (IndexingConfig element in Indexing MAMI).
  • Nested elements:
    Name Type Description
    DIHConfig com.exalead.mercury.mami.deploy.v10.DIHConfig
    DidAllocationPolicy com.exalead.mercury.mami.deploy.v10.DidAllocationPolicy
    DocumentCacheConfig com.exalead.mercury.mami.deploy.v10.DocumentCacheConfig
    PrecomputedThumbnailsConfig com.exalead.mercury.mami.deploy.v10.PrecomputedThumbnailsConfig
    ScratchHook com.exalead.mercury.mami.deploy.v10.ScratchHook*
    SlicePartioningPolicy com.exalead.mercury.mami.deploy.v10.SlicePartioningPolicy

DIHConfig

  • com.exalead.mercury.mami.deploy.v10.DIHConfig
  • A DIHConfig is a set of parameters for a DIH.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    compactArity int 4 Number of consecutive slots to trigger a compact.
    nbBloomBitsPerElement int 20 Number of bits per elements in the DIH's StrBTree's bloom filter.
    nbElementsInLeaf int 100 Number of entries in each of the DIH's StrBTree's leaves.
    readMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) mmap Read mode of the DIH's StrBTree, except for enumeration. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential
    enumMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) mmap Read mode of the DIH's StrBTree, for enumeration. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential
    compactMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) mmap Read mode of the DIH's StrBTree, for compact. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential

ContiguousDidAllocationPolicy

  • com.exalead.mercury.mami.deploy.v10.ContiguousDidAllocationPolicy
  • Base-class specifying how DIDs (Document IDs) are assigned to the documents.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    increasing boolean True Assign DIDs in an increasing order.
    startingPoint int Start point of the allocation. By default, the first DID will have value '1'.
    endingPoint nullableint End point of the allocation. By default, it will be Integer.MAX_VALUE if increasing or 1 if decreasing.

DocumentCacheConfig

  • com.exalead.mercury.mami.deploy.v10.DocumentCacheConfig
  • Configuration for the document cache.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    path string Location of the document cache on the filesystem. Unless otherwise specified, the document cache is located in the "cache" subdirectory of the build group.
    compactArity int 4 Number of consecutive slots to trigger a compact.
    nbBloomBitsPerElement int 10 Number of bits per element in the document cache StrBTree bloom filter.
    nbElementsInLeaf int 20 Number of entries in each of the document cache StrBTree leaves.
    readMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) auto Read mode of the document cache StrBTree, except for enumeration. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential
    enumMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) auto Read mode of the document cache StrBTree, for enumeration. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential
    compactMode enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) auto Read mode of the document cache StrBTree, for compact. Value can be null or one of
    • auto
    • direct
    • mmap
    • mmap_mlock
    • mmap_mload
    • pagecache
    • random
    • sequential
    diskCompressionAlgorithm enum(none, fastlz, gzip, lcs, lz4) fastlz Algorithm to compress the document cache on disk. Value can be null or one of
    • none
    • fastlz
    • gzip
    • lcs
    • lz4
    temporaryFilesCompressionAlgorithm enum(none, fastlz, gzip, lz4) fastlz Algorithm to compress the temporary files on disk. Value can be null or one of
    • none
    • fastlz
    • gzip
    • lz4

PrecomputedThumbnailsConfig

  • com.exalead.mercury.mami.deploy.v10.PrecomputedThumbnailsConfig
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    computeThreads int 4

FSPrecomputedThumbnailsConfig

  • com.exalead.mercury.mami.deploy.v10.FSPrecomputedThumbnailsConfig
  • Deprecated)
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    computeThreads int 4

GDSPrecomputedThumbnailsConfig

  • com.exalead.mercury.mami.deploy.v10.GDSPrecomputedThumbnailsConfig
  • Deprecated)
  • No documentation for this element.
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    computeThreads int 4
    ramBufferSizeMB long 16
    readMode enum(normal, direct) direct Value can be null or one of
    • normal
    • direct

ScratchHook

  • com.exalead.mercury.mami.deploy.v10.ScratchHook
  • A Hook to plug custom exa code on BuildGroup scratch
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
  • Attributes:
    Name Type Default value Description
    classId string The specified class must implement the {@code com.exalead.mercury.indexing.CustomScratchHook} Exascript interface.
  • Nested elements:
    Name Type Description
    KeyValue exa.bee.KeyValue*

BasicSlicePartioningPolicy

  • com.exalead.mercury.mami.deploy.v10.BasicSlicePartioningPolicy
  • Basic partionning function based on a URL hash and a '%' (modulo).
  • Parent elements:
    • com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)

StringValue

  • exa.bee.StringValue
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as annotationsToCopy)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as classes)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as classes)
    • com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as ids)
    • com.exalead.indexing.analysis.v10.HTMLCSSSelector (as ids)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as idsAndClassesToIgnore)
    • com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as idsAndClassesToKeep)
    • com.exalead.indexing.analysis.v10.ConcatValues (as inputContexts)
    • com.exalead.indexing.analysis.v10.ContentCleanup (as inputContexts)
    • com.exalead.indexing.analysis.v10.CoordinatesFormatter (as inputContexts)
    • com.exalead.indexing.analysis.v10.DebugProcessor (as inputContexts)
    • com.exalead.indexing.analysis.v10.LanguageDetector (as inputContexts)
    • com.exalead.indexing.analysis.v10.LanguageSetter (as inputContexts)
    • com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as inputContexts)
    • com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as inputContexts)
    • com.exalead.indexing.analysis.v10.NumericalFormatter (as inputContexts)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as inputContexts)
    • com.exalead.indexing.analysis.v10.RemoveContexts (as inputContexts)
    • com.exalead.indexing.analysis.v10.StringHash (as inputContexts)
    • com.exalead.indexing.analysis.v10.StringHash32 (as inputContexts)
    • com.exalead.indexing.analysis.v10.StringHash64 (as inputContexts)
    • com.exalead.indexing.analysis.v10.StringTransform (as inputContexts)
    • com.exalead.indexing.analysis.v10.UTF8Checker (as inputContexts)
    • com.exalead.indexing.analysis.v10.ValueSelector (as inputContexts)
    • com.exalead.indexing.analysis.v10.MimeCondition (as mimes)
    • com.exalead.indexing.analysis.v10.StandardPartsMerger (as partSpecificContexts)
    • com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as targetInstances)
    • com.exalead.indexing.analysis.v10.SimilarStringToPart (as values)
    • com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as values)
    • com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as values)
  • Attributes:
    Name Type Default value Description
    value string

KeyValue

  • exa.bee.KeyValue
  • No documentation for this element.
  • Parent elements:
    • com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
    • com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
    • com.exalead.indexing.analysis.v10.CustomSemanticProcessor (as CustomSemanticProcessor)
    • exa.bee.KeyValue (as KeyValue)
    • com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
    • com.exalead.mercury.mami.deploy.v10.ScratchHook (as ScratchHook)
    • com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
    • com.exalead.indexing.analysis.v10.CustomPublisher (as config)
  • Attributes:
    Name Type Default value Description
    key string The name of the key
    value string
    type string
    description string
  • Nested elements:
    Name Type Description
    KeyValue exa.bee.KeyValue*