AnalysisConfig
com.exalead.indexing.analysis.v10.AnalysisConfig
- AnalysisConfig represents a self-contained module for Document Analysis. AnalysisConfig is referenced by a BuildGroup. An analysis module defines a set of pipelines that are applied in this module.
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the analysis module. Must be unique. |
linguistic |
boolean |
True |
Extracts linguistic data for the dictionary, such as word counts. This impacts the ability to compute related terms and use word counts for ranking. |
- Nested elements:
Name |
Type |
Description |
AnalysisPipeline |
com.exalead.indexing.analysis.v10.AnalysisPipeline* |
|
AnalysisPipeline
com.exalead.indexing.analysis.v10.AnalysisPipeline
- A document analysis pipeline. Each pipeline has an associated accept condition. This condition is tested for each input document. If a document matches the condition, it is processed by this pipeline. If not, the condition is tested for the next pipeline in the list of pipelines defined in a DocumentAnalysis object. A document refused by all pipelines is neither processed nor indexed. Pipeline processing is made of several stages:
- Document Processing Stage - is performed by a list of DocumentProcessor which process each Document sequentially. Document Processors manipulate the 'DocumentParts' (binary data pushed through the PAPI) and the 'DocumentChunks' (textual data obtained either from PAPI meta or by processing of Document Part or by processing of pre-existing Document Chunks) Each DocumentChunk has a textual content, a ContextName, a language, a score, may belong to a DocumentPart. A DocumentChunk belonging to no DocumentPart is called a root DocumentChunk.
- Semantic Processing Stage - involves a list of SemanticProcessor which process each Document Chunk of each Document sequentially (except those for which Semantic Processing is disabled in the mapping). Semantic Processing segments text into 'tokens' and then processes text as a flow of tokens. SemanticAnnotations are produced on each token.
- Mapping - involves mapping DocumentChunk and Semantic Annotations to index fields.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisConfig (as AnalysisConfig)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
|
errorAction |
string |
continue |
Specifies the action to launch if there is a document error during processing:
- "discard": Discards the document from the job. If the document was already in the index, it's not removed if it already existed.
- "delete": Discards the document from the job and deletes it from the index.
- "continue": Keeps processing the document. The document will probably be incomplete in the index.
|
reportDocumentErrors |
boolean |
True |
Reports the document errors in the global reporting store, for further analysis. |
globalLogDocumentErrors |
boolean |
|
Logs errors and exceptions reported by the processors in the global log (without stack trace). |
autoBlacklistDocuments |
boolean |
True |
Tries to blacklist the documents triggering serious failure automatically. This option helps preventing loop failures, that is to say, when documents always trigger the same analysis process failures. |
tokenizationConfig |
string |
|
Reference to the TokenizationConfig object to use for tokenization during Semantic Processing Stage. |
autoconfigureFromDataModel |
boolean |
True |
|
documentProcessorsProfiling |
boolean |
|
Logs the CPU time spent for each document processor and for the main indexing phase. The total time spent for each processor is dumped in the analyzer log at the end of the job. |
semanticPipeTimeout |
int |
|
CPU-time limit for the processing of a text chunk by the semantic pipe, in seconds. |
slowDocumentWarningTimeUS |
long |
5000000 |
If the processing of a document is longer than this time, a message will be printed in the analyzer log. A value of 0 disables the warning feature. |
semanticProcessorsProfiling |
boolean |
|
Logs the CPU time spent for each semantic processor. The total time spent for each processor is dumped in the analyzer log at the end of the job. Warning: This feature strongly impacts performance, only enable it if required. |
- Nested elements:
Name |
Type |
Description |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
|
DocumentProcessor |
com.exalead.indexing.analysis.v10.DocumentProcessor* |
|
FilteringConfiguration |
com.exalead.indexing.analysis.v10.FilteringConfiguration |
|
LanguageConfiguration |
com.exalead.indexing.analysis.v10.LanguageConfiguration* |
|
MappingConfiguration |
com.exalead.indexing.analysis.v10.MappingConfiguration |
|
SemanticProcessor |
com.exalead.indexing.analysis.v10.SemanticProcessor* |
|
AndCondition
com.exalead.indexing.analysis.v10.AndCondition
- AndCondition matches if all children AcceptCondition match.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Nested elements:
Name |
Type |
Description |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition* |
|
OrCondition
com.exalead.indexing.analysis.v10.OrCondition
- OrCondition matches if one child matches.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Nested elements:
Name |
Type |
Description |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition* |
|
NotCondition
com.exalead.indexing.analysis.v10.NotCondition
- Matches if the child condition does not match. If there is no child condition (null), this condition never matches.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Nested elements:
Name |
Type |
Description |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
|
SourceCondition
com.exalead.indexing.analysis.v10.SourceCondition
- SourceCondition matches if the source of the document matches 'source'.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
source |
string |
|
Value of the 'source' for the document against which to check. |
BuildGroupCondition
com.exalead.indexing.analysis.v10.BuildGroupCondition
- BuildGroupCondition matches if the current buildgroup matches 'name'.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Value of the "buildgroup" for the document against which to check. |
MetaCondition
com.exalead.indexing.analysis.v10.MetaCondition
- MetaCondition matches if the Document contains a DocumentChunk whose meta name and value match the specified condition.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the meta against which to check. |
nameMode |
enum(equals, matches) |
equals |
Meta name test mode:
- "equals": Evaluates the DocumentChunk with a name equal to the specified one.
- "matches": Evaluates the DocumentChunk with a name matching the specified regular expression.The match is case insensitive.
|
valueMode |
enum(equals, contains, exists, matches) |
exists |
Value test mode:
- "exists": Matches if a DocumentChunk pass the name condition.
- "equals": Matches if a DocumentChunk pass the name condition and the textual content is equal to the 'value' attribute.
- "contains": Matches if a DocumentChunk pass the name condition and the textual content contains 'value' (Pure string matching is performed without tokenization).
- "matches": Matches if a DocumentChunk pass the name condition and the textual content matches the regular expression specified by the 'value' attribute. The match is case insensitive.
|
value |
string |
|
The string to check against the value of DocumentChunks. |
MimeCondition
com.exalead.indexing.analysis.v10.MimeCondition
- A condition that matches if the FIRST document part mime type is in the list. Note: Conditions work on document but mimes are set per document part. The MimeCondition only tests the mime type of the first part, if present.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Nested elements:
Name |
Type |
Description |
mimes |
exa.bee.StringValue* |
|
URLMatchCondition
com.exalead.indexing.analysis.v10.URLMatchCondition
- A condition that matches if the URI matches the regexp.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
regexp |
string |
|
The regexp. Note: It is not anchored by default ; i.e., use '.*\.asp to match .asp URIs. |
FilenameMatchCondition
com.exalead.indexing.analysis.v10.FilenameMatchCondition
- A condition that matches if the FIRST document part Filename type matches the regexp. Note: Conditions work on document but Filenames are set per document part. FilenameMatchCondition only tests the Filename type of the first part, if present.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
regexp |
string |
|
The regexp. Note: It is not anchored by default ; i.e., use '.*\.doc' to match .doc files. |
BinaryContentCondition
com.exalead.indexing.analysis.v10.BinaryContentCondition
- A condition that matches if the FIRST document part binary content type matches the binary string. Note: Conditions work on document but content is set per document part. BinaryContentCondition only tests the binary content of the first part, if present.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
offset |
int |
|
Offset in bytes for the binary data to be compared, in bytes (0 for the beginning of the file). Negative values are taken as offset from the end of the file (-1 for the last byte). |
match |
string |
|
Binary string to be compared. The string may contain any ASCII (7-bit) character, or the following '\' escape sequences:
- \xNN An hexadecimal-encoded character (N part of '0'..'9' or 'A'..'F')
- \NNN An octal-encoded character (N part of '0'..'9')
- \n Character 10
- \r Character 13
- \\ Character '\'
- \" Character '"'
- \? Any character
|
DataModelClassCondition
com.exalead.indexing.analysis.v10.DataModelClassCondition
- A condition that matches if the document has the corresponding DataModel.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
className |
string |
|
The restricted DataModel class |
CustomDirectiveCondition
com.exalead.indexing.analysis.v10.CustomDirectiveCondition
- A condition that matches if the document has the specified directive name, with an optional specific value.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.AndCondition (as AndCondition)
com.exalead.indexing.analysis.v10.CGRDocumentProcessor (as CGRDocumentProcessor)
com.exalead.indexing.analysis.v10.ConcatValues (as ConcatValues)
com.exalead.indexing.analysis.v10.ContentCleanup (as ContentCleanup)
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as CoordinatesFormatter)
com.exalead.indexing.analysis.v10.CopyContext (as CopyContext)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.DataModelClassResolver (as DataModelClassResolver)
com.exalead.indexing.analysis.v10.DateFormatter (as DateFormatter)
com.exalead.indexing.analysis.v10.DebugCrashProcessor (as DebugCrashProcessor)
com.exalead.indexing.analysis.v10.DebugProcessor (as DebugProcessor)
com.exalead.indexing.analysis.v10.DiscardDocument (as DiscardDocument)
com.exalead.indexing.analysis.v10.DocumentProcessor (as DocumentProcessor)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
com.exalead.indexing.analysis.v10.DoubleToLong (as DoubleToLong)
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning (as FixedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
com.exalead.indexing.analysis.v10.FormatCheckerDate (as FormatCheckerDate)
com.exalead.indexing.analysis.v10.GeoBBoxProcessor (as GeoBBoxProcessor)
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as HTMLCSSExtractor)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as HTMLCSSSelector)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as HTMLRelevantContentExtractor)
com.exalead.indexing.analysis.v10.HTMLTableExtractor (as HTMLTableExtractor)
com.exalead.indexing.analysis.v10.InferFileExtension (as InferFileExtension)
com.exalead.indexing.analysis.v10.InsertCurrentDate (as InsertCurrentDate)
com.exalead.indexing.analysis.v10.JavaDocumentProcessor (as JavaDocumentProcessor)
com.exalead.indexing.analysis.v10.JavaProcessor (as JavaProcessor)
com.exalead.indexing.analysis.v10.JavaScriptProcessor (as JavaScriptProcessor)
com.exalead.indexing.analysis.v10.LanguageDetector (as LanguageDetector)
com.exalead.indexing.analysis.v10.LanguageSetter (as LanguageSetter)
com.exalead.indexing.analysis.v10.MIMEDetector (as MIMEDetector)
com.exalead.indexing.analysis.v10.MathDocumentProcessor (as MathDocumentProcessor)
com.exalead.indexing.analysis.v10.MetaFinder (as MetaFinder)
com.exalead.indexing.analysis.v10.MimeTypeSetter (as MimeTypeSetter)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as MultiContextCSVEncoder)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as MultiContextDocumentProcessor)
com.exalead.indexing.analysis.v10.NativeTextExtractor (as NativeTextExtractor)
com.exalead.indexing.analysis.v10.NewChunk (as NewChunk)
com.exalead.indexing.analysis.v10.NotCondition (as NotCondition)
com.exalead.indexing.analysis.v10.NumericalFormatter (as NumericalFormatter)
com.exalead.indexing.analysis.v10.OrCondition (as OrCondition)
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor (as PLMExpandDocumentProcessor)
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor (as PrecomputedThumbnailsDocumentProcessor)
com.exalead.indexing.analysis.v10.PrintfValues (as PrintfValues)
com.exalead.indexing.analysis.v10.PublicUrlProcessor (as PublicUrlProcessor)
com.exalead.indexing.analysis.v10.RealTimeAlerting (as RealTimeAlerting)
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as RemoteHTTPTransformer)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as RemoteMOTAPIDocumentProcessor)
com.exalead.indexing.analysis.v10.RemoveContexts (as RemoveContexts)
com.exalead.indexing.analysis.v10.RenameContext (as RenameContext)
com.exalead.indexing.analysis.v10.RenameUnmappedContexts (as RenameUnmappedContexts)
com.exalead.indexing.analysis.v10.ReplaceContextNames (as ReplaceContextNames)
com.exalead.indexing.analysis.v10.ReplaceRegexp (as ReplaceRegexp)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as SimilarStringToPart)
com.exalead.indexing.analysis.v10.SingleContextDocumentProcessor (as SingleContextDocumentProcessor)
com.exalead.indexing.analysis.v10.SplitValues (as SplitValues)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as StandardPartsMerger)
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor (as StorageServiceDocumentProcessor)
com.exalead.indexing.analysis.v10.StringHash (as StringHash)
com.exalead.indexing.analysis.v10.StringHash32 (as StringHash32)
com.exalead.indexing.analysis.v10.StringHash64 (as StringHash64)
com.exalead.indexing.analysis.v10.StringTransform (as StringTransform)
com.exalead.indexing.analysis.v10.TextToNum (as TextToNum)
com.exalead.indexing.analysis.v10.URLCodec (as URLCodec)
com.exalead.indexing.analysis.v10.URLTransformer (as URLTransformer)
com.exalead.indexing.analysis.v10.UTF8Checker (as UTF8Checker)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as UniformRandomContextGenerator)
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer (as UnitsOfMeasurementNormalizer)
com.exalead.indexing.analysis.v10.ValueSelector (as ValueSelector)
com.exalead.indexing.analysis.v10.WildcardIndexing (as WildcardIndexing)
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as ZipfRandomContextGenerator)
- Attributes:
Name |
Type |
Default value |
Description |
directiveName |
string |
|
The expected directive name |
directiveValue |
string |
|
An optional expected value for the given directive |
LanguageDetector
com.exalead.indexing.analysis.v10.LanguageDetector
- Language detection is performed using the text of all the DocumentChunks associated with the specified input ContextNames for which language was not already detected or specified. The whole text of all these DocumentChunks is taken into account by a statistical algorithm that detects the language. This language is then set as the language for all specified chunks. For example, the language attribute of a DocumentChunk is used by semantic processing. Language is represented by its iso639-1 code: fr, en.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
languageContext |
string |
|
If this is not null and if there is a DocumentChunk with a ContextName matching 'languageContext':
- no automatic detection will be performed,
- the language specified will be used as the language of the DocumentChunks associated with the ContextNames specified as input.
|
languagesToDetect |
string |
|
If not null, restrict the language detector to a set of languages. If you only have a small set of languages to detect, you can restrict language detector to this set to improve precision. List is comma-separated, ex: "en,fr" |
defaultLanguage |
string |
|
If not null, 'defaultLanguage' will be used as the default language when automatic detection fails. |
exclude |
boolean |
|
If true, "inputContexts" is an exclude list instead of an include list. Language detection is then performed on all DocumentChunks except those whose ContextName appears in 'inputContexts'. |
outputContext |
string |
|
ContextName of the DocumentChunk to create. It will contain the language detected in the processed DocumentChunks as defined in ISO 639-1. |
minLangPercentage |
int |
33 |
Minimum ratio ([0-100]) of language to be detected (0 = always keeps a detected language) |
languagesToKeep |
int |
|
Keeps the n most represented languages in the document. A value of 0 lets the minLangPercentage select the languages. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
LanguageSetter
com.exalead.indexing.analysis.v10.LanguageSetter
- The language is set as the language for all the DocumentChunks associated with the specified input ContextNames. For example, the language attribute of a DocumentChunk is used by semantic processing. The language is represented by its iso639-1 code: fr, en
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
language |
iso code |
|
Language specified by ISO 639-1 code. |
outputContext |
string |
|
ContextName of the DocumentChunk to create. It will contain the language name as defined in ISO 639-1. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ContentCleanup
com.exalead.indexing.analysis.v10.ContentCleanup
- Analyzes each DocumentChunk and performs whitespace removal, 'Whitespaces' being defined by the Unicode specification. This includes ' ' '\r' and '\n'. Input: All DocumentChunks associated with the specified 'inputContext' ContextNames. Output: Same as input
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
duplicateWhitespaces |
boolean |
|
Removes duplicate whitespaces. (' ' -> ' ') |
leading |
boolean |
|
Removes the leading whitespaces |
trailing |
boolean |
|
Removes the trailing whitespaces |
spaces |
boolean |
|
Removes *all* whitespaces. |
stripHTML |
boolean |
|
Strips HTML tags |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ValueSelector
com.exalead.indexing.analysis.v10.ValueSelector
- Takes the input contexts in the specified order, and as soon as one is found, it copies the content to the output context and stops.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each selection. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
UTF8Checker
com.exalead.indexing.analysis.v10.UTF8Checker
- Checks that the text passing through is valid UTF-8. Emits a warning with the document URI and the context name if input is malformed. Optionally deletes invalid chunks.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
deleteInvalidChunks |
boolean |
|
Removes invalid chunks from documents. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ConcatValues
com.exalead.indexing.analysis.v10.ConcatValues
- Concatenates all textual content of DocumentChunks where ContextName matches 'inputContexts', and joins them with the 'join' string. A single DocumentChunk with ContextName 'outputContext' is created as an output.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each concatenated value. |
join |
string |
|
Optional string inserted between concatenated values. |
strict |
boolean |
True |
Forces all the input contexts found to generate the concatenation. |
allowDuplicates |
boolean |
True |
If true, and if there are multiple DocumentChunks with the same ContextName, it concatenates them all. If false, only the first DocumentChunk among all those with the same ContextName is kept. |
cartesianProduct |
boolean |
|
If there are multiple DocumentChunks with the same ContextName, it generates the cartesian product between all values. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RemoveContexts
com.exalead.indexing.analysis.v10.RemoveContexts
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
MultiContextCSVEncoder
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder
- Creates a DocumentChunk containing the ContextName and the textual value of the DocumentChunks matching 'inputContexts'. This processor can be used, for instance, to store arbitrary (key,value) pairs into one single index field. Note that this storing method is inefficient and should be used with caution.
@csh AC_MULTICONTEXT_ENCODER_ID
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
processUnmappedContexts |
boolean |
|
All DocumentChunks with an unmapped ContextName in the document will be used for input. This can be used to emulate the 'default meta' and 'content' field feature of CloudView 4.6. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
StringHash
com.exalead.indexing.analysis.v10.StringHash
- The StringHash processor computes a signed hash of the textual input value. For example, this value can be used in a field used for grouping.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
nbBits |
int |
64 |
The size of the hash, in bits, including the sign bit. The hash values will be in [-2^(nbBits-1); 2^(nbBits-1) - 1]. |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
StringHash64
com.exalead.indexing.analysis.v10.StringHash64
- The StringHash processor computes a signed hash of the textual input value on 64 bits. For example, this value can be used in a field used for grouping.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
StringHash32
com.exalead.indexing.analysis.v10.StringHash32
- The StringHash processor computes a signed hash of the textual input value on 32 bits. For example, this value can be used in a field used for grouping.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
NumericalFormatter
com.exalead.indexing.analysis.v10.NumericalFormatter
- The Numerical Formatter processor creates valid numerical chunks from various number formats.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for the newly created chunk. If null, it uses the same name as the input. |
precision |
int |
|
Number of digits relevant in the decimal part. |
round |
int |
|
Rounds the integer part with this range. |
removeTrailingZeros |
boolean |
True |
Removes the trailing zeros in the decimal part. |
groupSeparator |
string |
|
group separator |
decimalSeparator |
string |
. |
decimal separator |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
CoordinatesFormatter
com.exalead.indexing.analysis.v10.CoordinatesFormatter
- The Coordinates Formatter processor creates a normalized chunk for the latitude and longitude.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
latitudeContext |
string |
|
The ContextName used as input for the latitude |
latitudeFormat |
enum(DMS, Decimal) |
|
The input format for the latitude Value can be one of |
longitudeContext |
string |
|
The ContextName used as input for the longitude |
longitudeFormat |
enum(DMS, Decimal) |
|
The input format for the longitude Value can be one of |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DebugProcessor
com.exalead.indexing.analysis.v10.DebugProcessor
- Dumps all the DocumentChunks named after 'inputContexts' on Standard Output. This provides a log of the 'Analysis' process.
@descr
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
dump |
boolean |
True |
|
outputContext |
string |
|
The ContextName used for the newly created chunk. |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RemoteMOTAPIDocumentProcessor
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor
- The processing of each input context will be handled by the targeted remote API.
@param targetBuildGroups list of build groups that should be used to
handle processing.
@param remoteMOTAPIConfigName the name of the RemoteMOTAPIConfig object
as seen in RemoteMOTAPIConfig.xml high level configuration file.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
remoteMOTAPIConfigName |
string |
|
|
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
targetInstances |
exa.bee.StringValue* |
|
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
StringTransform
com.exalead.indexing.analysis.v10.StringTransform
- Applies textual transformations on chunks from several contexts:
- trims blanks at the beginning and end of chunks
- reduces sequences of blanks to just one
- changes text to uppercase/lowercase/normalized/capitalized
Outputs replace inputs.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
spaces |
string |
|
What to do with spaces ("trim" or "normalize-spaces", default set to nothing) |
form |
string |
|
What transformation to apply ("lowercase", "uppercase", "normalized", "capitalized", default set to nothing) |
- Nested elements:
Name |
Type |
Description |
inputContexts |
exa.bee.StringValue* |
The processor will only be applied to DocumentChunks with a ContextName specified in this list. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ReplaceValues
com.exalead.indexing.analysis.v10.ReplaceValues
- The ReplaceValues processor compares all DocumentChunks for a given inputContext with the specified KeyValue map. When the DocumentChunk value is an exact match, it is replaced by the specified string. This processor can be used, for instance, to normalize different spelling for document metadata. @csh AC_REPLACE_VALUES_ID
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
KeyValue |
exa.bee.KeyValue* |
|
PublicUrlProcessor
com.exalead.indexing.analysis.v10.PublicUrlProcessor
- For each input DocumentChunk associated with the 'inputContext' ContextName, 4 DocumentChunks are created, each associated with a different ContextName:
- 'treeOutputContext'
- 'leafOutputContext'
- 'urlOutputContext'
- 'urlCategoryOutputContext'
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
treeOutputContext |
string |
|
The ContextName for the DocumentChunk created from the category path encoding the web site tree. |
leafOutputContext |
string |
|
The ContextName for the DocumentChunks created from the complete, normalized, URL. |
urlOutputContext |
string |
|
The ContextName for the DocumentChunk created from the complete, normalized URL. |
urlPathOutputContext |
string |
|
The ContextName for the DocumentChunk created from the normalized URL. |
maxPathDepth |
int |
4 |
maximum depth of url path |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DateFormatter
com.exalead.indexing.analysis.v10.DateFormatter
- If a document chunk matches either:
- a custom input format defined with UNIX date syntax (for example,%Y/%m/%d-%H:%M:%S)
- one of the automatically recognized date formats (click icon for more information)
the Date Formatter generates three additional document chunks, each with its own context name, using the following naming convention:
- $inputContext$dateTimeOutputContext (Default format: %Y/%m/%d-%H:%M:%S)
- $inputContext$dateOutputContext (Default format: %Y/%m/%d)
- $inputContext$timeOutputContext (Default format: %H:%M:%S)
@csh AC_DATE_FORMATTER_ID
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
dateTimeOutputContext |
string |
|
Suffix for the name of the DocumentChunk containing the date as defined by dateTimeOutputFormat (default YYYY/MM/DD-HH:MM:SS). The original ContextName of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the ContextName actually used. This DocumentChunk is usually used for date display. |
dateTimeOutputFormat |
string |
|
A date and time output format compliant with libc's strftime. |
dateOutputContext |
string |
|
Suffix for the name of the DocumentChunk containing the date as defined by dateOutputFormat (default YYYY/MM/DD). The original name of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the name actually used. This DocumentChunk is usually remapped to a category for navigation. |
dateOutputFormat |
string |
|
A date output format compliant with libc's strftime. |
timeOutputContext |
string |
|
Suffix for the name of the DocumentChunk containing the date as defined by timeOutputFormat (default HH:MM:SS). The original name of the input DocumentChunk and this suffix are concatenated ($orig$dateTimeOutputContext) to produce the name actually used. |
timeOutputFormat |
string |
|
A time output format compliant with libc's strftime. |
inputFormat |
string |
|
An optional date input format, compliant with libc's strptime() format. If such a format is provided, the automatic date format heuristic is disabled, and the provided date format is used exclusively. Documentation of accepted formats: (days and month literals are only recognized in English)
- Day
- %a: weekday abbreviated ("Mon", ...)
- %A: weekday full ("Monday", ...)
- %d: day of the month, zero filled [01-31]
- %e: Equivalent to %d [1-31]
- %j: day year, zero filled [001-366]
- %u: day of week starting with Monday (1), i.e. mtwtfss [7 (for Sunday)]
- %w: day of week as a decimal number [0,6], with 0 representing Sunday
- Week
- %U: week number of the year (Sunday as first day of the week) as a decimal number [00,53]
- %W: week number of the year (Monday as the first day of the week) as a decimal number [01,53]
- %V: week of the year [01-53]
- Month
- %m: the month number [01-12]
- %b: month locale abbreviated ("Aug", ...)
- %h: equivalent to %b
- %B: locale's full month, variable length ("August")
- Year
- %y: The year within the century with two-digit dates, for example [69,99] is mapped to [1969,1999] and [00,68] is mapped to [2000,2068]
- %Y: The year, including the century (for example, 2014)
- %g: last two digits of year of ISO week number (see %G)
- %G: year of ISO week number (see %V), for example, 2014; normally useful only with %V
- Century
- %C: The century number [00,99]
- Date
- %D: Equivalent to mm/dd/yy (08/20/14)
- %x: locale's date representation (mm/dd/yy), 08/20/2014
- %F: %Y-%m-%d (2014-08-20)
- Hours
- %l: hour (12-hour clock), for example, [1-12]
- %I: hour (12-hour clock) zero filled, [01-12]
- %k: hour (24 hour), for example, 17
- %H: hour (24 hour) zero padded, 17
- %p: locale's upper case AM or PM (blank in many locales), for example, PM
- %P: locale's lower case am or pm, for example, pm
- Minutes
- Seconds
- %s: seconds since 00:00:00 1970-01-01 UTC (Unix epoch), for example, 1345483096
- %S: seconds [00-60], (The 60 is necessary to accommodate a leap second)
- Time
- %r: hours, minutes, seconds (12-hour clock), for example, 05:18:16 PM
- %R: hours, minutes (24-hour clock), for example, 17:18
- %T: hours, minutes, seconds (24-hour clock), for example, 17:18:16
- %X: locale's time representation, for example, 11:07:26 AM
- %dt: AM or PM
- Date and Time
- %c: locale's date and time, for example, Sat Nov 04 12:02:33 EST 1989
- Others
- %n: Any white space
- %t: Any white space
- %%: Replaced by %
|
removeOriginalChunk |
boolean |
True |
Removes the original input chunk. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
FormatCheckerDate
com.exalead.indexing.analysis.v10.FormatCheckerDate
- The FormatCheckDate processor checks the chunk matches either:
- a custom input format defined with UNIX date syntax (for example,%Y/%m/%d-%H:%M:%S)
- one of the automatically recognized date formats
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
inputFormat |
string |
|
An optional date input format, compliant with libc's strptime() format. If such a format is provided, the automatic date format heuristic is disabled, and the provided date format is used exclusively. Documentation of accepted formats: (days and month literals are only recognized in English)
- %a: The day of the week ("Monday", ...)
- %A: Equivalent to %a
- %b: The month ("January", ...)
- %B: Equivalent to %b
- %c: Equivalent to %a %b %e %H:%M:%S %Y
- %C: The century number [00,99]
- %d: The day of the month [01,31]
- %D: Equivalent to %m/%d/%y
- %e: Equivalent to %d
- %h: Equivalent to %b
- %H: The hour (24-hour clock) [00,23]
- %I: The hour (12-hour clock) [01,12]
- %j: The day number of the year [001,366]
- %m: The month number [01,12]
- %M: The minute [00,59]
- %n: Any white space
- %dt: AM or PM
- %r: Equivalent to %I:%M:%S %p
- %R: Equivalent to %H:%M
- %S: The seconds [00,60]
- %t: Any white space
- %T: Equivalent to %H:%M:%S
- %U: The week number of the year (Sunday as the first day of the week) as a decimal number [00,53]
- %w: The weekday as a decimal number [0,6], with 0 representing Sunday
- %W: The week number of the year (Monday as the first day of the week) as a decimal number [00,53]
- %x: Equivalent to %m/%d/%y
- %X: Equivalent to %H:%M:%S
- %y: The year within century. (for two-digit dates, [69,99] is mapped to [1969,1999] and [00,68] is mapped to [2000,2068])
- %Y: The year, including the century (for example, 1988)
- %%: Replaced by %
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
SplitValues
com.exalead.indexing.analysis.v10.SplitValues
- Splits the content of all DocumentChunks associated with the ContextName 'inputContext' using 'separator' as a separator regular expression. A new DocumentChunk is created for each segment, with 'outputContext' as the ContextName.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each split segment. |
separator |
string |
|
Separator around which to split. ASTL library is used to perform regular expression matching. The regular expression language supported is Perl 5, WITHOUT support for:
- assertions like \b, \B, ?=, ?!, ?<=, ?<!
- backreferences \1, \2, ...
- UNICODE escaping like \u0020 or \p{name}
- non-greedy (lazy) repeat operators like ??, *?, +?
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RenameContext
com.exalead.indexing.analysis.v10.RenameContext
- Each DocumentChunk with ContextName matching 'inputContext' is renamed with a ContextName 'outputContext'.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The new ContextName for DocumentChunks with ContextName matching 'inputContext'. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
CopyContext
com.exalead.indexing.analysis.v10.CopyContext
- Copies all DocumentChunks with 'inputContext' as ContextName, and creates new DocumentChunks with the same score, language and part but with
'outputContext' as ContextName.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
requiredAnnotation |
string |
|
The name of the required annotation the chunk must have to be copied. If null, no special handling is done on annotations. |
restrictValues |
string |
|
A regexp which values of the chunk must match to be copied to the output context. Values that don't match the regexp will not be copied. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
FixedRangeNumericalPartitioning
com.exalead.indexing.analysis.v10.FixedRangeNumericalPartitioning
- Matches numerical values in a range. It transforms a numerical value into a matching range, based on a fixed range size. For example, with rangeSize = 100,
- 101 -> 100_199
- 234 -> 200_299
It also works for negative numbers:
This helps to create categories (for navigation) from numerical values.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
separator |
string |
_ |
The range separator. |
rangeSize |
long |
1 |
The size of the range to consider. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ForcedRangeNumericalPartitioning
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning
- Transforms a numerical value into the text value associated to its matching range from a set of predetermined ranges specified in 'NumericalRange'.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
separator |
string |
_ |
The separator between the beginning and the end of the range. This parameter is deprecated. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
NumericalRange |
com.exalead.indexing.analysis.v10.NumericalRange* |
The forced ranges. |
NumericalRange
com.exalead.indexing.analysis.v10.NumericalRange
- Associates text with a numerical range. The range includes all values >= beg and <= end (beg <= x <= end). A range corresponding to a unique value with beg = end is allowed.
- Parent elements:
com.exalead.indexing.analysis.v10.ForcedRangeNumericalPartitioning (as ForcedRangeNumericalPartitioning)
- Attributes:
Name |
Type |
Default value |
Description |
beg |
long |
|
The lower bound. |
end |
long |
|
The upper bound. |
text |
string |
|
The associated text. |
TextToNum
com.exalead.indexing.analysis.v10.TextToNum
- Processor to hack an approximate sort on a text field. Implements a surjection from the set of strings to the set of integers [0..N] with N close but inferior or equal to 18,446,744,073,709,551,615 User defines an ordered alphabet. A first surjection from the set of all strings to the set of finite sequences of symbols taken from this alphabet is applied (strip the string from symbols out of the alphabet). A partial order relation is inferred on the latter set by the alphabet (lexicographical order). For obvious cardinal numbers reasons (one set is infinite the other is not), the second surjection cannot be partial-order preserving. The idea is to preserve the relation on the shorter strings, AND preserve the relation between shorter strings and longer strings, such as:
- if STRING2ULONG('shortstring1') <= STRING2ULONG('shortstring2') then 'shortstring1' <= 'shortstring2'
- STRING2ULONG('longstring1') <= STRING2ULONG('longstring2') does NOT insure 'longstring1' <= 'longstring2'
- if STRING2ULONG('shortstring1') <= STRING2ULONG('longstring2') then 'shortstring1' <= 'longstring2'
The size of the prefix obviously depends on the size of the alphabet.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
alphabet |
string |
0123456789abcdefghijklmnopqrstuvwxyz |
The ordered alphabet. |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
nbBits |
int |
63 |
Number of bits of unsigned field used for sorting. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DoubleToLong
com.exalead.indexing.analysis.v10.DoubleToLong
- Using this processor you can store floating point values into signed fields that can then be queried with the DoublePrefixHandler.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
precision |
int |
1000 |
The multiplicator. Each value will be multiplied by this factor. |
outputContext |
string |
|
The ContextName used for the newly created chunk. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
GeoBBoxProcessor
com.exalead.indexing.analysis.v10.GeoBBoxProcessor
- The Geo BBox processor converts the input geometry from WKT to WKB and compute its bouding box. Both WKB and bounding box are returned as chunks.
@descr
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
precision |
int |
6 |
The number of decimals that will be used in geometrical representations and computations. |
bboxMetaName |
string |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
JavaProcessor
com.exalead.indexing.analysis.v10.JavaProcessor
- Deprecated)
- Allows documents to be sent to a java process for analysis.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
id |
string |
|
|
target |
string |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ReplaceRegexp
com.exalead.indexing.analysis.v10.ReplaceRegexp
- Substitutes the content substring of all DocumentChunks having the ContextName 'inputContext', using:
- 'pattern' as the matching substring regular expression
- and 'value' as the replacement value.
This value may have the form of sed output format using references to captures \0 through \9. A new DocumentChunk is created with the substitutions.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each new context. |
pattern |
string |
|
Pattern used to match the substrings to replace. ASTL library is used to perform regular expression matching. The regular expression language supported is Perl 5, WITHOUT support for:
- lazy (non-greedy) quantifiers like *?, +?, ??, {n}?, {n,}?, {n,m}?
- possessive quantifiers like *+, ++, ?+, {n}+, {n,}+, {n,m}+
- assertions like \b, \B, \A, \z, \Z, \G
- look-around assertions (?=pattern), (?!pattern), (?<=pattern), (?<!pattern)
- named captures (?'name'pattern), (?<name>pattern)
- numeric and named backreferences like \1, \g1, g{-1}, \g{name}, k<name>, k'name'
- named Unicode character \N{name}
- all operators related to Perl code inlining like (?{ code })
- all operators related to backtracking algorithm control like independent subexpression (?>pattern)
- \C matching a single C char (octet)
- of the pattern-match modifiers (?pimsx-imsx) only (?i:pattern) and (?i) are supported (no negative form)
|
value |
string |
|
The replacement value (sed-like output format). |
replaceAll |
boolean |
True |
Replaces all first occurrences of patterns. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
URLCodec
com.exalead.indexing.analysis.v10.URLCodec
- URL encode/decode with UTF-8 charset only
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
Stores URL encoded form in outputContext. If outputContext = inputContext, it removes the original chunk. |
encodeURIComponent |
boolean |
True |
If true (default), it encodes the following characters:
',' '/' '?' ':' '@' '&' '=' '+' '$' '#' |
mode |
enum(encode, decode) |
encode |
mode = "encode" or "decode" |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
WildcardIndexing
com.exalead.indexing.analysis.v10.WildcardIndexing
- Computes the input chunk substring to perform efficient prefix/substring/suffix search
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
Stores exact/prefix/substring/suffix in outputContext. If outputContext = inputContext, it removes the original chunk. |
exactScore |
int |
4 |
Specifies the score for an exact match. |
prefixSearch |
boolean |
True |
Enables the prefix search. |
prefixScore |
int |
3 |
Specifies the score for a prefix match. |
suffixSearch |
boolean |
True |
Enables the suffix search. |
suffixScore |
int |
2 |
Specifies the score for a suffix match. |
substringSearch |
boolean |
True |
Enables the substring search. |
substringScore |
int |
1 |
Specifies the score for a substring match. |
maxStringSize |
int |
100 |
Specifies the max string size for which this processor will be applied. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
URLTransformer
com.exalead.indexing.analysis.v10.URLTransformer
- Parses a context string as a regular URL (RFC 2396, "Uniform Resource Identifier") and transforms it according to the given URL pattern. A new DocumentChunk is created with the substitution. Pattern used to transform the URL (in the form <scheme>://<authority><path>?<query>#<fragment>):
- Characters other than '$' or '\' are kept as-is
- The '$' character and the '\' character must be escaped with a leading \
- The ${expression} form allows to compute a string expression based on URL components (see "Expression" below)
Expression used inside the enclosing ${}:
- url: Original URL
- scheme: Scheme name ("http", "https", "file", ...)
- authority: Authority (host:port or host) (may be empty)
- host: Hostname part of the authority (may be empty)
- port: Port number part of the authority (may be empty)
- userInfo: username:password field of the authority (may be empty)
- file: File starting with / and query string, if any
- pathurl: Normalized absolute path starting with /
- path: Normalized absolute path (may start with C:\ on Windows)
- query: Normalized query part starting with ? (may be empty)
- args: Query part without the leading ? (may be empty)
- fragment: Fragment part starting with #(may be empty)
- reference: Reference part ; i.e., fragment without the leading # (may be empty)
- arg:name: Query part argument identified by its name, unescaped (you must re-escape it using "urlencode:" when necessary)
- str:string: The final argument is not a variable name, but a string (only useful for clarity purpose)
- tolower:<i>expression</i>: Transform into lowercase (ONLY A-Z)
- toupper:<i>expression</i>: Transform into uppercase (ONLY a-z)
- urlencode:<i>expression</i> :URL encoding (%NN or +)
- urlpathencode:expression</i>: URL encoding outside / fragments
- urldecode:<i>expression</i>: URL decoding
- pathslash:<i>expression</i>: Convert \ into /
- pathantislash:<i>expression</i>: Convert / into \
Notes:
- Unreserved characters are unescaped during URL processing (i.e., never '%' or '\')
- The lower other similar prefix accept recursion (i.e., the expression "${urlpathencode:pathantislash:toupper:path}" is valid)
- Both "file://C:\path" and "file:///C:\path" will produce path="/C:\path"
Examples:
- With the input context value "http://www.example.com/bar/foo?bar=42"
- "hello, world" => "hello, world"
- "the scheme is ${scheme}" => "the scheme is http"
- "the scheme is \${scheme}" => "the scheme is \${scheme}
- "http://myserver${path}${query}" => "http://myserver/bar/foo?bar=42"
- "http://myserver/applet?f=${urlpathencode:path}&t=${arg:bar}" => "http://myserver/applet?f=/bar/foo&t=42"
- "http://myserver/applet?f=${urlencode:path}&t=${arg:bar}" => "http://myserver/applet?f=%2Fbar%2Ffoo&t=42"
- "http://myserver/applet?f=${urlpathencode:pathantislash:toupper:path}" => "http://myserver/applet?f=%5CBAR%5CFOO"
- With the input context value "file:///C:/My%20Documents/Document.doc"
- "${pathantislash:urldecode:path}" => "C:\My Documents\Document.doc"
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each new context. |
urlPattern |
string |
|
Pattern used to transform the URL. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
GeoCategorizer
com.exalead.indexing.analysis.v10.GeoCategorizer
- A processor that categorizes geographic points given their inclusion in a GeoDomain.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
inputContext |
string |
|
The processor will only be applied to DocumentChunks with this ContextName. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
ContextName of the chunk to create. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
GeoDomain |
com.exalead.search.v30.GeoDomain* |
|
DiskDomain
com.exalead.search.v30.DiskDomain
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
- Attributes:
Name |
Type |
Default value |
Description |
title |
string |
|
|
id |
int |
|
Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value. |
radius |
double |
|
Disk radius in meters |
x |
double |
|
First coordinate of the center for the DiskDomain. If the point type is XY, it will be interpreted as the X coordinate (integer units). For geographic points (GPS), it will be interpreted as the latitude coordinate. |
y |
double |
|
Second coordinate of the center for the DiskDomain. If the point type is XY, it will be interpreted as the Y coordinate (integer units). For geographic points (GPS), it will be interpreted as the longitude coordinate. |
PolygonDomain
com.exalead.search.v30.PolygonDomain
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
- Attributes:
Name |
Type |
Default value |
Description |
title |
string |
|
|
id |
int |
|
Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value. |
vertices |
string |
|
Polygon vertices, as a list of (x,y) coordinates. For example: "0.0,0.0;1.1,0.1;1.1,1.1" |
KMLDomain
com.exalead.search.v30.KMLDomain
- Definition of a geographic domain using a KML or KMZ resource
- Parent elements:
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
- Attributes:
Name |
Type |
Default value |
Description |
title |
string |
|
|
id |
int |
|
Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value. |
resource |
string |
|
|
KMZ |
boolean |
|
Is this resource a KMZ resource? |
SHPDomain
com.exalead.search.v30.SHPDomain
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.GeoCategorizer (as GeoCategorizer)
- Attributes:
Name |
Type |
Default value |
Description |
title |
string |
|
|
id |
int |
|
Unique identifier of this domain. If id=0 (its default value) the category path will be the set of vertices. Otherwise, it will be the id value. |
shpResource |
string |
|
|
shxResource |
string |
|
|
dbfResource |
string |
|
|
MimeTypeSetter
com.exalead.indexing.analysis.v10.MimeTypeSetter
- Manually sets the mime type
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
value |
string |
|
New mime type |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
MetaFinder
com.exalead.indexing.analysis.v10.MetaFinder
- Keeps track of all document metas
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
JavaDocumentProcessor
com.exalead.indexing.analysis.v10.JavaDocumentProcessor
- Takes Java code either inline or from a file, and executes it on-the-fly. For production mode, we recommend packaging your custom code as a Java Plugin (CVPlugin) and using the Custom Document Processor to call it. Plugins allow better packaging and source code maintenance.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
path |
string |
|
User defined path to a Java file containing the processor code |
priority |
int |
|
Defines which path to use (0: user defined path, 1: resource managed path (inlined Java)) |
sourceCode |
string |
|
Inline Java code |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
JavaScriptProcessor
com.exalead.indexing.analysis.v10.JavaScriptProcessor
- Deprecated)
- This document processor is deprecated. Use the Java document processor instead. The JavaScript Processor takes a JS script and executes it.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
path |
string |
|
User defined path to a JS file containing the processor code |
priority |
int |
|
Defines which path to use (0: user defined path, 1: resource managed path (inlined JS)) |
script |
string |
|
Inline script |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
StorageServiceDocumentProcessor
com.exalead.indexing.analysis.v10.StorageServiceDocumentProcessor
- Queries the storage for any meta to attach to the document. Multi-valued pairs are pushed as multi-valued metas. For example:
- The storage key "nb_comment" will be attached as "nb_comment" meta on the document.
- The storage key "tags[]" will be attached as "tags" multi-valued meta on the document.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
instance |
string |
|
Storage service instance |
metaIdentifier |
string |
|
Defines an optional meta name that will be used as storage Identifier instead of the document Uri. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
MathDocumentProcessor
com.exalead.indexing.analysis.v10.MathDocumentProcessor
- Performs mathematical operations on a numerical field. Expressions must be prefaced by a $. For example, the expression `$ht_price * 1.196` finds the first chunk in the `ht_price` context,
and replaces all occurrences of `ht_price` with the mathematical expression. The result will be a new text chunk, either in the Output context (if specified), or in the original `ht_price` context.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
expression |
string |
|
Arithmetic expression to evaluate. For example: "$file_size + 42" |
outputContext |
string |
|
ContextName of the chunk to create. |
floatingPoint |
boolean |
|
Output: A floating point number instead of the default integer one. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
PrecomputedThumbnailsDocumentProcessor
com.exalead.indexing.analysis.v10.PrecomputedThumbnailsDocumentProcessor
- The Precomputed Thumbnails Document Processor precomputes thumbnails of the first DocumentPart.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
convertAddresses |
string |
|
Semicolon separated list of convert instance names or urls to use. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RealTimeAlerting
com.exalead.indexing.analysis.v10.RealTimeAlerting
- The Real-time alerting document processor matches queries defined by end-users and alerts them as soon as possible a new matching document is indexed. To be used only when not in task queue mode.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
alertGroups |
com.exalead.indexing.analysis.v10.AlertGroup* |
List of alert groups handled by this processor, empty means ALL groups |
customPublishers |
com.exalead.indexing.analysis.v10.CustomPublisher* |
|
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
AlertGroup
com.exalead.indexing.analysis.v10.AlertGroup
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.RealTimeAlerting (as alertGroups)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
|
CustomPublisher
com.exalead.indexing.analysis.v10.CustomPublisher
- Custom publisher configuration
- Parent elements:
com.exalead.indexing.analysis.v10.RealTimeAlerting (as customPublishers)
- Attributes:
Name |
Type |
Default value |
Description |
classId |
string |
|
Custom publisher type |
- Nested elements:
Name |
Type |
Description |
config |
exa.bee.KeyValue* |
|
MIMEDetector
com.exalead.indexing.analysis.v10.MIMEDetector
- The MIME detector operates on each DocumentPart for which a MIME-type is not available. The MIME-type can be specified for each DocumentPart in the PAPI. For DocumentPart, the 'bytes' and the 'filename' are used to guess the real MIME-type and charset. The guessed MIME-type and the charset are then set as attributes of the DocumentPart. Input: The DocumentPart of the document. Output: 'mime' and 'encodingToUse' attributes of DocumentParts. This document processor does not create any document chunks.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
defaultValue |
string |
|
Default mime to use if not detected. |
defaultCharset |
string |
|
On text or HTML files, the MIME detector tries to detect charset encoding automatically. If the encoding cannot be detected, this 'defaultCharset' is used. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RemoteHTTPTransformer
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer
- The processor posts part bytes to the remote HTTP service, and gets the typed resource as a result. The remote service may return a Document.MIME_V10 document, or any other document that can later be processed in the pipeline. If the remote service returns a non "OK" HTTP status (!= 200 error code), the corresponding error is passed as a regular error. The service may also advertise a filename, using the standard Content-Disposition's 'filename' attribute.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
remoteUrl |
string |
|
Remote URL |
timeoutMs |
int |
|
Remote processor timeout, in milliseconds. This is the timeout. |
httpIdleTimeoutMs |
int |
|
Cached HTTP connection idle timeout. This is an advanced setting. For efficiency, the RemoteHTTPTransformer maintains a pool of opened connections to the remote HTTP service. This defines the timeout for connections which are no longer used. Default is 10.000. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
argMapping |
com.exalead.indexing.analysis.v10.RemoteHTTPTransformerRemoteArgMapping* |
Argument(s) mapping, if any.
@see RemoteHTTPTransformerRemoteArgMapping |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RemoteHTTPTransformerRemoteArgMapping
com.exalead.indexing.analysis.v10.RemoteHTTPTransformerRemoteArgMapping
- Transformation RemoteHTTPTransformer argument mapping.
- Parent elements:
com.exalead.indexing.analysis.v10.RemoteHTTPTransformer (as argMapping)
- Attributes:
Name |
Type |
Default value |
Description |
key |
string |
|
URL key to map. This key name will be used as remote HTTP argument name. |
value |
string |
|
Value to use. If @c null, the @c defaultValue value will be used. The following values names are reserved:
- $docname: the document name or URI
- $msg.uri: see @c com.exalead.mercury.papi.PAPIMessage
- $msg.source: see @c com.exalead.mercury.papi.PAPIMessage
- $part.name: see @c com.exalead.indexing.DocPart
- $part.filename: see @c com.exalead.indexing.DocPart
- $part.encoding: see @c com.exalead.indexing.DocPart
- $part.forcedMime: see @c com.exalead.indexing.DocPart
- $part.mimeHint: see @c com.exalead.indexing.DocPart
- $part.mime: see @c com.exalead.indexing.DocPart
- $part.encodingToUse: see @c com.exalead.indexing.DocPart
- $part.bytes.length: see @c com.exalead.indexing.DocPart
- $part.customDirectives.*: see @c com.exalead.indexing.DocPart
- $$$foo: escaping for $foo
|
defaultValue |
string |
|
Value to use if the @c value is @c null. If this value is @c null, the empty string will be used. |
StandardPartsMerger
com.exalead.indexing.analysis.v10.StandardPartsMerger
- This processor does nothing if there are no DocumentParts (only root DocumentChunks). This processor needs one DocumentPart called the 'Master Part'. If there is only one part, this part is the 'Master Part'. If there are multiple parts, the part named after the 'masterPart' attribute is the 'Master Part'.
@csh AC_STANDARDPARTS_MERGER_ID
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
masterPart |
string |
|
Name of the master part. This name should be "master" to follow the convention used by connectors that send documents composed of multiple parts (e.g. mails with attachments). |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
partSpecificContexts |
exa.bee.StringValue* |
The ContextNames of the DocumentChunk from the non-master part that should be copied to the root document. |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
SemanticPipeDocumentProcessor
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor
- Instantiates a semantic pipe and creates chunks out of resulting annotations. It can be used to instantiate classification processors, and perform document level operations from their output.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
pipeline |
string |
|
Analysis pipeline on which semantic processors will be used. |
annotations |
string |
|
A chunk will be created for each annotation which name is in the list. Comma-separated list of annotations. |
topLevelAnnotationsOnly |
boolean |
|
Considers top level annotations only. For example, results from the QueryMatcher or Fast Rules. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
SemanticProcessor |
com.exalead.indexing.analysis.v10.SemanticProcessor* |
List of semantic processors to use |
Anchorer
com.exalead.indexing.analysis.v10.Anchorer
- Adds an annotation on the first and last tokens of either a processed sequence (first/last) or a range defined by an annotation a (first_a/last_a)
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
tagsToAnchor |
string |
|
List of comma-separated tags on which to work |
finalAnnotationOnNextToken |
boolean |
|
If true, sets final annotation on the token after the last token of annotation a |
finalCannotBeSepSpace |
boolean |
|
If final can't be a space, the annotation last may be set on the next non-blank token |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
CompoundWordSplitter
com.exalead.indexing.analysis.v10.CompoundWordSplitter
- Annotates compound words that use CamelCase (like SearchServer) or underscores (like my_variable) to separate the root words. This allows users to search for the root words individually. Annotations generated:
- "compound": for example, compound="search server"
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
tokenizeAnnotations |
boolean |
True |
Subtokenizes "SearchServer" into "Search" "Server" automatically, and keep original annotations. |
doCamelCase |
boolean |
True |
Separates compound words before each capital letter. For example, the annotation for "CamelCase" is compound="camel case". |
doUnderscore |
boolean |
True |
Separates multi-word strings wherever there is an underscore. For example, the annotation for "under_score" is compound="under score". |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
OntologyMatcher
com.exalead.indexing.analysis.v10.OntologyMatcher
- An OntologyMatcher detects concepts defined in an ontology in the textual content of the Document Chunks. Typically, an ontology contains a list of business terms to be detected. Resulting Annotations are mapped to enable navigation by business concepts. Annotations generated:
- Depends on the resource (See Pkg).
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
enableApproxMatching |
boolean |
|
Enables approximative matching in ontology. Approximative matching uses the Damerau-Levenshtein edit distance. |
minWordSizeForDist1 |
int |
3 |
Minimum number of chars in token to enable the Damerau-Levenshtein distance of 1. |
minWordSizeForDist2 |
int |
8 |
Minimum number of chars in token to enable the Damerau-Levenshtein distance of 2. |
resourceDir |
string |
|
URL for the directory containing the ontology (data://, file;// or resource://). |
restrictLanguage |
boolean |
True |
Keeps only the expression added with language == Language.XX or with the document language. For example, if the Ontology contains an expression added with language=En, it will be extracted only for an English document if restrictLanguage is set to true. |
keepLongestMatch |
boolean |
True |
Keeps only the longest match. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations. |
keepLongestMatchInterTag |
boolean |
|
Keeps only the longest match (tag independant). For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 annotations 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other annotations. |
tokenizeAnnotations |
boolean |
|
If you have some multi-tokens annotations (like "super market" annotation on token "supermarket", this option will automatically subtokenize "supermarket" in "super" "market" and keep original annotations. If you enable this option, keepLongestMatch and keepLongestMatcherInterTag will be set to true. |
annotationsToIgnore |
string |
|
Sets the list of annotations to be ignored (comma-separated). This feature allows you to define a list of words/expressions to ignore in the recognition of this ontology. For example, if you add:
- the expressions "of" and "the" with the tag "toIgnore" in ontology A,
- and the expression "website embassy" in ontology B with tagsToIgnore="toIgnore",
... you will be able to match "website of the embassy", "website of embassy" and "website embassy". |
ignoreSpaces |
boolean |
|
If your ontology was compiled with matchOnSeparators=false, this allows 'lemonde' to retrieve 'le monde' or 'le monde' to retrieve 'lemonde'. If your ontology was compiled with matchOnSeparators=true, this allows 'le monde' to retrieve 'le monde'. |
annotationPrefix |
string |
|
A prefix to add to each annotation tag. For example, if the package of the entry matched in the ontology is "exalead.location.country" and the annotationPrefix is "myOntology_", an annotation will be added with the tag "myOntology_exalead.location.country". |
trustLevelBasedDedup |
boolean |
|
Keeps only the annotation with the highest trust level when several entries from a package match the same text chunk. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
NamedEntitiesMatcher
com.exalead.indexing.analysis.v10.NamedEntitiesMatcher
- The Named Entities Matcher detects named entities such as people, organizations, or places, in the textual content of the document. It generates annotations like
NE.person or NE.organization , using ontology-based matching and/or rule-based matching.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file;// or resource://). |
rules |
string |
ne |
Defines which entities will be extracted:
- The default value,
ne triggers the extraction of people, organizations, locations and events.
- The value
ne-all triggers the extraction of all types of entities.
|
prefix |
string |
NE |
Prefix to add in front of each annotation generated by the named entity matcher. |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
partOfSpeechFiltering |
boolean |
True |
It discards annotations for parts of text made of a name followed by a verb or an adverb with the first letter in uppercase. This filter is useful if your documents contain a lot of titles with several capitalized words (what is called 'Title Case'). It applies to NE.person , NE.place and NE.organization . |
useKnownWordsForDisambiguisation |
boolean |
True |
Uses a resource of known words to disambiguate named entities candidates. It works only for English and French. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
Classifier
com.exalead.indexing.analysis.v10.Classifier
- A Classifier classifies a whole document according to the existing annotations on selected Document Chunks. The annotations are matched against a learning resource.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the vocabulary resource (data://, file;// or resource://) |
annotationName |
string |
|
Name of the annotation to add. |
language |
iso code |
|
Language for which the vocabulary classifier is activated. |
excludedLanguages |
string |
|
Language for which the vocabulary classifier is deactivated (works only if language=xx, comma-separated). |
addAnnotationsOnKeywords |
boolean |
|
If true, it adds annotations to all matching tokens. |
maxAnnotations |
int |
-1 |
Maximum number of annotations per document. |
minTrustLevel |
int |
|
The minimum trust level of categories to keep. |
maxKeywords |
int |
-1 |
The maximum number of keywords to keep. |
minKeywords |
int |
1 |
The minimum number of keywords per class. |
collapseToken |
boolean |
|
If true, all identical tokens are collapsed. |
extraPrefixAnnotations |
string |
|
The optional list of prefix annotations to keep (comma-separated). |
extraAnnotationsMinTrustLevel |
int |
100 |
The minimum trust level to keep an extra annotation. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
HierarchicalClassifier
com.exalead.indexing.analysis.v10.HierarchicalClassifier
- A Classifier classifies a whole document according to the existing annotations on selected Document Chunks. The annotations are matched against a learning resource.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
resourceDir |
string |
|
URL for the vocabulary resource (data://, file;// or resource://) |
annotationName |
string |
|
Name of the annotation to add. |
language |
iso code |
|
Language for which the vocabulary classifier is activated. |
excludedLanguages |
string |
|
Language for which the vocabulary classifier is deactivated (works only if language=xx, comma-separated). |
addAnnotationsOnKeywords |
boolean |
|
If true, it adds annotations to all matching tokens. |
maxAnnotations |
int |
-1 |
Maximum number of annotations per document. |
minTrustLevel |
int |
|
The minimum trust level of categories to keep. |
maxKeywords |
int |
-1 |
The maximum number of keywords to keep. |
minKeywords |
int |
1 |
The minimum number of keywords per class. |
collapseToken |
boolean |
|
If true, all identical tokens are collapsed. |
extraPrefixAnnotations |
string |
|
The optional list of prefix annotations to keep (comma-separated). |
extraAnnotationsMinTrustLevel |
int |
100 |
The minimum trust level to keep an extra annotation. |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
RulesMatcher
com.exalead.indexing.analysis.v10.RulesMatcher
- A RuleMatcher applies a rule engine on the textual content of the DocumentChunks. The rules are defined in a separate XML 'resourceFile' and are a combination of regular expression, word matching and boolean operators over content. Annotations generated:
- The matching rule defined in the XML specifies the annotation to generate
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceFile |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
iso code |
|
Language for which this processor is activated. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
RelatedTerms
com.exalead.indexing.analysis.v10.RelatedTerms
- Extracts all possible related terms. Only one instance of this processor may exist per input context. Annotations generated:
- "relatedTerm": RelatedTerm identifier (stored in the dictionary and in the index)
- "relatedTermDisplay": display form of the RelatedTerm (stored in the dictionary)
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
relatedTermsMinSpan |
int |
3 |
Minimum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist). |
relatedTermsMaxSpan |
int |
6 |
Maximum number of words (excluding stop words) in an automatically extracted term (not applicable to whitelist). |
maxRelatedTermsPerDoc |
int |
64 |
The maximum number of related terms per document. |
keepLongestMatch |
boolean |
True |
Keeps only the longest term when several overlap. For example, if you have 5 tokens ('a', 'b', 'c', 'd', 'e') and 4 related terms 'a', 'a-c', 'b-c-d' and 'd-e', this option will only keep 'b-c-d' and remove all other related terms. |
dictionaryName |
string |
|
Name of the dictionary populated by terms extracted by this processor. If null, use the default dictionary. |
preprocResourceDir |
string |
|
URL for the resource of the related terms preprocessor (data://, file;// or resource://). If null, we use the standard preprocessor of the product. |
whitelistResource |
string |
|
Path to a related terms whitelist resource. |
blacklistResource |
string |
|
Path to a related terms blacklist resource. |
withPartOfSpeech |
boolean |
True |
Adds a PartOfSpeechTagger to the list of processors automatically. Improves quality of automatically extracted terms. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
PartOfSpeechTagger
com.exalead.indexing.analysis.v10.PartOfSpeechTagger
- A PartOfSpeechTagger detects the part of speech for each word in the text of Document Chunks. It improves the quality of other processors, such as the named entity detector or the sentiment analyzer. Annotations generated:
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
Phonetizer
com.exalead.indexing.analysis.v10.Phonetizer
- Creates a phonetic form for each word. This processor is used:
- as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform phonetic matches.
- to perform search-time phonetic analysis using the Phonetic expansion module (this creates the dictionary of phonetic forms that will be used by the expansion module at search-time).
- to greatly improve the quality of spell checking.
Annotations generated:
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceFile |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
Lemmatizer
com.exalead.indexing.analysis.v10.Lemmatizer
- Creates a lemmatized form for each word (nouns and adjectives only). This processor is mostly used as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform lemmatized matches. Annotations generated:
- "lemma": normalized lemmatized form of the word (singular/masculine)
- "lemma_lowercase": lemmatized form of the word (singular/masculine)
- "fsingular": normalized singular form of the word
- "fsingular_lowercase": singular form of the word
- "masculine": if the token is a masculine word
- "feminine": if the token is a feminine word
- "neuter": if the token is neuter
- "singular": if the word is singular
- "plural": if the word is plural
- "unnumbered": if the word is unnumbered
- "pos": the static Part of Speech
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
lemmatizeNormalizedAnnotations |
boolean |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcronymDetector
com.exalead.indexing.analysis.v10.AcronymDetector
- Detects acronyms like 'o.n.u' and extracts 'onu'. '.', '-' and ' ' are the standard acronym separators. Custom alphanumeric separators can be added with the "separators" attribute. Annotations generated:
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
addNormalizerAnnotation |
boolean |
|
|
separators |
string |
|
List of allowed separators chars separated by ',' (can only be alphanumerical, for example, 'and' to handle '1 and 1') |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
strict |
boolean |
True |
In strict mode, the only separator is dot. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
Normalizer
com.exalead.indexing.analysis.v10.Normalizer
- Normalizes all tags given in input tags field. Annotations generated:
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
inputTags |
string |
|
Normalize all tags of "inputTags" (comma-separated list of tags). |
trustLevel |
int |
100 |
|
transliteration |
boolean |
True |
When normalizing, convert some characters to their latin equivalent |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
FarTextAnnotator
com.exalead.indexing.analysis.v10.FarTextAnnotator
- A FarTextAnnotator annotates alphanumeric tokens with 'annotation' if they are farther than 'startOffset'
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
startOffset |
int |
8192 |
|
annotation |
string |
fartext |
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
Chunker
com.exalead.indexing.analysis.v10.Chunker
- A chunker detects noun groups. Annotations generated:
- "gadv": adverbal group
- "gadj": adjectival group
- "gnoun": noun group
- "gverb": verbal group
- "gprep": prepositional group
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
string |
|
Languages for which the processor is activated;
if no language is specified, the processor is activated for all languages. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
SentimentAnalyzer
com.exalead.indexing.analysis.v10.SentimentAnalyzer
- Analyzes the nouns and adjectives present in the text. It detects topics and annotates the document with:
- a global rating of good, bad or neutral
- a rating per topic
- the adjective(s) used in the document
@require Tokenizer, Lemmatizer, PartOfSpeechTagger, RelatedTermsPreprocessor, RelatedTermsExtractor, NamedEntitiesMatcher, Chunker
@annotations "sentiment" annotation on nouns with a modulated ("really", "quite", "not") appreciation
@document-annotations "document_sentiment" annotation on the document with either "good", "bad" or "neutral" and a confidence ratio
@attribute resourceDir (defaults to resource://sentiment/sentiment.bin):
@attribute language (defaults to all supported languages):
@attribute summarize (defaults to false):
@attribute annotateGlobally (defaults to false):
@attribute showPackage (defaults to false):
@attribute packageCount (defaults to false):
@attribute nounPackage DEPRECATED (defaults to true):
@attribute ignorePartOfSpeech (defaults to false):
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file;// or resource://). |
language |
iso code |
|
|
annotateGlobally |
boolean |
|
|
annotatePronouns |
boolean |
|
|
ignorePartOfSpeech |
boolean |
|
|
ignoreRelatedTerms |
boolean |
|
|
legacyAnnotations |
boolean |
|
|
notApplicableAnnotations |
boolean |
True |
|
normalizeTrustLevels |
boolean |
True |
|
nounPackage |
boolean |
True |
|
packageCount |
boolean |
|
|
showPackage |
boolean |
|
|
suggest |
boolean |
|
|
summarize |
boolean |
|
|
suggestOutput |
string |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
FastRulesMatcher
com.exalead.indexing.analysis.v10.FastRulesMatcher
- Annotates a document using a set of XML rules, compiled for efficiency. The rules are described with the query language using the AND, OR and NOT operators, as well as 'context' matching operators. The rules can also match whole chunks (and not just words) per regular expressions. Annotations generated:
- Depending on the resources (See FastRulesDefinition)
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
Directory containing the matcher resources. Must not be empty. |
allowsExprStartingBySeparators |
boolean |
|
If you have expressions starting with a separator (",", ";", "&", ...), then you must set this option to true. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
SnowballStemmer
com.exalead.indexing.analysis.v10.SnowballStemmer
- Creates the stemmed form of each word. This uses the Snowball stemming algorithms. This processor is mostly used as a helper for other processors (like Ontology Matcher, or Semantic Extractor), which need to perform stemmed matches. Annotations generated:
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
DebugSemanticProcessor
com.exalead.indexing.analysis.v10.DebugSemanticProcessor
- Dumps all annotated tokens in the specified format on Standard Output, or in @c outputFile.
(Log of the 'Analysis' process)
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
outputFile |
string |
|
|
format |
enum(html, xml) |
html |
Output format. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
SQI
com.exalead.indexing.analysis.v10.SQI
- Deprecated)
- A SemanticProcessor applies semantic processing on the textual content of the DocumentChunks. A Semantic Processor creates SemanticAnnotations on tokens. These SemanticAnnotations can then be used in the Mapping.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceDir |
string |
|
URL for the resource (data://, file:// or resource://) |
breakOnSentence |
boolean |
|
If true, there will be maximum one match per sentence, and no match for inter-sentence. This option will add the SentenceFinder automatically. |
breakOnParagraph |
boolean |
True |
If true, there will be maximum one match per paragraph, and no match for inter-paragraph. |
breakOnLine |
boolean |
|
If true, there will be maximum one match per line, and no match for inter-line. |
matchAllRules |
boolean |
True |
If true, it returns the full list of matched rules. If false, it returns the first matched rule only. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
ProximityProcessor
com.exalead.indexing.analysis.v10.ProximityProcessor
- A proximity processor detects and annotates pieces of text where several annotations occur given distance constraints. Possible constraints (non mutually exclusive):
- token window size
- distance between annotations
- sentence/paragraph scope
Annotations generated:
- Depending on the resource (See Proximity)
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceFile |
string |
|
URL for the resource (data://, file:// or resource://) |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AnnotationManager
com.exalead.indexing.analysis.v10.AnnotationManager
- An annotation manager implements basic operations on annotations: copy/removal/selection according
to a number of conditions like:
- Removal of overlaping annotations
- Selection of the most frequent annotations
- Copy of an annotation unless blacklisted
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
resourceFile |
string |
|
URL for the resource (data://, file:// or resource://) |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
CustomSemanticProcessor
com.exalead.indexing.analysis.v10.CustomSemanticProcessor
- A custom semantic processor allows you to plug in custom code in the semantic pipeline.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.SemanticPipeDocumentProcessor (as SemanticPipeDocumentProcessor)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Semantic Processor. This name is only used for tracing and debugging purposes. |
contexts |
string |
|
Comma-separated list of the ContextNames of the Document Chunks on which this processor should be applied. If this list is empty, all DocumentChunks are processed. |
dataModelState |
string |
|
Is this semantic processor managed by a data model? @enum{null,auto,customized, error}. If null, this semantic processor is not related to the data model. If "auto", this semantic processor is auto-generated by the data model. |
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disables the DocumentProcessor |
classId |
string |
|
The specified class must implement the {@code com.exalead.indexing.analysis.semantic.CustomSemanticProcessorInterface} Exascript interface. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.SemanticProcessor |
If dataModelState is "customized", you will find here the original semantic processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
KeyValue |
exa.bee.KeyValue* |
|
PrintfValues
com.exalead.indexing.analysis.v10.PrintfValues
- Prints textual content of DocumentChunks according to a formatting string. This string contains variables in one of the 3 following formats: 1. $(name), the name of a context: output is the textual content of this context. 2. $/name:regexp/, the name of a context whose chunks must match the regexp: output is the piece of text that has matched. 3. $/name:regexp:format/, the name of a context whose chunks must match the regexp: output is defined by a sed-like format referencing the regexp subexpressions. Warning: In the regexp and format parts, colons and slashes must be escaped with a backslash. For example : "$(firstname) $(lastname) : $/age:[0-9]+/ $/date:([0-9]{2})([0-9]{2})([0-9]{4}):day=\\1 month=\\2 year=\\3" Warning: The context used in this method cannot be produced by another processor. It should come from the connector.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
formattingString |
string |
|
This string contains variables in one of the 3 following formats: 1. $(name), the name of a context: output is the textual content of this context. 2. $/name:regexp/, the name of a context whose chunks must match the regexp: output is the piece of text that has matched. 3. $/name:regexp:format/, the name of a context whose chunks must match the regexp: output is defined by a sed-like format referencing the regexp subexpressions. Warning: Colons and slashes must be escaped with a backslash. For example : "$(firstname) $(lastname) : $/age:[0-9]+/ $/date:([0-9]{2})([0-9]{2})([0-9]{4}):day=\\1 month=\\2 year=\\3" |
outputContext |
string |
|
ContextName to be associated with the DocumentChunk created for each generated value. |
strict |
boolean |
True |
Forces all the manipulated contexts found to process. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
RenameUnmappedContexts
com.exalead.indexing.analysis.v10.RenameUnmappedContexts
- This Document Processor changes the ContextName for all DocumentChunks associated with a ContextName that does not have a Mapping Configuration. This avoids extensive renaming using RenameContext.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The new ContextName for DocumentChunks with an unmapped ContextName. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
NewChunk
com.exalead.indexing.analysis.v10.NewChunk
- Creates a new DocumentChunk with 'outputContext' as ContextName, and textual content specified in 'value'.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
value |
string |
|
The value used for newly created chunks. |
partName |
string |
|
The part to which the chunk should belong. If nothing is specified here, the chunk will be handled as a global chunk. |
language |
iso code |
|
Language of the chunk, as an ISO639 code. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
UniformRandomContextGenerator
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator
- Adds a new DocumentChunk for one document out of 'modulo' documents processed. The textual content of the DocumentChunk is picked out of the list specified in 'values', with a uniform distribution.
@descr
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
modulo |
int |
|
Inverse probability of adding the new chunk. Must be a strictly positive integer. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
values |
exa.bee.StringValue* |
List of possible values. |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
ZipfRandomContextGenerator
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator
- Adds a new document chunk for one document out of 'modulo'. The textual content of the document chunk is picked out of the list specified in 'values', with a non-uniform discrete Zipf distribution.
@descr
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
outputContext |
string |
|
The ContextName used for newly created chunks. |
modulo |
int |
|
Inverse probability of adding the new chunk. Must be a strictly positive integer. |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
parameter |
double |
|
The exponent characterizing the distribution. |
- Nested elements:
Name |
Type |
Description |
values |
exa.bee.StringValue* |
List of possible values. |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DiscardDocument
ReplaceContextNames
com.exalead.indexing.analysis.v10.ReplaceContextNames
- Replaces the first matching substring of context names with the given replacement. For example, inputSubstring="abc" and outputReplacement="bar" will rename context abcdef to bardef and somethingabcstuff to somethingbarstuff
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
inputSubstring |
string |
|
The piece of string to be replaced. |
outputReplacement |
string |
|
The replacement string. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
HTMLCSSSelector
com.exalead.indexing.analysis.v10.HTMLCSSSelector
- Deletes all text chunks that are not annotated with a class or an id specified in {@link classes} or {@link ids}
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
classes |
exa.bee.StringValue* |
|
ids |
exa.bee.StringValue* |
|
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DataModelClassResolver
com.exalead.indexing.analysis.v10.DataModelClassResolver
- This processor takes the value of the "datamodel_class" papi directive to determine the DataModelClass of the document. If this directive is not found, we assume this is the default class. If this is not the default class, all metas corresponding to an existing DataModelProperty are prefixed with the type of the class declaring the property (it may be a superclass of the class). For the processors following this processor in the pipeline, you must refer to the Data Model property by prefixing it with its class name. For processors preceding this processor in the pipeline, use the meta name only (without prefix).
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
SetDefaultValue
com.exalead.indexing.analysis.v10.SetDefaultValue
- This processor looks for specified contexts. If they are not present in document, they are created with a configured value.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
KeyValue |
exa.bee.KeyValue* |
|
CustomDocumentProcessor
com.exalead.indexing.analysis.v10.CustomDocumentProcessor
- A Custom document processor allows you to plug in custom code packaged as a CVPlugin into the document processing pipeline.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
classId |
string |
|
Class identifier. The specified class must implement the com.exalead.pdoc.analysis.CustomDocumentProcessor Java Interface. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
KeyValue |
exa.bee.KeyValue* |
|
InferFileExtension
com.exalead.indexing.analysis.v10.InferFileExtension
- When the file_extension meta is not present, finds the file extension based on the file name or the mime meta (if one of these two is present).
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
InsertCurrentDate
com.exalead.indexing.analysis.v10.InsertCurrentDate
- Adds the current date in an output context
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
outputContext |
string |
|
The ContextName used for newly created chunks. |
format |
string |
|
Either "unixts" or a SimpleDateFormat specification |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
XpathRule
com.exalead.indexing.analysis.v10.XpathRule
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.XpathExtractor (as XpathExtractor)
- Attributes:
Name |
Type |
Default value |
Description |
metaName |
string |
|
|
xpath |
string |
|
|
concatMutiMatch |
boolean |
True |
Concatenates all results in a value when the xpath expression returns several results. Otherwise, it adds each match in a multiValued meta. It should be unselected if you want each node returned by xpath expression in different value (like list of item). |
XpathFragmentRule
com.exalead.indexing.analysis.v10.XpathFragmentRule
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.XpathFragmentExtractor (as XpathFragmentExtractor)
- Attributes:
Name |
Type |
Default value |
Description |
metaName |
string |
|
|
xpath |
string |
|
|
SimilarStringToPart
com.exalead.indexing.analysis.v10.SimilarStringToPart
- Converts the signatures in a string format from a meta to a binary part
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
version |
int |
1 |
Specifies the version. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
values |
exa.bee.StringValue* |
List of the names of the metas to parse and to transform to part. |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DocumentProcessorGroup
com.exalead.indexing.analysis.v10.DocumentProcessorGroup
- Contains a list of document processors, which are executed only if this group document processor condition matches. It avoids condition duplication or distinct pipelines creation when several processors share the same condition.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DocumentProcessor |
com.exalead.indexing.analysis.v10.DocumentProcessor* |
|
UnitsOfMeasurementNormalizer
com.exalead.indexing.analysis.v10.UnitsOfMeasurementNormalizer
- Unit of measurement detector and convertor
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
indexField |
string |
|
The index field in which the value will be stored. |
indexFieldUnitSymbol |
string |
|
The output unit symbol |
suffixName |
string |
_um |
Output suffix to create a new meta as output |
removeContext |
boolean |
|
Remove contexts after processing |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
DebugCrashProcessor
com.exalead.indexing.analysis.v10.DebugCrashProcessor
- Causes crashes for debugging purpose
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
type |
string |
exception |
The crash type
{@code enum(noop,exception,oom,infiniteloop,nullptr,abort,assert,segv,intdiv)} |
delay |
int |
|
Trigger delay in seconds. |
count |
int |
3 |
Trigger document count. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
PLMExpandDocumentProcessor
com.exalead.indexing.analysis.v10.PLMExpandDocumentProcessor
- Treat plm metas to generate octrees and matrices for PLMExpand.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
metaMatrix |
string |
matrix |
Name of the meta containing the matrix data. |
fieldMatrix |
string |
matrix |
Name of the target matrix field. |
fieldInvMatrix |
string |
invmatrix |
Name of the target matrix field. |
metaCGR |
string |
cgr |
Name of the meta containing the CGRs. |
fieldOctree |
string |
octree |
Name of target octree field. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
CGRDocumentProcessor
com.exalead.indexing.analysis.v10.CGRDocumentProcessor
- Calls convert to generate octrees.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
com.exalead.indexing.analysis.v10.DocumentProcessorGroup (as DocumentProcessorGroup)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of this processor. The name of a processor is used only for tracing and debugging purposes. |
dataModelState |
string |
|
Is this document processor managed by a data model? @enum{null,auto,customized, error}.
- If null, this document processor is not related to a data model.
- If "auto", this document processor is auto-generated by a data model.
- If "customized", this document processor was auto-generated by a data model
and then customized.
- If "error", there is a conflict between this document processor and the data model.
|
dataModelClass |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelClass that generated this DocumentProcessor. |
dataModelProperty |
string |
|
If dataModelState is either "auto" or "customized", you will find here the
name of the DataModelProperty that generated this DocumentProcessor. |
disabled |
boolean |
|
Disable the DocumentProcessor |
partCGR |
string |
CGR |
Name of the part containing the CGR data (tesselation). |
partOctree |
string |
octree |
Name of the part used to store the resulting octree. |
docIdentifyer |
string |
majorid |
Name of the meta identifying the document. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.DocumentProcessor |
If dataModelState is "customized", you will find here the
original document processor generated by the data model. Use this to easily revert to "auto" state from "customized".
@IgnoreForValueConstructor |
AcceptCondition |
com.exalead.indexing.analysis.v10.AcceptCondition |
Expresses the enablement condition of this DocumentProcessor. |
FilteringConfiguration
com.exalead.indexing.analysis.v10.FilteringConfiguration
- Filters to apply to the words extracted from the semantic processors. Words that do not satisfy these conditions will not be indexed. The filtered values are expressed by the number of unicode characters.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
- Attributes:
Name |
Type |
Default value |
Description |
wordMaxLength |
int |
100 |
Maximal length of a word.
100 is the default value. |
hexCharMax |
int |
|
Maximal number of hexadecimal characters that can appear in a word. This filter applies only for words bigger than 'hexLengthMin'. 0 = no filter (default value) |
hexLengthMin |
int |
|
Minimal number of characters in a word for the hexadecimal filter to apply. 0 = no filter (default value) |
maxNumChars |
int |
|
Maximal number of characters in a word. 0 = no filter (default value) |
LanguageConfiguration
com.exalead.indexing.analysis.v10.LanguageConfiguration
- Configuration of the linguistic extraction for a given language.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
- Attributes:
Name |
Type |
Default value |
Description |
language |
iso code |
|
The language iso code |
generateWordDict |
boolean |
|
Extracts words for the global dictionary. |
wordDictModulo |
int |
1 |
Word extraction modulo, by default extract all words. |
maxWordDictWordsPerDocument |
long |
-1 |
Maximum number of words extracted per document. |
maxExtractedWordLength |
int |
64 |
Maximum length a word must have to be extracted. |
spellCheckNGramMaxSize |
int |
3 |
Maximum number of consecutive words for spellchecking. If the value is set to '-1', spellcheck data is not generated for this language. 0 and 1 values are illegal, default is 3. |
spellCheckNGramsDictModulo |
int |
5 |
NGrams extraction modulo. It extracts 1 ngram out of 5 by default. |
maxSpellCheckNGramsPerDocument |
long |
-1 |
Maximum number of ngrams extracted per document. |
maxExtractedSpellCheckNGramLength |
int |
256 |
Maximum length an ngram must have to be extracted. |
relatedTermsDictModulo |
int |
1 |
Submits 1 out of X documents for related terms generation. If the value is set to 0, related terms are not generated for this language. |
maxRelatedTermsDictContextsPerDocument |
long |
-1 |
Maximum number of related terms extracted per document. |
MappingConfiguration
com.exalead.indexing.analysis.v10.MappingConfiguration
- Specifies how DocumentChunks and their SemanticAnnotations populate the index and the dictionary.
- Parent elements:
com.exalead.indexing.analysis.v10.AnalysisPipeline (as AnalysisPipeline)
- Nested elements:
Name |
Type |
Description |
AnnotationMapping |
com.exalead.indexing.analysis.v10.AnnotationMapping* |
List of mappings from annotations to index targets, with associated parameters. |
ContextMapping |
com.exalead.indexing.analysis.v10.ContextMapping* |
List mappings from contexts to index targets, with associated parameters. |
FieldIndexingLimit |
com.exalead.indexing.analysis.v10.FieldIndexingLimit* |
Word count limits to apply to texts mapped to index fields for search. |
FieldRetrievalLimit |
com.exalead.indexing.analysis.v10.FieldRetrievalLimit* |
Size limits (in bytes) to apply to texts mapped to the index for retrieval. |
GenerateAnnotationsForContext |
com.exalead.indexing.analysis.v10.GenerateAnnotationsForContext* |
List of contexts to process with a semantic pipeline before mapping. |
PartMapping |
com.exalead.indexing.analysis.v10.PartMapping* |
List mappings from parts to index targets, with associated parameters. |
WordCountMapping |
com.exalead.indexing.analysis.v10.WordCountMapping* |
Specify where to map Word count. |
AnnotationMapping
com.exalead.indexing.analysis.v10.AnnotationMapping
- Defines how SemanticAnnotations are used to populate index fields.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the SemanticAnnotation to map. |
context |
string |
|
Optional input context restricting the mapping from the annotations coming from a specific context. Incompatible with the patternMatch feature. |
patternMatch |
boolean |
|
Matches all annotations matching this pattern (must be a valid regular expression). |
dataModelState |
string |
|
Is this annotation target managed by a data model?
@enum{null,auto,customized}. If null, this annotation mapping is not related to a data model. If "auto", this annotation mapping is auto-generated by a data model If "customized", this annotation mapping was auto-generated by a data model
and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or "customized", you will find here the
name of the DataModelClass that generated this annotation mapping. |
dataModelProperty |
string |
|
If dataModelState is "auto" or "customized", you will find here the
name of the DataModelProperty that generated this annotation mapping. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.AnnotationMapping |
If dataModelState is "customized", you will find here the
original annotation mapping generated by the data model. Use this to easily show what reverting to "auto" from "customized" would imply |
AnnotationTarget |
com.exalead.indexing.analysis.v10.AnnotationTarget* |
|
CategoryAnnotationTarget
com.exalead.indexing.analysis.v10.CategoryAnnotationTarget
- CategoryAnnotationTarget is used to create a new category path inside an index category field, out of a SemanticAnnotation. The category path is built by the concatenation of the 'categoryRoot' and the selected 'form' of the annotation.
- Parent elements:
com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
|
forcedRank |
long |
|
|
rankBoost |
long |
|
|
form |
string |
normalized |
Which form of SemanticAnnotation value should we index?
{@code enum(exact,normalized)} |
dataModelState |
string |
|
Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget. |
dataModelProperty |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget. |
categoryRoot |
string |
|
Prefix used to build the CategoryPath. |
categoryAppend |
boolean |
True |
Builds the category path by concatenating the categoryRoot and the selected 'form' of the annotation. If false, only the category root will be used. |
appendAnnotationNameToRoot |
boolean |
|
Appends the annotation name between the root and the value. |
retrievable |
boolean |
|
If true, the category path is retrievable and can be used to create facets. If false, the category path is only searchable. (Advanced usage. langdate hacks) |
cleanupContent |
boolean |
True |
Removes trailing and leading spaces. Removes category path without AlphaNum character. |
detectTitle |
boolean |
|
Detect words set after # in path and use them as title |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.AnnotationTarget |
If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply. |
StandardAnnotationTarget
com.exalead.indexing.analysis.v10.StandardAnnotationTarget
- StandardAnnotationTarget is used to index the textual content of a SemanticAnnotation. The selected 'form' of the SemanticAnnotation is used to populate an index field.
- Parent elements:
com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
|
forcedRank |
long |
|
|
rankBoost |
long |
|
|
form |
string |
normalized |
Which form of SemanticAnnotation value should we index?
{@code enum(exact,normalized)} |
dataModelState |
string |
|
Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget. |
dataModelProperty |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget. |
searchable |
boolean |
|
If true, the SemanticAnnotation can be searched for. |
indexLevel |
string |
|
If searchable, index kind where data will be indexed. Can be "exact", "lowercase", "normalized" or "custom". |
customIndexKind |
int |
|
If indexLevel = "custom", this index kind will be used. |
retrievable |
boolean |
|
If true, the SemanticAnnotation can be retrieved. |
retrieveField |
string |
|
The field where the SemanticAnnotation is stored for retrieval, if 'retrievable' is set to true. If null, 'indexField' will be used to store the SemanticAnnotation for retrieval. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.AnnotationTarget |
If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply. |
EnumFacetAnnotationTarget
com.exalead.indexing.analysis.v10.EnumFacetAnnotationTarget
- EnumFacetAnnotationTarget maps the annotations according to the specified EnumFacet.
- Parent elements:
com.exalead.indexing.analysis.v10.AnnotationMapping (as AnnotationMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
|
forcedRank |
long |
|
|
rankBoost |
long |
|
|
form |
string |
normalized |
Which form of SemanticAnnotation value should we index?
{@code enum(exact,normalized)} |
dataModelState |
string |
|
Is this annotation target managed by a data model? @enum{null,auto,customized}. If null, this prefix handler is not related to a data model. If "auto", this prefix handler is auto-generated by a data model. If "customized", this prefix handler was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelClass that generated this AnnotationTarget. |
dataModelProperty |
string |
|
If dataModelState is "auto" or "customized", you will find here the name of the DataModelProperty that generated this AnnotationTarget. |
enumFacetId |
string |
|
The id of the EnumFacetAnnotationTarget this target refers to. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.AnnotationTarget |
If dataModelState is "customized", you will find here the original prefix handler generated by the data model. Use this to easily see what reverting to "auto" from "customized" would imply. |
ContextMapping
com.exalead.indexing.analysis.v10.ContextMapping
- ContextMapping specifies how DocumentChunks with a given ContextName are remapped to index fields and whether they are used to populate the dictionary.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
ContextName of the DocumentChunks to map. |
prefixMatch |
boolean |
|
Matches all context that starts with this prefix. |
unprefix |
boolean |
|
Remove the prefix that was used to match. |
patternMatch |
boolean |
|
Matches all context matching this pattern (must be a valid regular expression). |
semantic |
boolean |
True |
Performs semantic processing on the DocumentChunks processed by this mapping. If false, the textual content of the DocumentChunks will not be tokenized before indexing. This can be used to index 'exact raw values'. |
resourceFreq |
int |
1 |
To extract a resource, select the frequency to add. For example, if you have a 'firstname lastname' entry, you may want to simulate a frequency of 1000 to avoid spellcheck on this entry. |
tokenizationConfig |
string |
|
|
dataModelState |
string |
|
Is this content target managed by a data model?
@enum{null,auto,customized}. If null, this context mapping is not related to a data model. If "auto", this context mapping is auto-generated by a data model If "customized", this context mapping was auto-generated by a data model
and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or "customized", you will find here the
name of the DataModelClass that generated this context mapping. |
dataModelProperty |
string |
|
If dataModelState is "auto" or "customized", you will find here the
name of the DataModelProperty that generated this ContextMapping |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.indexing.analysis.v10.ContextMapping |
If dataModelState is "customized", you will find here the
original context mapping generated by the data model. Use this to easily show what reverting to "auto" from "customized" would imply. |
Target |
com.exalead.indexing.analysis.v10.Target* |
|
CategoryContentTarget
com.exalead.indexing.analysis.v10.CategoryContentTarget
- CategoryContentTarget is used to map a DocumentChunk to a category. A Category Path is created for each DocumentChunk processed. The textual content of the DocumentChunk is used to build a Category Path. 'indexField' should be a category field (usually called 'categories' or 'security').
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
categoryRoot |
string |
|
Builds the category path. |
categoryAppend |
boolean |
True |
Appends the textual content of the DocumentChunk to the category root. If false, only the category root will be used. |
appendContextNameToRoot |
boolean |
|
Appends the context name between the root and the value. |
form |
string |
normalized |
The form of the word to be used to build the category path.
{@code enum(exact,normalized)} |
retrievable |
boolean |
|
Stores the category path, which enables display and navigation by category path. If false, we only index the SemanticAnnotation (Advanced usage - langdate hacks). |
cleanupContent |
boolean |
True |
If true:
- Removes trailing and leading unicode-spaces.
- Replaces all sequences of unicode-space characters by a single 'space' character.
- Does not map to the category in append mode if the DocumentChunk does not contain at least one unicode alpha-numerical character.
|
detectTitle |
boolean |
|
Detect words set after # in path and use them as title |
DateCategoryContentTarget
com.exalead.indexing.analysis.v10.DateCategoryContentTarget
- CategoryContentTarget specific to date.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
categoryRoot |
string |
|
Builds the category path. |
categoryAppend |
boolean |
True |
Appends the textual content of the DocumentChunk to the category root. If false, only the category root will be used. |
appendContextNameToRoot |
boolean |
|
Appends the context name between the root and the value. |
form |
string |
normalized |
The form of the word to be used to build the category path.
{@code enum(exact,normalized)} |
retrievable |
boolean |
|
Stores the category path, which enables display and navigation by category path. If false, we only index the SemanticAnnotation (Advanced usage - langdate hacks). |
cleanupContent |
boolean |
True |
If true:
- Removes trailing and leading unicode-spaces.
- Replaces all sequences of unicode-space characters by a single 'space' character.
- Does not map to the category in append mode if the DocumentChunk does not contain at least one unicode alpha-numerical character.
|
detectTitle |
boolean |
|
Detect words set after # in path and use them as title |
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
inputFormat |
string |
|
Specifies the input format of the date, in UNIX date format. Set null value for automatic detection of standard formats. |
StandardContentTarget
com.exalead.indexing.analysis.v10.StandardContentTarget
- A StandardContentTarget is used to populate a textual, numerical or date index field, with the content of a DocumentChunk.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
prefixWithContext |
boolean |
|
Enables prefixing of all words in inverted lists by 'contextName#'. |
addStartEnd |
boolean |
|
Enables the introduction of a word __start__ before chunk content and a word __end__ after chunk content. Only valid if Chunk is mapped with semantic=true. This option is compatible with prefixContextName: produce contextName#__start__ and contextName#__end__) |
indexPrefixes |
boolean |
|
Enables the indexing of all prefixes for each word with a score = prefixScore. The prefix can be mapped to a specific type if you add 'prefix' in formIndexingConfig. |
prefixesScore |
int |
1 |
Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
maxPrefixLength |
int |
|
Maximum length of the extracted prefixes. |
indexSuffixes |
boolean |
|
Enables the indexing of all suffixes for each word with a score = suffixScore. The suffix can be mapped to a specific kind if you add 'suffix' in formIndexingConfig. |
suffixesScore |
int |
1 |
Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
maxSuffixLength |
int |
|
Maximum length of the extracted suffixes. |
indexSubstrings |
boolean |
|
Enables the indexing of all substrings for each word with a score = substringScore. The suffix can be mapped to a specific kind if you add 'substring' in formIndexingConfig. |
substringsScore |
int |
1 |
Score given to extracted substrings. Document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
searchable |
boolean |
True |
Marks the content of the DocumentChunk as indexed and searchable. |
retrievable |
boolean |
True |
Enables the content of the DocumentChunk to be directly stored in the index, so that it can be retrieved. For numerical values, retrievability allows you to sort results by field. |
retrieveField |
string |
|
The index field in which the content will be stored. If null, the content will be put in 'indexField'. |
indexNormalized |
boolean |
True |
Enables the indexing of the normalized form of the word. |
indexLowercase |
boolean |
|
Enables the indexing of the lowercase (non-normalized) form of each token. |
indexExact |
boolean |
|
Enables the indexing of the exact (non-normalized) form of each token. |
indexSeparators |
boolean |
|
Enables the indexing of the index standard separators. Indexed standard separators are: paragraph, sentence and page. Standard separators indexing is required for the SPLIT operator to work with these separators. |
addBreakBetweenChunks |
boolean |
True |
Enables the introduction of a break between document chunks by the indexer. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field.
- If addBreakBetweenChunks is false, then the document will match on the query
"foo bar" , foo NEXT bar
- If addBreakBetweenChunks is true, then the document will not match the query
"foo bar" nor foo NEXT bar but will match the query foo AND bar
|
- Nested elements:
Name |
Type |
Description |
DecreaseRankOnAnnotation |
com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation* |
List of DecreaseRankOnAnnotation |
IncreaseRankOnAnnotation |
com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation* |
List of IncreaseRankOnAnnotation |
RankOnAnnotation |
com.exalead.indexing.analysis.v10.RankOnAnnotation* |
List of RankOnAnnotation |
DateContentTarget
com.exalead.indexing.analysis.v10.DateContentTarget
- DateContentTarget defines indexing a date.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
prefixWithContext |
boolean |
|
Enables prefixing of all words in inverted lists by 'contextName#'. |
addStartEnd |
boolean |
|
Enables the introduction of a word __start__ before chunk content and a word __end__ after chunk content. Only valid if Chunk is mapped with semantic=true. This option is compatible with prefixContextName: produce contextName#__start__ and contextName#__end__) |
indexPrefixes |
boolean |
|
Enables the indexing of all prefixes for each word with a score = prefixScore. The prefix can be mapped to a specific type if you add 'prefix' in formIndexingConfig. |
prefixesScore |
int |
1 |
Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
maxPrefixLength |
int |
|
Maximum length of the extracted prefixes. |
indexSuffixes |
boolean |
|
Enables the indexing of all suffixes for each word with a score = suffixScore. The suffix can be mapped to a specific kind if you add 'suffix' in formIndexingConfig. |
suffixesScore |
int |
1 |
Score given to words' prefixes. The document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
maxSuffixLength |
int |
|
Maximum length of the extracted suffixes. |
indexSubstrings |
boolean |
|
Enables the indexing of all substrings for each word with a score = substringScore. The suffix can be mapped to a specific kind if you add 'substring' in formIndexingConfig. |
substringsScore |
int |
1 |
Score given to extracted substrings. Document relevance is determined by its score. The text matching score basically represents the "distance" between a search query and a document. |
searchable |
boolean |
True |
Marks the content of the DocumentChunk as indexed and searchable. |
retrievable |
boolean |
True |
Enables the content of the DocumentChunk to be directly stored in the index, so that it can be retrieved. For numerical values, retrievability allows you to sort results by field. |
retrieveField |
string |
|
The index field in which the content will be stored. If null, the content will be put in 'indexField'. |
indexNormalized |
boolean |
True |
Enables the indexing of the normalized form of the word. |
indexLowercase |
boolean |
|
Enables the indexing of the lowercase (non-normalized) form of each token. |
indexExact |
boolean |
|
Enables the indexing of the exact (non-normalized) form of each token. |
indexSeparators |
boolean |
|
Enables the indexing of the index standard separators. Indexed standard separators are: paragraph, sentence and page. Standard separators indexing is required for the SPLIT operator to work with these separators. |
addBreakBetweenChunks |
boolean |
True |
Enables the introduction of a break between document chunks by the indexer. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field.
- If addBreakBetweenChunks is false, then the document will match on the query
"foo bar" , foo NEXT bar
- If addBreakBetweenChunks is true, then the document will not match the query
"foo bar" nor foo NEXT bar but will match the query foo AND bar
|
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
inputFormat |
string |
|
Specifies the input format of the date, in UNIX date format. Set null value for automatic detection of standard formats. |
- Nested elements:
Name |
Type |
Description |
DecreaseRankOnAnnotation |
com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation* |
List of DecreaseRankOnAnnotation |
IncreaseRankOnAnnotation |
com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation* |
List of IncreaseRankOnAnnotation |
RankOnAnnotation |
com.exalead.indexing.analysis.v10.RankOnAnnotation* |
List of RankOnAnnotation |
DecreaseRankOnAnnotation
com.exalead.indexing.analysis.v10.DecreaseRankOnAnnotation
- Allows you to decrease the ranking when some words are flagged by an annotation (part of speech, ontology, ...).
- Parent elements:
com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
- Attributes:
Name |
Type |
Default value |
Description |
annotationName |
string |
|
Name of the targeted annotation. |
annotationValue |
string |
|
Value of the annotation that will trigger the decrease in ranking. |
value |
int |
|
Number to decrease from the ranking when triggered. |
IncreaseRankOnAnnotation
com.exalead.indexing.analysis.v10.IncreaseRankOnAnnotation
- Allows you to increase the ranking when some words are flagged by an annotation (part of speech, ontology, ...).
- Parent elements:
com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
- Attributes:
Name |
Type |
Default value |
Description |
annotationName |
string |
|
Name of the targeted annotation. |
annotationValue |
string |
|
Value of the annotation that will trigger the increase in ranking. |
value |
int |
|
Number to increase in the ranking when triggered. |
RankOnAnnotation
com.exalead.indexing.analysis.v10.RankOnAnnotation
- Modifies ranking when some words are flagged by a given annotation.
- Parent elements:
com.exalead.indexing.analysis.v10.DateContentTarget (as DateContentTarget)
com.exalead.indexing.analysis.v10.StandardContentTarget (as StandardContentTarget)
- Attributes:
Name |
Type |
Default value |
Description |
annotationName |
string |
|
The annotation that triggers the ranking modification. |
annotationValue |
string |
|
The annotation value required to trigger the ranking modification. |
forcedRank |
int |
|
The new ranking. |
CustomContentTarget
com.exalead.indexing.analysis.v10.CustomContentTarget
- CustomerContentTarget defines indexing by a custom 'Index Kind'.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
searchable |
boolean |
True |
If true, the content of the DocumentChunk will be indexed and searchable. |
retrieveField |
string |
|
The index field in which the content will be stored. If null, the content will be put in 'indexField'. |
retrievable |
boolean |
True |
Stores the content of the DocumentChunk directly in the index, so that it can be retrieved. For numerical values, retrievability enables to sort results by field. |
indexKind |
int |
|
Index 'Kind' to use for indexing content. |
addBreakBetweenChunks |
boolean |
True |
If true, the indexer introduces a break between document chunks. This forbids phrase matching across these chunks and has an impact on search when using double-quotes expressions or the 'NEXT' operator. For example, if a document has a "title" chunk containing "foo" and a "text" chunk containing "bar", and they are both remapped to the text field:
- If addBreakBetweenChunks is false, then the document will match on the query
"foo bar" , foo NEXT bar
- If addBreakBetweenChunks is true, then the document will not match the query
"foo bar" nor foo NEXT bar but will match the query foo AND bar
|
EnumFacetContentTarget
com.exalead.indexing.analysis.v10.EnumFacetContentTarget
- EnumFacetContentTarget maps the content according to the specified EnumFacet.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
The indexField to populate with this content. If null, the contextName of the DocumentChunk will be used for the index field. |
forcedRank |
long |
|
Sets the ranking value for chunks in this mapping. -1 means that the chunk internal ranking value is kept. |
rankBoost |
long |
|
Offsets the chunk internal ranking value. Use it only when forcedRank = -1 For example, if forcedRank=-1, rankBoost=2, and the chunk internal ranking value is 4, the final rank will be 6. |
enumFacetId |
string |
|
The id of the EnumFacet this target refers to. |
form |
string |
normalized |
The form of the values for the facet stringValues
{@code enum(exact,normalized)} |
DictionaryTarget
com.exalead.indexing.analysis.v10.DictionaryTarget
- A DictionaryTarget specifies how a DocumentChunk or semantic annotation is processed to the dictionary.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
dictionaryName |
string |
|
|
words |
boolean |
True |
|
ngrams |
boolean |
|
|
rt |
boolean |
|
|
phonemes |
boolean |
|
|
PartTarget
com.exalead.indexing.analysis.v10.PartTarget
- A PartTarget specifies how a Part is processed to populate the index.
- Parent elements:
com.exalead.indexing.analysis.v10.ContextMapping (as ContextMapping)
- Attributes:
Name |
Type |
Default value |
Description |
indexField |
string |
|
The index field in which the content will be stored. |
FieldIndexingLimit
com.exalead.indexing.analysis.v10.FieldIndexingLimit
- Limits the number of words that can be retrieved from a given field.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
fieldName |
string |
|
Field to limit. |
maxNbWords |
int |
|
Maximum number of words for this field. |
FieldRetrievalLimit
com.exalead.indexing.analysis.v10.FieldRetrievalLimit
- Limits the size of text that can be retrieved from a given field. In some standard configuration, a FieldRetrievalLimit on the 'text' field is set to "maxLength=4096". This limits the size of the index on disk.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
retrieveField |
string |
|
Field to limit. |
maxLength |
int |
|
Max text size in bytes. The text will be clipped to the nearest word. Text is stored in UTF-8. |
GenerateAnnotationsForContext
com.exalead.indexing.analysis.v10.GenerateAnnotationsForContext
- Forces a context to be processed by the SemanticProcessor pipeline and to process semantic annotations.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
ContextName of the DocumentChunks to map. |
prefixMatch |
boolean |
|
Matches any context starting with this prefix. |
patternMatch |
boolean |
|
Matches any context matching this regular expression. |
tokenizationConfig |
string |
|
If set, it forces the tokenization configuration to use. |
PartMapping
com.exalead.indexing.analysis.v10.PartMapping
- PartMapping specifies how parts are remapped to index fields.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the Part to map. |
prefixMatch |
boolean |
|
Matches all parts that starts with this prefix. |
patternMatch |
boolean |
|
Matches all parts matching this pattern (must be a valid regular expression). |
- Nested elements:
Name |
Type |
Description |
PartTarget |
com.exalead.indexing.analysis.v10.PartTarget* |
|
WordCountMapping
com.exalead.indexing.analysis.v10.WordCountMapping
- Specify where to map Word count.
- Parent elements:
com.exalead.indexing.analysis.v10.MappingConfiguration (as MappingConfiguration)
- Attributes:
Name |
Type |
Default value |
Description |
fromName |
string |
|
Compute the word count of this field. |
toName |
string |
|
Store the word count to this field. |
IndexSchema
com.exalead.mercury.mami.indexing.v10.IndexSchema
- Configuration for an index schema. This defines the fields actually stored in an index. Most commonly, only one index schema is defined, and used by all build groups (for all slices). This configuration is referenced in the BuildGroup element in 'Deployment'.
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
|
allowIntensiveDiskAccess |
boolean |
|
Allows intensive operations like sorting or faceting to be performed on disk
(SSD should be preferred). |
- Nested elements:
Name |
Type |
Description |
AttributeGroupStore |
com.exalead.mercury.mami.indexing.v10.AttributeGroupStore* |
|
FieldConfig |
com.exalead.mercury.mami.indexing.v10.FieldConfig* |
|
AttributeGroupStore
com.exalead.mercury.mami.indexing.v10.AttributeGroupStore
- Configuration of an attribute group. An attribute group define how attributes should be persisted on disk.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
id |
int |
|
A unique identifier for this attribute group. |
label |
string |
|
A human readable name for this attribute group. |
format |
enum(SimpleRowOrientedStore, ItemOrientedStore) |
ItemOrientedStore |
Specifies how to persist the data on disk for this attribute group. |
retrievableRoles |
string |
|
Specifies a comma-separated list of annotations to be handled in this attribute group store. Ex: @Facetable,@Sortable,@Display |
leafSize |
int |
30720 |
If the format is SimpleRowOrientedStore, configures the leaf size (i.e., maximum IO size read per DID). |
AlphanumFieldConfig
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig
- This field stores alphanumeric values (i.e., 'text', 'title').
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
maxStoredWordPosition |
int |
|
Number of words, starting from the beginning of the document, for which
word positions will be stored in the index. This enables proximity ranking and position searching (NEAR, NEXT, ...)
up to this number of words in the document. '0' should be used to disable position storing. |
maxInlineWordPositions |
int |
2 |
Advanced setting controlling how many positions are inlined
in the main data file for each word of each document. |
useVariablePositionsEncoding |
boolean |
|
Advanced setting to choose which positions encoding algorithm should be used. Variable position encoding should be used to reduce index size when indexing big documents. |
storeTf |
boolean |
|
Stores the number of terms of each document. This information may be used by the ranking algorithm to normalize term frequencies (as "nbTerms"). This costs a few bytes of RAM per document. |
bloomFilter |
boolean |
|
Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing). |
gzip |
boolean |
True |
Activates content compression |
implementation |
enum(strbtree, trie, fsm) |
fsm |
Advanced configuration. Internal structure used to store the field dictionary. |
nbWordsPerLeaf |
int |
1000 |
Advanced configuration. If using the strbtree structure, it configures the number of words per leaf. |
optimizePatternSearch |
boolean |
True |
Adds extra informations to the dictionaries for pattern search optimization. If false, optimizes data structures for size. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
RiceEncoderConfig
com.exalead.mercury.mami.indexing.v10.RiceEncoderConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
- Attributes:
Name |
Type |
Default value |
Description |
bytesPerBlock |
int |
1024 |
|
positionsRiceCodingParam |
int |
1024 |
|
dataFilesPrefetchPages |
int |
2 |
|
extFilesPrefetchPages |
int |
2 |
|
VarIntEncoderConfig
com.exalead.mercury.mami.indexing.v10.VarIntEncoderConfig
- Stores each integer in varint encoding
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
Apollo11EncoderConfig
com.exalead.mercury.mami.indexing.v10.Apollo11EncoderConfig
- Stores each integer in Apollo11 encoding
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
NoOpEncoderConfig
com.exalead.mercury.mami.indexing.v10.NoOpEncoderConfig
- Trivial encoder. For debugging purposes only
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
FastNoPosEncoderConfig
com.exalead.mercury.mami.indexing.v10.FastNoPosEncoderConfig
- An encoder that only stores docids, not ranks nor positions.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AlphanumFieldConfig (as AlphanumFieldConfig)
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig (as BinaryFieldConfig)
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig (as CategoryFieldConfig)
com.exalead.mercury.mami.indexing.v10.DateFieldConfig (as DateFieldConfig)
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig (as DoubleFieldConfig)
com.exalead.mercury.mami.indexing.v10.FieldConfig (as FieldConfig)
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig (as GeoFieldConfig)
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig (as HierarchyFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig (as LegacySignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig (as LegacyUnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.NumericalFieldConfig (as NumericalFieldConfig)
com.exalead.mercury.mami.indexing.v10.PointFieldConfig (as PointFieldConfig)
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig (as ReferenceFieldConfig)
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig (as SignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.SortableFieldConfig (as SortableFieldConfig)
com.exalead.mercury.mami.indexing.v10.StandardFieldConfig (as StandardFieldConfig)
com.exalead.mercury.mami.indexing.v10.TextFieldConfig (as TextFieldConfig)
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig (as TimeFieldConfig)
com.exalead.mercury.mami.indexing.v10.UidFieldConfig (as UidFieldConfig)
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig (as UnsignedFieldConfig)
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig (as ValueFieldConfig)
- Attributes:
Name |
Type |
Default value |
Description |
didsPerBlock |
int |
256 |
|
LegacyUnsignedFieldConfig
com.exalead.mercury.mami.indexing.v10.LegacyUnsignedFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
bitsForValue |
int |
32 |
Number of bits used to store numerical values. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
LegacySignedFieldConfig
com.exalead.mercury.mami.indexing.v10.LegacySignedFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
PointFieldConfig
com.exalead.mercury.mami.indexing.v10.PointFieldConfig
- This type of field is used to store geographical points using either GPS coordinates (WGS84) or planar X,Y coordinates (Meter).
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
geoType |
enum(WGS84, Meter) |
WGS84 |
Value can be one of |
blockSize |
int |
8192 |
|
exact |
boolean |
True |
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
GeoFieldConfig
com.exalead.mercury.mami.indexing.v10.GeoFieldConfig
- This type of field is used to store 2D geometries using either planar X,Y coordinates (Meter).
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
geoType |
enum(Meter) |
Meter |
Value can be one of |
maxBlockSize |
int |
24 |
|
precision |
int |
6 |
|
bboxFieldName |
string |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
UidFieldConfig
com.exalead.mercury.mami.indexing.v10.UidFieldConfig
- This field stores a unique value in order to facilitate search.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
dictStorage |
enum(strbtree, trie, fsm) |
fsm |
Associative array implementation. |
bitsetThreshold |
int |
10000 |
Number of requested documents before switching from a dynamic array to a bitset representation. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
ValueFieldConfig
com.exalead.mercury.mami.indexing.v10.ValueFieldConfig
- Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
ignorePresentBit |
boolean |
|
Uses and loads the present bit. |
minMemberNbBits |
int |
5 |
Min number of bits for attr part for value field. |
bloomFilter |
boolean |
|
Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing). |
hashThreshold |
int |
128 |
Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold. |
implementation |
enum(strbtree, fsm) |
fsm |
Advanced configuration. Internal structure used to store the field dictionary. |
optimizeListsForPatternSearch |
boolean |
|
speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
TextFieldConfig
com.exalead.mercury.mami.indexing.v10.TextFieldConfig
- Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
True |
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
retrievable |
boolean |
True |
|
ignorePresentBit |
boolean |
|
Uses and loads the present bit. |
minMemberNbBits |
int |
5 |
Min number of bits for attr part for value field. |
bloomFilter |
boolean |
|
Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing). |
hashThreshold |
int |
128 |
Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold. |
implementation |
enum(strbtree, fsm) |
fsm |
Advanced configuration. Internal structure used to store the field dictionary. |
optimizeListsForPatternSearch |
boolean |
|
speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space. |
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
storePositions |
boolean |
True |
Store positions for seq nodes and proximity scoring. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
ReferenceFieldConfig
com.exalead.mercury.mami.indexing.v10.ReferenceFieldConfig
- Stores alphanumerical content with an internal ordinal mapping, which makes it suitable for efficient facetting. Each term is limited to 1024 bytes.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
True |
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
retrievable |
boolean |
True |
|
ignorePresentBit |
boolean |
|
Uses and loads the present bit. |
minMemberNbBits |
int |
5 |
Min number of bits for attr part for value field. |
bloomFilter |
boolean |
|
Activates a Bloom filter per slot. This speeds up requests containing words that are not present in the field on a given slot. Disable this option if all words of the request for this field are always matching, and if you compact into big slots regularly. Enable this option if there is either a lot of misses (e.g. on the "text" field) or if you have small updates (e.g. with real-time indexing). |
hashThreshold |
int |
128 |
Stores a hash value in field dictionary instead of the original data if value length is greater than this threshold. |
implementation |
enum(strbtree, fsm) |
fsm |
Advanced configuration. Internal structure used to store the field dictionary. |
optimizeListsForPatternSearch |
boolean |
|
speed up pattern search by reducing the number of opened inverted lists at the expense of indexing time and disk space. |
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
UnsignedFieldConfig
com.exalead.mercury.mami.indexing.v10.UnsignedFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
bitsForValue |
int |
63 |
Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits. |
blockSize |
int |
8192 |
|
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
SignedFieldConfig
com.exalead.mercury.mami.indexing.v10.SignedFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
bitsForValue |
int |
63 |
Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits. |
blockSize |
int |
8192 |
|
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
DoubleFieldConfig
com.exalead.mercury.mami.indexing.v10.DoubleFieldConfig
- Configuration of a double precision floating point number field.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
bitsForValue |
int |
63 |
Number of bits used to store numerical values. For unsigned numerical fields, the possible values are [0; 2^N - 1], and the field values are stored on N bits. For signed fields (signed integer and double), the possible values are [-2^N, 2^N - 1], and the field values are stored on (N+1) bits. |
blockSize |
int |
8192 |
|
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
precision |
int |
4 |
Number of relevant digits in the decimal part. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
TimeFieldConfig
com.exalead.mercury.mami.indexing.v10.TimeFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
DateFieldConfig
com.exalead.mercury.mami.indexing.v10.DateFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
deltaRefEncodeMultivaluedValues |
boolean |
True |
Delta ref encode multivalued values. |
sortMultivaluedValues |
boolean |
True |
Storing multivalued RAM-based values in an increasing order consumes less RAM. This must be disabled to use some advanced multivalued virtual functions. |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
BinaryFieldConfig
com.exalead.mercury.mami.indexing.v10.BinaryFieldConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
|
A value field must be RAM-based to perform synthesis efficiently. |
multiContext |
boolean |
|
|
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
gzip |
boolean |
|
Activates content compression |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
CategoryFieldConfig
com.exalead.mercury.mami.indexing.v10.CategoryFieldConfig
- Stores hierarchy content. Each term is limited to 1024 bytes.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
ramBased |
boolean |
True |
A value field must be RAM-based to perform synthesis efficiently. |
implementation |
enum(strbtree, fsm) |
strbtree |
Advanced configuration. Internal structure used to store the field dictionary. |
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
HierarchyFieldConfig
com.exalead.mercury.mami.indexing.v10.HierarchyFieldConfig
- Stores hierarchy content. Each term is limited to 1024 bytes.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexSchema (as IndexSchema)
- Attributes:
Name |
Type |
Default value |
Description |
ramBased |
boolean |
True |
A value field must be RAM-based to perform synthesis efficiently. |
implementation |
enum(strbtree, fsm) |
strbtree |
Advanced configuration. Internal structure used to store the field dictionary. |
fieldName |
string |
|
The name of the field. The name of a field can only contain lower-case characters, numbers and underscore. [a-z0-9_]+ |
searchable |
boolean |
|
Allows users to query on this field (using a prefix handler). |
retrievable |
boolean |
|
Allows the content of this field to be retrieved at query time and displayed in the search results. |
dataModelState |
string |
|
Is this index field config managed by a data model?
@enum{null,auto,customized}. If null, this is not related to a data model. If "auto", this is auto-generated by a data model. If "customized", this was auto-generated by a data model and then customized. |
dataModelClass |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelClass that generated this field config. |
dataModelProperty |
string |
|
If dataModelState is "auto" or customized", you will find here the
name of the DataModelProperty that generated this field config. |
multivalued |
boolean |
|
|
version |
int |
|
|
- Nested elements:
Name |
Type |
Description |
fromDataModel |
com.exalead.mercury.mami.indexing.v10.FieldConfig |
If dataModelState is "customized", you will find here the
original object generated by the data model. Use this to easily revert to "auto" state from "customized". |
ListsEncoderConfig |
com.exalead.mercury.mami.indexing.v10.ListsEncoderConfig |
Configuration of the inverted lists encoder. If no configuration is specified, a Rice encoder is used. |
IndexingConfig
com.exalead.mercury.mami.indexing.v10.IndexingConfig
- No documentation for this element.
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
|
- Nested elements:
Name |
Type |
Description |
AnalysisPolicy |
com.exalead.mercury.mami.indexing.v10.AnalysisPolicy |
|
CommitTriggerCondition |
com.exalead.mercury.mami.indexing.v10.CommitTriggerCondition* |
|
ImportPolicy |
com.exalead.mercury.mami.indexing.v10.ImportPolicy |
|
IndexManagementPolicy |
com.exalead.mercury.mami.indexing.v10.IndexManagementPolicy |
|
WriteAttributeSlotConfig |
com.exalead.mercury.mami.indexing.v10.WriteAttributeSlotConfig* |
|
WriteSlotConfig |
com.exalead.mercury.mami.indexing.v10.WriteSlotConfig |
|
FixedThreadsAnalysisPolicy
com.exalead.mercury.mami.indexing.v10.FixedThreadsAnalysisPolicy
- Instantiates a fixed number of analysis threads. Dispatches documents according to their DIDs (Document IDs) and slice.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxRAMConsumptionThreshold |
enum(disabled, enabled, auto) |
enabled |
When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
- Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
- Auto: Commits when the RAM size reaches 2048 MB.'
|
maxRAMConsumptionMB |
int |
2048 |
The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit. |
nbThreads |
int |
4 |
Number of threads to allocate. |
PerSliceAnalysisPolicy
com.exalead.mercury.mami.indexing.v10.PerSliceAnalysisPolicy
- Instantiates an analysis thread for each slice. Dispatches documents according to their slice. Consumes less RAM than the 'FixedThreadsAnalysisPolicy'.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxRAMConsumptionThreshold |
enum(disabled, enabled, auto) |
enabled |
When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
- Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
- Auto: Commits when the RAM size reaches 2048 MB.'
|
maxRAMConsumptionMB |
int |
2048 |
The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit. |
nbThreads |
int |
1 |
Uses N threads per slice. |
SameThreadAnalysisPolicy
com.exalead.mercury.mami.indexing.v10.SameThreadAnalysisPolicy
- Instantiates an analysis thread for each incoming PAPI thread. Each PAPI thread analyzes its tasks synchronously.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxRAMConsumptionThreshold |
enum(disabled, enabled, auto) |
enabled |
When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
- Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
- Auto: Commits when the RAM size reaches 2048 MB.'
|
maxRAMConsumptionMB |
int |
2048 |
The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit. |
AutomaticAnalysisPolicy
com.exalead.mercury.mami.indexing.v10.AutomaticAnalysisPolicy
- Depending on the number of threads specified, CloudView automatically chooses the most efficient analysis policy. Changes made in Analyze require a restart of CloudView, or at least of the indexing server process, to be taken into account.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxRAMConsumptionThreshold |
enum(disabled, enabled, auto) |
enabled |
When reaching the RAM value specified, analysis is stopped and analyzed documents are imported to the index. Then analysis starts again.
- Enabled: Commits when the RAM size reaches the Threshold value specified (by default, 2048 MB).
- Auto: Commits when the RAM size reaches 2048 MB.'
|
maxRAMConsumptionMB |
int |
2048 |
The maximum of non-java RAM the analyzer can allocate. Reaching this limit triggers a commit. |
nbThreads |
int |
|
If not set or set with a multiple of 'nbSlices', it uses the 'PerSliceAnalysisPolicy'. Otherwise, it uses 'FixedThreadsAnalysisPolicy'. |
NumberOfTasksBasedCommitTriggerCondition
com.exalead.mercury.mami.indexing.v10.NumberOfTasksBasedCommitTriggerCondition
- Triggers a commit after the specified No. tasks has been processed. The No. of tasks calculation is executed each time a batch of documents received,
to avoid performance penalties. You might therefore have a bit more than the specified No. of tasks
analyzed.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
nbTasks |
int |
|
The number of tasks |
SizeBasedCommitTriggerCondition
com.exalead.mercury.mami.indexing.v10.SizeBasedCommitTriggerCondition
- Triggers a commit when the Max size (MB) is reached.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxSizeMB |
int |
|
Max size threshold in MB |
RAMUsageCommitTriggerCondition
com.exalead.mercury.mami.indexing.v10.RAMUsageCommitTriggerCondition
- Triggers a commit when RAM usage reaches the limit.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
maxRAMUsageInMB |
int |
|
Max RAM usage in MB |
PeriodicCommitTriggerCondition
com.exalead.mercury.mami.indexing.v10.PeriodicCommitTriggerCondition
- Commits every N seconds after the first push order done after the last commit.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
delayS |
long |
|
Time in seconds between two commits. |
InactivityCommitTriggerCondition
com.exalead.mercury.mami.indexing.v10.InactivityCommitTriggerCondition
- Inactivity-based condition. This condition is triggered when:
- there is no new data for the specified time period
- AND at least the specified No. tasks has been analyzed.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
numberOfTasks |
int |
|
Minimum number of tasks to trigger a commit. |
inactivityTimeS |
long |
|
After N seconds of no indexing activity, it is defined as inactive. |
ParallelImportPolicy
com.exalead.mercury.mami.indexing.v10.ParallelImportPolicy
- For each analysis buffers one generation is created. Analysis buffers are imported in parallel.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
nbThreads |
int |
8 |
The number of parallel import. |
MergedImportPolicy
com.exalead.mercury.mami.indexing.v10.MergedImportPolicy
- All analysis buffers are merged into a single one to be imported in an unique generation.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
StandardIndexManagementPolicy
com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy
- Default index (service + build) runtime configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
gcEveryS |
int |
15 |
Trigger a GC every N seconds. |
- Nested elements:
Name |
Type |
Description |
CommitPolicy |
com.exalead.mercury.mami.indexing.v10.CommitPolicy |
The commit policy used to configured how the index persists its file to disk. |
CompactPolicies |
com.exalead.mercury.mami.indexing.v10.CompactPolicies |
The compact policies used to trigger slots compaction. |
UploadPolicy |
com.exalead.mercury.mami.indexing.v10.UploadPolicy |
The upload policy used to replicate new slots to replicas. |
StandardCommitPolicy
com.exalead.mercury.mami.indexing.v10.StandardCommitPolicy
- Default commit policy
- Parent elements:
com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)
CompactPolicies
com.exalead.mercury.mami.indexing.v10.CompactPolicies
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)
- Attributes:
Name |
Type |
Default value |
Description |
synchronous |
boolean |
|
By default, compaction jobs are asynchronous. If set, compacts will be done synchronously just after imports. |
maxParallelFullCompacts |
int |
|
Limit the number of full compacts in parallel, can be useful when you don't have too much disk space available. 0 means no limit. |
type |
enum(mmap, pagecache) |
mmap |
Specifies which I/O mode is used while compacting. ( Value can be null or one of
) |
maxPageCacheSizeMB |
int |
32 |
If the policy uses the PageCache mode, it specifies the max cache size. |
pageCachePageSizeKB |
int |
8 |
If the policy uses the PageCache mode, it specifies the page size. |
priorityCompactThreshold |
int |
48 |
When compacting a slot gen0-gen1, consider as a priority compact if gen1-gen0 < priorityCompactThreshold. Default is 48. (0: disabled) |
lowPriorityCompactNbThreads |
int |
2 |
Number of threads to use for a compact having low priority (0: all available threads). |
highPriorityCompactNbThreads |
int |
|
Number of threads to use for a compact having high priority (0: all available threads). |
- Nested elements:
Name |
Type |
Description |
AutoCompactPolicy |
com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy* |
Specifies the auto-compact policies. |
NumberOfSlotsBasedCompactPolicy
com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy
- Compaction policy based on a fixed number of slots for a given number of generations.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
- Attributes:
Name |
Type |
Default value |
Description |
component |
string |
|
|
arity |
int |
4 |
Specifies the number of slots of the same length required to compact. |
maxSlotSizeMb |
long |
5000 |
If a slot reaches this size; it will never be used by the next automatic compaction processes. |
- Nested elements:
Name |
Type |
Description |
FullCompactPolicy |
com.exalead.mercury.mami.indexing.v10.FullCompactPolicy |
|
MaxSizeFullCompactPolicy
com.exalead.mercury.mami.indexing.v10.MaxSizeFullCompactPolicy
- A FullCompactPolicy that compacts all slots into one whenever the "tail" of small slots
exceeds a certain ratio of the large first slot. This policy is appropriate when auto-compacts are restricted to slots under a certain size
for performance reasons. In this case, a full optimization can occasionally be triggered to purge the deletes. If not, the deletes occurring in later slots would never be deleted, incurring performance
costs at query-time and extra disk space consumption.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy (as AutoCompactPolicy)
com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy (as LowLatencyCompactPolicy)
com.exalead.mercury.mami.indexing.v10.NoCompactPolicy (as NoCompactPolicy)
com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy (as NumberOfSlotsBasedCompactPolicy)
com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy (as SlotsLogSizeBasedCompactPolicy)
com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy (as SlotsSizeBasedCompactPolicy)
- Attributes:
Name |
Type |
Default value |
Description |
percentage |
int |
100 |
Minimum percentage to launch a full compaction. Compacts all slots into one whenever the "tail" of small slots
exceeds a certain percentage of the large first slot. Eg: with percentage=100, when cumulated size of all slots except biggest is higher than size of the biggest slot, a full compact is triggered. |
minSlots |
int |
2 |
Minimum number of slots before triggering a full compact. |
ArityBasedFullCompactPolicy
com.exalead.mercury.mami.indexing.v10.ArityBasedFullCompactPolicy
- A FullCompactPolicy that compacts all slots into one whenever the "tail" of slots with smaller arities
exceeds together a certain arity. The idea is that the arity-based policy guarantees occasional full-compaction, but the time interval
between full-compaction increases exponentially. This add-on policy caps the increase at a certain arity, and schedules full-compacts at regular intervals afterwards. This policy is appropriate when auto-compacts are managed per generation-arity. In this case, a full optimization can occasionally be triggered to purge the deletes. If not, the deletes occurring in later slots would never be deleted, incurring performance
costs at query-time and extra disk space consumption.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.AutoCompactPolicy (as AutoCompactPolicy)
com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy (as LowLatencyCompactPolicy)
com.exalead.mercury.mami.indexing.v10.NoCompactPolicy (as NoCompactPolicy)
com.exalead.mercury.mami.indexing.v10.NumberOfSlotsBasedCompactPolicy (as NumberOfSlotsBasedCompactPolicy)
com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy (as SlotsLogSizeBasedCompactPolicy)
com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy (as SlotsSizeBasedCompactPolicy)
- Attributes:
Name |
Type |
Default value |
Description |
maxArity |
int |
256 |
Whenever the long tail total arity reaches maxArity, a full compact is scheduled. The "long tail" are the slots whose span has an arity inferior to this parameter. This is generally a multiple of the auto-compact Arity policy arity parameter. |
minSize |
long |
|
Slots below this size are considered neglectable. |
SlotsSizeBasedCompactPolicy
com.exalead.mercury.mami.indexing.v10.SlotsSizeBasedCompactPolicy
- Compaction policy based on size that produces slots with similar size. When N consecutive slots have a size below targetSizeForCompactionMB, it performs a compaction if:
- N is at least minArity AND
- The N+1 slot makes the size above targetSizeForCompactionMB OR
- The size is above minSizeForCompactionMB
- Parent elements:
com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
- Attributes:
Name |
Type |
Default value |
Description |
component |
string |
|
|
targetSizeForCompactionMB |
int |
200 |
Targeted size for a compacted slot. |
minSizeForCompactionMB |
int |
50 |
Minimum size required to compact. |
minArity |
int |
2 |
Minimum number of slots required to compact. |
- Nested elements:
Name |
Type |
Description |
FullCompactPolicy |
com.exalead.mercury.mami.indexing.v10.FullCompactPolicy |
|
SlotsLogSizeBasedCompactPolicy
com.exalead.mercury.mami.indexing.v10.SlotsLogSizeBasedCompactPolicy
- A CompactPolicy that tries to compact slots into levels of exponentially increasing size,
where each level has fewer slots than the value of the compact factor. Whenever extra slots (beyond the compact factor upper bound) are encountered,
all slots within the level are compacted.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
- Attributes:
Name |
Type |
Default value |
Description |
component |
string |
|
|
compactFactor |
int |
10 |
Determines how often slots are compacted. With smaller values, less RAM is used while indexing,
and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices
are slower, indexing is faster. Thus larger values (greater than 10) are best for batch index creation, and smaller values
(lower than 10) for indices that are interactively maintained. |
minSize |
long |
1048576 |
A size setting type which sets the minimum size for the lowest level slots. Slots below this size are considered to be on the same level (even if they vary drastically in size)
and will be merged whenever there are mergeFactor for them. This effectively truncates the "long tail" of small slots that would otherwise be created into a single level. If you set this too large, it can greatly increase the merging cost during indexing (if you flush many
small slots). |
maxSize |
long |
9223372036854775807 |
A size setting type which sets the largest slot that may be merged with other segments. |
- Nested elements:
Name |
Type |
Description |
FullCompactPolicy |
com.exalead.mercury.mami.indexing.v10.FullCompactPolicy |
|
LowLatencyCompactPolicy
com.exalead.mercury.mami.indexing.v10.LowLatencyCompactPolicy
- Compacts when the size of all small slots is above the average large slot size,
or when the number of slots is above nbLargeSlots + maxNbSmallSlots.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
- Attributes:
Name |
Type |
Default value |
Description |
component |
string |
|
|
nbLargeSlots |
int |
8 |
The number of large slots to keep. |
maxNbSmallSlots |
int |
8 |
Maximum number of small slots allowed. As soon as this limit is reached, small slots are compacted together. |
gatherSmallsAtTheEnd |
boolean |
True |
|
contiguousCompact |
boolean |
|
|
- Nested elements:
Name |
Type |
Description |
FullCompactPolicy |
com.exalead.mercury.mami.indexing.v10.FullCompactPolicy |
|
NoCompactPolicy
com.exalead.mercury.mami.indexing.v10.NoCompactPolicy
- Compact policy that does not perform any compact.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.CompactPolicies (as CompactPolicies)
- Attributes:
Name |
Type |
Default value |
Description |
component |
string |
|
|
- Nested elements:
Name |
Type |
Description |
FullCompactPolicy |
com.exalead.mercury.mami.indexing.v10.FullCompactPolicy |
|
StandardUploadPolicy
com.exalead.mercury.mami.indexing.v10.StandardUploadPolicy
- Default upload policy
- Parent elements:
com.exalead.mercury.mami.indexing.v10.StandardIndexManagementPolicy (as StandardIndexManagementPolicy)
- Attributes:
Name |
Type |
Default value |
Description |
waitBetweenSwitchesS |
int |
|
If strictly positive, all slices switch to a generation sequentially, and we wait this time in seconds between two slices. This spreads the temporary memory consumption to avoid large memory spikes and swap out. |
WriteAttributeSlotConfig
com.exalead.mercury.mami.indexing.v10.WriteAttributeSlotConfig
- Write attribute slot configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
type |
enum(directio, sequential) |
directio |
Access type for writing the new slots. Value can be null or one of |
groupId |
int |
|
Specifies which attribute group store this access configuration applies to. |
WriteSlotConfig
com.exalead.mercury.mami.indexing.v10.WriteSlotConfig
- Write slot configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexingConfig (as IndexingConfig)
- Attributes:
Name |
Type |
Default value |
Description |
type |
enum(directio, sequential) |
sequential |
Access type for writing the new slots. Value can be null or one of |
IndexRuntimeConfigList
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList
- Lists all index runtime config list.
- Attributes:
Name |
Type |
Default value |
Description |
version |
long |
|
|
- Nested elements:
Name |
Type |
Description |
CacheConfig |
com.exalead.mercury.mami.indexing.v10.CacheConfig* |
Lists PageCache configurations |
IndexRuntimeConfig |
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig* |
Lists runtime configurations |
CacheConfig
com.exalead.mercury.mami.indexing.v10.CacheConfig
- PageCache configuration.
Warning: The index page cache is limited to 32000 files in the index directory. If you get an error like "FileRAM: too many cached files (c_max_files=32767)", it means that the limit has been crossed and you should set a more aggressive compact policy.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList (as IndexRuntimeConfigList)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
The cache ID. |
cacheSizeMB |
int |
256 |
Maximum cache size in MB. |
pageSizeKB |
int |
8 |
Page size in KB. |
maxSimultaneousIOOperations |
int |
32 |
Specifies the max number of simultaneous I/O. |
IndexRuntimeConfig
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig
- Index runtime configuration for an instance of an index slice. Use key values arguments to provide custom configuration keys.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfigList (as IndexRuntimeConfigList)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
|
newGenerationBandwidthLimitKB |
int |
|
|
compactBandwidthLimitKB |
int |
|
|
ramBasedAttrGroupLoadPolicy |
enum(rebuild, copyAndPatch) |
copyAndPatch |
Value can be one of |
- Nested elements:
Name |
Type |
Description |
AttributeGroupAccess |
com.exalead.mercury.mami.indexing.v10.AttributeGroupAccess* |
|
FieldRuntimeConfig |
com.exalead.mercury.mami.indexing.v10.FieldRuntimeConfig* |
|
QueryAutocacheConfig |
com.exalead.mercury.mami.indexing.v10.QueryAutocacheConfig |
|
ReplicationConfig |
com.exalead.mercury.mami.indexing.v10.ReplicationConfig |
|
WarmupConfig |
com.exalead.mercury.mami.indexing.v10.WarmupConfig |
|
AttributeGroupAccess
com.exalead.mercury.mami.indexing.v10.AttributeGroupAccess
- Configuration specifying how to access the attribute group at runtime.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
- Attributes:
Name |
Type |
Default value |
Description |
groupId |
string |
|
Specifies which attribute group store this access configuration applies to. |
runType |
enum(mmap, pagecache, direct, RAMRow, RAMColumnDense) |
mmap |
Specifies how the attribute group should be accessed at runtime. |
preload |
boolean |
|
For RAM-based access type, specifies if the attribute group should be loaded in RAM at startup instead of at access time. |
mlock |
boolean |
|
For RAM-based access type, specifies if the attribute group should be locked in RAM. Preventing it being moved to the swap area. |
cacheId |
string |
|
For pagecache I/O type, specifies the cache ID. |
FieldRuntimeConfig
com.exalead.mercury.mami.indexing.v10.FieldRuntimeConfig
- Configuration specifying the index field at runtime.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
The index field name. |
dictType |
enum(mmap, pagecache) |
mmap |
Specifies the I/O mode used to load the dictionary part of an index field. ( Value can be one of
) |
type |
enum(mmap, pagecache) |
mmap |
Specifies the I/O mode used to load the component. ( Value can be one of
) |
preload |
boolean |
|
Should the field be preloaded? This will force the field to be loaded in RAM at startup. |
mlock |
boolean |
|
Should the field be locked in RAM. |
cacheId |
string |
|
If PageCache is used, it specifies the cache ID. |
QueryAutocacheConfig
com.exalead.mercury.mami.indexing.v10.QueryAutocacheConfig
- Query #autocache configuration.
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
- Attributes:
Name |
Type |
Default value |
Description |
totalCacheSizeMB |
int |
20 |
Maximum cache size in MB (cross queries). |
queryCacheSizeMB |
int |
5 |
Maximum cached query size. |
maxCachedQueries |
int |
20 |
Number of queries cached. |
ReplicationConfig
com.exalead.mercury.mami.indexing.v10.ReplicationConfig
- Slice replication configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
- Nested elements:
Name |
Type |
Description |
AttributeReplicationConfig |
com.exalead.mercury.mami.indexing.v10.AttributeReplicationConfig* |
Configures the direction usage in attribute replication. |
FieldReplicationConfig |
com.exalead.mercury.mami.indexing.v10.FieldReplicationConfig* |
Configures the direction usage in field replication. |
AttributeReplicationConfig
com.exalead.mercury.mami.indexing.v10.AttributeReplicationConfig
- Attribute's replication configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.ReplicationConfig (as ReplicationConfig)
- Attributes:
Name |
Type |
Default value |
Description |
groupId |
string |
|
Group id of the attribute to configure |
type |
enum(directio, sequential) |
directio |
Access type Value can be null or one of |
FieldReplicationConfig
com.exalead.mercury.mami.indexing.v10.FieldReplicationConfig
- Index field replication configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.ReplicationConfig (as ReplicationConfig)
- Attributes:
Name |
Type |
Default value |
Description |
name |
string |
|
Name of the field to configure. |
type |
enum(directio, sequential) |
directio |
Access type Value can be null or one of |
dictType |
enum(directio, sequential) |
directio |
Access type for the dictionary Value can be null or one of |
WarmupConfig
com.exalead.mercury.mami.indexing.v10.WarmupConfig
- Index warmup configuration
- Parent elements:
com.exalead.mercury.mami.indexing.v10.IndexRuntimeConfig (as IndexRuntimeConfig)
- Attributes:
Name |
Type |
Default value |
Description |
warmupQueryFile |
string |
|
Warmup list of single queries |
maxWarmupDurationS |
int |
5 |
Maximum time for warmup. Open the index after and prints a warning indicating which line number has been reached |
BuildGroupConfig
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig
- Configuration of a build group. A "Build Group" is defined by references to sub-configurations defined
in other MAMI:
- Analysis (how documents are processed).
- Index Builder (how indexing jobs are scheduled and managed).
- Index Schema (schema of the index slices being built).
- Task Queue (how input document processing tasks are queued before jobs).
- Similar Document (optional)
Several build groups may share some or all their sub-configuration. In most configuration, all build groups would share the same index schema configuration. When built with the same schema, index slices built by different build groups can be queried
together (see the Search MAMI).
- Attributes:
Name |
Type |
Default value |
Description |
buildGroup |
string |
|
Name of the build group. This name should be unique. |
dataModel |
string |
|
Name of the data model. |
indexingConfig |
string |
|
Name of an indexing configuration (IndexingConfig element in Indexing MAMI). |
- Nested elements:
Name |
Type |
Description |
DIHConfig |
com.exalead.mercury.mami.deploy.v10.DIHConfig |
|
DidAllocationPolicy |
com.exalead.mercury.mami.deploy.v10.DidAllocationPolicy |
|
DocumentCacheConfig |
com.exalead.mercury.mami.deploy.v10.DocumentCacheConfig |
|
PrecomputedThumbnailsConfig |
com.exalead.mercury.mami.deploy.v10.PrecomputedThumbnailsConfig |
|
ScratchHook |
com.exalead.mercury.mami.deploy.v10.ScratchHook* |
|
SlicePartioningPolicy |
com.exalead.mercury.mami.deploy.v10.SlicePartioningPolicy |
|
DIHConfig
com.exalead.mercury.mami.deploy.v10.DIHConfig
- A DIHConfig is a set of parameters for a DIH.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
compactArity |
int |
4 |
Number of consecutive slots to trigger a compact. |
nbBloomBitsPerElement |
int |
20 |
Number of bits per elements in the DIH's StrBTree's bloom filter. |
nbElementsInLeaf |
int |
100 |
Number of entries in each of the DIH's StrBTree's leaves. |
readMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
mmap |
Read mode of the DIH's StrBTree, except for enumeration. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
enumMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
mmap |
Read mode of the DIH's StrBTree, for enumeration. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
compactMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
mmap |
Read mode of the DIH's StrBTree, for compact. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
ContiguousDidAllocationPolicy
com.exalead.mercury.mami.deploy.v10.ContiguousDidAllocationPolicy
- Base-class specifying how DIDs (Document IDs) are assigned to the documents.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
increasing |
boolean |
True |
Assign DIDs in an increasing order. |
startingPoint |
int |
|
Start point of the allocation. By default, the first DID will have value '1'. |
endingPoint |
nullableint |
|
End point of the allocation. By default, it will be Integer.MAX_VALUE if increasing or 1 if decreasing. |
DocumentCacheConfig
com.exalead.mercury.mami.deploy.v10.DocumentCacheConfig
- Configuration for the document cache.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
path |
string |
|
Location of the document cache on the filesystem. Unless otherwise specified, the document cache is located in the
"cache" subdirectory of the build group. |
compactArity |
int |
4 |
Number of consecutive slots to trigger a compact. |
nbBloomBitsPerElement |
int |
10 |
Number of bits per element in the document cache StrBTree bloom filter. |
nbElementsInLeaf |
int |
20 |
Number of entries in each of the document cache StrBTree leaves. |
readMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
auto |
Read mode of the document cache StrBTree, except for enumeration. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
enumMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
auto |
Read mode of the document cache StrBTree, for enumeration. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
compactMode |
enum(auto, direct, mmap, mmap_mlock, mmap_mload, pagecache, random, sequential) |
auto |
Read mode of the document cache StrBTree, for compact. Value can be null or one of
- auto
- direct
- mmap
- mmap_mlock
- mmap_mload
- pagecache
- random
- sequential
|
diskCompressionAlgorithm |
enum(none, fastlz, gzip, lcs, lz4) |
fastlz |
Algorithm to compress the document cache on disk. Value can be null or one of |
temporaryFilesCompressionAlgorithm |
enum(none, fastlz, gzip, lz4) |
fastlz |
Algorithm to compress the temporary files on disk. Value can be null or one of |
PrecomputedThumbnailsConfig
com.exalead.mercury.mami.deploy.v10.PrecomputedThumbnailsConfig
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
computeThreads |
int |
4 |
|
FSPrecomputedThumbnailsConfig
com.exalead.mercury.mami.deploy.v10.FSPrecomputedThumbnailsConfig
- Deprecated)
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
computeThreads |
int |
4 |
|
GDSPrecomputedThumbnailsConfig
com.exalead.mercury.mami.deploy.v10.GDSPrecomputedThumbnailsConfig
- Deprecated)
- No documentation for this element.
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
computeThreads |
int |
4 |
|
ramBufferSizeMB |
long |
16 |
|
readMode |
enum(normal, direct) |
direct |
Value can be null or one of |
ScratchHook
com.exalead.mercury.mami.deploy.v10.ScratchHook
- A Hook to plug custom exa code on BuildGroup scratch
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
- Attributes:
Name |
Type |
Default value |
Description |
classId |
string |
|
The specified class must implement
the {@code com.exalead.mercury.indexing.CustomScratchHook} Exascript interface. |
- Nested elements:
Name |
Type |
Description |
KeyValue |
exa.bee.KeyValue* |
|
BasicSlicePartioningPolicy
com.exalead.mercury.mami.deploy.v10.BasicSlicePartioningPolicy
- Basic partionning function based on a URL hash and a '%' (modulo).
- Parent elements:
com.exalead.mercury.mami.deploy.v10.BuildGroupConfig (as BuildGroupConfig)
StringValue
exa.bee.StringValue
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as annotationsToCopy)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as classes)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as classes)
com.exalead.indexing.analysis.v10.HTMLCSSExtractor (as ids)
com.exalead.indexing.analysis.v10.HTMLCSSSelector (as ids)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as idsAndClassesToIgnore)
com.exalead.indexing.analysis.v10.HTMLRelevantContentExtractor (as idsAndClassesToKeep)
com.exalead.indexing.analysis.v10.ConcatValues (as inputContexts)
com.exalead.indexing.analysis.v10.ContentCleanup (as inputContexts)
com.exalead.indexing.analysis.v10.CoordinatesFormatter (as inputContexts)
com.exalead.indexing.analysis.v10.DebugProcessor (as inputContexts)
com.exalead.indexing.analysis.v10.LanguageDetector (as inputContexts)
com.exalead.indexing.analysis.v10.LanguageSetter (as inputContexts)
com.exalead.indexing.analysis.v10.MultiContextCSVEncoder (as inputContexts)
com.exalead.indexing.analysis.v10.MultiContextDocumentProcessor (as inputContexts)
com.exalead.indexing.analysis.v10.NumericalFormatter (as inputContexts)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as inputContexts)
com.exalead.indexing.analysis.v10.RemoveContexts (as inputContexts)
com.exalead.indexing.analysis.v10.StringHash (as inputContexts)
com.exalead.indexing.analysis.v10.StringHash32 (as inputContexts)
com.exalead.indexing.analysis.v10.StringHash64 (as inputContexts)
com.exalead.indexing.analysis.v10.StringTransform (as inputContexts)
com.exalead.indexing.analysis.v10.UTF8Checker (as inputContexts)
com.exalead.indexing.analysis.v10.ValueSelector (as inputContexts)
com.exalead.indexing.analysis.v10.MimeCondition (as mimes)
com.exalead.indexing.analysis.v10.StandardPartsMerger (as partSpecificContexts)
com.exalead.indexing.analysis.v10.RemoteMOTAPIDocumentProcessor (as targetInstances)
com.exalead.indexing.analysis.v10.SimilarStringToPart (as values)
com.exalead.indexing.analysis.v10.UniformRandomContextGenerator (as values)
com.exalead.indexing.analysis.v10.ZipfRandomContextGenerator (as values)
- Attributes:
Name |
Type |
Default value |
Description |
value |
string |
|
|
KeyValue
exa.bee.KeyValue
- No documentation for this element.
- Parent elements:
com.exalead.indexing.analysis.v10.ConvertTextExtractor (as ConvertTextExtractor)
com.exalead.indexing.analysis.v10.CustomDocumentProcessor (as CustomDocumentProcessor)
com.exalead.indexing.analysis.v10.CustomSemanticProcessor (as CustomSemanticProcessor)
exa.bee.KeyValue (as KeyValue)
com.exalead.indexing.analysis.v10.ReplaceValues (as ReplaceValues)
com.exalead.mercury.mami.deploy.v10.ScratchHook (as ScratchHook)
com.exalead.indexing.analysis.v10.SetDefaultValue (as SetDefaultValue)
com.exalead.indexing.analysis.v10.CustomPublisher (as config)
- Attributes:
Name |
Type |
Default value |
Description |
key |
string |
|
The name of the key |
value |
string |
|
|
type |
string |
|
|
description |
string |
|
|
- Nested elements:
Name |
Type |
Description |
KeyValue |
exa.bee.KeyValue* |
|
|