Property Descriptions

This page discusses:

Global Configuration Properties

The following global configuration properties are available for the XML simple and the XML advanced connectors.

The properties flagged with [R] are required.

Type

Property

Value

Filesystem

File extensions

Specifies the file extensions to include in the crawl and index processes.

Recursive

Indexes the subdirectories of the path.

Default is true.

Index names

[R] Indexes non-matching extensions.

Default is false.

Max. input size

[R] Specifies the maximum size of files.

Default is 10MB.

Push folders as documents

Pushes folders as documents.

Max. document queued

Specifies the maximum number of documents in queue.

Default is 1000.

Max. folder queued

Specifies the maximum number of folders in queue.

Default is 1000.

No. pipeline document thread

Specifies the number of threads handling documents.

Default is 4.

No. pipeline folder thread

Specifies the number of threads handling folders.

Default is 4.

XML

Normalize meta names

Normalizes meta names (meta names are put in lowercase and spaces are replaced by an underscore).

Default is false.

Incremental

Pushes only new files from the filesystem.

Default is false.

Push XML as meta

(for XML simple only)

Pushes the XML document (or the chunk if the document is split) as a meta named xmlbody.

Default is false.

Verbose

Enables the verbose mode.

Default is false.

Entry point id

(for XML advanced only)

[R] Specifies the first processor of the chain.

The connector needs to know where the chain of processors begins so an Entry point id has to be supplied.

Extraction Properties

The following extraction properties are available for the Simple XML connector. Some properties are located in the global configuration section.

The properties flagged with [R] are required.

Property

Value

Root Paths

Specifies the list of files or folders to index.

You can prefix paths by:

  • data:// OR /data – relative to the <DATADIR>
  • resource:// – relative to the NGRESOURCEPATH (see definition on <DATADIR>/bin/ngstart.env)
  • kit:// – relative to the <INSTALLDIR>
  • file:// – absolute path, for example, file://temp/csv-data/
  • run:// – relative to the NGRUNDIR (see definition on <DATADIR>/bin/ngstart.env)
  • config:// – relative to the NGCONFIGDIR (see definition on <DATADIR>/bin/ngstart.env)

Root path

Specifies either a file or a folder.

Split

Determines whether to consider every first-level child as a separate document.

Default is false.

Stylesheet

Specifies the XSL Transformation file to use.

XPath after XSLT

Enables XPath extraction on XSL transformation result

Default is false.

Element extraction after XSLT

Enables element name extraction on XSL transformation result.

Default is false.

Include elements

For extraction by Element name

Specifies a list of elements to include:

  • Element [R]: name of the XML node to include
  • Meta: name of the associated meta

XPaths

For extraction by XPath.

Specifies a list of XPath expressions. Each result of XPath extraction is associated with a meta:

  • XPath [R]: an XPath expression
  • Meta [R]: name of the associated meta

Exclude elements

Specifies a list of elements to exclude.