File extensions
|
This is the text version of the
Configuration tab
Filename extensions section.
|
Recursive
|
Indexes sub-folders recursively. If unchecked, only the files
in the defined top root paths will be indexed. Enabled by default.
|
Enable ACL handling
|
Fetches security tokens associated with files.
- On Unix, it will fetch group/user security mode and, if
available, POSIX ACLs.
- On Windows, it will fetch security SID.
|
Keep local ACL
|
Only applies to Windows, and if
Enable ACL handling is enabled.
Fetches all security SID, including well-known local security
SID such as "Local System"
|
Skip directory symlinks
|
Only applies to Unix/Linux.
Skips symbolic links to directories (do not follow them) to
avoid possible infinite loops.
|
Default text encoding
|
If specified, defines a global default encoding for text files
on this connector. This encoding may be used to index raw text files whose
encoding is unknown.
|
Enable containers support
|
If specified, files which are containers (i.e., ZIP files, TAR
files, PST files, EML files, etc.) will be processed as if they were regular
folders.
|
Max. container depth
|
When containers support is enabled, sets the maximum recursive
depth inside containers.
Example:
- A level of 1 will only allow file scanning within containers
in the filesystem source.
- A level of 2 will also allow to scan containers inside
containers (a ZIP file in a ZIP file, for example) in the filesystem source.
- A level of 3 will allow one further depth (for example, an
attachment inside a mail inside a PST file).
|
Max. documents per container
|
When containers support is enabled, set the maximum number of
files to be processed inside a single container (inside a ZIP file, for
example).
For example, considering the following structure:
foo.zip : a ZIP containing 80 files, and 10
ZIP files:
file1.doc
file2.doc
...
file80.doc
archive1.zip : a ZIP containing 50 files
archive2.zip : a ZIP containing 50 files
...
archive10.zip : a ZIP containing 50 files
Setting this value to "100" will allow to index all 80 files
within
foo.zip , and all 50 files within
archive1.zip , all 50 files within
archive2.zip , etc. The total number of files
indexed will be equal to 580 (80 files at top level, and 50 files for each 10
archives).
|
Max. documents per container total
|
When containers support is enabled, set the maximum number of
files to be processed overall, in all recursed container depth.
In the previous example, setting this value to "100", will
allow to index all 80 files within
foo.zip , but the indexing will stop after
indexing 20 files within
archive1.zip file. Other archives will not be
indexed at all.
|
CPath stop MIME filter
|
Define the MIME types of containers which are to be considered
as documents as a whole. For example, msg or eml mail files are containers,
because they may contain attachments or attached files themselves.
Note:
If this parameter is empty, no restriction or exclusion is applied.
|
Container MIME filter
|
Select the MIME types of files which are to be considered as containers.
Note:
If this parameter is empty, no restriction or exclusion is applied.
|
Item MIME Filter
|
Select the MIME types of files to be scanned in a container.
Note:
If this parameter is empty, no restriction or exclusion is applied.
|
Item extensions
|
Define the extensions of files to be scanned in a container.
|
Index names
|
Push empty documents for all the files which have not been
accepted because of filters. This allows to index filenames of files whose
content should not be indexed.
|
Max. input size
|
Maximum file input size allowed.
Specify any SI byte unit (1000KB, 100MB, 1GB and so on). If no
unit is specified, it uses bytes.
|
Max. container fetch size
|
Maximum container size allowed for fetch (preview, data
fetch).
Specify any SI byte unit (1000KB, 100MB, 1GB and so on). If no
unit is specified, it uses bytes.
|
Convert address
|
External Convert address. Should be empty to dispatch to
default Converter.
|
Container timeout
|
When opening a container using a remote Convert service,
define the timeout when opening the file. For example, a large PST file may
take several minutes to be opened.
|
Container fetch timeout
|
When opening a container using a remote Convert service,
define the timeout when fetching a sub-item.
|
Truncate files pattern
|
When a file is larger than the allowed size set in
Max. input size, truncate the file
rather than discarding it. This option is compatible only with raw text files,
or HTML (not Office files or PDF, for example).
|
Push folders as documents
|
Push an empty document for all folders found. Disabled by
default.
|
Never send delete
|
Never send any delete remotely, even if the file is no longer
present locally. Disabled by default.
|
Delete document on error
|
Define the strategy to be adopted when a document cannot be
updated after a first indexing (if the file become unreadable, busy, or the
access rights do not allow to access it anymore).
- Keep: keep the entry in the remote
index as it was before
- Delete: remove the entry in the remote
index
- Empty: create an empty file in the
remote index
|
Max. document queued
|
Maximum number of documents to be added in the document
processing queue (in memory).
|
Max. folder queued
|
Maximum number of folders to be added in the folder processing
queue.
|
No. pipeline document thread
|
Number of background threads processing the document queue,
that is, reading documents to be indexed and sending them to the remote server.
|
No. pipeline folder thread
|
Number of background threads processing the folder queue, that
is, scanning locally folders to find all files and subfolders to be indexed.
|
Max. processing size
|
Limits the total amount of memory which can be used when
processing the document queue. If the limit is reached, other document threads
will be blocked until the memory is free.
|
Root Paths (N)
|
Text version of the
Configuration tab
Filesystem paths
|
Filename include rules (N)
|
Text version of the
Configuration tab
Include rules
|
Filename exclude rules (N)
|
Text version of the
Configuration tab
Exclude rules
|
Main part MIME filters (N)
|
Used to aggregate and dedup items within a mail container. For
example, this allows to index the HTML part of a mail, and ignore the text
part.
- Parent MIME filter: list of MIME
filters of mail containers
- Main part MIME filter: list of MIME
types of body part(s) inside a mail
- Main part dedup MIME Fiter: list of
equivalent MIME types to be deduped
- Main part dedup max. count: maximum
number of documents to be deduped
- Add child links: adds meta-data linking
sub-child (such as attachments)
- Merge in parent: merges bodies in the
document
- Merge container metas: merges
container's metadata in main document
|
Filename MIME rules (N)
|
A set of rules allowing to set the MIME type, and optionally
the encoding, of files matching the given extension/filename filter.
- Filter: the space-separated list of
filename extensions matching (or the regular expression, if the checkbox is
checked)
- Regular expression: if checked, the
filter is a regular expression matching the filename
- MIME type: the MIME type to set
- Encoding: the encoding to set,
optionally
- Hint only: if checked, the MIME type is
not forced
|
PushAPI filters (N)
|
The PushAPI pipeline configuration. Documents being added in
the PushAPI pipeline will go through defined filters, starting by the first
filter defined, until the last one, before being injected to the PushAPI.
|