Configuring the CSV Connector

This section covers how to create and correctly configure your CSV connector in Exalead CloudView.

This section implies that the connector has already been added. See Creating a Standard Connector.

This task shows you how to:

Configure the Connector

  1. On the Administration Console home page , under Connectors, select your CSV connector.
  2. Specify the filesystems paths to crawl
  3. Configure the connector’s main parameters as follows:

    Parameter

    Description

    Encoding

    Specifies the file encoding to use, by default, UTF-8

    Column delimiter

    Specifies the column delimiter to use. By default, commas ',' are used.

    Escaping character

    Specifies the character used to escape special characters. You can use it before the quoting character, the column delimiter, and the escaping character itself. Use \0 if you do not want to escape characters.

    Quoting character

    Specifies the characters used to begin and end a string. Use \0 if you do not want to index quoting characters.

    Treat first row as header

    Specifies if the columns of the first row of your CSV file are column headers. These column headers are used as meta names. If you do not have headers in your CSV file, clear this option. Column names are then imported as 0, 1, 2, 3. You can customize these names using the Custom column names option.

    File extensions

    Specifies the file extensions to process, by default, csv. Separate extensions with spaces.

    Filesystem paths

    Enter the filesystem path for the csv file or directory to crawl. You can prefix paths by:

    • data:// OR /data – relative to the <DATADIR>
    • resource:// – relative to the NGRESOURCEPATH (see definition on <DATADIR>/bin/ngstart.env)
    • kit:// – relative to the <INSTALLDIR>
    • file:// – absolute path, for example, file://temp/csv-data/
    • run:// – relative to the NGRUNDIR (see definition on <DATADIR>/bin/ngstart.env)
    • config:// – relative to the NGCONFIGDIR (see definition on <DATADIR>/bin/ngstart.env)
    • Click Add item for each path you want to crawl.

  4. Configure the optional parameters:

    Parameter

    Description

    Max. row

    Specifies the maximum number of rows to crawl

    Prefix

    Specifies the prefix to add to each meta.

    Meta normalization

    Normalizes meta names by converting names to lowercase and spaces to underscores.

    Public security

    Pushes the Everybody security token with the document.

    Custom URI

    Specifies the separator and the list of columns to concatenate and build the URI. Beware, in case of multivalued metas, only the first value is used.

    If empty, the URI is generated automatically as the file name plus the row number.

    Columns selection

    By default, the connector crawls all CSV file columns unless you specify a filter. You can specify a filter here to exclude or include columns, for example, column D.

    Note: If the Treat first row as header option is selected, you can click in the Columns field to get column name suggestions.

    Custom column names

    To rename the specified Column name to a Meta name that is used in the Exalead CloudView index.

    Group by

    The set of columns included in the document URI determines how the rows are accumulated. For more information, see Aggregate Column Values.

    The Distinct option ensures that if column values are identical, the value appears only once.

    Push API filter

    Create an entry for each filter to apply. For more information, see Push API filters in the Exalead CloudView Connector Programmer's Guide

Check the CSV Config

  1. From the Exalead CloudView home page, click the connector name.
  2. Click Check config.
  3. Click Apply to save and apply the configuration changes.
    You are now ready to scan and index your documents.