Extracting by XML Elements Method

You can use the XML Elements to perform a by-name node (or attribute) selection and push their value as metas. The connector selects nodes or attributes based on the include and exclude lists.

The following rules apply:

  • Children of included nodes are pushed as separate metas.

  • Children of excluded nodes are completely ignored (as well as their attributes).

You can configure the parameters:

  • Include elements: list of element mappings.

    • Meta: name of meta associated with the node content

    • Element: name of the XML element to extract. This can also represent an attribute when specifying a string in the form node@attribute. For example: book@title.

  • Exclude elements: list of node names to skip

When no Meta is specified in Include elements, the connector uses the name of the element or attribute to infer the meta name. For example:

  • If you set Element to book, the node text is pushed in the book meta.

  • If you set Element to book@title, the attribute text is pushed in the title meta.

See Also
Property Descriptions

Context: This procedure describes how to extract by Element name

  1. In Configuration, keep the global configuration default settings.
  2. Expand Root Paths, and click Add item to add a path.
  3. In Root path, enter the file or folder path to index.

    For example, /data/<USER>/xml-sample-data/contacts

  4. Expand Include elements, click Add item and enter the element mapping to include.
    1. Enter the name of the XML node to extract in Element. For example, person@name.
    2. Enter the meta name associated with the node content in Meta. For example, title.
  5. Click Add item and repeat Step 4 for each mapping required.
  6. Enter the list of element mappings to exclude in Exclude elements.

    For this example, you can leave the settings as is.

  7. Click Apply to apply changes to the configuration.

    The connector configuration is complete. You can now scan and index the documents.