Configuring the XML Element Processor

This processor performs a by-name node (or attribute) selection and push their values as metas. The connector selects nodes and attributes according to an include and an exclude list.

The following rules apply:

  • Children of included nodes are pushed as separate metas,

  • Children of excluded nodes are completely ignored (as well as their attributes).

  1. Expand XML element processor, and click Add item.
  2. Select Concatenate text element, to prevent the text to be split when the XML code contains predefined character entities (like &, >, ", etc.).

    For example, if you index the tag: <THEME_FR>Global Procurement &amp; Supply chain</THEME_FR> without the Concatenate text element option, you get several metadata instead of a single one. Instead of a single metadata: THEME_FR = Global Procurement &amp; Supply chain, you get 3 metadata:

    • THEME_FR = Global Procurement
    • THEME_FR = &amp;
    • THEME_FR = Supply chain

  3. For Processor’s id, specify the identifier of the processor that must process documents. This id must be identical to the Entry point id.
  4. In Include elements, specify element mappings:
    1. Element: Name of the XML element to extract. This can also represent an attribute when specifying a string in the form node@attribute. For example: book@title.
    2. Meta: Name of meta associated with the node content. If you do not specify any meta, the connector uses the name of the element or attribute to infer the meta name. For example, if Element is set to book, the node text is pushed in the book meta, or if Element is set to book@title, the attribute text is pushed in the title meta.
  5. Optionally, in Exclude elements, you can specify a list of node names that must be skipped by the processor.
  6. Click Apply.