Extracting Using XSLT Method

This processor performs XSL transformations on XML documents and passes the result to the extraction methods.


Before you begin:

In the top pane, configure the Stylesheet parameter to define the stylesheet file path. The connector transforms XML files according to this stylesheet.

You can then use one of the following extraction methods:

  • If the transformation result returns an XML file with the following PAPI_document format, then appropriate metas are automatically pushed:

    <PAPI_document>
      <PAPI_meta name="name">value</PAPI_meta> 
      <!-- ... -->
    </PAPI_document>

    For example,

    <PAPI_document>
      <PAPI_meta name="text">Sally M.</PAPI_meta>
      <PAPI_meta name="phone">555 1234</PAPI_meta>
    </PAPI_document>
  • Xpath expressions - if XPath after XSLT is selected, extraction is performed on the XSLT result instead of the original document or chunk.

  • by Element name - If Elements extraction after XSLT is selected, by-name extraction is performed on the XSLT result instead of the original document or chunk.

See Also
Property Descriptions

Context: In this example, the connector transforms the XML file and then extracts using the PAPI document method.

  1. In the global configuration pane, for Stylesheet, enter the stylesheet to use for transfor­mation.

    For example, /data/<USER>/xml-sample-data/contacts.xsl

  2. Expand Root Paths, and click Add item to add a path.
  3. In Root path, enter the file or folder path to index.

    For example, /data/<USER>/xml-sample-data/contacts

  4. Click Add item to add more paths to the list of Root Paths.
  5. Click Apply.