UC-3: Consolidating Information on a View Document

When flattening data, it is also interesting to build the most complete "View" to answer global queries.

In the coffee sample, we might want to search for ICO country members, having some trade record of import type, and filter the global volume of trade above a specific threshold. We also want to add the coffee varieties sold by producing countries.

This task shows you how to:


Before you begin: We assume that previous UCs have been completed.

Step 1 - Check Existing Data

You can see the provided application sample. To access its front page:

  1. Open the Mashup UI application: http://<HOSTNAME>:<BASEPORT>/mashup-ui/page/searchcountry_v1

    Countries are displayed with their ICO status and yes flags show if they have associated PDF files (UC-1).

  2. You can click the see details link of a country. It provides a 360° view of all known data for this country.



Step 2 - Add Trade Info on Countries

This procedure describes how to calculate for each country: the quantity of imported coffee for the last year, and the average quantity of imported coffee through time.

  1. Add an aggregation processor:
    1. Select Groovy as format
    2. For Name, enter Countries_UC_3_1
    3. Click Accept
  2. Replace the default code by the following one:

    // Process nodes having the “country” type
    process("country") {
     // Add the import volume value of the last year
     // Goal: Be able to sort countries based on import trade activity
     year = 0;
     volume = 0;
     nbTrade = 0;
    
     // Big Integer
     def avgVolume = 0G;
     // Get import trade only, using the path label, i.e., "import"
     for (path in match(it, "import[trade]")) {
       // If a valid path is found, retrieve its last element
       last = path.last();
       log.info "trade found: " + last.getUri();
    
       // Get trade volume for the last year
       if (last.metas.getValue("year")?.toInteger() > year ) {
         year = last.metas.getValue("year")?.toInteger();
       }
    
       // Add volume to calculate the total import trade volume
       volume = last.metas.getValue("volume")?.toInteger();
       avgVolume += volume; 
       nbTrade++;
     }
    
     // Add metas to countries having import trade
     if (nbTrade!=0) {
     it.metas.import_lastvolume = volume;
     it.metas.import_lastyear = year;
     avgVolume = Math.ceil(avgVolume / nbTrade).intValue();
     it.metas.import_averagevolume = avgVolume;
     }
    }

  3. Save and apply the configuration.

Step 3 - Scan the Source Connector and Check What Is Indexed

  1. Go to the Home page.
  2. Click Force aggregation, and enter country as type.
  3. Open the following Mashup UI application search page: http://<HOSTNAME>:<BASEPORT>/mashup-ui/page/searchcountry_v2
  4. Check that countries now have the following metas: Last import year, Last import volume, Average import volume.
  5. You can now use the average import volume as search criteria. For example, sort by Avg import volume.



Step 4 - Add New Categories on Countries

Define the Connector for the Prices Source

  1. In the Administration Console, go to Index > Connectors and click Add connector.
    1. In Name, enter prices.
    2. For Type, select the JDBC connector.
    3. For Push to PAPI server, select the Consolidation server cbx0 instance.
    4. Click Accept.
  2. For Store documents in data model class, choose the price class.
  3. In Connection parameters:
    1. For Driver, enter org.sqlite.JDBC
    2. For Connection string, enter jdbc:sqlite://<INPUTDIR>/coffee.db
    3. Click Test connection. The database connector automatically connects to the database.
  4. In Query parameters:
    1. For Synchronization mode, select Full synchronization
    2. For Initial query, enter select country_id, coffee_type, year, price from price
  5. Click Retrieve fields.
  6. Define the coffee_type, country_id, and year fields as primary keys.
    1. Click the coffee_type field to expand it.
    2. Select Use as primary key.
    3. Repeat the operation for the country_id and year fields.
  7. Click Apply.

Configure the Transformation Processor

  1. Go to Index > Consolidation
  2. Add a new transformation processor:
    1. Select Groovy as format
    2. For Name, enter Prices
    3. Click Accept
  3. For Source connector, select prices
  4. Replace the default code by the following one:
    // Process all nodes
    process("") {
     // Link prices records to nodes having the “country” type
     it.addArcTo("producedBy", "country_id=" + it.metas.getValue("country_id") + "&");
    }

Configure the Aggregation Processor

  1. Add an aggregation processor:
    1. Select Groovy as format
    2. For Name, enter Countries_UC_3_2
    3. Click Accept
  2. Replace the default code by the following one:

    // Process nodes having the “country” type
    process("country") {
     // Add all trade types on countries
     if (match(it, "import[trade]")) {
     it.metas.tradetype.add("import")
     }
    
     if (match(it, "export[trade]")) {
     it.metas.tradetype.add("export");
     }
    
     if (match(it, "reExport[trade]")) {
     it.metas.tradetype.add("reExport")
     }
    
     // Add all coffee types to producing countries
     it.metas.coffeetype +=
        // Get all paths to price nodes
        match(it, "-producedBy[price]") *.last()
        // fetch the last node of each path
           // retrieve the coffee_type meta values for all price nodes
           .collect{n-> n.metas.getValue("coffee_type") }
           .unique() // dedup collected meta values
        // or if multi valued: .collect{n-> n.metas.coffee_type}.flatten().unique()
    }

  3. Save and apply the configuration.

Step 5 - Rescan Source Connectors and Check What Is Indexed

  1. Go to the Home page and under the connectors list, click Scan for the country JDBC connector and the prices JDBC connector.
  2. Open the Mashup UI application search page: http://<HOSTNAME>:<BASEPORT>/mashup-ui/page/searchcountry_v3
  3. Check that countries now have the following facets: Country Trade type and Country Coffee types (the metas created previously in the aggregation processor are mapped to these facets).