What is Exalead CloudView

Exalead CloudView allows you to exploit huge quantities of both structured and unstructured data coming from multiple data sources, to present it in an intuitive search interface.

The Exalead CloudView platform uses textual and semantic technologies to reconcile formats, structures and terminologies, and identify embedded meanings and relationships. All information sources are unified in a robust modular index that ensures continuous access and optimal use of server resources.

This page discusses:

What Are the Main Parts of Exalead CloudView

To start with a high-level view of the product, let us say that it is made of the following parts:

  • Connectors to fetch data from data sources.

  • Indexing to process fetched documents and store them into the index.

  • Searching to define how data will be searchable and what will be displayed in the Search-Based Application.

  • Mashup UI or custom front-end applications based on search.

The following diagram summarizes the process to index and search with Exalead CloudView.

A simplified view of Exalead CloudView

#

Description

1

Connectors:

  • access the data sources,
  • fetch their files,
  • convert the files into documents,
  • send them to the Indexing Server through the Push API protocol.

2

During the analysis phase, the Indexing Server:

  • Receives documents.
  • Triggers their analysis sequentially, entirely in memory.
  • The analyzers process each document in the analysis job, perform text extraction, semantic processing, custom operations, and mapping.

3

During the build phase, the Indexing Server analyzes pushed documents, and creates a new generation of the index. It creates a set of files (tables, inverted list, and other structures) to make the index efficient at search time.

4

Once the build phase is complete, it is stored into the index.

  • It merges the data computed for analysis with the current version of the index.
  • Once done, the index is committed and updated. The new documents are available for search.

5

The Search Server interprets and processes user queries.

Each user query is processed by the Search Server based on a specific search logic.

6

Search queries and search results can be entered and displayed either in the Mashup UI (the default search application), or a custom search application relying on the Exalead CloudView Search API.

Exalead CloudView Software Interfaces

Main Software Interfaces

Exalead CloudView includes the following main software interfaces:

  • Administration Console

  • Mashup UI

  • Mashup Builder

  • Business Console

  • Monitoring Console

  • API Console

For more details on how to display these software interfaces, see Access the Configuration Interfaces.

Software Extensions

The following software interface extensions are available as options. Contact your NETVIBES representative for your licensing needs.

  • Mashup Builder Premium

  • Content Recommender extends the Business Console with a business logic recommendation engine to recommend results depending on a search query, for example, to recommend Home-cinemas when the user searches for TV sets.

Add-ons

Add-ons are optional components extending the capabilities of Exalead CloudView. You must install them in the INSTALLDIR. They require additional installation steps and a product restart to be functional. Exalead CloudView add-ons include:

  • Extended Languages add-on extends semantic analysis for a wider variety of languages.

  • Russian Lemmatization add-on allows Exalead CloudView to lemmatize Russian. As this language contains many inflections and therefore involves a large resource, it is not shipped by default.

Exalead CloudView Terminology

  • Connectors provide access to your data source (files, records), converts them into Exalead CloudView documents, and then sends them to Exalead CloudView for indexing. Connectors use the Push API (PAPI), a simple HTTP API to feed the index with documents. Each connector relies on the data source's native protocol to connect to its information source.

  • Convert allows Exalead CloudView to read documents with various formats (such as PDF, XML, or Microsoft Word). It receives documents from connectors, extract text and field information from them, and pass that information along for indexing and storage in the index.

  • Corpus refers to the collection of documents, coming from one or several data sources that needs to be indexed.

  • Documents can be defined as all the objects to be indexed by Exalead CloudView, regardless of file or entity type in the data source. For example, HTML, JPG or CSV files, database records are all considered documents within Exalead CloudView, since they are all converted into a Exalead CloudView-specific document format (also known as a PAPI document) after being scanned by a connector.

  • Document metas, not to be confused with hit metas, are pieces of text belonging to a document that have associated values, such as title or size. Document metas are stored either as an index field or as a category. Context is sometimes used as a synonym for document meta.

  • Dictionary is a separate structure from the index that stores all the words from an indexed document, plus their number of occurrences in the corpus. It is used for linguistic expansion mechanisms such as spell-checking or regular expression matching.

  • Facets are used to narrow search results. Use them to drill down into an area, such as language, author, or file type. They are typically used in dashboard analytics widgets, or in the Refinements panel for enterprise search.

  • Hit metas, not to be confused with document metas, are used to display one or more retrievable index fields in the hit content of search results.

  • Index is an efficient structure used by Exalead CloudView to store information about the items it has analyzed. When users issue search queries, Exalead CloudView quickly and easily finds the results within this structure.

  • The Exalead CloudView index is divided into fields:

    • Each field has a type: alphanumeric, numeric, hierarchical categories, geographic, and so forth.

    • Each field can be defined as:

      • Searchable which means that user search queries can be applied to this field.

      • Retrievable which means that the field can be displayed in the search results.

  • Queries are the search requests sent to the Exalead CloudView search engine and processed according to a specified search logic.

  • Thumbnails are small image previews for documents, which can be displayed in the search results. They are computed at search-time and kept in the browser cache for one week.