Feed Fetcher Connector

Introducing the Feed Fetcher Connector

This section implies that the connector has already been added.

The General configuration is the same as the Crawler connector's one. For more information, see Configuring the Crawler. In the following use case, we want to crawl two feeds and refresh their crawls every hour.

Configure the Connector

In the General pane, use the default options.

In the URLs pane, define the 2 RSS feeds to crawl.


Parameter	Setting
Feed	Enter the URLs of the feeds. For example, we can crawl the 2 following feeds: `http://www.example.com/feed1`and `http://www.example.com/feed2`
Refresh period(s)	Specify how often the feeds are refreshed. For example, 3600 seconds.
Actions	Choose how to crawl the feeds: Index feed and article content: Indexes both what is displayed in the feed, and the content of the article (by following the link) Index feed content only: Indexes only what is displayed in the feed, NOT the content of the article. Index metas quickly: Indexes metas first when crawling the feed, and then again when crawling the entire article. This can be useful when you want to index article titles and summaries very quickly. The drawbacks are that your index will contain both partial and complete articles and you may have to configure your Mashup UI to avoid displaying empty hit contents (showing titles and summaries only).
Extract image/video links	In addition to extracting media enclosures by default, look for links to images and links to youtube or dailymotion in the content of the feed items.
Priority	If you define several RSS feeds, you can sort their priority from the Priority select box. This changes the priority of URLs matching the pattern. See How Priorities Work.

Click Apply.
On the Home page, click Start crawl.
The crawl of the 2 RSS feeds begins.