The FeastieBot

What is the FeastieBot?

The FeastieBot is the web crawling bot (also known as a "spider") for Feastie.com.  Crawling food blogs and sites is how we find new recipe pages to add to our search index. 

The FeastieBot starts with a list of webpage URLs for known food blogs and sites. When visiting each page, it finds links to other pages to crawl, and detects whether or not the page contains a recipe. If the page contains a recipe, it adds a link to the page to our search index along with a thumbnail image, title, and a set of recipe-specific keywords including the ingredients. The FeastieBot can detect recipes, ingredients, and other recipe-specific keywords without requiring any special formats or codes such as rich snippets or hRecipe.


How your site appears in Feastie

Search result snippets from Feastie include:

  1. The title of the webpage or recipe.
  2. The name of the source.
  3. A thumbnail image.

When a user clicks on a snippet, we show the original webpage with our tools in a sidebar so that the user can add the ingredients to their shopping list. We make sure it is clear that the recipe comes from the orignal site owner and that it is easy to remove that sidebar to see the original webpage without it. With or without the sidebar, it counts as traffic for the site owner.

Making sure your site is crawlable

While the FeastieBot works automatically for the vast majority of food blogs and sites, it has many of the same limitations that other search engine spiders have. Here are some guidelines for ensuring that the FeastieBot can index your site:

  1. Don't overuse JavaScript. The FeastieBot cannot read text that has been rendered by JavaScript. If you have a Blogspot blog, this includes Dynamic Views. It also includes the Jux platform.
  2. Don't put the text of your recipe in an image.
  3. Make sure your recipe pages are linked to from other pages and that the links are viewable with JavaScript disabled.
  4. Make sure you haven't inadvertently blocked the FeastieBot by using a script or plugin on your site such as the BadBehavior WordPress plugin.

Preventing the FeastieBot from crawling your site

If you would prefer not to have the FeastieBot indexing your site, add the following lines to your robots.txt file:

User-agent: FeastieBot
Disallow: /

See www.robotstxt.org for more information on configuring robots.txt.

If you have a WordPress blog, another option is to install the BadBehavior plugin and configure it to block the FeastieBot.

Want to know more about the FeastieBot?

Follow @FeastieBot on Twitter.