Diffbot offers a set of API’s built on a visual robot that tracks and analyzes Web pages for developers. The technology uses artificial intelligence, visual machine learning and natural language processing to understand Web pages similar to the way human beings do – it can then extract relevant objects from the page for developers to use in their software applications.
We worked with Diffbot for the first time in August 2011, to help them announce themselves to the world. Since then, they’ve focused on refining the technology and hiring some new people. Diffbot has figured out that there are some 18 or so types of Internet pages on the Web. Web pages by and large fall into some category or “page type” or another, including things like articles, social networking profiles, product pages, recipes, etc. This is technology that, in my humble opinion, has a chance to seriously affect how everyone interacts with the Web.
They’ve announced APIs for two page types, Articles and Front Pages. In May 2012 they announced a $2M round of funding, which they’ll use to hire more awesome engineers, and to teach Diffbot to understand more page types. They’ve raised the money from some heavy-duty technologists including Sky Dayton (founder of Earthlink and Boingo Wireless) and Andy Bechtolsheim (co-founder of Sun Microsystems) along with many other notables.