Browse > Home / Archive by category 'News'

| Subcribe via RSS

The not-so-itsy-bitsy-spider: 80Legs

September 23rd, 2009 | 1 Comment | Posted in News

Wanted to drop a quick post in honor of today’s public launch of 80Legs at the DEMO09 conference.

As followers of TechCrunch, Web2.0, SemanticWeb, GigaOM, and DEMO09 already know, 80Legs is a web crawling and online content analysis service which offers users access to more than 50,000 computers which can crawl as many as 2 billion web pages per day.

Need to take home a slice of the Web?  80Legs makes it easy.  Point your browser to their portal, specify your seed list (and some crawl preferences), and away you go!  It’s really that simple.  (Here’s a screenshot of their dashboard for a job we ran earlier today.)

Picture1

Sounds good?  It gets better. 80Legs makes all of this computational power affordable as well.  How affordable?  How about $2 (yes, really $2!) per million pages  crawled and $0.03 per CPU-hr used.   Don’t be afraid to do the math:  a 5 million page crawl costs just a little more than $12.

But wait — there’s more!  80Legs is so much more than a crawling platform.  It’s also the ideal foundation for a new generation of semantically-aware content processing apps, as well.  80Legs makes it possible for users to upload small content apps (usually < 20 MB) that can be run on each downloaded page.

Here’s where Swingly comes in.  Want to limit your crawls to pages in French?  Or docs that mention relief pitchers?  Or pages that discuss how people feel about a particular product or service?  As Steve Jobs might say, there’s gonna be an app for that.

Swingly started building natural language processing apps that can run on top of an 80Legs crawl earlier this year using technology licensed from its parent company, Language Computer Corporation.

These include:

  • Language Detection: Don’t read Dutch?  Don’t worry.  We won’t let you crawl those pages.
  • Semantic Crawling:  Tired of keyword searches?  So are we.  With Swingly’s semantic crawling service, you specify a seed list of concepts, not just words.  We’ll run all the queries you need to get every last drop of relevant content, regardless of what keywords you tried out.
  • Named Entity Recognition: Want to find pages that include only certain kinds of names?  Swingly’s named entity recognition service makes it possible to track down names from more than 2500 different semantic categories, ranging from startup companies to bands to stock ticker symbols to financial institutions to lawn mowers.
  • Sentiment Analysis:  Want to know what people really think about people, products, or services?  Swingly’s sentiment analysis apps analyze crawled pages for reviews, opinions, and other kinds of subjective attitudes related to a certain category. Unlike other sentiment apps, this app actually discovers the attributes associated with category — and tells you exactly what people liked (and didn’t like) about it!

These apps are now currently running on the 80Legs platform — in fact, the Swingly sentiment analysis app made its debut at today’s DEMO09 conference!  All three sets of apps (plus a couple more that we’ve got under development now) will be made availableto the general public when 80Legs launches its App Store later this Fall.

Can’t wait?  Want a sneak peak at one (or all) of the Swingly 80Apps?  Email me at andy@swingly.com!

Tags: , ,

Unexpectedly Clean Slate

August 4th, 2009 | 1 Comment | Posted in News

Picture 010

Oops. Corrupted my WordPress build this afternoon while trying to update to a new version.  Will be restoring my old posts in the days ahead.