Browse > Home /

| Subcribe via RSS

Recap: Future of Semantic Search Panel @ Web 3.0

January 31st, 2010 | 3 Comments | Posted in Blog

I had the good fortune on Thursday to be a part of a panel on semantic search at the Web 3.0 Conference. The panel was organized Mark Johnson(Bing/Powerset) and featured the likes of Connie Kenneally (TextWise), Will Hunsinger (Evri), Tim Musgrove (TextDigger), and yours truly (LCC, Swingly, Extractiv, etc.).

Mark put on an absolutely great panel. In addition to being one of the most knowledgeable people in our industry, he’s a natural-born moderator and a talented discussion leader. He’s got great journalistic chops too: definitely not one to shy away from asking the tough questions.

Since I wasn’t able to capture video of the panel, I thought I’d try to recreate my side of the discussion. Here are some of the questions that Mark asked — and the gist of the answers I gave. (Or would have given.)

More after the jump…

More »

Tags: , , ,

Don’t Miss: “The Evolution of Semantic Search” @ Web 3.0

January 27th, 2010 | No Comments | Posted in Blog

The Evolution of Semantic SearchThe potential for semantic search to take on the role of an all-purpose engine is dead. Building a search engine is just too expensive: a massive capital expenditure, a huge team, and a marketing campaign to hook users are beyond the reach for most companies, let alone a startup. And, the big players are already integrating more and more semantic technology, such as Microsoft’s acquisition of Bing and Yahoo’s SearchMonkey initiative. That being said, there are still many ways for semantic technology to provide value to smaller domains in search. It’s time we refined our notion of semantic search and discuss what’s next for semantic search startups.

Andy Hickl Will Hunsiger Mark Johnson Connie Kenneally
ANDY HICKL
CEO
Swingly
WILL HUNSINGER
CEO
Evri
Moderator
MARK JOHNSON

Senior Program Manager
Bing at Microsoft
CONNIE KENNEALLY
CEO
Textwise
I’m part of an excellent panel (organized by Mark Johnson of Powerset/Bing fame) this morning at the Web 3.0 Conference in Santa Clara.

We’re slated to tackle the question of “what’s next” for semantic search — a worth topic, indeed!

But, I have the feeling that we’ll all be circling back to the more vexing problem of exactly how companies who have invested in semantic technologies can create real (sustainable, sexy, growing) markets for their products.

There’s no live feed, but I’ll get shakycam video up later this afternoon.

Posted via web from andyhickl’s posterous

Tags: , ,

WWJS?

September 29th, 2009 | No Comments | Posted in Blog

google

That’s right.  Those in the know search Swingly.

Tags:

More than a Feeling

September 24th, 2009 | 3 Comments | Posted in Blog

Really tickled with all of the much-deserved positive press that 80Legs has attracted after yesterday’s successful public launch at the DEMO09 conference.  (See here, here, and here.  And here.  And here.)  We couldn’t be happier for Shion and Brad and the rest of the 80Legs team.  They’ve got a great product, and they’re well-positioned to really dominate the crawling market.  (Oh, and they’re from Texas Rice, which is a good thing in my book.)  Congrats, guys!

Over the next couple of weeks, I’ll be talking a lot about the different kinds of semantic apps that Swingly (and its parent company, Language Computer) have built to run on the 80Legs platform. We’re psyched about combining Swingly’s broad-coverage semantic apps with the massive amounts of data that 80Legs provides.   It’s a pretty unbeatable combination:  80Legs helps you cast a broad net, while Swingly lets you know exactly what you caught.

While I don’t want to steal any of the 80Legs spotlight, I couldn’t resist telling you a little about the Swingly sentiment analysis app (code name:  Positively) that Shion used during his DEMO pitch yesterday.

Like a lot of other sentiment analysis services (such as those provided by ScoutLabs, Jodange, Evri, NStein, or Crimson Hexagon — just to name a few), Positively was designed to help users discover what people think about pretty much any person, product, organization, or service imaginable.

Want to know what people think about the Neill Blomkamp flick, District 9?  Lots of sentiment analysis apps can boil down an Internet’s worth of noise to a summary score like this:

district9Summary

and a list of comments (usually tagged as positive or negative ) like this:

  • The movie looks great to begin with and this trailer re-enforces we’ll likely get a solid, if not great film out of it. [1]
  • “District 9″ seems an oddly misguided sci-fi movie. [2]
  • It definitely has the goods: an interesting concept, Blomkamp’s clever filmmaking (the movie begins as a faux-documentary and gradually shifts into a survival tale) and ambitions that far exceed the Hollywood norm.[3]

Positively is different from most sentiment analysis apps in two ways.

First, unlike many other services which rely on large amounts of preprocessed data, Positively runs “live” as part of an 80Legs crawl.  Instead of indexing data after it’s become stale, Positively analyzes the sentiments in pages as they’re downloaded.  No indexing, no large-scale distributed processing.  No headaches.  Oh, and you can’t get fresher semantic content.

Second, Positively knows that sometimes you need more than a number.   As it crawls, Positively automatically discovers attributes associated with each of the people, products, or services it’s investigating — and then figures out what people think about each of those attributes.  Interested in District 9?  You might be interested in its:

  • plot
  • actors
  • humor
  • cameos
  • visual effects

Want to track down information on AT&T cell phones?  You might want to know about their:

  • battery life
  • reception
  • chargers
  • apps
  • features
  • size
  • display

No, these attributes don’t come from some big, pre-cooked list of things that might (or might not) be relevant for each product category.   In order to discover why people feel the way they do, Positively hunts for each of the attributes associated with an item — and then discovers what people actually think about that attribute.  Here are a couple of  examples for District 9:

d92

Despite its successful launch yesterday, Positively won’t be available to the general public until 80Legs goes live with its App Store later this Fall.  We are, however, giving sneak peaks.  Want one?  Email me at andy@swingly.com.

Oh, and there’s Boston after the jump!

More »

Tags: , ,

The not-so-itsy-bitsy-spider: 80Legs

September 23rd, 2009 | 1 Comment | Posted in News

Wanted to drop a quick post in honor of today’s public launch of 80Legs at the DEMO09 conference.

As followers of TechCrunch, Web2.0, SemanticWeb, GigaOM, and DEMO09 already know, 80Legs is a web crawling and online content analysis service which offers users access to more than 50,000 computers which can crawl as many as 2 billion web pages per day.

Need to take home a slice of the Web?  80Legs makes it easy.  Point your browser to their portal, specify your seed list (and some crawl preferences), and away you go!  It’s really that simple.  (Here’s a screenshot of their dashboard for a job we ran earlier today.)

Picture1

Sounds good?  It gets better. 80Legs makes all of this computational power affordable as well.  How affordable?  How about $2 (yes, really $2!) per million pages  crawled and $0.03 per CPU-hr used.   Don’t be afraid to do the math:  a 5 million page crawl costs just a little more than $12.

But wait — there’s more!  80Legs is so much more than a crawling platform.  It’s also the ideal foundation for a new generation of semantically-aware content processing apps, as well.  80Legs makes it possible for users to upload small content apps (usually < 20 MB) that can be run on each downloaded page.

Here’s where Swingly comes in.  Want to limit your crawls to pages in French?  Or docs that mention relief pitchers?  Or pages that discuss how people feel about a particular product or service?  As Steve Jobs might say, there’s gonna be an app for that.

Swingly started building natural language processing apps that can run on top of an 80Legs crawl earlier this year using technology licensed from its parent company, Language Computer Corporation.

These include:

  • Language Detection: Don’t read Dutch?  Don’t worry.  We won’t let you crawl those pages.
  • Semantic Crawling:  Tired of keyword searches?  So are we.  With Swingly’s semantic crawling service, you specify a seed list of concepts, not just words.  We’ll run all the queries you need to get every last drop of relevant content, regardless of what keywords you tried out.
  • Named Entity Recognition: Want to find pages that include only certain kinds of names?  Swingly’s named entity recognition service makes it possible to track down names from more than 2500 different semantic categories, ranging from startup companies to bands to stock ticker symbols to financial institutions to lawn mowers.
  • Sentiment Analysis:  Want to know what people really think about people, products, or services?  Swingly’s sentiment analysis apps analyze crawled pages for reviews, opinions, and other kinds of subjective attitudes related to a certain category. Unlike other sentiment apps, this app actually discovers the attributes associated with category — and tells you exactly what people liked (and didn’t like) about it!

These apps are now currently running on the 80Legs platform — in fact, the Swingly sentiment analysis app made its debut at today’s DEMO09 conference!  All three sets of apps (plus a couple more that we’ve got under development now) will be made availableto the general public when 80Legs launches its App Store later this Fall.

Can’t wait?  Want a sneak peak at one (or all) of the Swingly 80Apps?  Email me at andy@swingly.com!

Tags: , ,

Avoiding Search Overload

August 5th, 2009 | No Comments | Posted in Blog

3642650246_707852816a

Like you, we’ve heard a lot this summer about the challenges facing America:  the financial crisis, healthcare reform, and worst of all:  search overload.

Well, here at Swingly HQ, we’ve been doing our part.  We’ve been trying to find new ways to figure out what kinds of information are most relevant to a particular search topic.

While relevance modeling isn’t exactly new, it’s becoming an increasingly important problem for semantic search applications.   Information Extraction apps are rapidly increasing the amount of factual information that’s available from the Internet.  That’s good.  Unfortunately, instead of being buried under mountains of irrelevant information, we’re now being overwhelmed with gigabytes of factual information which may (or may not) be exactly what we’re looking for.  That’s bad.

So, what’s a new semantic search app to do?  Full details after the jump.

More »

Tags: , , ,