<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AndyHickl.com &#187; semantic web</title>
	<atom:link href="http://andyhickl.com/tag/semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://andyhickl.com</link>
	<description>building the next big thing down in big d</description>
	<lastBuildDate>Tue, 09 Mar 2010 20:36:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Quick Q&amp;A on Extractiv</title>
		<link>http://andyhickl.com/2010/01/31/292/</link>
		<comments>http://andyhickl.com/2010/01/31/292/#comments</comments>
		<pubDate>Mon, 01 Feb 2010 03:47:17 +0000</pubDate>
		<dc:creator>andy</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[extractiv]]></category>
		<category><![CDATA[Quick Extractiv Q&A]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://andyhickl.com/2010/01/31/292/</guid>
		<description><![CDATA[

I had so much fun writing up my answers to Mark Johnson&#8217;s panel questions that I thought I&#8217;d put together another &#8220;mock&#8221; interview &#8212; with myself.
This time, I&#8217;m going to be tackling some of the more popular questions we get regarding Extractiv. As a brand-new start-up (only about 8 weeks old), we&#8217;re still finding our [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;">
<p style="text-align: center;"><a href="http://andyhickl.com/wp-content/uploads/2010/01/still.jpg"><img class="aligncenter size-full wp-image-294" title="still" src="http://andyhickl.com/wp-content/uploads/2010/01/still.jpg" alt="" /></a></p>
<p>I had so much fun writing up my answers to <strong>Mark Johnson</strong>&#8217;s panel questions that I thought I&#8217;d put together another &#8220;mock&#8221; interview &#8212; with myself.</p>
<p>This time, I&#8217;m going to be tackling some of the more popular questions we get regarding <strong>Extractiv</strong>. As a brand-new start-up (only about 8 weeks old), we&#8217;re still finding our strengths, but I thought it&#8217;d be safe to share a little more about who we are &#8212; and what we&#8217;re trying to do under the <strong>Extractiv</strong> name. Want to know more? Write us at <a href="mailto:support@extractiv.com">support@extractiv.com</a>; we&#8217;d be happy to answer any questions you might have (or to show you a demo)!</p>
<p>(As always, the views expressed on this blog are mine, and do not necessarily reflect the views of Language Computer or Extractiv or its subsidiaries or parent companies. Well, until we get the Extractiv Blog put together and start blogging there in earnest, that is.)</p>
<p><em>Interview after the jump&#8230;</em></p>
<p><span id="more-292"></span></p>
<p><strong>Andy Hickl: What is Extractiv?</strong></p>
<p><strong>Extractiv</strong> is a new content provisioning service that helps consumers &#8220;make sense&#8221; of large amounts of unstructured text. We use natural language processing &#8212; in conjunction with one of the world&#8217;s best distributed computing platforms &#8212; in order to turn text into structured data that can be used in a variety of apps, such as sentiment tracking or semantic search.</p>
<p><strong>AH: Why did you build Extractiv? Why now?</strong></p>
<p>We&#8217;re building Extractiv because we wanted to give consumers a better way to access all of the knowledge that&#8217;s available on the Web.</p>
<p><strong>AH: Okay, so you&#8217;re all about getting knowledge from the Web. Isn&#8217;t that what search engines do?</strong></p>
<p>Well, yes and no.</p>
<p>Search engines are great ways to get your hands on lots of relevant content related to a keyword query. Want 10 million pages on Labrador Retrievers? Or all the Tweets talking about the Grammy awards? We&#8217;d recommend you use a search engine.</p>
<p>But search engines can only take you so far. Let&#8217;s say you want a list of all of the men who have ever won a Grammy award. (That&#8217;s a pretty disparate group, mind you: one that includes <strong>Bill Clinton</strong> as well as <strong>George Clinton</strong>.) Sorry to say, but search &#8212; even semantic search &#8212; ain&#8217;t going to help you much here. If you speak SPARQL, you can try to pull the knowledge out of a pre-compiled, hand-vetted knowledge repository like <strong>NNDB</strong> or <strong>DBPedia</strong>. If you don&#8217;t? You&#8217;re left hoping that the Grammys compiled <a href="http://www2.grammy.com/GRAMMY_Awards/Winners/">a list that you can use</a>.</p>
<p>Most of the time, however, the knowledge you want won&#8217;t have been compiled into a single, handy-dandy list. What do you do if you want the list of people who have been killed at U.S. sporting events since 1925? Or the comprehensive list of people who have been killed by Somali pirates? Well, before Extractiv, you had to:</p>
<ol>
<li>Search the Web.</li>
<li>Download lots and lots of documents.</li>
<li>Start reading.</li>
</ol>
<p><strong>AH: Okay, that&#8217;s not much fun. But how does Extractiv help?</strong></p>
<p>Instead of simply search the Web for pages which might (or might not) be relevant to your query, Extractiv goes one step further and actually <em>extracts</em> the exact piece of knowledge you&#8217;re looking for.</p>
<p>Simply put, we turn a bit of text like this:</p>
<blockquote style="margin-right: 0px;" dir="ltr"><p><span style="widows: 2; text-transform: none; text-indent: 0px; border-collapse: separate; font: medium 'Times New Roman'; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><span style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: x-small;">An unlikely nominee, Clinton won his second consecutive nod for music&#8217;s top awards in the best spoken word album category for the recording of his best-selling autobiography &#8220;My Life.&#8221; <span style="widows: 2; text-transform: none; text-indent: 0px; border-collapse: separate; font: medium 'Times New Roman'; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><span style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: x-small;"><strong>Earlier this year, the former leader of the free world won a golden gramophone statuette</strong> for <strong>lending his voice to the spoken word recording of Russian folk tale of &#8220;Peter and the Wolf.&#8221; <span style="widows: 2; text-transform: none; text-indent: 0px; border-collapse: separate; font: medium 'Times New Roman'; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><span style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: x-small;">Earlier this year, the former leader of the free world won a golden gramophone statuette for lending his voice to the spoken word recording of Russian folk tale of &#8220;Peter and the Wolf.&#8221;</span></span></strong></span></span></span></span></p></blockquote>
<p>into a structured record like this:</p>
<blockquote style="margin-right: 0px;" dir="ltr"><p>GRAMMY WINNER: Bill Clinton, 2004, spoken word, &#8220;Peter and the Wolf&#8221;</p></blockquote>
<p>where <strong>Bill Clinton</strong> refers to the name of the winner, <strong>2004</strong> refers to the year he won, and so on.</p>
<p>But we don&#8217;t do that just for one bit of text: we do it for the millions of pages we encounter on a Web crawl. Extractiv&#8217;s unique distributed computing platform makes it possible for us to crawl &#8212; and extract content from &#8212; zillions of pages <em>at the same time</em>. (Our performance is pretty unbeatable, too: we&#8217;re currently able to download and extract content from 1 million pages in just under an hour.)</p>
<p><strong>AH: Whoa. But what kinds of content can I extract? I&#8217;m not exactly interested in male Grammy winners, you know.</strong></p>
<p>What, you&#8217;re not? That&#8217;s okay. We aren&#8217;t either.</p>
<p>Extractiv currently offers more content extractors than any other provider: including more than 10,000 different types of named entities, along with hundreds of facts, attributes, relationships, and events.</p>
<p>We also have the ability to create custom extractors for practically any content type imaginable. Want a list of all of the IED bombings in Iraq since 2008? We can do that. Want a list of sex scandals involving U.S. politicians? We can do that, too.</p>
<p><strong>AH: Who&#8217;s behind Extractiv?</strong></p>
<p>Extractiv&#8217;s a joint venture between two companies: <strong>80Legs</strong> and <strong>Language Computer</strong>. It&#8217;s really a great match. 80Legs offers the world&#8217;s first truly scalable web crawling platform, while Language Computer provides some of the world&#8217;s best &#8212; and most scalable &#8212; natural language processing tools.</p>
<p><strong>AH: Are you based in the Bay Area?</strong></p>
<p>No, we&#8217;re 100% Texan. (And darned proud of it.) <strong>Language Computer</strong> is based in Dallas. <strong>80Legs</strong> is out of Houston.</p>
<p><strong>AH: What products do you offer?</strong></p>
<p>We&#8217;re currently in alpha with two products: a content extraction service and a sentiment tracking service. Both are available for demos. Just shoot us an email at <a href="mailto:support@extractiv.com">support@extractiv.com</a>, and we&#8217;ll show you what we can do.</p>
]]></content:encoded>
			<wfw:commentRss>http://andyhickl.com/2010/01/31/292/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
