<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Describing &#8216;Red&#8217; To A Blind Man: The Dilemma Of Ontology</title>
	<atom:link href="http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/</link>
	<description>ASP.NET, PHP, XML, JavaScript, DOM, Web geekery</description>
	<lastBuildDate>Mon, 02 Jan 2012 21:22:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: dougv.com &#124; The Web home of Doug Vanderweide &#187; Blog Archive &#187; The Difference Between Null, Empty And Zero-Length Data / Strings</title>
		<link>http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/#comment-154</link>
		<dc:creator>dougv.com &#124; The Web home of Doug Vanderweide &#187; Blog Archive &#187; The Difference Between Null, Empty And Zero-Length Data / Strings</dc:creator>
		<pubDate>Wed, 22 Oct 2008 19:13:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.dougv.com/blog/?p=875#comment-154</guid>
		<description>[...] because humans equate nothing with zero, thanks to our ontological capacities, almost all programming languages have some construct that allows us to equate null and empty. In [...]</description>
		<content:encoded><![CDATA[<p>[...] because humans equate nothing with zero, thanks to our ontological capacities, almost all programming languages have some construct that allows us to equate null and empty. In [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Vanderweide</title>
		<link>http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/#comment-153</link>
		<dc:creator>Doug Vanderweide</dc:creator>
		<pubDate>Thu, 04 Sep 2008 13:27:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.dougv.com/blog/?p=875#comment-153</guid>
		<description>Read the entry again. My intent was to explain the programming problem and expose why it&#039;s so difficult to do what you want to do; not to provide a solution. In that sense, your follow-ups repeat my points.</description>
		<content:encoded><![CDATA[<p>Read the entry again. My intent was to explain the programming problem and expose why it&#8217;s so difficult to do what you want to do; not to provide a solution. In that sense, your follow-ups repeat my points.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/#comment-152</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Thu, 04 Sep 2008 05:39:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.dougv.com/blog/?p=875#comment-152</guid>
		<description>Whoops, I didn&#039;t mean keyword &quot;banking&quot; in that post, I meant &quot;IPOD&quot;. Not sure how I made that mistake. ;)

And what I meant to say is that after indexing the top 100 documents, you&#039;d scan the content of these documents to get your relevant keywords.</description>
		<content:encoded><![CDATA[<p>Whoops, I didn&#8217;t mean keyword &#8220;banking&#8221; in that post, I meant &#8220;IPOD&#8221;. Not sure how I made that mistake. <img src='http://www.dougv.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>And what I meant to say is that after indexing the top 100 documents, you&#8217;d scan the content of these documents to get your relevant keywords.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://www.dougv.com/2008/09/03/describing-red-to-a-blind-man-the-dilemma-of-ontology/#comment-151</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Thu, 04 Sep 2008 05:37:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.dougv.com/blog/?p=875#comment-151</guid>
		<description>Thanks for taking the time to give your view on the this puzzling question.

I agree on what you are saying. But it seems like you have took the question and turned it around. What you have explained is taking a category and mapping it to keywords. That is not really what I wanted. Sure, it&#039;ll be hard for the system to figure out that a colloid i in milk, pepsi, coca cola, and more. It seems like you&#039;ve taken it a bit too far.

What I was trying to do is figure out a way to take a keyword, and then find the most relevant category. And that IS possible, but it does require data. I should have probably asked, what kind of data that we find on the internet can we use in order to get the most accurate categories for the biggest amount of keywords. Or possibly what collection of data.

My original idea worked, sort of. And it is what you sort of explain. The idea was to use data from search engines. Take a keyword such as &quot;banking&quot; and run it across the top 100 sites listed on Google. Since Google is a good source for relevant information, you can bank on the fact that they will provide you mainly with accurate top 100 results.

By indexing all those top 100 documents, you&#039;ll then need a separate technology that will get all the related keywords to your main keyword.

So let&#039;s say that you now got your top 10 related keywords which would be something like:

Music
Mp3
Portable Device
Computers
Apple

You&#039;d be sure to get the top most repeat keywords / keyword phrases (and that would require yet another technology),  that do not contain the word &quot;IPOD&quot;.

Now you would go back to your list of let&#039;s say - 500 categories. Each category would need to have related keywords. So the category &quot;Music&quot; would need to have keywords that would be &quot;music,mp3,rap,hiphop,pop,rock,etc&quot;.

My idea was to now run the related keywords that you found against the related keywords of the categories. The category that has the most matches would take the spot as being the most relevant category.

The reason I liked this system was because if you have a new product that just came out, for instance let&#039;s say IPOD was just announced yesterday - you now know that Google will definitely have it listed.

And since the categories would contain terms that could be found in the English dictionary, then you don&#039;t need to worry about having category related terms of newly released products, therefore the categories don&#039;t need updates, or possibly only a few updates every few years.

The only service you&#039;d rely on is Google to get you the recent and relevant web pages for your initial search term.

But at the end of the day, this is too hard. And once again, I fall into a problem where I will be able to map the keyword &quot;Ipod&quot; or &quot;Ipod Devices&quot; to &quot;Music&quot; category. But, once again, how do I map something to the category &quot;Recreation&quot;? It&#039;s not the same ball game again. And, how can I map something like &quot;United States&quot; to category &quot;World&quot;? I mean, yes I can map it, but that would now require me to make specific rules for countries / cities / states to map to a &quot;LOCAL&quot; category.

The question is, how can we use the data that is out there to build such a system?</description>
		<content:encoded><![CDATA[<p>Thanks for taking the time to give your view on the this puzzling question.</p>
<p>I agree on what you are saying. But it seems like you have took the question and turned it around. What you have explained is taking a category and mapping it to keywords. That is not really what I wanted. Sure, it&#8217;ll be hard for the system to figure out that a colloid i in milk, pepsi, coca cola, and more. It seems like you&#8217;ve taken it a bit too far.</p>
<p>What I was trying to do is figure out a way to take a keyword, and then find the most relevant category. And that IS possible, but it does require data. I should have probably asked, what kind of data that we find on the internet can we use in order to get the most accurate categories for the biggest amount of keywords. Or possibly what collection of data.</p>
<p>My original idea worked, sort of. And it is what you sort of explain. The idea was to use data from search engines. Take a keyword such as &#8220;banking&#8221; and run it across the top 100 sites listed on Google. Since Google is a good source for relevant information, you can bank on the fact that they will provide you mainly with accurate top 100 results.</p>
<p>By indexing all those top 100 documents, you&#8217;ll then need a separate technology that will get all the related keywords to your main keyword.</p>
<p>So let&#8217;s say that you now got your top 10 related keywords which would be something like:</p>
<p>Music<br />
Mp3<br />
Portable Device<br />
Computers<br />
Apple</p>
<p>You&#8217;d be sure to get the top most repeat keywords / keyword phrases (and that would require yet another technology),  that do not contain the word &#8220;IPOD&#8221;.</p>
<p>Now you would go back to your list of let&#8217;s say &#8211; 500 categories. Each category would need to have related keywords. So the category &#8220;Music&#8221; would need to have keywords that would be &#8220;music,mp3,rap,hiphop,pop,rock,etc&#8221;.</p>
<p>My idea was to now run the related keywords that you found against the related keywords of the categories. The category that has the most matches would take the spot as being the most relevant category.</p>
<p>The reason I liked this system was because if you have a new product that just came out, for instance let&#8217;s say IPOD was just announced yesterday &#8211; you now know that Google will definitely have it listed.</p>
<p>And since the categories would contain terms that could be found in the English dictionary, then you don&#8217;t need to worry about having category related terms of newly released products, therefore the categories don&#8217;t need updates, or possibly only a few updates every few years.</p>
<p>The only service you&#8217;d rely on is Google to get you the recent and relevant web pages for your initial search term.</p>
<p>But at the end of the day, this is too hard. And once again, I fall into a problem where I will be able to map the keyword &#8220;Ipod&#8221; or &#8220;Ipod Devices&#8221; to &#8220;Music&#8221; category. But, once again, how do I map something to the category &#8220;Recreation&#8221;? It&#8217;s not the same ball game again. And, how can I map something like &#8220;United States&#8221; to category &#8220;World&#8221;? I mean, yes I can map it, but that would now require me to make specific rules for countries / cities / states to map to a &#8220;LOCAL&#8221; category.</p>
<p>The question is, how can we use the data that is out there to build such a system?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

