Everyone wants to build a semantic Web. Most people have the wrong idea of how to do it.
You can't build a software program that adequately mimics and understands the rules of human grammar. Not even Google can do that. Look at its problems in translating one language to another.
But you don't have to. We talk about the "cloud" of the internet, an amorphous virtual computer that sits on the fringe of our local networks, waiting to do our work for us. The great thing about the cloud is that it's smart. It's smarter than any one individual.
The internet is not just a string of connected computers. It is also people. People create links, travel to some sites more than others, refer interesting things to friends, comment on things they've seen, chat with each other. By tapping into the patterns of human activity on the internet, we tap into human intelligence.
Google first did this with the PageRank algorithm. By studying links people made to web sites, it was able to make a good evaluation of the relevance of those sites. It tapped into the human intelligence lying within the internet.
Google does not reveal what new algorithms it has created to determine relevancy. But you can be certain it has gone beyond PagRank. It can track our own traffic patterns to tell what's relevant to us. When we click on certain search results, Google gathers data about what's relevant to us. We leave a trail behind every time we surf the internet, and there are many trails Google can follow to determine what's relevant.
Amazon watches purchase patterns to determine what other products might be of interest to us. If we buy one thing, it can tell us what others buying that product have also bought. It taps into the human intelligence of the buying patterns.
I did some consulting last year for a startup called Pluggd, which helps people find videos of interest to them. Aside from watching people's patterns of viewing videos, it also gathers data by crawling the text-based internet. That reveals what people are talking about at any given time, helping the Pluggd search engine figure out exactly what it is you intend to search on.
It's called "associative rule mining," and happens to be an area Sergey Brin experimented with for a while at Stanford.
The company told me that the associations made can surprise them. When people were searching on "injury" at one point last year, the search results started coming up heavily weighted toward the terms "strained rotator cuff," "torn ligament," "sprained knee," and "stress fracture." It was the height of NFL season and fantasy football was a hot topic in the internet.
You don't have to create a computer algorithm that tries to mimic the human brain's ability to understand grammar. By analyzing the human traffic in the cloud, you can make associations that tell you what people are trying to say.
The brain is a mysterious computer, impossible to mimic by any computer today. But human wisdom is already captured in the cloud. You simply have to come up with the right ways to tap into it.