Surfing The Mega Tsunami of Web 2010 Information

Your Input Information Stream is Wide Band

In order to move past the limitation of our maximum effective information channel, we need the assistance of advanced tools. Current filters provide some aid but require our time to setup and regularly deploy. In addition our filters may be on keywords or tags that are never actually mentioned in an extremely relevant message. The challenge for developers is leveraging existing algorithms and database techniques to help us automatically sift through the ever increasing information wave.

Essential Information

From the raw volumes of text or annotated audio/video data, there are category type descriptors we call tags. Semantic services are continually working on improving the accuracy of identifying the best tags for a given message. If we are able to run all potential incoming information through advanced semantic extraction algorithms, we can generate a table or database of tags and weights (confidence levels as well as frequency give us numerics). Now information can be clustered using a distinct number of overlapping tags (1+) based on the numeric attached, where each tag represents a dimension. Some information may be clustered with several topic areas, but this functionality is dependent on the assignment algorithm.

In addition to being able to quickly isolate the messages, statuses, blog entries and articles in your desired topics, you can isolate anomalies or completely novel information. The filtering need not be manually processed, for tags and cluster data of interest can be selected from user history. Advanced settings will allow users to of course customize this two way search procedure.

Using a system like this, anyone should be able to handle a far greater bandwidth of incoming information. Within moments of browsing a website one can easily narrow down a huge array of information, into a few messages that are easily skimmed over and read.

GetSimple, The Simplest Content Management System Ever (mt-soft.com.ar)
All data are not the same (mndoci.com)
The $1M Netflix Prize has been won (glinden.blogspot.com) (algorithms may be applicable in pattern matching)
Data.gov Revealed (arnoldit.com)

Victus Spiritus