Clustering and Operation System's Cluster
are not the same things. In other words,
Clustering is the grouping of results from one domain under one listing in search engine results pages. This is to prevent multiple pages from one site appearing too many times in the search engine's results. There is often a 'More results from this site' link to allow the viewing of these.
Most search engines use the Search Result Clustering (SRC) technique to list the search result. It on-the-fly clusters a certain search engine's search results into different groups, and provide meaningful and readable names for these groups. SRC changes the traditional representation of search results into a non-linear way, so as to facilitate user's browsing.
Traditional clustering techniques don't work for this problem because the documents are short, the cluster names should be readable and the algorithm should be efficient for on-the-fly calculation. Our method takes the whole problem in another way and overcome the difficulties in traditional clustering method. Basically, we try to first identify salient topics by identifying distinct and independent keyword, and then classify the search results into these topics.