Abstract—This research proposes a novel framework for clustering search results, whereby efficiency and effectiveness are considered simultaneously. Search engines are an essential method for searching the Internet. Due to communication issues between information providers, information requesters and search engines, some relevant results may not be shown at the top of the list of search results. Based on the cluster hypothesis, whereby documents containing similar concepts will match the same search requests, clustering techniques are able to reorganize search results and improve performance. Traditionally, search results clustering works by mainly focusing on document snippets, due to the need for a quick response to the user’s query. However, snippets contain poor quality semantics, which may cause the problem of poor effectiveness. On the other hand, using full-text clustering is impractical as it is very time consuming. This research integrates the real-time and batch processing phases. Batch processing achieves greater effectiveness, and real-time processing returns the clustering results to users quickly. From the experiments, the proposed method is able to achieve search efficiency and effectiveness at the same time.
Index Terms—Search result clustering, document organization, information retrieval, semantics indexing, web search.
C. Hung and Z.-B. Wang are with Chung Yuan Christian University, Taoyuan, Taiwan (e-mail: chihli@cycu.edu.tw, bang@cycu.org.tw). P.-F. Hu and C.-Y. Yen are with SYSCOM Computer Engineering Co., Taipei, Taiwan (e-mail: Pei-fen_Hu@SYSCOM.com.tw, Jan-Chang_Yan@SYSCOM.com.tw). T.-H. Lin and L.-H. Chiang are with Institute for Information Industry, Taipei, Taiwan (e-mail: tsunghsilin@iii.org.tw, lihaochiang@iii.org.tw).
[PDF]
Cite:Chihli Hung, Zhen-Bang Wang, Pei-Fen Hu, Chen-Yu Yen, Tsung-His Lin, and Li-Hao Chiang, "Reorganization of Search Results Based on Semantic Clustering," International Journal of Computer Theory and Engineering vol. 10, no. 5, pp. 152-157, 2018.