Abstract—In the traditional Information retrieval system, ranking of the documents is done based on the relevance of the document w.r.t. to the searched query. Relevance of the document is computed entirely based on text content of the document. But due to large number of web pages, searching on the web results in large set of web pages retrieved as a result. Effective ranking of these resultant pages is required in order of their relevance to the searched query. The link information of these web pages plays an important role while ranking them. Different link Analysis ranking algorithms are suggested which compute the ranking of web pages like Kleinberg's HITS algorithm, Lempel and Moran's SALSA algorithm, BFS algorithm and many improved modified algorithms. All these link analysis ranking algorithms (LAR) have their limitations that show that any ranking algorithm cannot rely solely on link information, but must also examine the text content of linked sites to prevent the difficulties observed by existing link analysis ranking algorithms. In this paper, we study the ranking scores of pages computed through different link analysis ranking algorithms and proposed a new ranking approach based on the content analysis of the link pages while computing the rank score of the target web page.
Index Terms—Backward links, Forward links, Information Retrieval, Link Structure Analysis, Web page ranking.
P.C. Saxena is with the Department of School of Computer and System Sciences, JNU, New Delhi, India (e-mail : firstname.lastname@example.org ).
J.P. Gupta is with the Institute Of Information Technology, JIIT University, Noida, Uttar Pradesh, India (e-mail : email@example.com ).
Namita Gupta is with the Maharaja Agrasen Institute of Technology, GGSIPU, New Delhi, India, IACSIT membership No. 80333020 (e-mail: firstname.lastname@example.org).
Cite: P. C. Saxena, J. P. Gupta, Namita Gupt, "Web Page Ranking Based on Text Content of Linked Pages," International Journal of Computer Theory and Engineering vol. 2, no. 1, pp. 42-51, 2010.