Abstract—Geo-location searching is an important feature forany search engine and research in this field is not new. The only issue that remains is how a search engine know whether a webpage belongs to India or the USA? URLs ending with [.in] are the ultimate choice for India but not all web sites from India end with [.in]. This paper describes a technology known as the address parser. The address parser searches for patterns in a web page that communicates address information. The address parser does not parse every web page of a website for extracting the address but only works on those URLs where the probability of finding an address of the website owner is maximum, thereby eliminating false positives. A central knowledge base is built manually, which contains information like States of a country followed by their city names and other relevant information that may help the address parser do precise local indexing. It was observed that the address parser was not only able to recognize the address patterns in the web pages but also indexed them to city specific information. As a result, a person located in Gangtok, Sikkim, India searched for[universities]; the searching module showed the link of [Sikkim Manipal University] first, followed by other links from India. This work also focuses on the importance of the terms contained in the URLs for geographical based indexing and searching.
Index Terms—Address Parser, Geo-location Indexing, Information Retrieval, Localized Searching
M. Shoaib Jameel was with the Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, Majitar, Rangpo, East Sikkim - 737132 INDIA. He is now with the Department of Research and Development/Scientific Services, Tata Steel Limited, India (tel.: +919234502858).
Tejbanta Singh Chingtham is with the Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, Majitar, Rangpo, East Sikkim - 737132, India.
Cite: M. Shoaib Jameel and Tejbanta Singh Chingtham, "Compounded Uniqueness Level: Geo-Location Indexing Using Address Parser," International Journal of Computer Theory and Engineering vol. 1, no. 1, pp. 27-34, 2009.