International Journal of Computer Theory and Engineering

Editor-In-Chief: Prof. Mehmet Sahinoglu
Frequency: Quarterly
ISSN: 1793-8201 (Print), 2972-4511 (Online)
Publisher:IACSIT Press
OPEN ACCESS
4.1
CiteScore

⚠️ Important Security Notice: Beware of Fraudulent Emails Impersonating IJCTE Officials
IJCTE 2011 Vol.3(5): 623-627 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2011.V3.381

A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology

Somayyeh Estahbanati, Reza Javidan, and Mehdi Nikkhah

Abstract—The main goal of stemming is to standardize words by reducing a word to its origin. In this paper a new algorithm for stemming in Farsi (Persian) language is presented. This stemmer is based on removing the suffixes and prefixes, and a database is used for saving the exceptions to decrease error rate. In the proposed method the speed of stemmer and also the percentage of errors are improved. The evaluation results on the prototype document collections show significant improvement in precision and recall in comparison with other well-known methods.

Index Terms—Farsi, persian, language, stemming.

Somayye Estahbanati is with Department of Computer Engineering. Islamic Azad University, Science and Research Branch, Ahvaz, Iran (Email: s.estahbanati@gmail.com).
Reza Javidan and Mehdi Nikkhah are with Department of Computer Engineering. Islamic Azad University, Beyza Branch, Beyza, Iran (Email: reza.javidan@gmail.com; Nikkhah@biau.ac.ir).

[PDF]

Cite: Somayyeh Estahbanati, Reza Javidan, and Mehdi Nikkhah, "A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology," International Journal of Computer Theory and Engineering vol. 3, no. 5, pp. 623-627, 2011.

Article Metrics in Dimensions

Menu