General Information
    • ISSN: 1793-8201 (Print), 2972-4511 (Online)
    • Abbreviated Title: Int. J. Comput. Theory Eng.
    • Frequency: Quarterly
    • DOI: 10.7763/IJCTE
    • Editor-in-Chief: Prof. Mehmet Sahinoglu
    • Associate Editor-in-Chief: Assoc. Prof. Alberto Arteta, Assoc. Prof. Engin Maşazade
    • Managing Editor: Ms. Mia Hu
    • Abstracting/Indexing: Scopus (Since 2022), INSPEC (IET), CNKI,  Google Scholar, EBSCO, etc.
    • Average Days from Submission to Acceptance: 192 days
    • E-mail:
    • Journal Metrics:

Prof. Mehmet Sahinoglu
Computer Science Department, Troy University, USA
I'm happy to take on the position of editor in chief of IJCTE. We encourage authors to submit papers concerning any branch of computer theory and engineering.

IJCTE 2009 Vol.1(4): 394-397 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2009.V1.62

Research on Correlative Techniques of Building Specific Topic Lexicon

Shouning QU, Jian Lu and Jing Li

Abstract—for information extraction and topic classification, this paper extracts topic words from the classified documents, and builds the topic lexicon according to the topic. Topic words are extracted from each document by pretreating the document and using the TF-IDF weight formula. The topic words are extracted by the size of weight proportionally. After processing each document uninterruptedly, the topic lexicon is built according to the topic. Experiments prove that it has good accuracy to extract topic words. The topic lexicon is easy to build, and it satisfies the needs of word segmentation of all kinds of documents. It is a new method in information extraction and text classification.

Index Terms—text classification, TF-IDF, topic lexicon, topic words.


Cite: Shouning QU, Jian Lu and Jing Li, "Research on Correlative Techniques of Building Specific Topic Lexicon," International Journal of Computer Theory and Engineering vol. 1, no. 4, pp. 394-397, 2009.

Copyright © 2008-2024. International Association of Computer Science and Information Technology. All rights reserved.