• Dec 30, 2021 News!Vol.13, No.1 & Vol.13, No.2 have been indexed by Inspec.   [Click]
  • Mar 29, 2022 News!IJCTE Vol.14, No.2 has been published.   [Click]
  • Jan 28, 2022 News!IJCTE had implemented online submission system   [Click]
General Information
Prof. Mehmet Sahinoglu
Faculty at Computer Science Department, Troy University, USA
I'm happy to take on the position of editor in chief of IJCTE. We encourage authors to submit papers concerning any branch of computer theory and engineering.

IJCTE 2012 Vol.4(5): 702-706 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2012.V4.561

β-Thalassemia Knowledge Elicitation Using Data Engineering: PCA, Pearson’s Chi Square and Machine Learning

P. Paokanta

Abstract—Data Engineering is one of the Knowledge Elicitation and Analysis methods, among serveral techniques; Feature Selection methods play an important role for these processes which are the processes in data mining technique especially classification tasks. The filtering process is an important pre-treatment for every classification process. Not only decreasing the computational time and cost, but selecting an appropriate variable is increasing the classification accuracy also. In this paper, the Thalassemia knowledge was elicited using Data engineering techniques (PCA, Pearson’s Chi square and Machine Learning). This knowledge presented in form of the comparison of classification performance of machine learning techniques between using Principal Components Analysis (PCA) and Pearson’s Chi square for screening the genotypes of β-Thalassemia patients. According to using PCA, the classification results show that the Multi-Layer Perceptron (MLP) is the best algorithm, providing that the percentage of accuracy reaches 86.61, K- Nearest Neighbors (KNN), Naive Bayes, Bayesian Networks (BNs) and Multinomial Logistic Regression with the percentage of accuracy 85.83, 85.04, 85.04 and 82.68. On the other hand, these results were compared to the Pearson’s Chi Square and presented that…. In the future, we will search for the other feature selection techniques in order to improve the classification performance such as the hybrid method, filtering mathod etc.

Index Terms—Knowledge elicitation, data engineering, feature selection, principal component analysis (PCA), pearson’s chisquare, machine learning, β-thalassemia.

Patcharaporn Paokanta is with the Development, System Analysis and Design, and Information Technology at the College of Arts, Media and Technology, Chiang Mai University (CMU), Thailand.


Cite: P. Paokanta, "β-Thalassemia Knowledge Elicitation Using Data Engineering: PCA, Pearson’s Chi Square and Machine Learning," International Journal of Computer Theory and Engineering vol. 4, no. 5, pp. 702-706, 2012.

Copyright © 2008-2022. International Association of Computer Science and Information Technology. All rights reserved.