• May 27, 2016 News!The submission for Special Issue is officially open now!   [Click]
  • May 03, 2016 News!Vol.6, No.6 has been indexed by EI (Inspec).   [Click]
  • Mar 17, 2017 News!Vol.9, No.2 has been published with online version. 13 peer reviewed articles from 4 specific areas are published in this issue.   [Click]
General Information
Prof. Wael Badawy
Department of Computing and Information Systems Umm Al Qura University, Canada
I'm happy to take on the position of editor in chief of IJCTE. We encourage authors to submit papers concerning any branch of computer theory and engineering.
IJCTE 2012 Vol.4(5): 702-706 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2012.V4.561

β-Thalassemia Knowledge Elicitation Using Data Engineering: PCA, Pearson’s Chi Square and Machine Learning

P. Paokanta

Abstract—Data Engineering is one of the Knowledge Elicitation and Analysis methods, among serveral techniques; Feature Selection methods play an important role for these processes which are the processes in data mining technique especially classification tasks. The filtering process is an important pre-treatment for every classification process. Not only decreasing the computational time and cost, but selecting an appropriate variable is increasing the classification accuracy also. In this paper, the Thalassemia knowledge was elicited using Data engineering techniques (PCA, Pearson’s Chi square and Machine Learning). This knowledge presented in form of the comparison of classification performance of machine learning techniques between using Principal Components Analysis (PCA) and Pearson’s Chi square for screening the genotypes of β-Thalassemia patients. According to using PCA, the classification results show that the Multi-Layer Perceptron (MLP) is the best algorithm, providing that the percentage of accuracy reaches 86.61, K- Nearest Neighbors (KNN), Naive Bayes, Bayesian Networks (BNs) and Multinomial Logistic Regression with the percentage of accuracy 85.83, 85.04, 85.04 and 82.68. On the other hand, these results were compared to the Pearson’s Chi Square and presented that…. In the future, we will search for the other feature selection techniques in order to improve the classification performance such as the hybrid method, filtering mathod etc.

Index Terms—Knowledge elicitation, data engineering, feature selection, principal component analysis (PCA), pearson’s chisquare, machine learning, β-thalassemia.

Patcharaporn Paokanta is with the Development, System Analysis and Design, and Information Technology at the College of Arts, Media and Technology, Chiang Mai University (CMU), Thailand.


Cite: P. Paokanta, "β-Thalassemia Knowledge Elicitation Using Data Engineering: PCA, Pearson’s Chi Square and Machine Learning," International Journal of Computer Theory and Engineering vol. 4, no. 5, pp. 702-706, 2012.

Copyright © 2008-2015. International Journal of Computer Theory and Engineering. All rights reserved.
E-mail: ijcte@vip.163.com