Abstract—As heart disease is the leading cause of mortality worldwide, early detection and prevention of the disease would reduce the mortality rate. Various Machine Learning Algorithms are employed in the classification and prediction of diseases. For accurate prediction, Feature Selection algorithms are employed to choose features that have a significant association with the disease or target variable. This would reduce computing time and improve the prediction performance. In this paper, ModifiedBoostARoota (MBAR) algorithm was used for Feature Selection, and classifiers CatBoost, XGBoost, Decision Tree, Extra Trees Classifier, Support Vector Classifier, Logistic Regression, K Nearest Neighbors, Naive Bayes, and Random Forest were applied on UCI Arrhythmia dataset and UCI Z-Alizadeh Sani dataset. Synthetic Minority Over Sampling Technique (SMOTE) was used to balance the dataset. A comparison of the performance of the models on the imbalanced and balanced datasets shows that MBAR with CatBoost classifier gives better accuracy of 92.76% on the balanced Z-Alizadeh Sani dataset and 86.33% on the balanced Arrhythmia dataset.
Index Terms—Heart disease, feature selection, CatBoost, classification.
Anuradha. P and Vasantha Kalyani David are with the Department of Computer Science, Avinashilingam Institute for Home Science and Higher Education for Women, deemed to be University, Coimbatore, India (e-mail: anujith72@gmail.com).
Cite:Anuradha. P and Vasantha Kalyani David, "Feature Selection by ModifiedBoostARoota and Classification by CatBoost Model on High Dimensional Heart Disease Datasets," International Journal of Computer Theory and Engineering vol. 14, no. 4, pp. 141-148, 2022.
Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
Copyright © 2008-2024. International Association of Computer Science and Information Technology. All rights reserved.