Clustering is the process of grouping similar objects. Naïve Bayes Classifier is the classification technique which is widely used to predict the unknown class labels. Herein this paper we extend this concept to unsupervised classification, clustering. As in K-modes the proposed method starts the clustering process with the modes. Based on the prior information bayes theorem is used to place the object in the respective clusters. The feature of the proposed algorithm is scalability and it need only one data scan. The proposed Bayesian clustering to cluster categorical data is experimented with the real data sets obtained from the UCI machine learning data repository and compared with the well known K-modes algorithm to cluster the categorical data. Experimental resultsprove that the proposed method is efficient than K-modes.
—clustering, categorical data, Bayesian theorem, mode.
Aranganayagi. S is with the J. K. K. Nataraja College of Arts & Science, Komarapalayam, Tamilnadu, India and doing research in the Department of Computer Science and Applications, Gandhigram Rural University, Gandhigram Tamilnadu India. Member of IAENG: Corresponding author, phone: 0424-2230855, 9842723085.
Dr. K. Thangavel is with the Periyar University, Salem, Tamilnadu, Indiaas Professor in Computer Science.
Cite: Aranganayagi. S and Thangavel. K, "Clustering Categorical Data using BayesianConcept," International Journal of Computer Theory and Engineering
vol. 1, no. 2, pp. 119-125, 2009.