Abstract—Data warehouses provide a great deal of opportunities for performing data mining tasks such as classification and clustering. Typically, updates are collected and applied to the data warehouse periodically. Then, all patterns derived from the warehouse by some data mining algorithm have to be updated as well. Due to the very large size of the databases, it is highly desirable to perform these updates incrementally. In this paper, we present the new approach/algorithm based on Genetic algorithm. Our algorithm is applicable to any database containing data from a metric space, e.g., to a spatial database. Based on the formal definition of clusters, it can be proven that the incremental algorithm yields the same result as any other algorithm. A performance evaluation of algorithm Incremental Clustering using Genetic Algorithm (ICGA) on a spatial database is presented, demonstrating the efficiency of the proposed algorithm. ICGA yields significant speed-up factors over other clustering algorithms.
Index Terms—Data Mining, Clustering, Genetic Algorithm
Atul Kamble is with the D.K.T.E.S. Textile and Engineering Institute, Ichalkaranji-416115, India. (Mobile phone: +91-9673274518; e-mail: firstname.lastname@example.org).
Cite: Atul Kamble, "Incremental Clustering in Data Mining using Genetic Algorithm," International Journal of Computer Theory and Engineering vol. 2, no. 3, pp. 326-328, 2010.