Abstract—Source code plagiarism is currently a severe problem in academia. In academia’s programming assignments are used to evaluate students in programming courses. Therefore, checking programming assignments for plagiarism is essential. If a course consists of a large number of students, it is impractical for a human inspector to check each assignment. Therefore, it is essential to have automated tools in order to detect plagiarism in the programming assignments. Majority of the current source code plagiarism detection tools are based on structured methods. Structural properties of a plagiarized program and the original program differ significantly. Therefore, it is hard to detect plagiarized programs with tools based on structural methods, when the plagiarism level is four or above.This paper proposes a new plagiarism detection method, which is based on the attribute counting technique. Novelty of our method is that, we have utilized a meta-learning algorithm in order to improve the accuracy of our plagiarism detection system.
Index Terms—Plagiarism detection, machine learning, source code, naïve bayes classifier, k-nearest neighbor
U Bandara is with the Virtusa Corporation, Sri Lanka (e-mail: upulbandara@ gmail.com).
G. Wijayarathna is with the Faculty of Science, University of Kelaniya, Sri Lanka (email@example.com).
Cite: Upul Bandara and Gamini Wijayrathna, "Detection of Source Code Plagiarism Using Machine Learning Approach," International Journal of Computer Theory and Engineering vol. 4, no. 5, pp. 674-678, 2012.