Multilevel Troll Classification of Twitter Data Using Machine Learning Techniques

Home > Articles > Archive > 2024 > Volume 16 Number 1 (2024) >

IJCTE 2024 Vol.16(1): 21-28
DOI: 10.7763/IJCTE.2024.V16.1350

Susan Mathew K*, Deborah Alex, Nidhi Deshpande, Richa Sharma, Arti Arya, and D. P. Balendra

Department of Computer Science and Engineering, PES University, Bengaluru, India
Email: susanmatk@gmail.com (S.M.K.); itsdeborahalex@gmail.com (D.A.); nidhideshpande15@gmail.com (N.D.); richasharma@pes.edu (R.S.); artiarya@pes.edu (A.A.); balendradp@gmail.com (D.P.B)
^*Corresponding author

Manuscript received June 27, 2023; revised July 20, 2023; accepted August 11, 2023; published February 15, 2024

Abstract—Trolling on social media is the phenomenon of using provocative or offensive text, attempts to dominate, disrupt or deviate from the main topic of discussion. Identifying trolls can help protect organic users of the platform from the unwanted negative consequences resulting from interacting with a troll. In this work, five condensed feature sets namely sentiment, readability, post analysis, network and frequency analysis are used to make the broad distinction between troll and non-troll users. An ensemble of Machine Learning Algorithms (with base classifiers as Random Forest, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM) and meta-classifier as Random Forest) are used to perform the multilevel classification. In the first level, trolls are identified from non-trolls and in the second level, the trolls are classified into their respective types—Political, Communal, Conspiracy or Asocial Trolls. Additionally, by data driven observations, the traditional understanding of antisocial behavior in trolls is expanded to develop a more multidimensional representation of trolling behavior. Using the Stacking Classifier, an accuracy of 78.72% was achieved for identifying trolls from non-trolls in first phase and an accuracy of 83.24% in classifying trolls into their respective categories in the second phase.

Keywords—machine learning, trolls, types of trolls, multi-class classification, random forest, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), ensemble

[PDF]

Cite: Susan Mathew K, Deborah Alex, Nidhi Deshpande, Richa Sharma, Arti Arya, and D. P. Balendra, "Multilevel Troll Classification of Twitter Data Using Machine Learning Techniques," International Journal of Computer Theory and Engineering vol. 16, no. 1, pp. 21-28, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions

Previous Paper

Concurrent and Spectral Clustering of Wireless Waves

Next Paper

Deep Learning-Based Approach for Tomato Classification in Complex Scenes

Multilevel Troll Classification of Twitter Data Using Machine Learning Techniques

Article Metrics in Dimensions

Menu