Hate Speech Detection: A Social Network Story that Needs Serious Attention

Authors

  • Muhammad Akram Department of Software Engineering, Balochistan University of Information Technology, Engineering and Management, Pakistan
  • Wajid Hassan Moosa Department of Software Engineering, Balochistan University of Information Technology, Engineering and Management, Pakistan
  • Najiba Zahra Department of Software Engineering, Balochistan University of Information Technology, Engineering and Management, Pakistan

Keywords:

Hate speech detection; Machine learning; Neural network; Pre-trained language model; Multilingual; Context-based embeddings; Transfer learning.

Abstract

Hate speech detection is an important task in natural language processing that has significant implications for online safety and social harmony. In this paper, we introduce a new benchmark composed of existing datasets for multilingual hate speech detection, comprising 24 datasets from 14 different languages covering a range of hate speech types, including racism, Islamophobia, misogyny, and more. The dataset includes numerous instances of data that have been carefully curated and annotated to ensure high quality and representation. We evaluated several machine learning models on this dataset, including mBERT, XLM-Roberta, and tree-based models such as XGBoost and we also investigated several aspects such as LASER being found to be resilient to biased data. Our results show that these models achieve high F1 scores in detecting hate speech in different languages. We also compare the performance of these models to other state-of-the-art models on our benchmark dataset, demonstrating that our dataset provides a valuable addition to the multilingual hate speech detection field. Our benchmark dataset and evaluation results provide a useful resource for future researchers working on multilingual hate speech detection. They can use this dataset to benchmark and compare the performance of different models, and to develop new models that can detect hate speech more effectively in different languages. Overall, our work contributes to advancing the field of natural language processing and promoting online safety and social harmony.

Downloads

Download data is not yet available.

Published

2024-08-14

How to Cite

Akram, M. ., Moosa, W. H. ., & Zahra, N. . (2024). Hate Speech Detection: A Social Network Story that Needs Serious Attention. Journal of Information Systems Research and Practice, 2(3), 97–108. Retrieved from https://mjlis.um.edu.my/index.php/JISRP/article/view/55850