Toxic Speech Classification Using Deep Learning and Machine Learning: A Survey
Main Article Content
Abstract
The exponential increase in toxic speech has significantly jeopardized the creation of an inclusive environment for all individuals. Though attempts have been taken to detect and restrict such information online, this is still difficult to discover. Deep learning-based methods have pioneered toxic speech detection. The context-dependent characteristics of poisonous speech, user intention, unwanted biases, etc., render this procedure overcritical. We provide a hierarchical architecture of automated hazardous speech detection difficulties in this study to fully examine them. We examine machine learning and deep learning toxic speech recognition difficulties. At the top, we differentiate data, model, and human issues. We analyze each hierarchical level in detail using examples. This poll will help toxic speech detection researchers create better solutions. This survey paper presents an extensive literature review of deep learning and machine learning methods towards the automatic identification of toxic speech, considering recent technological advancements. A multitude of algorithms and architectures have been evaluated in this context. This paper will assess the positive and negative aspects of various recognition and categorization models regarding speech expressed vocally in multilingual contexts. Additionally, there will be an analysis of occurrences of code-mixing. To demonstrate the impact of these techniques on the overall effectiveness of the model, additional analysis will be performed on the methods employed in feature selection during toxic speech detection.