Preprint has been published in a journal as an article
Preprint / Version 1

Hate Speech and Emotions Classification in Indonesian Language Texts on Twitter Using Naïve Bayes Classifier

Klasifikasi Hate Speech dan Emosi Dalam Teks Berbahasa Indonesia pada Pengguna Twitter Menggunakan Metode Naïve Bayes Classifier




Clasification, Hate Speech, Emotional Description, Naive Bayes, Tweet


Hate speech is a form of expression that incites, spreads, justifies, or encourages hatred, discrimination and violence against individuals and groups for various reasons. Hate speech  is usually found on social media connected to the internet, one of which is in this study through social media twitter using the Naïve Bayes Classifier method. The dataset used in this study amounted to 1800 data labeled not hate speech and 2250 data labeled hate speech with a comparison of 60% training data and 40% test data. The results of the evaluation of test data with confusion matrix obtained measurements of matrix mean accuracy for hate speech classification 0.89 and matrix mean accuracy for emotion classification 0.59.  Based on the results obtained, it can be concluded that to classify hate speech and emotions on Twitter using Naïve Bayes, the best results with the Confusion Matrix without selecting the Information Gain feature.



