Preprint has been published in a journal as an article
Preprint / Version 1

Hate Speech Detection Using Support Vector Machine (SVM) Method


Deteksi Ujaran Kebencian Menggunakan Metode Support Vector Machine (SVM)

##article.authors##

DOI:

https://doi.org/10.21070/ups.2545

Keywords:

Predictions, Hate Speech, SVM, XGBoost, RSCV

Abstract

Hate speech is a linguistic phenomenon that deviates from the norms and polite grammar in language and communication ethics. This research is aimed at detecting a word or sentence containing or not containing a hate speech using the SVM method for classification. This research takes data using the Tweepy API and gets a total sample data of 1681. To do word weighting, researchers use TF-IDF to find out the frequency of words that often arise in the dataset. In the classification process, researchers used two methods, namely SVM and XGBoost which then from the best results in SVM with 90% training data and 10% test data obtained a training score of 95.87% and a test score of 87.30% with a gap of 8.57% then from the SVM method was tuned using RSCV and managed to increase the training score by 100% test score of 93.20% with a gap of 6.80%.

Downloads

Download data is not yet available.

References

dan D. E. C. W. Dian Junita Ningrum, Suryadi, “KAJIAN UJARAN KEBENCIAN DI MEDIA SOSIAL,” Dian Junita Ningrum, Suryadi, dan Dian Eka Chandra Wardhana, vol. 2, no. 3, pp. 241–252, 2018.

F. A. S. Awaluddin, Afif Khalid, “Analisis Yuridis Tentang Pertanggungjawaban Pidana Pelaku Ujaran Kebencian (Hate Speech),” Univ. Islam Kalimantan, no. 19, pp. 1–14, 2022, [Online]. Available: http://eprints.uniska-bjm.ac.id/9294/.

M. K. Kelviandy, I. Komputer, and U. Gunadarma, “Kajian Penelitian Pembelajaran Mesin Untuk Pemrosesan Bahasa Alami Dalam Kalimat Perundungan Di Media Sosial,” vol. 03, no. 02, pp. 104–108, 2022.

I. Muslim Karo Karo, “Implementasi Metode XGBoost dan Feature Importance untuk Klasifikasi pada Kebakaran Hutan dan Lahan,” J. Softw. Eng. Inf. Commun. Technol., vol. 1, no. 1, pp. 11–18, 2020.

K. Akyol, “Coronary artery disease classification with support vector machines tuned via randomized search,” pp. 1–15, 2022.

W. Ayu, R. Abdulhakim, Y. Umaidah, and J. H. Jaman, “Optimasi Support Vector Machine Berbasis Particle Swarm Optimization Untuk Mendeteksi Hate Speech Pilkada Karawang,” J. Appl. Informatics Comput., vol. 5, no. 2, pp. 190–201, 2021, doi: 10.30871/jaic.v5i2.3473.

A. Adhari, M. Nasrun, and ..., “Deteksi Ujaran Ancaman Berbasis Website Pada Media Sosial Twitter Menggunakan Metode Support Vector Machine,” eProceedings …, vol. 8, no. 2, pp. 1920–1925, 2021, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/viewFile/14602/14381.

D. P. N. Lyrawati, “Deteksi Ujaran Kebencian Pada Twitter Menjelang Pilpres 2019 Dengan Machine Learning,” J. Ilm. Mat., vol. 7, no. 2, pp. 104–110, 2019.

N. A. Verdikha, T. B. Adji, and A. E. Permanasari, “Komparasi Metode Oversampling Untuk Klasifikasi Teks Ujaran Kebencian,” Semin. Nas. Teknol. Inf. dan Multimed. 2018, pp. 85–90, 2018.

L. P. A. S. Tjahyanti, “Pendeteksian Bahasa Kasar (Abusive Language) Dan Ujaran Kebencian (Hate Speech) Dari Komentar Di Jejaring Sosial,” J. Chem. Inf. Model., vol. 07, no. 9, pp. 1689–1699, 2020.

J. A. Septian, T. M. Fachrudin, and A. Nugroho, “Analisis Sentimen Pengguna Twitter Terhadap Polemik Persepakbolaan Indonesia Menggunakan Pembobotan TF-IDF dan K-Nearest Neighbor,” J. Intell. Syst. Comput., vol. 1, no. 1, pp. 43–49, 2019, doi: 10.52985/insyst.v1i1.36.

M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,” IEEE Access, vol. 8, pp. 90847–90861, 2020, doi: 10.1109/ACCESS.2020.2994222.

Posted

2023-08-21