Web Scraping and Natural Language Processing Using CNN for Cross-Platform Digital Sentiment Analysis
Web Scraping dan Natural Language Processing Menggunakan CNN untuk Analisis Sentimen Lintas Platform Digital
DOI:
https://doi.org/10.21070/ups.9802Keywords:
CNN, LLM Llama, Analisis Sentimen, Youtube, Detik.com, Web ScrapingAbstract
The rapid proliferation of public opinion on digital platforms such as YouTube and Detik.com has generated a massive volume of data reflecting societal perceptions. However, data processing is often inefficient when conducted manually and is typically limited to sentiment classification, failing to provide concrete solutions for decision-makers. This study aims to develop an end-to-end cross-platform sentiment analysis system that not only classifies sentiment but also generates automated policy recommendations. The methodology employs dynamic keyword-based web scraping, and classification using the Convolutional Neural Network (CNN). Furthermore, the system integrates the Llama Large Language Model (LLM) to synthesize keyword trends into draft policy solutions. Empirical results indicate that the CNN model achieves an accuracy of 81%. The integration of real-time visualization and LLM-based policy recommendations effectively bridges the gap between technical data analysis and practical decision-making requirements, establishing this system as a relevant and adaptive solution to the dynamics of public issues.
Downloads
References
G. A. Suwito, I. Cholissodin, and P. P. Adikara, “Analisis Sentimen Citayam Fashion Week pada Komentar YouTube dengan Metode Convolutional Neural Network,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 6, no. 12, pp. 5948–5956, 2022, [Online]. Available: http://j-ptiik.ub.ac.id
D. E. Saputra and A. R. Isnain, “Implementasi Algoritma Convolutional Neural Network Untuk Analisis Sentimen Bacapres 2024 Pada Kolom Komentar Youtube Mata Najwa,” JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 9, no. 3, pp. 1431–1441, 2024, doi: 10.29100/jipi.v9i3.5420.
Y. F. Qitfirul Dwi Cahyono, Ade Eviyanti , Metatia Intan Mauliana, “Perancangan Sistem Informasi Layanan Surat Menyurat dan Pengaduan Masyarakat Berbasis WEB,” J. Technol. Syst. Inf., vol. 1, no. 1, pp. 86–102, 2024.
R. C. Rivaldi, T. D. Wismarini, J. T. Lomba, and J. Semarang, “Analisis Sentimen Pada Ulasan Produk Dengan Metode Natural Language Processing (NLP) (Studi Kasus Zalika Store 88 Shopee),” Elkom, vol. 17, no. 1, pp. 120–128, 2024.
M. Ujaran et al., “Implementasi Convolutional Neural Network ( C NN ) Untuk,” vol. 14, no. 2, pp. 314–325, 2024.
P. L. Parameswari and Prihandoko, “Penggunaan Convolutional Neural Network Untuk Analisis Sentimen Opini Lingkungan Hidup Kota Depok Di Twitter,” J. Ilm. Teknol. dan Rekayasa, vol. 27, no. 1, pp. 29–42, 2022, doi: 10.35760/tr.2022.v27i1.4671.
G. Tamara and Kemas Muslim L, “Sentiment Analysis on Acute Kidney Syrup Videos Using CNN and LSTM Algorithms,” Int. J. Inf. Commun. Technol., vol. 9, no. 2, pp. 53–65, 2023, doi: 10.21108/ijoict.v9i2.818.
A. Maulana, D. Dyantono, and R. E. Putra, “Perbandingan Sent2vec TF-IDF Logistic Regression dan Word2vec CNN pada hasil Sentiment Analysis Youtube Comment,” J. Informatics Comput. Sci., vol. 05, pp. 63–72, 2023, [Online]. Available: https://ejournal.unesa.ac.id/index.php/jinacs/article/view/54621%0Ahttps://ejournal.unesa.ac.id/index.php/jinacs/article/download/54621/43435
M. Fariz, S. Lazuardy, and D. Anggraini, “Modern Front End Web Architectures with React.Js and Next.Js,” Int. Res. J. Adv. Eng. Sci., vol. 7, no. 1, pp. 132–141, 2022.
H. L. Yuzefa and A. Eviyanti, “Perancangan Perpustakaan Digital Berbasis Web dengan Pendekatan Development Life Cycle,” Indones. J. Appl. Technol., vol. 1, no. 2, p. 22, 2024, doi: 10.47134/ijat.v1i2.3042.
A. Z. Pratama, A. Marinta, B. Triyudanto, M. Saman, and T. N. Fatyanosa, “Retrieval-Augmented Generation for Indonesian Criminal Law Information Using the LLaMA Model,” Innov. Informatics Artif. Intell. Res., vol. 1, no. 1, pp. 35–41, 2025, [Online]. Available: http://doi.org/10.35718/iiair.v1i1.1306https://journal.itk.ac.id/index.php/IIAIRAvailable:https://doi.org/10.35718/iiair.v1i1.1306
R. A. Putri and M. A. Hamzah, “Analisis Sentimen Terhadap Kebijakan Penggunaan Kendaraan Listrik di Media Youtube menggunakan Metode Convolutional Neural Network ( CNN ),” vol. 11, no. 1, pp. 182–192, 2025.
A. Fatma and P. Indah, “Analisis Sentimen Komentar YouTube MV K-Pop Menggunakan Naïve Bayes dan SVM : Studi Kasus Jung Jaehyun ‘ Horizon ,’” vol. 15, no. 2, pp. 383–395, 2025.
S. Khairunnisa, A. Adiwijaya, and S. Al Faraby, “Pengaruh Text Preprocessing terhadap Analisis Sentimen Komentar Masyarakat pada Media Sosial Twitter (Studi Kasus Pandemi COVID-19),” J. Media Inform. Budidarma, vol. 5, no. 2, p. 406, 2021, doi: 10.30865/mib.v5i2.2835.
E. Y. Hidayat and D. Handayani, “Penerapan 1D-CNN untuk Analisis Sentimen Ulasan Produk Kosmetik Berdasar Female Daily Review,” J. Nas. Teknol. dan Sist. Inf., vol. 8, no. 3, pp. 153–163, 2023, doi: 10.25077/teknosi.v8i3.2022.153-163.
S. T. Andini, A. Eviyanti, H. Setiawan, and C. Taurusta, “Analisis Sentimen Pengguna Aplikasi Tantan: Perbandingan Kinerja Metode Naive Bayes dan SVM Sentiment Analysis of Tantan Application Users: A Performance Comparison Between Naive Bayes and SVM,” vol. 15, no. 2, pp. 396–407, 2025.
M. A. Naufal Dzaki, H. Hindarto, A. Eviyanti, and Nuril Lutvi Azizah, “Analisis Sentimen Layanan Pelanggan Provider Internet dengan Algoritma Support Vector Machine dan Naïve Bayes,” Semant. Tek. Inf., vol. 11, no. 1, pp. 84–93, 2025, doi: 10.55679/semantik.v11i1.127.
K. B. W. K. Arya, I. Y. A. W. Nyoman, and I. J. E. P. Gede, “Implementasi Next.Js, Typescript, Dan Tailwind Css Untuk Pengembangan Aplikasi Frontend Sistem Inventory Perusahaan Apar (Studi Kasus: CV Indoka Surya Jaya) Implementation Next.Js, Typescript, And Tailwind Css For The Development Of Apar Company Inventory,” JIKOM J. Inform. dan Komput., vol. 14, no. 2, pp. 95–108, 2024.
Downloads
Additional Files
Posted
License
Copyright (c) 2026 UMSIDA Preprints Server

This work is licensed under a Creative Commons Attribution 4.0 International License.
