Implementation of Data Mining in Breast Cancer Diagnosis Classification Using Logistic Regression Algorithm
Implementasi Data Mining dalam Klasifikasi Diagnosa Kanker Payudara Menggunakan Algoritma Logistic Regression
DOI:
https://doi.org/10.21070/ups.2767Keywords:
Logistic Regression, Breast Cancer Diagnosis Clasification, Confusion MatrixAbstract
Breast cancer is a very dangerous disease. It is considered as one of the most serious threats to women's health. To treat breast cancer, surgery and chemotherapy are two common approaches. It is important to diagnose breast cancer early to minimize the severity and increase the chance of cure. This study aims to classify breast cancer diagnoses using Logistic Regression. The data used is secondary data downloaded from Kaggle.com totaling 569 records. After going through pre-processing, the data that is ready to be processed is then divided into training and testing data with a ratio of 70%: 30%. This study resulted in an accuracy rate of 98% for predicting breast cancer patients after classification modeling and model testing using the confusion matrix method.
Downloads
References
A. Suyanto, Data Mining in Early Diagnosis of Breast Cancer. Journal of Medical Systems, 2017.
Kemenkes RI., “Infodatin. Bulan Peduli Kanker Payudara Jakarta Kemenkes RI.,” Jakarta Selatan, Indones. Kementeri. Kesehat. Republik Indones., pp. 1–17, 2016.
E. Susilowati, A. T. Hapsari, M. Efendi, and P. Edi, “Diagnosa Penyakit Kanker Payudara Menggunakan Metode K - Means Clustering,” J. Sist. Informasi, Teknol. Inform. dan Komput., vol. 10, no. 1, pp. 27–32, 2019.
I. Mubarog, A. Setyanto, and H. Sismoro, “Sistem Klasifikasi Pada Penyakit Breast Cancer Dengan Menggunakan Metode Naïve Bayes,” Creat. Inf. Technol. J., vol. 6, no. 2, p. 109, 2021, doi: 10.24076/citec.2019v6i2.246.
Suyanto, Data mining untuk klasifikasi dan klasterisasi data. Bandung: Informatika Bandung, 2017.
N. Meilani and O. Nurdiawan, “Data Mining untuk Klasifikasi Penderita Kanker Payudara Menggunakan Algoritma K-Nearest Neighbor,” J. Wahana Inform., vol. 2, no. 1, pp. 177–187, 2023, [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer.
M. I. Gunawan, D. Sugiarto, and I. Mardianto, “Peningkatan Kinerja Akurasi Prediksi Penyakit Diabetes Mellitus Menggunakan Metode Grid Seacrh pada Algoritma Logistic Regression,” J. Edukasi dan Penelit. Inform., vol. 6, no. 3, p. 280, 2020, doi: 10.26418/jp.v6i3.40718.
A. Bimantara and T. A. Dina, “Klasifikasi Web Berbahaya Menggunakan Metode Logistic Regression,” Annu. Res. Semin., vol. 4, no. 1, pp. 173–177, 2019, [Online]. Available: https://seminar.ilkom.unsri.ac.id/index.php/ars/article/view/1932.
G. P. PB, “Klasifikasi Persetujuan Permohonan Pinjaman Pada Koperasi Simpan Pinjam Menggunakan Algoritma Logistic Regression,” J. Ilmu Data, vol. 2, no. 12, pp. 1–12, 2022, [Online]. Available: http://ilmudata.org/index.php/ilmudata/article/view/281%0Ahttp://ilmudata.org/index.php/ilmudata/article/download/281/270.
F. M. Faruk, F. M. Faruk, F. S. Doven, and B. Budyanra, “Penerapan Metode Regresi Logistik Biner Untuk Mengetahui Determinan Kesiapsiagaan Rumah Tangga Dalam Menghadapi Bencana Alam,” Semin. Nas. Off. Stat., vol. 2019, no. 1, pp. 379–389, 2020, doi: 10.34123/semnasoffstat.v2019i1.146.
N. G. Ramadhan, F. D. Adhinata, A. J. T. Segara, and D. P. Rakhmadani, “Deteksi Berita Palsu Menggunakan Metode Random Forest dan Logistic Regression,” JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 2, p. 251, 2022, doi: 10.30865/jurikom.v9i2.3979.
A. K. A. I, F. Nurhadi, I. K. O. Setiawan, I. A. Rizky, and R. B. Manurung, “Pengaruh Normalisasi Data pada Klasifikasi Harga Ponsel Berdasarkan Spesifikasi Menggunakan Klasifikasi Naive Bayes dan Multinomial Logistic Regression,” J. Rekayasa Elektro Sriwij., vol. 3, no. 1, pp. 8–16, 2022.
A. D. Achmad, “KLASIFIKASI BREAST CANCER MENGGUNAKAN METODE LOGISTIC REGRESSION,” vol. 9, no. 1, 2022.
I. N. Atthalla, A. Jovandy, and H. Habibie, “Klasifikasi Penyakit Kanker Payudara Menggunakan Metode K Nearest Neighbor,” Pros. Annu. Res. Semin., vol. 4, no. 1, pp. 148–151, 2018.
A. K. Santoso, A. Noviriandini, A. Kurniasih, B. D. Wicaksono, and A. Nuryanto, “Klasifikasi Persepsi Pengguna Twitter Terhadap Kasus Covid-19 Menggunakan Metode Logistic Regression,” JIK (Jurnal Inform. dan Komputer), vol. 5, no. 2, pp. 234–241, 2021.
R. Nofitri and N. Irawati, “Analisis Data Hasil Keuntungan Menggunakan Software Rapidminer,” JURTEKSI (Jurnal Teknol. dan Sist. Informasi), vol. 5, no. 2, pp. 199–204, 2019, doi: 10.33330/jurteksi.v5i2.365.
A. Saleh and F. Nasari, “Penerapan Equal-Width Interval Discretization Dalam Metode Naive Bayes Untuk Meningkatkan Akurasi Prediksi Pemilihan Jurusan Siswa (Studi Kasus: Mas Pab 2 Helvetia,Medan),” Masy. Telemat. Dan Inf. J. Penelit. Teknol. Inf. dan Komun., vol. 8, no. 1, p. 1, 2018, doi: 10.17933/mti.v8i1.98.
N. Barkah, E. Sutinah, and N. Agustina, “Metode Asosiasi Data Mining Untuk Analisa Persediaan Fiber Optik Menggunakan Algoritma Apriori,” J. Kaji. Ilm., vol. 20, no. 3, pp. 237–248, 2020, doi: 10.31599/jki.v20i3.288.
O. I. Desanti, I. Sunarsih, and Supriyati, “Persepsi Wanita Berisiko Kanker Payudara Tentang Pemeriksaan Payudara Sendiri Di Kota Semarang, Jawa Tengah,” Ber. Kedokt. Masy., vol. 26, no. 3, pp. 152–161, 2010.
A. Alharthi, Abdulrahman ; Al-Mutairi, “Performance evaluation of classification models using confusion matrix,” Int. J. Adv. Comput. Sci. Appl., pp. 427–432, 2020.
Downloads
Additional Files
Posted
License
Copyright (c) 2023 UMSIDA Preprints Server
This work is licensed under a Creative Commons Attribution 4.0 International License.