Preprint / Version 1

Modeling Early-stage Diabetes Mellitus using an Ensemble Learning Approach


Pemodelan Deteksi Dini Diabetes Mellitus menggunakan Pendekatan Ensemble Learning

##article.authors##

DOI:

https://doi.org/10.21070/ups.4219

Keywords:

prediction of early-stage, Diabetes Mellitus, RapidMiner, Classification, Random Forest

Abstract

Diabetes mellitus is characterized by hyperglycemia caused by the pancreas's inability to produce insulin properly. Diabetes has early-stage symptoms that can be used as a benchmark for determining whether a person has diabetes mellitus or not. Based on data from Sidoarjo Regional General Hospital, diabetes cases are the fourth most common of the 10 biggest diseases in Sidoarjo Regional General Hospital. The purpose of this research is to detect early symptoms of type 2 diabetes mellitus, Data annotation is performed by proficient paramedics within their respective fields. This research uses the ensemble learning classification method with Rapidminer tools, conducts training and testing tests with a ratio of 60:40 on split data operators, and adds performance to produce accuracy values. The results obtained in the form of evaluation results with a Random Forest accuracy rate of 87.30%, where the accuracy level can be categorized as excellent classification,.

Downloads

Download data is not yet available.

References

B. E. Nyarko, R. S. Amoah, and A. Crimi, “Boosting diabetes and pre-diabetes detection in rural Ghana [version 2; peer review: 2 approved],” F1000 Res., vol. 8, p. 19, Aug. 2019, doi: https://doi.org/10.12688/f1000research.18497.2.

D. Magliano and E. J. Boyko, IDF diabetes atlas, 10th edition. Brussels: International Diabetes Federation, 2021.

S. Bessy, “Hasil Riskesdas 2018 Kemestrian Kesehatan,” presented at the hasil riskesdas, Kemkes, 02112018. [Online]. Available: https://kesmas.kemkes.go.id/assets/upload/dir_519d41d8cd98f00/files/Hasil-riskesdas-2018_1274.pdf

W. Yusnaeni and W. Widiarina, “Penerapan Algoritma C4.5 Dalam Prediksi Resiko Diabetes Tahap Awal (Early Stage Diabetes),” J. Tek. Komput., vol. 8, no. 1, pp. 56–60, Jan. 2022, doi: 10.31294/jtk.v8i1.11566.

A. M. Argina, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, Jul. 2020, doi: 10.33096/ijodas.v1i2.11.

O. Ozougwu, “The pathogenesis and pathophysiology of type 1 and type 2 diabetes mellitus,” J. Physiol. Pathophysiol., vol. 4, no. 4, pp. 46–57, Sep. 2013, doi: 10.5897/JPAP2013.0001.

A. Ridwan, “Penerapan Algoritma Naïve Bayes Untuk Klasifikasi Penyakit Diabetes Mellitus,” J. SISKOM-KB Sist. Komput. Dan Kecerdasan Buatan, vol. 4, no. 1, pp. 15–21, Oct. 2020, doi: 10.47970/siskom-kb.v4i1.169.

P. Subarkah, “Penerapan Algoritme Klasifikasi Classification And Regression Trees (Cart) Untuk Diagnosis Penyakit Diabetes Retinopathy,” MATRIK J. Manaj. Tek. Inform. Dan Rekayasa Komput., vol. 19, no. 2, pp. 294–301, May 2020, doi: 10.30812/matrik.v19i2.676.

W. Apriliah, I. Kurniawan, M. Baydhowi, and T. Haryati, “Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest,” SISTEMASI, vol. 10, no. 1, p. 163, Jan. 2021, doi: 10.32520/stmsi.v10i1.1129.

W. Nugraha and R. Sabaruddin, “Teknik Resampling untuk Mengatasi Ketidakseimbangan Kelas pada Klasifikasi Penyakit Diabetes Menggunakan C4.5, Random Forest, dan SVM,” Techno.Com, vol. 20, no. 3, pp. 352–361, Aug. 2021, doi: 10.33633/tc.v20i3.4762.

B. T. R. Doni, S. Susanti, and A. Mubarok, “PENERAPAN DATA MINING UNTUK KLASIFIKASI PENYAKIT HEPATOCELLULAR CARCINOMA MENGGUNAKAN ALGORITMA NAÏVE BAYES,” J. Responsif Ris. Sains Dan Inform., vol. 3, no. 1, pp. 12–19, Feb. 2021, doi: 10.51977/jti.v3i1.403.

A. K. F. Aidia, P. J. Amelia, and V. R. Setyaning Nastiti, “Prediksi Jumlah Pasien Covid-19 Dengan Menggunakan Klasifikasi Algoritma Machine Learning,” SINTECH Sci. Inf. Technol. J., vol. 5, no. 2, pp. 165–172, Oct. 2022, doi: 10.31598/sintechjournal.v5i2.1163.

R. Ghorbani and R. Ghousi, “Predictive data mining approaches in medical diagnosis: A review of some diseases prediction,” Int. J. Data Netw. Sci., pp. 47–70, 2019, doi: 10.5267/j.ijdns.2019.1.003.

F. Aris, “Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi,” vol. 1, no. 1, 2019.

M. Azhari, Z. Situmorang, and R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” J. MEDIA Inform. BUDIDARMA, vol. 5, no. 2, p. 640, Apr. 2021, doi: 10.30865/mib.v5i2.2937.

F. Elfaladonna and A. Rahmadani, “ANALISA METODE CLASSIFICATION-DECISSION TREE DAN ALGORITMA C.45 UNTUK MEMPREDIKSI PENYAKIT DIABETES DENGAN MENGGUNAKAN APLIKASI RAPID MINER,” SINTECH Sci. Inf. Technol. J., vol. 2, no. 1, pp. 10–17, Apr. 2019, doi: 10.31598/sintechjournal.v2i1.293.

D. R. Ente, S. A. Thamrin, S. Arifin, H. Kuswanto, and A. Andreza, “KLASIFIKASI FAKTOR-FAKTOR PENYEBAB PENYAKIT DIABETES MELITUS DI RUMAH SAKIT UNHAS MENGGUNAKAN ALGORITMA C4.5,” Indones. J. Stat. Its Appl., vol. 4, no. 1, pp. 80–88, Feb. 2020, doi: 10.29244/ijsa.v4i1.330.

M. Tarawneh and O. Embarak, “Hybrid Approach for Heart Disease Prediction Using Data Mining Techniques,” in Advances in Internet, Data and Web Technologies, vol. 29, L. Barolli, F. Xhafa, Z. A. Khan, and H. Odhabi, Eds., in Lecture Notes on Data Engineering and Communications Technologies, vol. 29. , Cham: Springer International Publishing, 2019, pp. 447–454. doi: 10.1007/978-3-030-12839-5_41.

M. Syukron, R. Santoso, and T. Widiharih, “PERBANDINGAN METODE SMOTE RANDOM FOREST DAN SMOTE XGBOOST UNTUK KLASIFIKASI TINGKAT PENYAKIT HEPATITIS C PADA IMBALANCE CLASS DATA,” J. Gaussian, vol. 9, no. 3, pp. 227–236, Aug. 2020, doi: 10.14710/j.gauss.v9i3.28915.

C. C. Aggarwal, Data Mining: The Textbook. Cham: Springer International Publishing, 2015. doi: 10.1007/978-3-319-14142-8.

D. H. Depari, Y. Widiastiwi, and M. M. Santoni, “Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung,” Inform. J. Ilmu Komput., vol. 18, no. 3, p. 239, Dec. 2022, doi: 10.52958/iftk.v18i3.4694.

F. D. Astuti and F. N. Lenti, “Implementasi SMOTE untuk mengatasi Imbalance Class pada Klasifikasi Car Evolution menggunakan K-NN,” vol. 13, no. 1, 2021.

D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, “Multi-label Classifier Performance Evaluation with Confusion Matrix,” Comput. Sci..

P. Cavalin and L. Oliveira, “Confusion Matrix-Based Building of Hierarchical Classification,” in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, vol. 11401, R. Vera-Rodriguez, J. Fierrez, and A. Morales, Eds., in Lecture Notes in Computer Science, vol. 11401. , Cham: Springer International Publishing, 2019, pp. 271–278. doi: 10.1007/978-3-030-13469-3_32.

F. Gorunescu, Data Mining, vol. 12. in Intelligent Systems Reference Library, vol. 12. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. doi: 10.1007/978-3-642-19721-5.

Posted

2024-02-23