Optimasi K-NN dengan Bayes Search CV untuk Klasifikasi Kanker Payudara

Authors

DOI:

https://doi.org/10.33795/jip.v12i3.9766

Keywords:

Machine Learning, data science, Kanker payudara, K-Nearest Neighbors , Bayes Search Cross Validation , Ekspresi gen mRNA, Klasifikasi, Optimasi hyperparameter

Abstract

Kanker payudara merupakan salah satu penyebab utama mortalitas global pada wanita dan menjadi tantangan serius dalam bidang kesehatan. Sifat penyakit yang heterogen menuntut adanya metode klasifikasi yang akurat dan andal guna mendukung proses diagnosis serta penentuan strategi terapi yang tepat. Penelitian ini bertujuan untuk mengoptimalkan kinerja algoritma K-Nearest Neighbors (KNN) melalui penerapan metode Bayes Search Cross Validation (CV) dalam meningkatkan akurasi klasifikasi berbasis data ekspresi gen messenger Ribonucleic Acid (mRNA). Metodologi penelitian meliputi proses pemodelan KNN yang diintegrasikan dengan optimasi hyperparameter menggunakan Bayes Search CV pada dataset Breast Cancer Gene Expression Profiles (METABRIC) yang terdiri dari 1.904 sampel dan 692 atribut, mencakup data ekspresi gen dan karakteristik klinis pasien. Tahapan penelitian mencakup pengolahan data awal, pembagian data latih dan uji, proses optimasi, serta evaluasi model menggunakan metrik akurasi. Hasil penelitian menunjukkan bahwa penerapan Bayes Search CV mampu meningkatkan akurasi klasifikasi secara signifikan menjadi 78,68%, dibandingkan dengan model dasar tanpa optimasi yang hanya mencapai 66,81%. Temuan ini mengindikasikan bahwa pemilihan hyperparameter yang optimal berkontribusi besar terhadap peningkatan performa model. Dengan demikian, dapat disimpulkan bahwa pendekatan optimasi berbasis Bayes Search CV efektif dalam meningkatkan kinerja algoritma KNN pada data genetik yang kompleks serta berpotensi mendukung pengembangan sistem pendukung keputusan klinis yang lebih akurat, personal, efisien, adaptif, dan relevan dalam mendukung implementasi Precision medicine pada diagnosis kanker payudara modern.

Downloads

Download data is not yet available.

References

Abd-Elnaby, M., Alfonse, M., & Roushdy, M. (2021). Classification of breast cancer using microarray Gene Expression data: A survey. In Journal of Biomedical Informatics (Vol. 117). Academic Press Inc. https://doi.org/10.1016/j.jbi.2021.103764

Abunasser, B. S., AL-Hiealy, M. R. J., Zaqout, I. S., & Abu-Naser, S. S. (2023). Literature review of breast cancer detection using machine learning algorithms. AIP Conference Proceedings, 2808(1), 040006. https://doi.org/10.1063/5.0133688

Alanazi, S. A., Alshammari, N., Alruwaili, M., Junaid, K., Abid, M. R., & Ahmad, F. (2024). Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques. Saudi Journal of Biological Sciences, 31(3). https://doi.org/10.1016/j.sjbs.2023.103918

Alharbi, F., & Vakanski, A. (2023). Machine learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering, 10(2). https://doi.org/10.3390/bioengineering10020173

Alromema, N., Syed, A. H., & Khan, T. (2023). A Hybrid Machine learning Approach to Screen Optimal Predictors for the Classification of Primary Breast Tumors from Gene Expression Microarray Data. Diagnostics, 13(4). https://doi.org/10.3390/diagnostics13040708

Arifin, T., Agung, I. W. P., Junianto, E., Agustin, D. D., Wibowo, I. R., & Rachman, R. (2025). Breast cancer identification using a hybrid machine learning system. International Journal of Electrical and Computer Engineering (IJECE), 15(4), 3928. https://doi.org/10.11591/ijece.v15i4.pp3928-3937

Arifin, T., Agung, I. W. P., Junianto, E., Rachman, R., Wibowo, I. R., & Agustin, D. D. (2024). Breast cancer identification using machine learning and hyperparameter optimization. Indonesian Journal of Electrical Engineering and Computer Science, 36(3), 1620–1630. https://doi.org/10.11591/ijeecs.v36.i3.pp1620-1630

Assegie, T. A. (2021). An optimized K-Nearest neighbor based breast cancer detection. Journal of Robotics and Control (JRC), 2(3), 115–118. https://doi.org/10.18196/jrc.2363

Babichev, S., Liakh, I., & Škvor, J. (2024). Integrating Data Mining, Deep Learning, and GeneOntology Analysis for Gene Expression-BasedDisease Diagnosis Systems. https://doi.org/10.21203/rs.3.rs-3978499/v1

Bhat, M. A., Mir, M. A., Lakshmi, R. V., Pradhan, T., Rao, G. V. V. J., Tejani, G. G., & Hussain, S. A. (2026). Machine learning approaches for predicting breast cancer recurrence using clinical and histopathological data. Clinical and Experimental Medicine, 26(1). https://doi.org/10.1007/s10238-025-02018-x

Boeri, C., Chiappa, C., Galli, F., De Berardinis, V., Bardelli, L., Carcano, G., & Rovera, F. (2020). Machine learning techniques in breast cancer prognosis prediction: A primary evaluation. Cancer Medicine, 9(9), 3234–3243. https://doi.org/10.1002/cam4.2811

Elgeldawi, E., Sayed, A., Galal, A. R., & Zaki, A. M. (2021). Hyperparameter tuning for machine learning algorithms used for arabic sentiment analysis. Informatics, 8(4). https://doi.org/10.3390/informatics8040079

Farhad Khorshid, S., & Mohsin Abdulazeez, A. (2021). Breast Cancer Diagnosis Based On K-Nearest Neighbors: A Review Pjaee, 18 (4) (2021) Breast Cancer Diagnosis Based On K-Nearest Neighbors: A REVIEW. In Journal Of Archaeology Of Egypt/Egyptology (Vol. 18, Number 4).

Fatima, N., Liu, L., Hong, S., & Ahmed, H. (2020). Prediction of Breast Cancer, Comparative Review of Machine learning Techniques, and Their Analysis. In IEEE Access (Vol. 8, pp. 150360–150376). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ACCESS.2020.3016715

Kallah-Dagadu, G., Mohammed, M., Nasejje, J. B., Mchunu, N. N., Twabi, H. S., Batidzirai, J. M., Singini, G. C., Nevhungoni, P., & Maposa, I. (2025). Breast cancer prediction based on Gene Expression data using interpretable machine learning techniques. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-85323-5

Kim, D. H., & Lee, K. E. (2022). Discovering Breast Cancer Biomarkers Candidates through mRNA Expression Analysis Based on The Cancer Genome Atlas Database. Journal of Personalized Medicine, 12(10). https://doi.org/10.3390/jpm12101753

Mudzakir, I., & Arifin, T. (2022). Klasifikasi Penggunaan Masker dengan Convolutional Neural Network Menggunakan Arsitektur MobileNetv2. EXPERT: Jurnal Manajemen Sistem Informasi Dan Teknologi, 12(1), 76. https://doi.org/10.36448/expert.v12i1.2466

Nisanova, A., Yavary, A., Deaner, J., Ali, F. S., Gogte, P., Kaplan, R., Chen, K. C., Nudleman, E., Grewal, D., Gupta, M., Wolfe, J., Klufas, M., Yiu, G., Soltani, I., & Emami-Naeini, P. (2024). Performance of Automated Machine learning in Predicting Outcomes of Pneumatic Retinopexy. Ophthalmology Science, 4(5). https://doi.org/10.1016/j.xops.2024.100470

Ozturk Kiyak, E., Ghasemkhani, B., & Birant, D. (2023). High-Level K-Nearest Neighbors (HLKNN): A Supervised Machine learning Model for Classification Analysis. Electronics (Switzerland), 12(18). https://doi.org/10.3390/electronics12183828

Phan, N. N., Huang, C. C., Tseng, L. M., & Chuang, E. Y. (2021). Predicting Breast Cancer Gene Expression Signature by Applying Deep Convolutional Neural Networks From Unannotated Pathological Images. Frontiers in Oncology, 11. https://doi.org/10.3389/fonc.2021.769447

Qiu, Y., & Liu, P. (2025). Investigation of ML algorithms for prediction of CFD data of fluid flow inside a packed-bed reactor. Case Studies in Thermal Engineering, 70. https://doi.org/10.1016/j.csite.2025.106093

Rahman, Md. M., Rahman, A., Akter, S., & Pinky, S. A. (2023). Hyperparameter Tuning Based Machine learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications, 11(04), 149–165. https://doi.org/10.4236/jcc.2023.114007

Rajaguru, H., & Sannasi Chakravarthy, S. R. (2019). Analysis of Decision Tree and k-nearest neighbor algorithm in the classification of breast cancer. Asian Pacific Journal of Cancer Prevention, 20(12), 3777–3781. https://doi.org/10.31557/APJCP.2019.20.12.3777

Ranti, N., 1, M., & Hanif, K. H. (2022). Klasifikasi Penyakit Kanker Payudara Menggunakan Perbandingan Algoritma Machine learning. 3(1), 1–6. http://creativecommons.org/licences/by/4.0/

Rasheda, A., Arifin, T., Studi, P., Informatika, T., Adhirajasa, U., Sanjaya, R., Neighbor, K., & Belakang, P. T. (2022). Penerapan K-Nearest Neighbor Untuk Sistem Pakar Diagnosa Penyakit Tulang Belakang. 3(2).

Resmiati, R., & Arifin, T. (2021). Klasifikasi Pasien Kanker Payudara Menggunakan Metode Support Vector Machine dengan Backward Elimination. Sistemasi, 10(2), 381. https://doi.org/10.32520/stmsi.v10i2.1238

Sun, Y. S., Zhao, Z., Yang, Z. N., Xu, F., Lu, H. J., Zhu, Z. Y., Shi, W., Jiang, J., Yao, P. P., & Zhu, H. P. (2017). Risk factors and preventions of breast cancer. In International Journal of Biological Sciences (Vol. 13, Number 11, pp. 1387–1397). Ivyspring International Publisher. https://doi.org/10.7150/ijbs.21635

Takeshita, T., Iwase, H., Wu, R., Ziazadeh, D., Yan, L., & Takabe, K. (2023). Development of a Machine learning-Based Prognostic Model for Hormone Receptor-Positive Breast Cancer Using Nine-Gene Expression Signature. World Journal of Oncology, 14(5), 406–422. https://doi.org/10.14740/wjon1700

Thalor, A., Kumar Joon, H., Singh, G., Roy, S., & Gupta, D. (2022). Machine learning assisted analysis of breast cancer Gene Expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer. Computational and Structural Biotechnology Journal, 20, 1618–1631. https://doi.org/10.1016/j.csbj.2022.03.019

Ubaidillah, R., Muliadi, M., Nugrahadi, D. T., Faisal, M. R., & Herteno, R. (2022). Implementasi XGBoost Pada Keseimbangan Liver Patient Dataset dengan SMOTE dan Hyperparameter Tuning Bayesian Search. JURNAL MEDIA INFORMATIKA BUDIDARMA, 6(3), 1723. https://doi.org/10.30865/mib.v6i3.4146

Zhao, Y., Zhang, W., & Liu, X. (2024). Grid search with a weighted error function: Hyper-parameter optimization for financial time series forecasting. Applied Soft Computing, 154. https://doi.org/10.1016/j.asoc.2024.111362

Downloads

Published

2026-05-31

How to Cite

Arifin, T., Agung, I. W. P., Wibowo, I. R., & Junianto, E. (2026). Optimasi K-NN dengan Bayes Search CV untuk Klasifikasi Kanker Payudara. Jurnal Informatika Polinema, 12(3), 553–560. https://doi.org/10.33795/jip.v12i3.9766