Optimasi K-NN dengan Bayes Search CV untuk Klasifikasi Kanker Payudara
DOI:
https://doi.org/10.33795/jip.v12i3.9766Keywords:
Machine Learning, data science, Kanker payudara, K-Nearest Neighbors , Bayes Search Cross Validation , Ekspresi gen mRNA, Klasifikasi, Optimasi hyperparameterAbstract
Kanker payudara merupakan salah satu penyebab utama mortalitas global pada wanita dan menjadi tantangan serius dalam bidang kesehatan. Sifat penyakit yang heterogen menuntut adanya metode klasifikasi yang akurat dan andal guna mendukung proses diagnosis serta penentuan strategi terapi yang tepat. Penelitian ini bertujuan untuk mengoptimalkan kinerja algoritma K-Nearest Neighbors (KNN) melalui penerapan metode Bayes Search Cross Validation (CV) dalam meningkatkan akurasi klasifikasi berbasis data ekspresi gen messenger Ribonucleic Acid (mRNA). Metodologi penelitian meliputi proses pemodelan KNN yang diintegrasikan dengan optimasi hyperparameter menggunakan Bayes Search CV pada dataset Breast Cancer Gene Expression Profiles (METABRIC) yang terdiri dari 1.904 sampel dan 692 atribut, mencakup data ekspresi gen dan karakteristik klinis pasien. Tahapan penelitian mencakup pengolahan data awal, pembagian data latih dan uji, proses optimasi, serta evaluasi model menggunakan metrik akurasi. Hasil penelitian menunjukkan bahwa penerapan Bayes Search CV mampu meningkatkan akurasi klasifikasi secara signifikan menjadi 78,68%, dibandingkan dengan model dasar tanpa optimasi yang hanya mencapai 66,81%. Temuan ini mengindikasikan bahwa pemilihan hyperparameter yang optimal berkontribusi besar terhadap peningkatan performa model. Dengan demikian, dapat disimpulkan bahwa pendekatan optimasi berbasis Bayes Search CV efektif dalam meningkatkan kinerja algoritma KNN pada data genetik yang kompleks serta berpotensi mendukung pengembangan sistem pendukung keputusan klinis yang lebih akurat, personal, efisien, adaptif, dan relevan dalam mendukung implementasi Precision medicine pada diagnosis kanker payudara modern.
Downloads
References
Abd-Elnaby, M., Alfonse, M., & Roushdy, M. (2021). Classification of breast cancer using microarray Gene Expression data: A survey. In Journal of Biomedical Informatics (Vol. 117). Academic Press Inc. https://doi.org/10.1016/j.jbi.2021.103764
Abunasser, B. S., AL-Hiealy, M. R. J., Zaqout, I. S., & Abu-Naser, S. S. (2023). Literature review of breast cancer detection using machine learning algorithms. AIP Conference Proceedings, 2808(1), 040006. https://doi.org/10.1063/5.0133688
Alanazi, S. A., Alshammari, N., Alruwaili, M., Junaid, K., Abid, M. R., & Ahmad, F. (2024). Integrative analysis of RNA expression data unveils distinct cancer types through machine learning techniques. Saudi Journal of Biological Sciences, 31(3). https://doi.org/10.1016/j.sjbs.2023.103918
Alharbi, F., & Vakanski, A. (2023). Machine learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering, 10(2). https://doi.org/10.3390/bioengineering10020173
Alromema, N., Syed, A. H., & Khan, T. (2023). A Hybrid Machine learning Approach to Screen Optimal Predictors for the Classification of Primary Breast Tumors from Gene Expression Microarray Data. Diagnostics, 13(4). https://doi.org/10.3390/diagnostics13040708
Arifin, T., Agung, I. W. P., Junianto, E., Agustin, D. D., Wibowo, I. R., & Rachman, R. (2025). Breast cancer identification using a hybrid machine learning system. International Journal of Electrical and Computer Engineering (IJECE), 15(4), 3928. https://doi.org/10.11591/ijece.v15i4.pp3928-3937
Arifin, T., Agung, I. W. P., Junianto, E., Rachman, R., Wibowo, I. R., & Agustin, D. D. (2024). Breast cancer identification using machine learning and hyperparameter optimization. Indonesian Journal of Electrical Engineering and Computer Science, 36(3), 1620–1630. https://doi.org/10.11591/ijeecs.v36.i3.pp1620-1630
Assegie, T. A. (2021). An optimized K-Nearest neighbor based breast cancer detection. Journal of Robotics and Control (JRC), 2(3), 115–118. https://doi.org/10.18196/jrc.2363
Babichev, S., Liakh, I., & Škvor, J. (2024). Integrating Data Mining, Deep Learning, and GeneOntology Analysis for Gene Expression-BasedDisease Diagnosis Systems. https://doi.org/10.21203/rs.3.rs-3978499/v1
Bhat, M. A., Mir, M. A., Lakshmi, R. V., Pradhan, T., Rao, G. V. V. J., Tejani, G. G., & Hussain, S. A. (2026). Machine learning approaches for predicting breast cancer recurrence using clinical and histopathological data. Clinical and Experimental Medicine, 26(1). https://doi.org/10.1007/s10238-025-02018-x
Boeri, C., Chiappa, C., Galli, F., De Berardinis, V., Bardelli, L., Carcano, G., & Rovera, F. (2020). Machine learning techniques in breast cancer prognosis prediction: A primary evaluation. Cancer Medicine, 9(9), 3234–3243. https://doi.org/10.1002/cam4.2811
Elgeldawi, E., Sayed, A., Galal, A. R., & Zaki, A. M. (2021). Hyperparameter tuning for machine learning algorithms used for arabic sentiment analysis. Informatics, 8(4). https://doi.org/10.3390/informatics8040079
Farhad Khorshid, S., & Mohsin Abdulazeez, A. (2021). Breast Cancer Diagnosis Based On K-Nearest Neighbors: A Review Pjaee, 18 (4) (2021) Breast Cancer Diagnosis Based On K-Nearest Neighbors: A REVIEW. In Journal Of Archaeology Of Egypt/Egyptology (Vol. 18, Number 4).
Fatima, N., Liu, L., Hong, S., & Ahmed, H. (2020). Prediction of Breast Cancer, Comparative Review of Machine learning Techniques, and Their Analysis. In IEEE Access (Vol. 8, pp. 150360–150376). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ACCESS.2020.3016715
Kallah-Dagadu, G., Mohammed, M., Nasejje, J. B., Mchunu, N. N., Twabi, H. S., Batidzirai, J. M., Singini, G. C., Nevhungoni, P., & Maposa, I. (2025). Breast cancer prediction based on Gene Expression data using interpretable machine learning techniques. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-85323-5
Kim, D. H., & Lee, K. E. (2022). Discovering Breast Cancer Biomarkers Candidates through mRNA Expression Analysis Based on The Cancer Genome Atlas Database. Journal of Personalized Medicine, 12(10). https://doi.org/10.3390/jpm12101753
Mudzakir, I., & Arifin, T. (2022). Klasifikasi Penggunaan Masker dengan Convolutional Neural Network Menggunakan Arsitektur MobileNetv2. EXPERT: Jurnal Manajemen Sistem Informasi Dan Teknologi, 12(1), 76. https://doi.org/10.36448/expert.v12i1.2466
Nisanova, A., Yavary, A., Deaner, J., Ali, F. S., Gogte, P., Kaplan, R., Chen, K. C., Nudleman, E., Grewal, D., Gupta, M., Wolfe, J., Klufas, M., Yiu, G., Soltani, I., & Emami-Naeini, P. (2024). Performance of Automated Machine learning in Predicting Outcomes of Pneumatic Retinopexy. Ophthalmology Science, 4(5). https://doi.org/10.1016/j.xops.2024.100470
Ozturk Kiyak, E., Ghasemkhani, B., & Birant, D. (2023). High-Level K-Nearest Neighbors (HLKNN): A Supervised Machine learning Model for Classification Analysis. Electronics (Switzerland), 12(18). https://doi.org/10.3390/electronics12183828
Phan, N. N., Huang, C. C., Tseng, L. M., & Chuang, E. Y. (2021). Predicting Breast Cancer Gene Expression Signature by Applying Deep Convolutional Neural Networks From Unannotated Pathological Images. Frontiers in Oncology, 11. https://doi.org/10.3389/fonc.2021.769447
Qiu, Y., & Liu, P. (2025). Investigation of ML algorithms for prediction of CFD data of fluid flow inside a packed-bed reactor. Case Studies in Thermal Engineering, 70. https://doi.org/10.1016/j.csite.2025.106093
Rahman, Md. M., Rahman, A., Akter, S., & Pinky, S. A. (2023). Hyperparameter Tuning Based Machine learning Classifier for Breast Cancer Prediction. Journal of Computer and Communications, 11(04), 149–165. https://doi.org/10.4236/jcc.2023.114007
Rajaguru, H., & Sannasi Chakravarthy, S. R. (2019). Analysis of Decision Tree and k-nearest neighbor algorithm in the classification of breast cancer. Asian Pacific Journal of Cancer Prevention, 20(12), 3777–3781. https://doi.org/10.31557/APJCP.2019.20.12.3777
Ranti, N., 1, M., & Hanif, K. H. (2022). Klasifikasi Penyakit Kanker Payudara Menggunakan Perbandingan Algoritma Machine learning. 3(1), 1–6. http://creativecommons.org/licences/by/4.0/
Rasheda, A., Arifin, T., Studi, P., Informatika, T., Adhirajasa, U., Sanjaya, R., Neighbor, K., & Belakang, P. T. (2022). Penerapan K-Nearest Neighbor Untuk Sistem Pakar Diagnosa Penyakit Tulang Belakang. 3(2).
Resmiati, R., & Arifin, T. (2021). Klasifikasi Pasien Kanker Payudara Menggunakan Metode Support Vector Machine dengan Backward Elimination. Sistemasi, 10(2), 381. https://doi.org/10.32520/stmsi.v10i2.1238
Sun, Y. S., Zhao, Z., Yang, Z. N., Xu, F., Lu, H. J., Zhu, Z. Y., Shi, W., Jiang, J., Yao, P. P., & Zhu, H. P. (2017). Risk factors and preventions of breast cancer. In International Journal of Biological Sciences (Vol. 13, Number 11, pp. 1387–1397). Ivyspring International Publisher. https://doi.org/10.7150/ijbs.21635
Takeshita, T., Iwase, H., Wu, R., Ziazadeh, D., Yan, L., & Takabe, K. (2023). Development of a Machine learning-Based Prognostic Model for Hormone Receptor-Positive Breast Cancer Using Nine-Gene Expression Signature. World Journal of Oncology, 14(5), 406–422. https://doi.org/10.14740/wjon1700
Thalor, A., Kumar Joon, H., Singh, G., Roy, S., & Gupta, D. (2022). Machine learning assisted analysis of breast cancer Gene Expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer. Computational and Structural Biotechnology Journal, 20, 1618–1631. https://doi.org/10.1016/j.csbj.2022.03.019
Ubaidillah, R., Muliadi, M., Nugrahadi, D. T., Faisal, M. R., & Herteno, R. (2022). Implementasi XGBoost Pada Keseimbangan Liver Patient Dataset dengan SMOTE dan Hyperparameter Tuning Bayesian Search. JURNAL MEDIA INFORMATIKA BUDIDARMA, 6(3), 1723. https://doi.org/10.30865/mib.v6i3.4146
Zhao, Y., Zhang, W., & Liu, X. (2024). Grid search with a weighted error function: Hyper-parameter optimization for financial time series forecasting. Applied Soft Computing, 154. https://doi.org/10.1016/j.asoc.2024.111362






