Peningkatan Akurasi Model Boosting pada Prediksi Kesehatan Tidur Menggunakan Optuna
DOI:
https://doi.org/10.33795/jip.v12i2.8878Keywords:
boosting, optuna, hyperparameter tuning, kesehatan tidur, machine learningAbstract
Kualitas tidur memiliki peran penting dalam menjaga kesehatan fisik maupun mental, sementara gangguan tidur dapat meningkatkan risiko berbagai penyakit kronis. Perkembangan machine learning membuka peluang untuk melakukan prediksi kesehatan tidur secara lebih akurat melalui pemanfaatan data gaya hidup. Penelitian ini berfokus pada penerapan algoritma boosting, yaitu XGBoost, LightGBM, AdaBoost, dan GradientBoosting, dengan dukungan teknik hyperparameter tuning berbasis Optuna untuk meningkatkan akurasi prediksi. Dataset yang digunakan adalah Sleep Health and Lifestyle Dataset yang memuat variabel demografis, kebiasaan hidup, serta kondisi tidur. Tahapan penelitian meliputi praproses data, pembagian data latih dan uji, pelatihan model, optimasi hyperparameter menggunakan Optuna dengan metode Tree-structured Parzen Estimator (TPE), serta evaluasi model menggunakan metrik akurasi. Hasil eksperimen menunjukkan bahwa tuning dengan Optuna memberikan peningkatan akurasi pada beberapa model, khususnya LightGBM dan AdaBoost, dengan nilai akurasi mencapai 93,3% dan 90,7%. Sementara itu, XGBoost dan GradientBoosting menunjukkan performa stabil dengan akurasi tetap tinggi baik sebelum maupun sesudah tuning. Temuan ini menegaskan bahwa efektivitas tuning bergantung pada karakteristik algoritma yang digunakan. Secara keseluruhan, penelitian ini membuktikan bahwa Optuna dapat menjadi solusi efektif dalam meningkatkan kinerja model boosting untuk prediksi kesehatan tidur. Sebagai arah penelitian lanjutan, disarankan penggunaan metrik evaluasi yang lebih beragam, penerapan teknik penyeimbangan data, serta eksplorasi integrasi dengan metode deep learning untuk memperkaya hasil analisis.
Downloads
References
Ahmed Ouameur, M., Caza-Szoka, M., & Massicotte, D. (2020). Machine learning enabled tools and methods for indoor localization using low power wireless network. Internet of Things (Netherlands), 12, 100300. https://doi.org/10.1016/j.iot.2020.100300
Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A Next-generation Hyperparameter Optimization Framework. KDD ’19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. https://doi.org/10.1145/3292500.3330701
Birba, D. E. (2020). A Comparative study of data splitting algorithms for machine learning model selection. Degree Project in Computer Science and Engineering, 2020(1), 1–23. https://www.diva-portal.org/smash/get/diva2:1506870/FULLTEXT01.pdf
Chen, T., & Guestrin, C. (2016) XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-Augu, 785–794. https://doi.org/10.1145/2939672.2939785
Daniel, C. (2024). A robust LightGBM model for concrete tensile strength forecast to aid in resilience-based structure strategies. Heliyon, 10(20), e39679. https://doi.org/10.1016/j.heliyon.2024.e39679
Darwis, D., Siskawati, N., & Abidin, Z. (2021). Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter Bmkg Nasional. Jurnal Tekno Kompak, 15(1), 131. https://doi.org/10.33365/jtk.v15i1.744
Das, D., Aayushman, Kumar, S., Hussain, M. A., & Reddy, B. R. (2025). Diabetes Prediction using Ensemble Learning Techniques. Procedia Computer Science, 258, 3155–3164. https://doi.org/10.1016/j.procs.2025.04.573
de Amorim, L. B. V., Cavalcanti, G. D. C., & Cruz, R. M. O. (2023). The choice of scaling technique matters for classification performance. Applied Soft Computing, 133, 1–37. https://doi.org/10.1016/j.asoc.2022.109924
de Giorgio, A., Cola, G., & Wang, L. (2023). Systematic review of class imbalance problems in manufacturing. Journal of Manufacturing Systems, 71(September), 620–644. https://doi.org/10.1016/j.jmsy.2023.10.014
Deepa, B., & Ramesh, K. (2022). Epileptic seizure detection using deep learning through min max scaler normalization. International Journal of Health Sciences, 6(May), 10981–10996. https://doi.org/10.53730/ijhs.v6ns1.7801
Dinathi, D. A., Ramadanti, E., & Chandranegara, D. R. (2024). Diabetes Detection Using Extreme Gradient Boosting (XGBoost) with Hyperparameter Tuning. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(2), 78–84.
Farhadi, S., Tatullo, S., Boveiri Konari, M., & Afzal, P. (2024). Evaluating StackingC and ensemble models for enhanced lithological classification in geological mapping. Journal of Geochemical Exploration, 260, 107441. https://doi.org/10.1016/j.gexplo.2024.107441
Jain, R., & Ganesan, R. A. (2024). Effective Diagnosis of Various Sleep Disorders by LEE Classifier: LightGBM-EOG-EEG. IEEE, 29(4), 2581–2588. https://doi.org/10.1109/JBHI.2024.3524079
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 2017-Decem(Nips), 3147–3155.
Krzywicka, M., & Wosiak, A. (2023). Sensitivity of Standard Evaluation Metrics for Disease Classification and Progression Assessment Based on Whole-Body Imaging. Procedia Computer Science, 225, 4314–4323. https://doi.org/10.1016/j.procs.2023.10.428
Lai, J. P., Lin, Y. L., Lin, H. C., Shih, C. Y., Wang, Y. P., & Pai, P. F. (2023). Tree-Based Machine Learning Models with Optuna in Predicting Impedance Values for Circuit Analysis. Micromachines, 14(2). https://doi.org/10.3390/mi14020265
Low, M. X., Yap, T. T. V., Soo, W. K., Ng, H., Goh, V. T., Chin, J. J., & Kuek, T. Y. (2022). Comparison of Label Encoding and Evidence Counting for Malware Classification. Journal of System and Management Sciences, 12(6), 17–30. https://doi.org/10.33168/JSMS.2022.0602
Mao, X., Ren, N., Dai, P., Jin, J., Wang, B., Kang, R., & Li, D. (2024). A variable weight combination prediction model for climate in a greenhouse based on BiGRU-Attention and LightGBM. Computers and Electronics in Agriculture, 219, 108818. https://doi.org/10.1016/j.compag.2024.108818
Medic, G., Wille, M., & Hemels, M. E. H. (2017). Short- and long-term health consequences of sleep disruption. Nature and Science of Sleep, 9, 151–161. https://doi.org/10.2147/NSS.S134864
Ouadi, B., Khatir, A., Magagnini, E., Mokadem, M., Abualigah, L., & Smerat, A. (2024). Optimizing silt density index prediction in water treatment systems using pressure-based gradient boosting hybridized with Salp Swarm Algorithm. Journal of Water Process Engineering, 68(September), 106479. https://doi.org/10.1016/j.jwpe.2024.106479
Safavi, V., Mohammadi Vaniar, A., Bazmohammadi, N., Vasquez, J. C., Keysan, O., & Guerrero, J. M. (2024). Early prediction of battery remaining useful life using CNN-XGBoost model and Coati optimization algorithm. Journal of Energy Storage, 98, 113176. https://doi.org/10.1016/j.est.2024.113176
Schapire, R. E., Labs, T., Avenue, P., Room, A., & Park, F. (1999). A Brief Introduction to Boosting. Ijcai 99, 1401–1406. http://u.math.biu.ac.il/~louzouy/courses/seminar/boost1.pdf
Scikit-learn Developers. (n.d.). train_test_split — dokumentasi scikit-learn 1.6.1. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Shirdel, M., Di Mauro, M., & Liotta, A. (2024). Worthiness Benchmark: A novel concept for analyzing binary classification evaluation metrics. Information Sciences, 678, 120882. https://doi.org/10.1016/j.ins.2024.120882
Srinivas, P., & Katarya, R. (2022). hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomedical Signal Processing and Control, 73(June 2021), 103456. https://doi.org/10.1016/j.bspc.2021.103456
Su, Q., Chen, L., & Qian, L. (2024). Optimization of big data analysis resources supported by XGBoost algorithm: Comprehensive analysis of industry 5.0 and ESG performance. Measurement: Sensors, 36, 101310. https://doi.org/10.1016/j.measen.2024.101310
Tharmalingam, L. (2021). Sleep Health and Lifestyle Dataset. Kaggle. https://www.kaggle.com/datasets
Wang, W., & Sun, D. (2021). The improved AdaBoost algorithms for imbalanced data classification. Information Sciences, 563, 358–374. https://doi.org/10.1016/j.ins.2021.03.042
Wu, Z., Liu, Y., Fang, S., Shen, W., Li, X., Mao, Z., & Wu, S. (2024). Integration of geographic features and bathymetric inversion in the Yangtze River’s Nantong Channel using gradient boosting machine algorithm with ZY-1E satellite and multibeam data. Geomatica, 76, 100027. https://doi.org/10.1016/j.geomat.2024.100027






