Diabetes Detection Using Extreme Gradient Boosting (XGBoost) with Hyperparameter Tuning
Abstract
Diabetes is a metabolic disorder caused by problems with insulin production in the body. Diabetes is one of the deadliest diseases worldwide, especially in Indonesia. Diabetes can cause various serious complications to the sufferers and can lead to death. With current technological advances, machine learning algorithms can identify diabetes using available data for analysis. One of the machine learning methods that can be applied is Extreme Gradient Boosting (XGBoost). This study aims to find the best classification performance on diabetes datasets using the XGBoost method. The dataset used consists of 768 rows and 9 columns, with target values of 0 and 1. In this study, resampling is applied to overcome data imbalance using SMOTE and optimize hyperparameters using GridSearchCV and RandomSearchCV. Model evaluation is done using confusion matrix and various metrics such as accuracy, precision, recall, and f1-score. This research conducted several three test scenarios. The first test was hyperparameter optimization using GridSearchCV. The second test was hyperparameter optimization using RandomSearchCV. In the third test by applying data resampling, the XGBoost method achieved the highest accuracy of 82% with GridSearchCV hyperparameter optimization.
Downloads
Copyright (c) 2024 Devi Aprilya Dinathi, Elisa Ramadanti, christian sri kusuma aditya, Didih Rizki Chandranegara
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).