Building a forecast model for costs in treating type 2 diabetes based on social insurance data in Vietnam
Abstract
Diabetes mellitus (DM) is a chronic disease that leads to severe complications and significant treatment costs, placing a heavy burden on healthcare systems. Predicting future healthcare costs associated with DM is essential for efficient healthcare planning and policymaking. Advanced machine learning models offer promising tools for cost prediction. To develop and compare classical statistical models and machine learning models to accurately forecast the cost of type 2 diabetes treatment in Vietnam, using real-world healthcare data. A cross-sectional analysis was conducted using electronic payment data from the Hanoi and Ho Chi Minh City Social Security systems. The study included all patients diagnosed with type 2 diabetes who met predefined inclusion criteria within the 2018-2022 timeframe. This study compares models' performance, which were classical statistical models (LR, Ridge, Lasso, Elastic Net) and modern machine learning models (RF, SVR, MLP, XGBoost), for predicting the cost of a diabetes treatment course. The model demonstrating the best fit was determined based on four criteria: MAE, RMSE, and R2. Based on the research findings, the XGBoost model was selected to forecast the treatment cost of type 2 diabetes in Vietnam. This model achieved the highest accuracy (R2 = 0.4991) and the lowest prediction error (RMSE = 0.6562) compared to other models such as MLP and SVR. To optimize the performance of the XGBoost model, grid search was employed on the training dataset. The optimal hyperparameter set includes number of trees (200), learning rate (0.1), maximum depth of each tree (7), minimum child weight (3), and subsample ratio per tree (1.0). The XGBoost model with the optimal hyperparameter set was evaluated on the entire dataset. The results demonstrated high stability, with no significant differences in the R2 and RMSE metrics between the training and testing sets. A prediction application incorporating 17 patient features was developed, offering quick and accurate cost estimates. The XGBoost model, selected for its superior performance, was used to forecast type 2 diabetes treatment costs in Vietnam.
Keywords: Forecasting cost model, Machine learning, Diabetes type 2, Vietnam
How to cite this article:
Citation Formats:
Contact Meral
Meral Publications
www.meralpublisher.com
Davutpasa / Zeytinburnu 34087
Istanbul
Turkey
Email: [email protected]