, Dhihram Tenrisau2
, Syarif Rahman Hasibuan3,4
, Bhirau Wilaksono5
, Yeni Indriyani6
, Andi Afdal Abdullah7, Halik Malik7
, Andi Alfian Zainuddin8
1Sekolah Tinggi Ilmu Kesehatan Dharma Husada, Bandung, Indonesia
2Public Health Literature Club, Yogyakarta, Indonesia
3Faculty of Medicine, Universitas Pembangunan Nasional Veteran, Jakarta, Indonesia
4Center for Health Administration and Policy Studies, Faculty of Public Health, Universitas Indonesia, Jakarta, Indonesia
5Center for Longevity Research, Faculty of Medicine, Universitas Negeri Makassar, Makassar, Indonesia
6Department of Public Health, Faculty of Health Science, Universitas Muhammadiyah Surakarta, Surakarta, Indonesia
7Social Insurance Administration Organization, Jakarta, Indonesia
8Department of Public Health and Community, Faculty of Medicine, Universitas Hasanuddin, Makassar, Indonesia
Copyright © 2026 The Korean Society for Preventive Medicine
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Conflict of Interest
The authors have no conflicts of interest associated with the material presented in this paper.
Funding
None.
Acknowledgements
The authors express their gratitude to BPJS Kesehatan for granting access to the anonymized dataset used in this study. The authors would also like to express their gratitude to Pierre Masselot, Alex Lewin, and Andree Valle Campos for sharing valuable insights from the “Machine Learning” and “Data Challenge” modules of the MSc Health Data Science program at the London School of Hygiene and Tropical Medicine (LSHTM).
Author Contributions
Conceptualization: Mahwati Y, Tenrisau D. Data curation: Hasibuan SR. Formal analysis: Tenrisau D. Funding acquisition: None. Methodology: Mahwati Y, Tenrisau D, Hasibuan SR. Project administration: Mahwati Y. Visualization: Tenrisau D. Writing – original draft: Mahwati Y, Tenrisau D, Hasibuan SR, Wilaksono B, Indriyani Y, Abdullah AA, Malik H, Zainuddin AA. Writing – review & editing: Mahwati Y, Tenrisau D, Hasibuan SR, Wilaksono B, Indriyani Y, Abdullah AA, Malik H, Zainuddin AA.
| Characteristics | n (%) or mean±SD |
|---|---|
| Age (y) | 68.90±6.75 |
| Sex | |
| Male | 53 165 (51.8) |
| Female | 49 563 (48.2) |
| Marital status | |
| Unmarried | 6780 (6.6) |
| Divorced | 17 977 (17.5) |
| Married | 74 247 (72.3) |
| Undefined | 3724 (3.6) |
| Utilization by severity level | |
| Mild | 0.83±1.35 |
| Moderate | 0.33±0.82 |
| Severe | 0.14±0.44 |
| Outpatient care | 14.30±32.50 |
| Utilization by ICD-10 diagnosis | |
| ICD Chapter 1 | 0.16±0.55 |
| ICD Chapter 2 | 0.13±0.69 |
| ICD Chapter 3 | 0.04±0.38 |
| ICD Chapter 4 | 0.23±0.98 |
| ICD Chapter 5 | 0.02±0.33 |
| ICD Chapter 6 | 0.06±0.60 |
| ICD Chapter 7 | 0.50±1.60 |
| ICD Chapter 8 | 0.06±0.31 |
| ICD Chapter 9 | 0.63±1.70 |
| ICD Chapter 10 | 0.26±1.09 |
| ICD Chapter 11 | 0.28±0.88 |
| ICD Chapter 12 | 0.05±0.35 |
| ICD Chapter 13 | 0.26±1.95 |
| ICD Chapter 14 | 0.30±4.26 |
| ICD Chapter 15 | 0.00±0.03 |
| ICD Chapter 17 | 0.00±0.08 |
| ICD Chapter 18 | 0.19±0.78 |
| ICD Chapter 19 | 0.10±0.39 |
| ICD Chapter 21 | 12.30±30.50 |
| Utilization by type of service received | |
| Outpatient | 14.30±32.50 |
| Inpatient | 1.29±1.88 |
| Utilization by ward class | |
| Ward class I | 6.11±25.00 |
| Ward class II | 2.80±14.50 |
| Ward class III | 6.68±20.20 |
| Length of stay (day) | 5.32±9.03 |
| Utilization by participant segmentation | |
| Non-worker | 25 054 (24.4) |
| PBI APBD | 9116 (8.9) |
| PBI APBN | 26 553 (25.8) |
| PBPU | 36 155 (35.2) |
| PPU | 5850 (5.7) |
| Utilization by facility ownership | |
| Government | 6.55±19.80 |
| Private | 9.03±25.40 |
| Utilization by facility type | |
| Type A hospital | 0.72±6.56 |
| Type B hospital | 4.16±18.60 |
| Type C hospital | 7.98±21.70 |
| Type D hospital | 1.79±8.39 |
| Special hospital | 0.87±5.26 |
| Other | 0.07±2.19 |
| Claim cost (IDR) | 12 740 059±25 021 824 |
| Models | RMSE | R2 | MAE |
|---|---|---|---|
| Linear regression | 13 710 035 | 0.72 | 5 641 598 |
| Random forest | |||
| ntree 50 | 12 434 238 | 0.78 | 4 696 412 |
| ntree 100 | 12 438 436 | 0.78 | 4 672 509 |
| ntree 200 | 12 437 736 | 0.78 | 4 667 147 |
| XGBoost | |||
| nrounds 100 | 11 508 546 | 0.80 | 4 579 177 |
| nrounds 200 | 11 451 146 | 0.81 | 4 549 908 |
| nrounds 500 | 11 360 283 | 0.81 | 4 485 917 |
SD, standard deviation; ICD-10, International Classification of Diseases, 10th revision; PBI APBD, government-subsidized premium (local); PBI APBN, government-subsidized premium (national); PBPU, informal self-employed; PPU, formal worker participant; IDR, Indonesian rupiah.
RMSE, root mean square error; MAE, mean absolute error.