A Choice of Performance Metrics for Evaluating Predictive Accuracy of Survival Models

Authors

  • Kumur John Haganawiga Department of Mathematics, Sharda School of Basic Sciences and Research, Sharda University, Greater Noida-201310, India https://orcid.org/0009-0000-8723-232X
  • Surya Kant Pal Department of Mathematics, Sharda School of Basic Sciences and Research, Sharda University, Greater Noida-201310, India
  • Anu Sirohi Department of Statistics, Amity Institute of Applied Sciences, Amity University, Noida, India

DOI:

https://doi.org/10.6000/1929-6029.2025.14.16

Keywords:

Survival analysis, Infant Mortality, Child Mortality, Predictive Performance, Penalized Cox Proportional Hazards, Performance Metrics

Abstract

This research critically assessed the predictive accuracy of parametric survival models (Weibull, Exponential, Log-logistic, and Gompertz) against penalized Cox PH models (Ridge, Lasso, and Elastic Net) using both simulated data (sample sizes of 100, 200, and 1000) and real-world data from the Nigerian Demographic and Health Survey (NDHS). The findings showed that parametric models, particularly the Weibull and Log-logistic models, consistently outperformed the others, achieving the highest Concordance Index (C-index) and the lowest Mean Absolute Error (MAE) and Mean Squared Error (MSE), indicating superior discrimination and calibration. In contrast, penalized Cox models underperformed, especially with a larger number of covariates, and the Gompertz model exhibited poor predictive performance under all conditions. Notably, parametric models remained stable and consistent even with smaller sample sizes and high-dimensional, complex data. These results highlighted the reliability of parametric models in survival analysis, particularly in small-sample and high-dimensional settings, offering key insights to inform future infant and child health research.

References

Collett D. Modelling survival data in medical research. Chapman and Hall/CRC 2023. DOI: https://doi.org/10.1201/9781003282525

Uno H, Cai T, Tian L, Wei LJ. Evaluating prediction rules for t-year survivors with censored regression models. Journal of the American Statistical Association 2007; 102(478): 527-37. DOI: https://doi.org/10.1198/016214507000000149

Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA 1982; 247(18): 2543-6. DOI: https://doi.org/10.1001/jama.1982.03320430047030

Jing B, Zhang T, Wang Z, Jin Y, Liu K, Qiu W, Ke L, Sun Y, He C, Hou D, Tang L. A deep survival analysis method based on ranking. Artificial Intelligence in Medicine 2019; 98: 1-9. DOI: https://doi.org/10.1016/j.artmed.2019.06.001

Amico M, Van Keilegom I, Han B. Assessing cure status prediction from survival data using receiver operating characteristic curves. Biometrika 2021; 108(3): 727-40. DOI: https://doi.org/10.1093/biomet/asaa080

Devlin SM, Heller G. Concordance probability as a meaningful contrast across disparate survival times. Statistical Methods in Medical Research 2021; 30(3): 816-25. DOI: https://doi.org/10.1177/0962280220973694

Hsu TC, Lin C. Generative adversarial networks for robust breast cancer prognosis prediction with limited data size. In2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2020; (pp. 5669-5672). IEEE. DOI: https://doi.org/10.1109/EMBC44109.2020.9175736

Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research 2005; 30(1): 79-82. DOI: https://doi.org/10.3354/cr030079

Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. New York: Springer 2009 Aug. DOI: https://doi.org/10.1007/978-0-387-84858-7

Zhou H, Wang H, Wang S, Zou Y. SurvMetrics: An R package for Predictive Evaluation Metrics in Survival Analysis. R J 2023; 14(4): 252-263. DOI: https://doi.org/10.32614/RJ-2023-009

Wang P, Li Y, Reddy CK. Machine learning for survival analysis: A survey. ACM Computing Surveys (CSUR) 2019; 51(6): 1-36. DOI: https://doi.org/10.1145/3214306

Zhang L, Dong D, Zhong L, Li C, Hu C, Yang X, Liu Z, Wang R, Zhou J, Tian J. Multi-focus network to decode imaging phenotype for overall survival prediction of gastric cancer patients. IEEE Journal of Biomedical and Health Informatics 2021; 25(10): 3933-42. DOI: https://doi.org/10.1109/JBHI.2021.3087634

Choodari‐Oskooei B, Royston P, Parmar MK. A simulation study of predictive ability measures in a survival model I: explained variation measures. Statistics in Medicine 2012; 31(23): 2627-43. DOI: https://doi.org/10.1002/sim.4242

Fatima-Tuz-Zahura M, Mohammad KA, Bari W. Log-logistic proportional odds model for analyzing infant mortality in Bangladesh. Asia Pacific Journal of Public Health 2017; 29(1): 60-9. DOI: https://doi.org/10.1177/1010539516680023

Setu SP, Kabir R, Islam MA, Alauddin S, Nahar MT. Factors associated with time to first birth interval among ever married Bangladeshi women: A comparative analysis on Cox-PH model and parametric models. PLOS Global Public Health 2024; 4(12): e0004062. DOI: https://doi.org/10.1371/journal.pgph.0004062

Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology 2005; 67(2): 301-20. DOI: https://doi.org/10.1111/j.1467-9868.2005.00503.x

Downloads

Published

2025-03-25

How to Cite

Haganawiga, K. J. ., Pal, S. K. ., & Sirohi, A. . (2025). A Choice of Performance Metrics for Evaluating Predictive Accuracy of Survival Models. International Journal of Statistics in Medical Research, 14, 153–160. https://doi.org/10.6000/1929-6029.2025.14.16

Issue

Section

General Articles