A Correlation Technique to Reduce the Number of Predictors to Estimate the Survival Time of HIV/ AIDS Patients on ART


  • Anurag Sharma Department of Statistics, University of Delhi, Delhi-110007, India
  • Vajala Ravi Department of Statistics, Lady Shri College, University of Delhi, Delhi-110024, India
  • Gurprit Grover Department of Statistics, University of Delhi, Delhi-110007, India
  • Rabindra Nath Das Department of Statistics, The University of Burdwan, Burdwan, West Bengal-713104, India
  • M.K. Varshney Department of Statistics, Hindu College, University of Delhi, Delhi-110007, India




AIDS, AFT, Correlation, Chi- Square test, One- Way ANOVA.


Till now, many research papers have been published which aims to estimate the survivle time of the HIV/AIDS patients taking into consideration all the predictors viz, Age, Sex, CD4, MOT, Smoking, Weight, HB, Coinfection, Time, BMI, Location Status, Marital Status, Drug etc, although all the predictors need not to be included in the model. Since some of the predictors may be correlated/ associated and may have some influence on the outcome variable, therefore, instead of taking both the significantly correlated/ associated predictors, we may take only one of the two. In this way, we may be able to reduce the number of predictors without affecting the estimated survival time. In this paper we have tried to reduce the number of predictors by determining the highly positively correlated predictors and then evaluating the effect of correlation/ association on the survival time of HIV/AIDS patients. These predictors that we have considered in the starting are Age, Sex, State, Smoking, Alcohol, Drugs, Opportunistic Infections (OI), Living Status (LS), Occupation (OC), Marital Status (MS) and Spouse for the data collected from 2004 to 2014 of AIDS patients in an ART center of Delhi, India. We have performed one – way ANOVA to test the association between a quantitative and a categorical variable and Chi-square test to test between two categorical variables. To select one of the two highly correlated/ associated predictors, a suitable model is fitted keeping one predictor independent at a time and other dependent and the model having the smaller AIC is considered and the independent variable in the model is included in the modified model. The fitted models are logistic, linear and multinomial logistic depending on the type of the independent variable to be fitted. Then the true model (having all the predictors) and the modified model (with reduced number of predictors) are compared on the basis of their AICs and the model having minimum AIC is chosen. In this way we could reduce the number of predictors by almost 50% without affecting the estimated survival time with a reduced standard error.


