The Hybrid ROC (HROC) Curve and its Divergence Measures for Binary Classification

Authors

  • S. Balaswamy Department of Statistics, Pondicherry University, Puducherry – 605 014, India
  • R. Vishnu Vardhan Department of Statistics, Pondicherry University, Puducherry – 605 014, India
  • K.V.S. Sarma Department of Statistics, Sri Venkateswara University, Tirupati – 517 502, India

DOI:

https://doi.org/10.6000/1929-6029.2015.04.01.11

Keywords:

AUC, Exponential distribution, Half-Normal distribution, Hybrid ROC Curve, Kullback-Leibler Divergence.

Abstract

In assessing the performance of a diagnostic test, the widely used classification technique is the Receiver Operating Characteristic (ROC) Curve. The Binormal model is commonly used when the test scores in the diseased and healthy populations follow Normal Distribution. It is possible that in real applications the two distributions are different but having a continuous density function. In this paper we considered a model in which healthy and diseased populations follow half normal and exponential distributions respectively, hence named it as the Hybrid ROC (HROC) Curve. The properties and Area under the curve (AUC) expressions were derived. Further, to measure the distance between the defined distributions, a popular divergence measure namely Kullback Leibler Divergence (KLD) has been used. Simulation studies were conducted to study the functional behavior of Hybrid ROC curve and to show the importance of KLD in classification.

Author Biographies

S. Balaswamy, Department of Statistics, Pondicherry University, Puducherry – 605 014, India

Statistics

R. Vishnu Vardhan, Department of Statistics, Pondicherry University, Puducherry – 605 014, India

Statistics

K.V.S. Sarma, Department of Statistics, Sri Venkateswara University, Tirupati – 517 502, India

Statistics

References

Green DM, Swets JA. Signal Detection theory and Psychophysics. Wiley, New York 1966.

Ogilive and Creelman. Maximum Likelihood Estimation of Receiver Operating Characteristic Curve Parameters. Journal of Mathematical Psychology 1968; 5: 377-391. http://dx.doi.org/10.1016/0022-2496(68)90083-7 DOI: https://doi.org/10.1016/0022-2496(68)90083-7

Dorfman and Alf. Maximum Likelihood Estimation of parameters of signal detection theory-a direct solution. Psychometrika 1968; 33: 117-124. http://dx.doi.org/10.1007/BF02289677 DOI: https://doi.org/10.1007/BF02289677

Dorfman and Alf. Maximum-Likelihood Estimation of parameters of signal detection theory and determination of confidence interval-rating method data. Journal of Mathematical Psychology 1969; 6: 487-496. http://dx.doi.org/10.1016/0022-2496(69)90019-4 DOI: https://doi.org/10.1016/0022-2496(69)90019-4

Lusted LB. Signal detectability and medical decision making. Science 1971; 171: 1217-1219. DOI: https://doi.org/10.1126/science.171.3977.1217

Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology 1975; 12: 387-415. http://dx.doi.org/10.1016/0022-2496(75)90001-2 DOI: https://doi.org/10.1016/0022-2496(75)90001-2

Egan. Signal Detection Theory and ROC analysis. New York, Academic Press 1975.

Metz CE. Basic Principles of ROC analysis. Seminars in Nuclear Medicine 1978; 8: 283-298. http://dx.doi.org/10.1016/S0001-2998(78)80014-2 DOI: https://doi.org/10.1016/S0001-2998(78)80014-2

Swets JA, et al. Assessment of Diagnostic Technologies. Science 1979; 205: 753-759. http://dx.doi.org/10.1126/science.462188 DOI: https://doi.org/10.1126/science.462188

Hanley JA, Mc Neil BJ. A Meaning and Use of the area under a Receiver Operating Characteristics (ROC) Curves. Radiology 1982; 143: 29-36. http://dx.doi.org/10.1148/radiology.143.1.7063747 DOI: https://doi.org/10.1148/radiology.143.1.7063747

Hanley JA, Mc Neil BJ. A method of Comparing the Areas Under Receiver Operating Characteristics Analysis derived from the same cases. Radiology 1983; 148: 839-843. http://dx.doi.org/10.1148/radiology.148.3.6878708 DOI: https://doi.org/10.1148/radiology.148.3.6878708

Hanley JA. The Robustness of the Binormal Assumption used in fitting ROC curves. Medical Decision Making 1988; 8: 197-203. http://dx.doi.org/10.1177/0272989X8800800308 DOI: https://doi.org/10.1177/0272989X8800800308

Goddard MJ, Hindberg I. Receiver operating characteristic (ROC) curves and non-normal data: An empirical study. Statistics in Medicine 1990; 9: 325-337. http://dx.doi.org/10.1002/sim.4780090315 DOI: https://doi.org/10.1002/sim.4780090315

Pepe MS. Three approaches to regression analysis of receiver operating characteristic for continuous test results. Biomterics 1998; 54: 124-135. http://dx.doi.org/10.2307/2534001 DOI: https://doi.org/10.2307/2534001

Pepe MS. A regression modeling framework for receiver operating characteristic curves in medical diagnostic testing. Biometrika 1997; 84: 595-608. http://dx.doi.org/10.1093/biomet/84.3.595 DOI: https://doi.org/10.1093/biomet/84.3.595

Pepe MS. Interpretation, estimation and regression for ROC curves. Biometrics 2000; 56: 352-359. http://dx.doi.org/10.1111/j.0006-341X.2000.00352.x DOI: https://doi.org/10.1111/j.0006-341X.2000.00352.x

Alonzo TA, Pepe MS. Distribution free ROC analysis using binary regression techniques. Biostatistics 2002; 3(3): 421-432. http://dx.doi.org/10.1093/biostatistics/3.3.421 DOI: https://doi.org/10.1093/biostatistics/3.3.421

Zhang Z, Pepe MS. A Linear Regression Framework for Receiver Operating Characteristic (ROC) Curve Analysis. UW Biostatistics Working Paper Series, Paper 253, 2005, http://bepress.com/uwbiostat/paper253

Krzanowski WJ, Hand DJ. ROC curves for continuous data, Monographs on Statistics and Applied Probability. CRC Press, Taylor and Francis Group; NY 2009. DOI: https://doi.org/10.1201/9781439800225

Vardhan RV, Sarma KVS. On the Relationship between the Odds Ratio and the Area under the ROC Curve in the context of Logistic Regression for Comparing Several Biomarkers. International Journal of Statistics and Systems 2010; 5: 165-172.

Vardhan RV, Sarma KVS. Estimation of the Area under the ROC curve using Confidence Intervals of Mean. ANU Journal of Physical Sciences 2010; 2(1): 29-39.

Farragi and Benjamin Raiser. Estimation of the Area under the ROC Curve, Statistics in Medicine 2002; 21: 3093-3106. http://dx.doi.org/10.1002/sim.1228 DOI: https://doi.org/10.1002/sim.1228

Fisher RA. On the Mathematical Foundations of Theoretical Statistics. Philosophical Transactions of the Royal Society A 1921; 222: 309-368. http://dx.doi.org/10.1098/rsta.1922.0009 DOI: https://doi.org/10.1098/rsta.1922.0009

Kullback S, Leibler RA. On Information and Sufficiency. The Annals of Mathematical Statistics 1951; 22(1): 79-86. http://dx.doi.org/10.1214/aoms/1177729694 DOI: https://doi.org/10.1214/aoms/1177729694

Cover T, Thomas J. Elements of Information Theory. John Wiley & Sons, Inc 1991. DOI: https://doi.org/10.1002/0471200611

Burnham KP, Anderson DR. Model selection and multi model inference: a practical information-theoretic approach. 2nd eds., Springer, New York 2002.

Henson R, Douglas J. Test construction for cognitive diagnosis. Applied Psychological Measurement 2005; 29(4): 262-277. http://dx.doi.org/10.1177/0146621604272623 DOI: https://doi.org/10.1177/0146621604272623

Dumonceaux R, Antle CE. Discrimination between the log normal and weibull distributions. Technometrics 1973; 15(4): 923-926. http://dx.doi.org/10.1080/00401706.1973.10489124 DOI: https://doi.org/10.1080/00401706.1973.10489124

Kundu, D. and Manglick, A. Discriminating between the weibull and log normal distributions. Naval Research Logistics 2004; 51: 893-905. http://dx.doi.org/10.1002/nav.20029 DOI: https://doi.org/10.1002/nav.20029

Pascual FG. Maximum likelihood estimation under misspecified Log-Normal and Weibull distributions. Communications in Statistics-Simulation and Computations 2005; 34: 503-524. http://dx.doi.org/10.1081/SAC-200068380 DOI: https://doi.org/10.1081/SAC-200068380

Bain LJ, Englehardt M. Probability of correct selection of Weibull versus gamma based on likelihood ratio. Communications in Statistics Ser A 1980; 9: 375-381. DOI: https://doi.org/10.1080/03610928008827886

Fearn DH, Nebenzahl E. On the maximum likelihood ratio method of deciding between the Weibull and gamma distributions. Communications in Statistics Ser A 1991; 20(2): 579-593. http://dx.doi.org/10.1080/03610929108830516 DOI: https://doi.org/10.1080/03610929108830516

Mohd Saat NZ, Jemain AA, Al-Mashoor SH. A Comparison of Normal and generalized exponential distributions. Journal of Statistical Planning and Inference 2008; 127: 213-227. DOI: https://doi.org/10.1016/j.jspi.2003.08.017

Kundu D, Manglick A. Discriminating between the log normal and gamma distributions. Journal of the Applied Statistical Sciences 2005; 14: 175-187.

Arizono I, Ohta H. A test for normality based on Kullback Leibler information. The American Statistician 1989; 43: 20-22. DOI: https://doi.org/10.1080/00031305.1989.10475600

Clarke B. Asymptotic normality of the post error in relative entropy. IEEE Transactions on Information Theory 1999; 45: 165-176. http://dx.doi.org/10.1109/18.746784 DOI: https://doi.org/10.1109/18.746784

Song KS. Goodness of fit tests based on the Kullback-Leibler discrimination information. IEEE Transactions on Information Theory 2002; 48: 1103-1117. http://dx.doi.org/10.1109/18.995548 DOI: https://doi.org/10.1109/18.995548

Volkau I, Bhanu Prakash KN, Anand A, Aziz A, Nowinski W. L. Extraction of the midsagittal plane from morphological neuroimages using the Kullback-Leibler’s measure. Medical Image Analysis 2006; 10: 863-874. http://dx.doi.org/10.1016/j.media.2006.07.005 DOI: https://doi.org/10.1016/j.media.2006.07.005

Cabella BCT, Sturzbecher MJ, Tedeschi W, Filho OB, de raujo DB, Neves UPC. A numerical study of the Kullback-Leibler distance in functional magnetic resonance imaging. Brazilian Journal of Physics 2008; 38: 20-25. http://dx.doi.org/10.1590/S0103-97332008000100005 DOI: https://doi.org/10.1590/S0103-97332008000100005

Hughes, Bhaskar Bhattacharya. Symmetry Properties of Bi-Normal and Bi-Gamma Receiver Operating Characteristic Curves are Described by Kullback-Leibler Divergences. Entropy 2013; 15: 1342-1356. http://dx.doi.org/10.3390/e15041342 DOI: https://doi.org/10.3390/e15041342

Downloads

Published

2015-01-27

How to Cite

Balaswamy, S., Vardhan, R. V., & Sarma, K. (2015). The Hybrid ROC (HROC) Curve and its Divergence Measures for Binary Classification. International Journal of Statistics in Medical Research, 4(1), 94–102. https://doi.org/10.6000/1929-6029.2015.04.01.11

Issue

Section

General Articles