Methods Inf Med 2001; 40(03): 259-264
DOI: 10.1055/s-0038-1634163
Original Article
Schattauer GmbH

Application of Resampling Techniques to the Statistical Analysis of the Brier Score

M. Ikeda
1   Department of Medical Information and Medical Records, Nagoya University Hospital, Japan
,
S. Itoh
2   Division of Technical Radiology, Department of Health Science, Nagoya University, School of Medicine, Japan
,
T. Ishigaki
3   Department of Radiology, Nagoya University, School of Medicine, Japan
,
K. Yamauchi
1   Department of Medical Information and Medical Records, Nagoya University Hospital, Japan
› Author Affiliations
Further Information

Publication History

Publication Date:
07 February 2018 (online)

Abstract:

We investigated the application of resampling techniques to the statistical analysis of the Brier score (B), and extended them to the statistical comparison of two Bs derived from the same set of patients. The re-sampling techniques are helpful in the statistical analysis of B, and there are almost no differences between the jackknife method and the bootstrap method in this analysis. Thus, we believe that B should be used more often as an index to evaluate probabilistic judgments in the case in which the data sets for the assessment are “degenerate” as the “receiver operating characteristic data sets.”

 
  • REFERENCES

  • 1 Gurney JW. Neural networks at the crossroads: caution ahead. Radiology 1994; 193: 27-30.
  • 2 Spiegelhalter DJ. Probabilistic prediction in patient management and clinical trials. Stat Med 1986; 5: 421-33.
  • 3 Dolan JG, Bordley DR, Mushlin AI. An evaluation of clinician’s subjective prior probability estimates. Med Decis Making 1986; 6: 216-23.
  • 4 Poses RM, Cebul RD, Centor RM. Evaluating physicians’ probabilistic judgments. Med Decis Making 1988; 8: 233-40.
  • 5 McClish DK, Powell SH. How well can physicians estimate mortality in a medical intensive care unit?. Med Decis Making 1989; 9: 125-32.
  • 6 Redelmeier DA, Bloch DA, Hickam DH. Assessing predictive accuracy: how to compare Brier scores. J Clin Epidemiol 1991; 44: 1141-6.
  • 7 Brier GW. Verification of forecasts expressed in terms of probability. Month Weather Rev 1950; 78: 1-3.
  • 8 Yates JF. External correspondence: decompositions of the mean probability score. Organiza Behav Hum Perform 1982; 30: 132-56.
  • 9 Mossman D. Resampling techniques in the analysis of non-binormal ROC data. Med Decis Making 1995; 15: 358-66.
  • 10 Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148: 839-43.
  • 11 Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: 29-36.
  • 12 Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York: Chapman and Hall; 1993
  • 13 Swets JA, Pickett RM. Evaluation of diagnostic systems: methods from signal detection theory. New York: Academic Press; 1982
  • 14 Swets JA. ROC analysis applied to the evaluation of medical imaging techniques. Invest Radiol 1979; 14: 109-21.
  • 15 Simpson AJ, Fitter MJ. What is the best index of detectability?. Psychol Bull 1973; 80: 481-8.
  • 16 McClish DK. Comparing the areas under more than two independent ROC curves. Med Decis Making 1987; 7: 149-55.
  • 17 Pan X, Metz CE. The “proper” binormal model: Parametric receiver operating characteristic curve estimation with degenerative data. Acad Radiol 1997; 4: 380-9.