Skip to main content

Regression Methods for Epidemiologic Analysis

  • Chapter
Handbook of Epidemiology

Abstract

Basic tabular and graphical methods are an essential component of epidemiologic analysis and are often sufficient, especially when one need consider only a few variables at a time. They are, however, limited in the number of variables that they can examine simultaneously. Even sparse-strata methods (such as Mantel-Haenszel) require that some strata have two or more subjects; yet, as more and more variables or categories are added to a stratification, the number of subjects in each stratum may eventually drop to 0 or 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 199.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agresti A (2002) Categorical data analysis. Wiley, New York

    MATH  Google Scholar 

  • Bancroft TA, Han C-P (1977) Inference based on conditional specification: A note and a bibliography. Int Stat Rev 45:117–127

    MATH  MathSciNet  Google Scholar 

  • Berk R (2004) Regression analysis: A constructive critique. Sage publications, Thousand Oaks, CA

    Google Scholar 

  • Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis: theory and practice. MIT Press, Cambridge, MA

    MATH  Google Scholar 

  • Breiman L (2001) Statistical modeling: The two cultures (with discussion). Statistical Science 16:199–231

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow NE, Day NE (1980) Statistical methods in cancer research. Vol I: the analysis of case-control data. IARC, Lyon

    Google Scholar 

  • Breslow NE, Day NE (1987) Statistical methods in cancer research. Vol II: the design and analysis of cohort studies. IARC, Lyon

    Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA (1995) Measurement error in nonlinear models. Chapman and Hall, New York

    MATH  Google Scholar 

  • Clayton D, Hills M (1993) Statistical models in epidemiology. Oxford University Press, New York

    Google Scholar 

  • Cole SR, Ananth CV (2001) Regression models for unconstrained, partially or fully constrained continuation odds ratios. Int J Epidemiol 30:1379–1382

    Article  Google Scholar 

  • Copas JB (1983) Regression, prediction, and shrinkage (with discussion). J Royal Stat Soc B 45:311–354

    MATH  MathSciNet  Google Scholar 

  • Cox DR (1972) Regression models and life tables (with discussions). J Royal Stat Soc B 34:187–220

    MATH  Google Scholar 

  • Cox DR, Hinkley DV (1974) Theoretical statistics. Chapman and Hall, New York

    MATH  Google Scholar 

  • Cox DR, Oakes D (1984) Analysis of survival data. Chapman and Hall, New York

    Google Scholar 

  • Cox DR, Wermuth N (1992) A comment on the coefficient of determination for binary responses. Am Statist 46:1–4

    Article  Google Scholar 

  • Cytel Corporation. LogXact Version 5 (software) (2003) Cytel Corp., Cambridge, MA

    Google Scholar 

  • Diggle PJ, Heagerty P, Liang KY, Zeger SL (2002) The analysis of longitudinal data, 2nd edn. Oxford University Press, New York

    Google Scholar 

  • Efron B, Morris CN (1975) Data analysis using Stein’s estimator and its generalizations. J Am Stat Assoc 70:311–319

    Article  MATH  Google Scholar 

  • Flack VF, Chang PC (1987) Frequency of selecting noise variables in subset regression analysis: a simulation study. Am Statist 41:84–86

    Article  Google Scholar 

  • Freedman DA (1983) A note on screening regression equations. Am Statist 37:152–155

    Article  Google Scholar 

  • Good IJ (1983) Good thinking: The foundations of probability and its applications. University of Minnesota Press, Minneapolis, MN

    MATH  Google Scholar 

  • Green PJ, Silverman BW (1994) Nonparametric regression and generalized linear models: A roughness penalty approach. Chapman and Hall, New York

    MATH  Google Scholar 

  • Greenland S (1993) Basic problems in interaction assessment. Environ Health Perspect 101(suppl 4):59–66

    Article  Google Scholar 

  • Greenland S (1994) Alternative models for ordinal logistic regression. Stat Med 13:1665–1677

    Article  Google Scholar 

  • Greenland S (1995a) Dose-response and trend analysis: Alternatives to categorical analysis. Epidemiology 6:356–365

    Article  Google Scholar 

  • Greenland S (1995b) Avoiding power loss associated with categorization and ordinal scores in dose-response and trend analysis. Epidemiology 6:450–454

    Article  Google Scholar 

  • Greenland S (1995c) Problems in the average-risk interpretation of categorical dose-response analyses. Epidemiology 6:563–565

    Article  Google Scholar 

  • Greenland S (1996) A lower bound for the correlation of exponentiated bivariate normal pairs. Am Statist 50:163–164

    Article  MathSciNet  Google Scholar 

  • Greenland S (1997) Second-stage least squares versus penalized quasi-likelihood for fitting hierarchical models in epidemiologic analyses. Stat Med 16:515–526

    Article  Google Scholar 

  • Greenland S (2000a) Principles of multilevel modeling. Int J Epidemiol 29:158–167

    Article  Google Scholar 

  • Greenland S (2000b) When should epidemiologic regressions use random coefficients?. Biometrics 56:915–921

    Article  MATH  Google Scholar 

  • Greenland S (2001) Putting background information about relative risks into conjugate prior distributions. Biometrics 57:663–70

    Article  MathSciNet  Google Scholar 

  • Greenland S (2003a) The impact of prior distributions for uncontrolled confounding and response bias: A case study of the relation of wire codes and magnetic fields towards childhood leukemia. J Am Stat Assoc 98:47–54

    Article  MATH  MathSciNet  Google Scholar 

  • Greenland S (2003b) Generalized conjugate priors for Bayesian analysis of risk and survival regressions. Biometrics 59:92–99

    Article  MathSciNet  Google Scholar 

  • Greenland S, Christensen R (2001) Data augmentation priors for Bayesian and semi-Bayes analyses of conditional-logistic and proportional-hazards regression. Stat Med 20:2421–2428

    Article  Google Scholar 

  • Greenland S, Maldonado G (1994) The interpretation of multiplicative-model parameters as standardized parameters. Stat Med 13:989–999

    Article  Google Scholar 

  • Greenland S, Schlesselman JJ, Criqui MH (1986) The fallacy of employing standardized regression coefficients and correlations as measures of effect. Am J Epidemiol 123:203–208

    Google Scholar 

  • Greenland S, Maclure M, Schlesselman JJ, Poole C, Morgenstern H (1991) Standardized regression coefficients: a further critique and review of some alternatives. Epidemiology 2:387–392

    Article  Google Scholar 

  • Greenland S, Schwartbaum JA, Finkle WD (2000) Problems from small samples and sparse data in conditional logistic regression. Am J Epidemiol 151:531–539

    Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, New York

    MATH  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  • Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Hosmer DW, Hosmer T, Le Cessie S, Lemeshow S (1997) A comparison of goodness-of-fit tests for the logistic regression model. Stat Med 16:965–980

    Article  Google Scholar 

  • Hurvich DM, Tsai CL (1990) The impact of model selection on inference in linear regression. Am Statist 44:214–217

    Article  Google Scholar 

  • Lagakos SW (1988) Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable. Stat Med 7:257–274

    Article  Google Scholar 

  • Lash TL, Fink AK (2003) Semi-automated sensitivity analysis to assess systematic errors in observational data. Epidemiology 14:451–458

    Google Scholar 

  • Le Cessie S, van Houwelingen HC (1992) Ridge estimators in logistic regression. Appl Stat 41:191–201

    Article  MATH  Google Scholar 

  • Leamer EE (1978) Specification searches: Ad hoc interference with nonexperimental data. Wiley, New York

    Google Scholar 

  • Maclure M (1993) Demonstration of deductive meta-analysis: Ethanol intake and risk of myocardial infarction. Epidemiol Rev 15:328–351

    Google Scholar 

  • Maclure M, Greenland S (1992) Tests for trend and dose response: Misinterpretations and alternatives. Am J Epidemiol 135:96–104

    Google Scholar 

  • Maldonado G, Greenland S (1993a) Interpreting model coefficients when the true model form is unknown. Epidemiology 4:310–318

    Article  Google Scholar 

  • Maldonado G, Greenland S (1993b) Simulation study of confounder-selection strategies. Am J Epidemiol 138:923–936

    Google Scholar 

  • Maldonado G, Greenland S (1994) A comparison of the performance of modelbased confidence intervals when the correct model form is unknown: coverage of asymptotic means. Epidemiology 5:171–182

    Article  Google Scholar 

  • Mantel N, Haenszel WH (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22:719–748

    Google Scholar 

  • McCullagh P (1991) Quasi-likelihood and estimating functions. In: Hinkley DV, Reid NM, Snell EJ (eds) Statistical theory and modelling. Chapman and Hall, London, Chap. 11

    Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, New York

    MATH  Google Scholar 

  • Michels KB, Greenland S, Rosner BA (1998) Does body mass index adequately capture the relation of body composition and body size to health outcomes?. Am J Epidemiol 147:167–172

    Google Scholar 

  • Moolgavkar SH, Venzon DJ (1987) General relative risk regression models for epidemiologic studies. Am J Epidemiol 126:949–961

    Google Scholar 

  • Pearl J (1995) Causal diagrams for empirical research. Biometrika 82:669–710

    Article  MATH  MathSciNet  Google Scholar 

  • Pregibon D (1981) Logistic regression diagnostics. Ann Stat 9:705–724

    MATH  MathSciNet  Google Scholar 

  • Robins JM, Greenland S (1986) The role of model selection in causal inference from nonexperimental data. Am J Epidemiol 123:392–402

    Google Scholar 

  • Robins JM, Greenland S (1994) Adjusting for differential rates of prophylaxis therapy for PCP in high versus lowdose AZT treatment arms in an AIDS randomized trial. J Am Stat Assoc 89:737–749

    Article  MATH  Google Scholar 

  • Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for Pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 3:319–336. Errata: Epidemiology (1993) 4:189

    Article  Google Scholar 

  • Robins JM, Greenland S, Hu FC (1999) Estimation of the causal effect of time varying exposure on the marginal mean of a repeated binary outcome. J Am Stat Assoc 94:687–712

    Article  MATH  MathSciNet  Google Scholar 

  • Rosenthal R, Rubin DB (1979) A note on percent variance explained as a measure of importance of effects. J Appl Psychol 9:395–396

    Article  Google Scholar 

  • Rothman KJ, Greenland S (1998) Modern epidemiology, 2nd edn. Lippincott, Philadelphia

    Google Scholar 

  • Royston P, Altman DG (1994) Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling (with discussion). Appl Stat 43:425–467

    Article  Google Scholar 

  • Sclove SL, Morris C, Radhakrishna R (1972) Non-optimality of preliminary-test estimators for the mean of a multivariate normal distribution. Ann Math Stat 43:1481–1490

    MATH  Google Scholar 

  • Sheehe P (1962) Dynamic risk analysis of matched-pair studies of disease. Biometrics 18:323–341

    Article  Google Scholar 

  • Strömberg U (1996) Collapsing ordered outcome categories: a note of concern. Am J Epidemiol 144:421–424

    Google Scholar 

  • Titterington DM (1985) Common structure of smoothing techniques in statistics. Int Stat Rev 53:141–170

    Article  MATH  MathSciNet  Google Scholar 

  • Walker AM, Rothman KJ (1982) Models of varying parametric formin case-referent studies. Am J Epidemiol 115:129–137

    Google Scholar 

  • Weiss RE (1995) The influence of variable selection: A Bayesian diagnostic perspective. J Am Stat Assoc 90:619–625

    Article  MATH  Google Scholar 

  • White H (1994) Estimation, inference, and specification analysis. Cambridge University Press, New York

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Greenland, S. (2005). Regression Methods for Epidemiologic Analysis. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-26577-1_17

Download citation

Publish with us

Policies and ethics