Introduction

High fasting plasma glucose was recently ranked as the fifth leading risk for death [1] and 6.8% of global excess mortality was attributed to diabetes [2]. Prevalence of this metabolic disorder is predicted to reach nearly 600 million cases by 2035 [3], posing both a substantial morbidity and mortality burden and a large financial cost on individuals and healthcare systems [4, 5].

Evidence on the effects of physical activity (PA) on risk of diabetes arises from interventional [69] and observational studies [1014]. Prevention trials conducted in patients with impaired glucose tolerance provide some understanding of the extent to which PA may confer a preventive effect on progression to type 2 diabetes in high-risk populations [69, 15]. However, the majority of these studies include both diet and PA interventions, and isolation of the impact of PA itself is rarely possible. It is also difficult to evaluate the benefit of the whole PA exposure continuum from trials, as most intervention studies focus on shifting participants’ behaviours towards the recommended levels of exercise rather than assessing the benefits of changes at the lowest ends of the normal PA spectrum, or the additional benefits gained at the highest level. Therefore, although associated with a higher risk of confounding, evidence from cohort studies in the general population can provide complementary evidence of the dose–response relationship between PA and diabetes, independent of diet.

Public health guidelines [16, 17] recommend a minimum of 150 min of moderate to vigorous PA (MVPA) or 75 min vigorous PA (VPA) a week to maintain general health. Self-report data suggest that around a third of adults globally are not meeting these targets [18]. A fundamental consideration in the formulation of PA guidelines, however, is the nature of the dose–response relationship between PA and non-communicable disease incidence.

Dose–response curves for PA and health outcomes, ranging from cardiovascular disease to all-cause mortality, suggest a non-linear dose–response shape [1924], often with large gains when low activity is compared with completely sedentary but much smaller additional benefits beyond that. A recent review suggested a non-linear relationship between PA and diabetes. However, it found differently shaped dose–response curves based on the different ways in which PA was reported in the original studies [25]. Each of the dose–response analyses only included a small portion of the total studies available in this area of research, owing to a lack of data harmonisation and leaving considerable uncertainty about the relative risk for any given exposure since not all of the evidence could be considered.

Providing quantitative estimates regarding the dose–response relationship is essential for approximating how changes in levels of PA in the general population would impact disease incidence, and would support more nuanced guidance to the public and evidence-based dialogue in clinical settings.

Calculating the dose of PA is associated with considerable uncertainty and can be achieved using a variety of methods. In deciding how to equate activities of varying intensity, one issue is whether to include the resting metabolic rate. In this review we investigate the dose–response relationship between PA and type 2 diabetes via a systematic review and dose–response meta-analysis. We report results quantifying PA dose, both via inclusion and exclusion of the resting metabolic rate in the summation of PA volume.

Methods

Search strategy

PubMed and EMBASE were searched for prospective cohort studies on the association between PA and type 2 diabetes using a combination of medical subject heading (MeSH) and indexed terms (details in electronic supplementary material [ESM] Fig. 1). Search filters for observational studies were applied to refine the search output. The reference list of past systematic reviews were manually searched for further studies [2632]. No restrictions on date of publication were set and new results were included up until December 2015.

Eligibility criteria

Prospective studies were included if they: (1) followed a cohort of adults; (2) excluded individuals with type 2 diabetes at baseline; (3) ascertained levels of leisure-time PA (LTPA) or total PA at baseline; and (4) reported RRs, ORs or HRs for incidence of type 2 diabetes. Exclusion criteria were: (1) studies which reported insufficient detail of PA assessment to estimate PA dose in metabolic equivalent of task (MET) h/week; (2) studies using measures of fitness as the exposure; (3) studies reporting PA as a dichotomous variable; and (4) duplicate data.

Two researchers (ADS and BR-S) screened titles and abstracts for eligibility according to the pre-specified criteria. When eligibility was ambiguous, the full text was retrieved. To ensure no duplicate data were included, cohort name, recruitment periods or protocols were compared, and only the most complete publication was included. A third researcher (O. Olayinka, London School of Hygiene and Tropical Medicine, London, UK) assessed the identified articles and any disagreements were discussed until consensus was reached. A breakdown of the literature search is shown in ESM Fig. 2.

Data extraction and exposure harmonisation

Data were extracted (by ADS) from eligible studies on first author, publication date, geographical location, cohort size, sex and age characteristics, cumulative incidence or incidence rate of type 2 diabetes, case count per category of PA exposure, total persons or person-years per PA category, method and unit of PA assessment, reported levels of PA exposure, ORs/RRs/HRs for type 2 diabetes with 95% CIs for each PA category, and covariates for which the analyses were adjusted. Overall study quality score was derived using the Newcastle Ottawa Scale (NOS); inter-rater reliability (between ADS and O. Olayinka) was 86% (full NOS results are shown in ESM Table 1).

In prospective studies where HRs or ORs for type 2 diabetes were reported, we assumed these approximated the RR [33]. We pooled the most adjusted risk estimates both including and excluding adjustment for BMI. Initially we harmonised group-level exposure estimates to the common unit of MET h/week, thereby allowing integration of activities differing in intensity and duration amassed over the course of a week. For the assignment of specific intensities to categories of PA exposure, average intensity of MVPA and VPA was defined as 4.5 and 8 METs (or 3.5 and 7 marginal METs [MMETs]), respectively [34]. Studies reporting data independently for men and women [3539] or for multiple cohorts within a study [35] were treated as separate observations. Studies reporting risk estimates relative to the highest category of PA were re-calculated to set the lowest PA [36, 4042] category as the referent [43].

When not directly reported, classic PA volume (MET h/week) was calculated by multiplication of the median or mid-point duration of the reported category with its assigned gross MET value. Open-ended categories for average LTPA duration were converted to point estimates by assuming that the median of the open-ended category was equidistant from the lower category boundary as half the interval width in the neighbouring category [44]. For one study that reported PA as PA level (PAL, a measure of energy expenditure expressed as a multiple of 24 h resting metabolic rate), an approximation of LTPA MET h/week was performed using descriptions of typical PA levels for each category [45]. If PA was reported only as frequency of sessions per week, a single session was assumed to consist of 45 min in the main analysis with an assumption of 30 min tested in sensitivity analysis. Likewise, if only average duration for PA (e.g. walking, cycling) was reported, we assumed this was undertaken at an intensity of 4.5 METs. Marginalised PA volume (MMET h/week) was calculated by discounting the resting metabolic rate of 1 MET in the quantification of PA intensity. An overview of dose assignment calculations is shown in ESM Table 2. For summary data, we subtracted 1 MET h from each 1 h increment over which total reported activity was performed. When the required data were not reported in the original articles we emailed authors from the identified cohorts to acquire further details, e.g. on duration of PA and number of type 2 diabetes cases for each PA exposure category. Following correspondence, updated follow-up data [11, 13] and further details on PA behaviour [11, 38, 46, 47] were obtained.

Statistical analysis

Generalised least-squares (GLS) regression was performed to estimate study-specific dose–response curves. GLS regression estimates the linear dose–response coefficients taking into account the covariance for each exposure category within each study, as they are estimated relative to a common referent PA exposure category [48, 49]. Study-specific dose–response coefficients were pooled using the DerSimonian–Laird estimator in a random-effects model [50]. First, a linear association was assumed; study-specific RR estimates were calculated per 10 MET h/week increment and subsequently pooled. Two cohorts [51, 52] did not provide sufficient data to be included in this model. However, variance-weighted least-squares regression analysis was used to estimate linear associations for both of these studies, allowing us to quantify the influence of excluding these on the overall effect estimates.

Sensitivity analyses were conducted by consecutive removal of individual studies from the summary risk estimate and via restriction to high-quality studies. The impact of duration and intensity assumptions (when necessary) was assessed by applying lower values. Subgroup analysis by sex, study location, cohort size and follow-up time was undertaken. Mediation by BMI was explored according to the degree of adjustment (BMI adjusted vs non-BMI adjusted) and participant obesity (BMI < 30 vs > 30 kg/m2). To further reduce heterogeneity, we separately pooled risk estimates that either focused on LTPA or the more inclusive measures of total PA. Significance of subgroup and sensitivity analysis was judged by the p value for heterogeneity [53].

In addition, we examined possible non-linear associations by modelling PA using restricted cubic spline with three knots located at the 25th, 50th and 75th percentiles of the distribution. Only studies reporting risk estimates for at least three PA exposure levels for incident type 2 diabetes [54] were included in this analysis. Departure from linearity of the final cubic spline model was assessed using the Wald test for non-linearity [55].

Publication bias was investigated by funnel plot and Egger’s test for asymmetry. All reported p values were two sided. All analyses were performed using Stata 13.1 (Stata Corp, College Station, TX, USA). Interactive dose–response curves were visualised using R (R Foundation for Statistical Computing, Vienna, Austria) [56].

Results

Literature search

In total, 28 eligible cohort studies were identified which returned a total of 32 independent observations on PA and incidence of type 2 diabetes. The majority of studies (24 cohorts) yielded information on the association between LTPA and type 2 diabetes (28 observations), while four cohorts [39, 5759] reported findings on total PA. Overall, this review includes 1,261,991 individuals and 84,134 incident cases of type 2 diabetes.

Study characteristics

Cohort size ranged from 916 to 675,496 people, with cumulative type 2 diabetes incidence ranging from 1.6% [42] to 27.5% [46]. Follow-up time varied from 3 [42] to 23.1 [60] years. Twelve studies were conducted in the USA [12, 14, 35, 38, 46, 58, 6065], six in Asia [47, 57, 59, 6668], two in Australia [40, 42] and eight across Europe [13, 36, 37, 39, 41, 6971]. All cohorts relied on self-reported PA collected using questionnaires or by interview, apart from one study in Hawaiians [58]. A descriptive summary of the cohort characteristics can be found in Table 1.

Table 1 Summary of the characteristics of 28 prospective cohort studies that investigate the association between levels of PA and incident type 2 diabetes, identified in the systematic literature search

Age was the only variable for which all cohorts had adjusted their findings, with adjustment for other confounders varying considerably. Four cohorts [14, 36, 58, 64] did not adjust for BMI, a key variable believed to mediate the effect of PA on type 2 diabetes. Overall, inverse associations between PA and incident type 2 diabetes were observed for all identified cohorts.

Linear association between PA and incidence of type 2 diabetes

Study-specific linear RRs (95% CI) for 10 MET h/week increments of PA sorted by PA domain and publication year, are shown in Fig. 1.

Fig. 1
figure 1

Forest plot of the study-specific RRs for type 2 diabetes for every 10 MET h/week exposure of PA, sorted by PA domain and publication year. Study-specific estimates obtained by a generalised least squares regression assuming a linear relationship of the RR to the referent in a random-effects model. Referents for PA were the individuals reporting no or lowest level of PA within the specific study. (I)/(II) indicate subcohorts with independently reported risk estimates for type 2 diabetes. The black midline indicates the line of no effect. The diamond indicates the pooled (subgroup) estimate. Grey boxes are relative to study size and the black vertical lines indicate 95% CIs around the effect size estimate

The mean pooled risk reduction for type 2 diabetes was 13% (95% CI 11%, 16%) per 10 MET h/week increment of PA, albeit observed in the presence of high heterogeneity (I 2 93.5%, p Het < 0.001). Consecutive removal of single studies indicated no significant impact of any one study on the overall heterogeneity in the model (I 2 88.3–92.3%, p Het < 0.001). Likewise, restriction to studies rated as high quality did not substantially influence model heterogeneity (I 2 82%, p Het < 0.001, n = 17).

Risk reductions for type 2 diabetes were considerably more pronounced for LTPA compared with the benefits estimated for total PA. Each 10 MET h/week increment of LTPA reduced type 2 diabetes risk by 17% (95% CI 13%, 21%) compared with 5% (95% CI 2%, 7%) for each 10 MET h/week increment of total PA. Benefits from VPA integrated over time to MET h/week were much larger, with a decrease in risk of type 2 diabetes of 56% (95% CI 16%, 77%) per 10 MET h/week increment.

The effects appeared to be more pronounced in women with a pooled RR of 0.83 (95% CI 0.77, 0.90, I 2 89.5%, p Het < 0.001, n = 10 observations) compared with a pooled RR for men of 0.89 (95% CI 0.86, 0.93, I 2 95.3%, p Het < 0.001, n = 13 observations) per 10 MET h/week. Studies conducted in Asia on average observed less benefit, with a mean RR of 0.97 (95% CI 0.95, 0.98, I 2 65.2%, p Het < 0.001, n = 6 observations) per 10 MET h/week when compared to the USA (0.85 [95% CI 0.79, 0.91, I 2 96.6%, p Het < 0.001, n = 13]) or Europe (0.83 [95% CI 0.77, 0.89, I 2 80.6%, p Het < 0.001, n = 11 observations]). The two studies in Australia reported, on average, the highest benefit (0.81 [95% CI 0.65, 1.01, I 2 77.1%, p Het < 0.001]; see Table 2).

Table 2 Relative risk estimates for type 2 diabetes per 10 MET h/week of physical activity, stratified by study design and population characteristics

Adjustment for BMI appeared to attenuate the pooled protective effect size by around one-third, from 0.81 (95% CI 0.77, 0.84, I 2 96.8%, p Het < 0.001, n = 21 observations) to 0.87 (95% CI 0.84, 0.90, I 2 92.6%, p Het < 0.001, n = 27 observations). Stratification by participant BMI suggested the protective effect of activity was more pronounced in those with BMI < 30 kg/m2, with an observed mean RR of 0.75 (95% CI, 0.65, 0.95, I 2 63.1%, p Het = 0.01, n = 4 observations) vs 0.88 (95% CI 0.80, 0.96, I 2 0.00, p Het < 0.001, n = 3 observations) for obese individuals. Inspection of funnel plots and Egger’s test for asymmetry (p < 0.001) did not indicate the presence of publication bias or small studies effect (ESM Fig. 3).

Non-linear dose–response analysis

In total, data from 23 cohorts were included in the restricted cubic spline analysis and the ensuing pooling in a two-stage multivariate dose–response model. A significant non-linear dose–response is shown in Fig. 2a (p Non-linearity < 0.001), with greater risk reduction at moderate exposures compared with higher ones.

Fig. 2
figure 2

(ad) Dose–response association between LTPA and incidence of type 2 diabetes modelled using restricted cubic splines and comparison of predicted RR point estimates for type 2 diabetes using different dose-assignment assumptions. LTPA converted to MET h/week with results pooled in a two-stage random-effects model. RRs were derived from a common lowest PA category within each study. Listed exposure levels were chosen to represent meaningful and easy to interpret PA volumes equivalent to the following: 30 min of MVPA; 1 h MVPA; rounded value to allow for comparison with GLS PA exposure increment; 150 min PA/current recommended guidelines; double the recommended guidelines and two high PA exposure levels investigating the risk reductions at the higher end of the LTPA spectrum. The bold lines indicate the pooled restricted cubic spline model and the black dashed line indicates the 95% CIs of the pooled curve. Duration assumption was necessary in nine out of 27 observations, applied as 45 min/session in scenarios (a) and (c), and 30 min/session in scenarios (b) and (d). Intensity assumption was necessary in 15 out of 27 observations, applied as low-intensity PA (LPA) = 3 MET, MVPA = 4.5 MET and VPA = 8 MET in scenarios (a) and (b), and LPA = 2 MET, MVPA = 3.5 MET and VPA = 7 MET in scenarios (c) and (d)

Results from the cubic spline model suggest that individuals who accumulate 11.25 MET h/week (equivalent to meeting the recommended guidelines of 150 min/week of activity at 4.5 MET) have a reduced risk of developing type 2 diabetes equal to 26% (95% CI 20%, 31%) relative to completely inactive individuals.

We found no indication of a substantial threshold effect or plateau for the obtained benefit across increasing levels of PA. Being active at a level corresponding to double that of the recommended minimal PA (22.5 MET h/week) was associated with a reduced risk of type 2 diabetes of 36% (95% CI 27%, 46%) with further reductions at higher doses (60 MET h/week, risk reduction of 53%), in the cubic spline model.

For 8.75 MMET h/week (equivalent to 11.25 MET h/week at a mean gross intensity of 4.5 MET) the pooled RR for type 2 diabetes was 0.74 (95% CI 0.69, 0.80), with risk being 0.64 (95% CI 0.56, 0.73) for those doing twice as much. Point risk estimates of the pooled dose–response relation for LTPA (in MET h/week) and type 2 diabetes are tabulated in Fig. 2 (also available online as an interactive version at http://epiweb.mrc-epid.cam.ac.uk/meta-analyses/pa/diabetes/).

Sensitivity analyses were run to assess the effect of assumptions regarding duration or intensity of the PA exposure used in the LTPA dose assignment procedure for those studies where this information was not directly available; see Fig. 2 b-d and ESM Fig. 4. The shape of the dose–response curve was similar under these different assumptions. Benefits were larger for a given exposure if duration and intensity were assumed to be smaller in the original studies where these assumptions were needed. Furthermore, we repeated the final cubic spline model including variance-weighted linear dose–response gradients of the two identified studies that could not be used in the main model because of incomplete data. The impact of excluding these studies was minimal on the overall final result, with a risk reduction of 24% (95% CI 19%, 29%) at 11.25 MET h/week in this more inclusive model (ESM Table 3 and ESM Fig. 4).

Discussion

Our results from a comprehensive literature search identifying relevant longitudinal studies indicate an inverse association between PA and incidence of type 2 diabetes, which was consistently observed across the identified cohorts. Using the restricted cubic splines model, accumulating an activity volume which is commensurate with adherence to the current public health recommendations of 150 min of MVPA per week compared with sedentary individuals was associated with a reduction in the risk of type 2 diabetes by 26% (95% CI 20%, 31%) in the general population.

Our results suggest that the benefits of higher activity levels extend considerably beyond the minimum recommendations. Using the restricted cubic spline model we found that a doubling of activity volume from 11.25 MET h/week to 22.5 MET h/week would further reduce the risk of type 2 diabetes by 10% to a total risk reduction of 36% compared with being inactive. For an intensity of 4.5 MET, our results were very similar under the MMET analysis. However, a greater benefit would be gained from using MMETs for more intensive activity, whereas less intensive activity would gain smaller benefits.

Central to any dose–response analysis for assessing PA in relation to health is the issue of uncertainty in the way by which PA was assessed in free-living individuals. Self-reported PA generally correlates significantly but weakly with objective methods of PA ascertainment, with approximately 10% shared variance [60]. A further crucial issue which may have affected our findings is the substantial heterogeneity in the measurement and reporting of PA behaviour, resulting from questionnaires ascertaining different domains, timeframes and/or units of PA. Methods of outcome assessment were also not consistent across the identified cohorts and it is possible that diagnostic bias may have distorted the results of some of the studies because of differences in diabetes detection accuracy.

When interpreting the findings, the fact that most studies were primarily conducted in samples of well-educated white populations in high-income countries must be taken into account. In the context of type 2 diabetes, earlier studies have found that dose–response curves may be different for Asian Indians who may require more PA to be protected from their relatively higher susceptibility to develop type 2 diabetes [72, 73].

A potential strength of our present analyses is the expression of PA exposure dose in MMET h/week rather than just MET h/week. There is a fine distinction between these two measures; an individual expending 3 METs on a given activity is using double the activity-related energy above rest than an individual performing an activity at 2 METs. By setting the starting point of the PA volume at 0 MMET h/week, better mathematical properties (proportionality) of the exposure variable are taken into account, allowing different intensities of activity to be more fairly equated, both within and across individuals and populations. This calculation gives a relatively higher weighting to time spent in more vigorous activity compared with classic METs. This means that doing more intensive activity would equate to a relatively larger dose in the MMET model than under the MET model. For example, 2.5 h/week of MVPA at 4.5 MET (equal to 11.25 MET h/week) is volume equivalent to 1.41 h of 8 MET of intense activity, while 2.5 h/week of MVPA at 3.5 MMET (equal to 8.75 MMET h/week) is volume equivalent to 1.25 h of 7 MMET of intense activity. Results for MVPA were similar, but benefits were larger for more intense PA.

Most cohorts were not designed to specifically investigate PA and the resulting paucity of comprehensive data on all PA behaviours may have hindered our analysis. We used aggregated exposure measures across a range of reported activities from each study, which relied on the originally assigned intensity values for each activity by the primary study analysis alongside aggregated durations, however it is likely that more accurate MMET h estimates could be calculated with access to individual-level raw PA data. Nevertheless, expressing PA in marginal MET units is a promising method to account for activities of differing intensity and would be aided by better reporting of intensity and duration characteristics for each exposure group.

As a restricted cubic spline regression model was used to study the shape of the dose–response relationship, we were able to improve precision as to how the association between PA and incident type 2 diabetes varies at different exposure levels [49].

An earlier systematic review [25] also conducted dose–response meta-analyses for PA and type 2 diabetes. However, this review achieved far less data harmonisation than in our paper. Aune et al report results separately for MET h/week (five studies), hours per week (ten studies) and energy expenditure (four studies). They found a larger benefit (based on an assumption of moderate intensity activity) and a more linear dose–response curve using the time-based measure compared with the MET h measure. Our results, which are derived from 23 studies, suggest considerably larger benefits for the same PA exposure level, e.g. RR of 0.65 vs RR of 0.76 at 20 MET h/week. Given that our more extensive approach to harmonisation requires more assumptions it is encouraging that our sensitivity analysis found relatively small differences in the size of the effects, and little difference in the shape of the dose–response curve.

Previous research into PA and other health outcomes has often provided evidence in favour of a strongly curvilinear dose–response relationship [2023, 74]. This curvilinear association has been the basis for further health impact modelling studies [75] and, as such is used to estimate how much gain there would be in population health from different PA interventions or scenarios. Uncertainty about the dose–response shape has been found to contribute substantially to uncertainty about the final results of partaking in PA for disease prevention. Our results indicate that for type 2 diabetes prevention, while probably curvilinear over a much wider exposure range, the relationship is much closer to linearity than that found previously for all-cause mortality or ischaemic heart disease [21]. Our effect estimates are likely to be conservative, given the diluting impact that exposure measurement error stemming from a single self-report measure of activity will have on the observed associations. Even so, our results suggest a major potential for PA to slow down or reverse the global increase in type 2 diabetes prevalence and should prove useful for health impact modelling, which frequently forms part of the evidence base for policy decisions (e.g. WebTAG for transport [76]).

Increasingly, PA research is incorporating the use of objective data, e.g. UK Biobank has recently collected accelerometry data in 100,000 individuals who are also followed up over time to link this data with health outcomes. However, before such studies accrue enough major clinical events to examine prospective relationships, self-report data may be calibrated against objective measures to enhance translation of findings based on self-report into public health action [77].

Given the non-linear nature of the dose–response curve between LTPA and type 2 diabetes, the effects of LTPA are likely to depend on the exposure to non-leisure activity. Our finding of a smaller effect for total PA is unexpected but was based on a much smaller evidence base and may reflect differences in measurement properties between domains. Assuming, however, that the non-linear relationship holds across all domains, the marginal effect of LTPA will be greater in a population that is less active in other domains and vice versa. One way to address this would be to conduct a meta-analysis of LTPA by level of non-leisure PA, e.g. occupational grouping.

The results from this dose–response meta-analysis provide evidence in support of the clinically meaningful role of PA in the primary prevention of type 2 diabetes in the general population. We highlight the necessity for progress in PA measurement and reporting of PA of different intensities and duration in cohort studies. Additionally, we recommend investigations to consider the dose–response relationship of PA and type 2 diabetes prevention in more ethnically diverse population groups.

Overall, we found the dose–response curve for PA and incident type 2 diabetes is curvilinear. Our study suggests that notable health benefits of PA can be realised even at relatively low levels of PA but also that considerable additional decreases in risk for type 2 diabetes are afforded when substantially exceeding the current PA guidelines.

Our meta-analysis supports the generally accepted notion of a graded association between PA and health maintenance [78, 79]. It favours a some is good but more is better’ guideline, in which specific targets are mainly used for a psychological effect. There is no clear cut-off at which benefits are not achieved and health protection increases at activity levels well beyond current recommendations. Enabling cultures and built environments to increase PA at the population-wide level may prevent substantial personal suffering and economic burden. Given the current obesity and diabetes epidemic, the utility of such a strategy may reach beyond any present-day approaches to improve population health.