Introduction

Fractures in older women constitute a serious health risk, and often cause great suffering both in short and long term for the affected individuals [1, 2]. Additionally, patients with hip fracture often experience loss of physical function, decreased social engagement, increased dependence, and worsened quality of life [3,4,5]. A large proportion of patients who have had a hip fracture are forced to alter their living conditions, which could involve relocating from their home into a residential aged care facility, having extensive impact on the affected person’s autonomy [6,7,8]. Furthermore, all fractures and especially hip fractures increase the risk of mortality and morbidity in older men and women [9]. The etiology of fracture involves both bone fragility and fall risk. For each standard deviation (SD) decrease in bone mineral density (BMD) as assessed using dual-energy x-ray absorptiometry (DXA) in the femoral neck, the risk of hip fracture is increased nearly 3-fold [10]. Osteoporosis, sarcopenia, and reduced physical performance become more prevalent with increasing age and therefore contribute to the increasing risk of fall injuries and fractures with aging [11, 12]. Over 70% of all fractures affect women older than 65 years [13]. Risk factors for falls such as immobility and previous falls also contribute to the risk of fractures in the elderly [14, 15]. Identifying individuals who will fracture based on BMD alone has low sensitivity [16]. Therefore, fracture risk calculators which incorporate clinical risk factors in addition to BMD have been developed in recent years. Among those, the fracture risk assessment tool FRAX is the most widely used [17]. The timed up and go test (TUG) measures physical performance and has been used to identify frail older individuals. A slow TUG time (> 10 s) was associated with increased risk of fracture in a large study of Australian women, an association independent of BMD and some clinical risk factors [18]. However, the most appropriate cutoff time to define slow TUG time in relation to fracture risk has not been established. It is also not known if assessment of TUG time contributes independently to fracture risk when all clinical risk factors incorporated in FRAX are considered and to what extent TUG time affects fracture probability. The aim of the present work was to study the relationship between TUG time and risk of fracture, to investigate if TUG independently contributes to fracture risk when also considering all clinical risk factors currently included in FRAX and femoral neck BMD, and to determine to which extent TUG time contributes to the over-all fracture probability in older women.

Materials and methods

Subjects

SUPERB—Sahlgrenska University Hospital Prospective Evaluation of the Risk of Bone fractures—is a prospective population-based study, carried out in the greater Gothenburg area. The study comprises 3028 women, 75–80 years old [19]. Women were chosen randomly from the Swedish national population register. All subjects signed an informed consent, prior to participation. The study protocol has been approved by the regional Ethics Review Board in Gothenburg. The criteria for being invited to the Study Clinic (Department of Geriatrics, Sahlgrenska University Hospital, Mölndal, Sweden) were as follows: (1) acceptance of the invitation sent by letter and providing a positive response to the follow-up telephone call, (2) be able to attend the clinic visit and being ambulatory, (3) understand Swedish, (4) being a woman between the age of 75 and 80 years.

Anthropometrics and TUG test

Body height was measured using a wall-mounted calibrated stadiometer. Body weight was measured to the nearest 0.1 kg using the same scale in all women. The timed up and go (TUG) test investigates balance and mobility [20, 21]. The participants were timed (in seconds) starting when rising from sitting in a chair, 45 cm high, equipped with armrests, walking 3 m in normal pace, turning around, walking back, and sitting down again. The time for this procedure was recorded. They could use their regular footwear and were allowed to utilize any mobility aids that they normally required. The TUG test was performed by 3004 participants, and 24 could not accomplish the test.

Questionnaires

Data regarding medical and fracture history, physical activity, occurrence of falls in the last 12 months, alcohol consumption, parental history of hip fracture, oral glucocorticoid use, alcohol consumption, and calcium intake were collected using questionnaires. Self-reported fractures after the age of 50 years at any location, except the skull and face, were included in the FRAX score calculations. Current smoking was defined using a validated questionnaire [22]. A small proportion of participants (≈ 1.6%) could not recall if a parent had sustained a hip fracture. A null response was assumed in those cases. A high alcohol consumption was defined as 21 standard drinks per week or more [23]. Mental and physical health (MC12 and PCS12) related quality of life was assessed using SF12 Health Survey [24,25,26]. The validated questionnaire Physical Activity Scale for the Elderly (PASE) was used to estimate physical activity in the last 7 days before the inclusion in the study [27]. PASE is a self-reported questionnaire which targets individuals over the age of 65 years. Participation (yes/no) or the number of hours spent in an activity is multiplied by given weights, thus giving a total score. The daily calcium intake was assessed and estimated in a validated questionnaire, by adding the food-derived calcium intake to the amount of calcium provided by supplements [28]. No information regarding vitamin D intake or sunlight exposure was available.

Dual-energy x-ray absorptiometry measures

Bone density measurements were performed using the same DXA device for most participants (n = 2995) (Discovery A S/N 86491; Hologic, Waltham, MA, USA). Owing to a temporary machine failure, a few women (n = 33) were measured using another Discovery A Hologic DXA device. A cross-calibration was performed between the two instruments and has been reported elsewhere [19]. The areal BMD (aBMD) (g/cm2) of the femoral neck (FN) and lumbar spine (LS) were used in the analyses. Lateral scan imaging at baseline was performed using DXA with the participant in the supine position, in order to diagnose vertebral fractures, using the software program Physician’s Viewer (Hologic) as previously described [29]. After assessment of the anteroposterior lumbar, vertebrae which were fractured and/or contained osteosynthesis materials in the LS (L1 to L4) were excluded. The LS aBMD was calculated as the mean of L1 to L4 if at least two vertebrae were assessable. The coefficients of variation (CV) were for aBMD FN and aBMD LS, 0.7% and 1.3%, respectively. The 4- and 10-year probabilities of MOF (major osteoporotic fractures; included fractures of the spine, hip, forearm, and proximal humerus) for women 75 and 80 years old were calculated according to femoral neck BMD T-score [30]. The National Health and Nutrition Examination Survey (NHANES) III reference database for total hip and femoral neck in young (20–29-year-old) Caucasian women and the Hologic sample for lumbar spine measurements comprising 30-year-old Caucasian American women were used to calculate the corresponding T-scores [31, 32].

Biochemical analyses

Blood samples were drawn from all study participants, and serum was separated, aliquoted, and stored at − 80 °C until analyses. Serum 25-hydroxyvitamin D, calcium, and parathyroid hormone (PTH) were analyzed at the Department of Clinical Chemistry (Swedac accredited no. 1342), Linköping University Hospital, Sweden, and all samples were assayed with reagents from the same batch. Serum 25-hydroxyvitamin D was measured on the DiaSorin LIAISON® XL analyzer with the 25-hydroxyvitamin D total chemiluminescence immunoassay (DiaSorin, Stillwater, MN, USA), which demonstrates 100% cross-reactivity for 25-hydroxyvitamin D2 and 25-hydroxyvitamin D3. The assay performances were analytical range 10–375 nmol/L, and total CVs of 8.8%, 6.4%, and 6.8% at levels 25 nmol/L, 68 nmol/L, and 150 nmol/L, respectively. Serum intact PTH was determined using the Elecsys electrochemiluminescence immunoassay on a Roche Cobas e601 platform (Roche Diagnostics Scandinavia AB, Gothenburg, Sweden), which has the following assay performance: analytical range 0.13–530 pmol/L, and total CVs of 4.0%, and 2.9% at levels 1.9 pmol/L and 8.6 pmol/L, respectively.

Incident fracture assessment

Incident fractures were verified using radiographs. X-ray reports and/or images were retrieved from a regional digital X-ray archive that included all the 49 municipalities in the Västra Götaland region surrounding Gothenburg. All the radiology reports were initially reviewed by research nurses between the baseline exam until May 24, 2018. All reported fractures were recorded and all radiographs without available radiology reports or reports with uncertain fracture diagnosis were examined by an experienced orthopedic surgeon. Major osteoporotic fractures included clinical spine, hip, forearm, and proximal humerus fractures. Nonvertebral fractures included all fractures, except for fractures of the spine, skull, face, hand, and foot.

Statistical analyses

For continuous variables, independent samples t tests were used to examine differences between groups. χ2 and Fisher’s exact tests were used for dichotomous variables. The association between TUG as a continuous variable and the risk of fracture was examined using an extension of the Poisson regression model in the whole cohort [33, 34]. The hazard function for major osteoporotic fracture and death was calculated using a modification of the Poisson regression model. For fracture, the variables in the hazard function were current time since baseline, current age, BMI, previous fracture, family history of hip fracture, smoking, corticosteroids, rheumatoid arthritis, alcohol use, and BMD. One additional model was constructed using the variables mentioned before adding slow TUG (0/1). For death, the variables in the hazard function were current time since baseline, current age, BMI, current smoking, per oral corticosteroid use, and BMD. Also, here, one additional model was constructed adding slow TUG (0/1). From the hazard functions for fracture and death, the 10-year probability of major osteoporotic fracture was calculated [35]. Follow-up is approximately 4 years for the SUPERB cohort, so when calculating 10-year probability, the hazard functions were extrapolated in time. It is important to note that the probability models used were based on purpose-built models similar to, but not identical to FRAX. The observation period of each participant was divided in intervals of 1 month. The first fracture per person was counted for each relevant outcome. Covariates included current age and time since start of follow-up. In order to study the association between TUG and fracture risk in more detail, a spline Poisson regression model was fitted using cohort-specific knots at the 10th, 50th, and 90th percentiles of BMI, as recommended by Harrell [36]. The splines were second-order functions between the breakpoints and linear functions at the tails resulting in a smooth curve. The difference in log likelihood values using the spline and linear models were tested, in order to determine which model provides the most optimal curve-fit. The value considered significant was a p value less than 0.05. Incidence per 1000 person-years was calculated as number of events divided by total follow-up time (until fracture, death, or censored) per 1000 years. Associations between TUG (TUG ≤ 12 s or > 12 s) and incident fractures were also studied using Cox proportional hazard models adjusted for age, height and weight as well as additional covariates, including all FRAX clinical risk factors (previous fracture, family history of hip fracture, current smoking, oral glucocorticoid use, rheumatoid arthritis, excessive alcohol intake), osteoporosis medication and history of falls, and censored for death or end of study (May 24, 2018). Hazard ratios (HR) and 95% confidence intervals derived from Cox models are presented. Statistical analyses were performed using SPSS Statistics Version 24 (IBM Corporation, Armonk, NY, USA). The hazard function developed by Fine and Gray was used to assess death as a competing risk for osteoporotic fractures [37]. For the calculations according to Fine Gray, STATA, Statistics/Data Analysis, version 16.0, serial number 401609206078, licensed to the University of Gothenburg, was used.

Results

Baseline characteristics and TUG time

During a median follow-up of 3.6 (Interquartile range, IQR 1.48) years, 335 women sustained a MOF, 314 a nonvertebral fracture, and 66 women a hip fracture. In the whole cohort (n = 3004), median TUG time was 8.00 (IQR 2.6) seconds. The relationship between TUG and incident MOF and hip fracture showed a steep increase in fracture risk with TUG up to about 12 s (Fig. 1a and b) and then started to level off. The spline function provided a significantly improved curve-fit in comparison to a linear function for MOF (p = 0.01), but the improvement was only borderline significant for hip fracture (p = 0.05).

Fig. 1
figure 1

a The relationship between timed up and go (TUG) time and incidence of major osteoporotic fracture (MOF). Incidence and 95% confidence intervals of MOF according to TUG (seconds) is described per 100,000 person-years. b The relationship between timed up and go (TUG) time and incidence of hip fracture. Incidence and 95% confidence intervals of hip fracture according to TUG (seconds) is described per 100,000 person-years

We therefore choose TUG above 12 s as the cutoff and divided the women with available TUG into two groups, TUG time ≤ 12 s (n = 2711) and TUG time > 12 s (n = 293). Characteristics of women with TUG ≤ 12 s and women with TUG > 12 s at baseline are presented in Table 1. Women with TUG > 12 s were older, shorter, and heavier than women with TUG ≤ 12 s. Physical activity and physical and mental health were inferior, while calcium and PTH were higher among those with slow TUG. Prevalence of falls, self-reported prior fracture, rheumatoid arthritis, hyperthyroidism, self-reported osteoporosis, hypertension, stroke, myocardial infarction, angina, heart failure, type 2 diabetes, chronic bronchitis/asthma/emphysema, and women using osteoporosis medication were all more common among those with TUG > 12 s (p < 0.05), indicating an increased comorbidity in this group of women.

Table 1 Characteristics of older women with timed up and go (TUG) ≤ 12 s and > 12 s

A slow TUG time was associated with a higher fracture incidence

The incident fractures were divided into three groups; nonvertebral fracture, MOF, and hip fracture. The incidence of nonvertebral fracture, MOF, and hip fracture was substantially higher in women with TUG > 12 s than in women with faster TUG. Cox proportional hazard models adjusted for age, height, and weight demonstrated that TUG > 12 s was associated with increased risk of nonvertebral fracture (hazard ratio (HR) and 95% confidence interval (CI) 2.09 [1.53–2.84]), MOF (HR, 95% CI 2.39 [1.80–3.18]), and hip fracture (HR 95% CI 2.96 [1.62–5.40]). These associations were somewhat attenuated but remained significant also after adjustments for clinical risk factors included in FRAX, use of osteoporosis medication, prior falls, and femoral neck BMD (Table 2).

Table 2 Associations between TUG time > 12 s and fracture risk in older women

The impact of TUG > 12 s on fracture probabilities

Study subject follow-up time was extrapolated up to 10 years to allow for calculations of 10-year fracture probability. The 10-year probabilities of MOF for women 75 and 80 years old were calculated, setting BMI to 26 kg/m2, previous fracture set to yes, and all other clinical risk factors set to no, according to femoral neck BMD T-score, with or without consideration to TUG (≤ 12 s or > 12 s) in the analysis. For a 75-year-old woman with BMD T-score − 2, a slow TUG > 12 s increased the 10-year probability substantially, from 34.3 to 47.5%. The equivalent 10-year probability for an 80-year-old woman with a T-score of − 2 was 37.3% and 48.7%, for TUG ≤ 12 s or > 12 s, respectively (Fig. 2a, b). The 4-year probabilities of MOF for women 75 and 80 years old, with previous fracture, BMI of 26 kg/m2, and no additional clinical risk factors, were also calculated according to femoral neck BMD T-score, with or without consideration to TUG (≤ 12 s or > 12 s) in the analysis. For a 75-year-old woman with BMD T-score − 2, TUG > 12 s was associated with a markedly higher 4-year probability (14% vs. 24%). The corresponding 4-year probability for an 80-year-old woman with a T-score of − 2 was 16% and 26%, for TUG ≤ 12 s or > 12 s, respectively (Fig. 2c, d). The ratios between the calculated 4-year probability without considering TUG and with TUG > 12 s for women 75 and 80 years old, with previous fracture, BMI of 26 kg/m2 but no other clinical risk factors, according to femoral neck BMD T-score, are presented in Fig. 3. The relative importance on fracture probability of having TUG > 12 s increased with BMD in both 75- and 80-year-old women.

Fig. 2
figure 2

10-year (a, b) and 4-year (c, d) probability of major osteoporotic fracture according to femoral neck BMD and TUG time. a, b 10-year probability of a major osteoporotic fracture (MOF) in a 75-year-old (a) or 80-year-old (b) woman according to T-score of femoral neck BMD. The symbol (closed circle) denote probabilities calculated without TUG and the lines the range of probabilities with TUG > 12 and TUG ≤ 12 using the model incorporating TUG. In the used model, BMI is set to 26 kg/m2, previous fracture to yes, but all other clinical risk factors set to no. c, d 4-year probability of a major osteoporotic fracture (MOF) in a 75-year-old (c) or 80-year-old (d) woman according to T-score of femoral neck BMD. The symbol (closed circle) denote probabilities calculated without TUG and the lines the range of probabilities with TUG > 12 and TUG ≤ 12 using the model incorporating TUG. In the used model, BMI is set to 26 kg/m2, previous fracture to yes, but all other clinical risk factors set to no

Fig. 3
figure 3

The ratio between the 4-year probability of major osteoporotic fracture with and without considering TUG time is dependent on femoral neck BMD. The ratio between the 4-year probability of major osteoporotic fracture with TUG > 12 s and without considering TUG, shown for women 75 and 80 years old according to femoral neck BMD T-score. In the used model, BMI is set to 26 kg/m2, previous fracture to yes, but all other clinical risk factors set to no

The impact of competing risk of death according to Fine and Gray

The association between TUG > 12 and risk for major osteoporotic fracture (subhazard ratio (SHR) and 95% CI 2.31 (1.73–3.09)), hip fracture (SHR 2.85 (1.56–5.22)), and nonvertebral fracture (SHR 2.01 (1.47–2.76)) did not change substantially when a competing risk survival regression model, adjusted for age, height, and weight, was applied.

Discussion

In the present study, we demonstrate that TUG is an independent predictor of nonvertebral fracture, MOF, and hip fracture, and that these associations are independent of clinical risk factors included in FRAX and BMD of the femoral neck. Fracture risk increased progressively with TUG time and started to level off when TUG time exceeded 12 s. Having a slow TUG time (> 12 s) had a substantial impact, on the probability of MOF and hip fracture, indicating that evaluation of TUG could be useful in determining fracture risk in older women.

Although the spline functions describing the relationship between TUG and MOF or hip fracture provided better curve fits than linear models, the proposed and used cutoff of 12 s is clearly not perfect, since it is evident that the risk of fracture progresses further with TUG time slower than 12 s. It should be acknowledged that the spline model herein presented for hip fracture is based on few fractures (n = 66) with resulting large confidence intervals. However, it can be argued that deriving the cutoff using spline models for fracture risk is superior to, as previously done, using age-specific means without any consideration of fracture risk [18, 20].

FRAX is the most widely used fracture risk assessment tool for estimating individualized 10-year probability of hip and major osteoporotic fracture [30, 38,39,40,41]. The calculated 10-year fracture probability is based on clinical risk factors with or without BMD, and recommended in many clinical guidelines to calculate the probability of a MOF or hip fracture [41, 42]. In the presently investigated cohort, a slow TUG time (> 12 s) increased the 10-year probability substantially in women of all ages and with low to normal BMD, to risk levels above the 20%, a commonly used treatment threshold [42]. Thus, performing the TUG test and considering TUG time would have a substantial impact on treatment decisions in women in this age group.

TUG performance is known to capture several different aspects of aging, such as poor balance, falls, and disability of daily living [43,44,45,46]. TUG time has been shown to be able to predict frailty with high sensitivity and specificity but is to a lesser extent able to discriminate fallers from non-fallers [47, 48]. In the present study, it was apparent that women with a slow TUG time (> 12 s) had a considerably poorer general health than women with normal TUG, as indicated by a higher BMI, worse physical and mental quality of life, as well as more prevalent falls and fractures, type 2 diabetes, stroke, osteoporosis, Parkinson’s disease, myocardial infarction, and heart failure. Thus, a slow TUG time served as a proxy for worse general health and can therefore be used to efficiently identify physical frailty which negatively impacts the risk of falls and fractures.

It can be problematic to identify risk factors in older populations with considerable comorbidities and a high mortality rate, due to the competing risk of death [49]. In the herein presented analysis, we also performed analysis of the relationship between TUG time and incident fracture, using the Fine and Gray method, to adjust for competing mortality. The robust associations between TUG and fracture risk remained when considering competing mortality, supporting the notion that TUG performance is useful to test also in this age group.

In a randomized controlled trial of older women (age 75 years) investigating the effect of calcium supplementation on fracture risk, TUG was associated with incident fractures after adjusting for calcium treatment and several other risk factors and covariates, including BMD [18]. In a recent, very large study (n = 1,070,320) of male and female Koreans, 66 years old, a slow TUG time (≥ 10 s) was found to be associated with a modest 21% increased risk of hip fracture and a 7% increased risk in vertebral fracture, compared to those with a faster TUG time [50]. In contrast to the analysis in the present study, neither of these previous studies attempted to investigate and identify the appropriate threshold, most strongly associated with an elevated fracture risk. Furthermore, all risk factors presently included in FRAX were not considered and the impact of TUG on the over-all fracture probability was not assessed in these studies [18, 50].

This study has some limitations. The inclusion criteria required that women were older (75–80 years old), were ambulatory, and able to understand Swedish. Thus, the results may not apply to women in other age groups, residing in nursing homes or with other ethnic backgrounds. Although women with walking aids were included, women with more severe disability (not able to walk at all) were excluded. The TUG test was not available in all women in this study, but a very large proportion were able to complete the test (3004 out of 3028) demonstrating its usefulness in this population. The prevalence of a slow TUG is likely higher in women with a higher prevalence of disability but lower in younger women. The importance of TUG as predictor of fracture in such populations could be different from the herein investigated population.

This study also has strengths. It is a large population-based, prospective study, comprising over 3000 older women, a population with very high fracture risk. All identified fractures were confirmed using x-rays or radiology reports, ensuring high quality of the fracture data. It is the first study to investigate and present results on the impact of TUG performance on fracture probability, also after considering all currently used FRAX clinical risk factors and BMD. However, additional similar studies and meta-analyses will be required to determine if TUG performance could provide additional value to future FRAX-models.

In conclusion, the present study demonstrates that TUG time is strongly associated with hip, MOF, and nonvertebral fractures in older women, also after adjustments for all FRAX clinical risk factors and BMD. These results indicate that TUG performance could be included as a routine clinical assessment in order to improve fracture prediction in older women.