Summary of Findings (SoF) tables present, for each of the seven (or fewer) most important outcomes, the following: the number of studies and number of participants; the confidence in effect estimates (quality of evidence); and the best estimates of relative and absolute effects. Potentially challenging choices in preparing SoF table include using direct evidence (which may have very few events) or indirect evidence (from a surrogate) as the best evidence for a treatment effect. If a surrogate is chosen, it must be labeled as substituting for the corresponding patient-important outcome.
Another such choice is presenting evidence from low-quality randomized trials or high-quality observational studies. When in doubt, a reasonable approach is to present both sets of evidence; if the two bodies of evidence have similar quality but discrepant results, one would rate down further for inconsistency.
For binary outcomes, relative risks (RRs) are the preferred measure of relative effect and, in most instances, are applied to the baseline or control group risks to generate absolute risks. Ideally, the baseline risks come from observational studies including representative patients and identifying easily measured prognostic factors that define groups at differing risk. In the absence of such studies, relevant randomized trials provide estimates of baseline risk.
When confidence intervals (CIs) around the relative effect include no difference, one may simply state in the absolute risk column that results fail to show a difference, omit the point estimate and report only the CIs, or add a comment emphasizing the uncertainty associated with the point estimate.
Introduction
What is new?
Key points
Summary of Findings (SoF) tables provide succinct, easily digestible presentations of confidence in effect estimates (quality of evidence) and magnitude of effects.
SoF table should present the seven (or fewer) most important outcomes—these outcomes must always be patient-important outcomes and never be surrogates, although surrogates can be used to estimate effects on patient-important outcomes.
SoF table should present the highest quality evidence. When quality of two bodies of evidence (e.g., randomized trials and observational studies) is similar, SoF table may include summaries from both.
SoF table should include both relative and absolute effect measures, and separate estimates of absolute effect for identifiable patient groups with substantially different baseline or control group risks.
The first 11 articles in this series introduced the GRADE approach to systematic reviews and guideline development [1], discussed the framing of the question [2], and presented GRADE’s concept of confidence in effect estimates [3] and how to apply it [4], [5], [6], [7], [8], [9]. In this 12th article, we describe the final product of a systematic review using the GRADE process, Summary of Findings (SoF) tables that present, for each relevant comparison of alternative management strategies, the quality rating for each outcome, the best estimate of the magnitude of effect in relative terms, and the absolute effect that one might see across subgroups of patients with varying baseline or control group risks. The focus of this article is on binary outcomes. Box 1 presents the seven elements recommended for SoF tables. Table 1, Table 2, Table 3, examples of SoF tables, highlight some of the issues in constructing such a table. Readers will find additional details in the Cochrane Handbook, Chapter 11 [10].
Section snippets
The seven elements of a SoF table
SoF tables include seven elements (Box 1). Uniformity of presentation is likely to facilitate readers’ familiarity and comfort with SoF tables and is therefore desirable and facilitated by the use of GRADEpro software [11]. Initial user testing with consumers of guidelines (clinicians and researchers) guided the format of Table 1 [12], [13]. In Table 1, putting what is most important first guided the order of the columns, and the presentation of absolute risks was guided by a finding that some
Choosing which outcomes to present
SoF tables should ideally present results of all patient-important outcomes—possibly noting which ones are critical—without, however, overwhelming the reader. GRADE suggests inclusion of no more than seven outcomes, including both benefits and harms. If there are more than seven outcomes that are judged important, reviewers should choose the seven most important. This number is based on our intuition about the amount of information users can grasp, and an informal survey of attendees at
Presentation of direct vs. indirect evidence
Sometimes, direct measures of the patient-important outcomes are unavailable or, as in Table 1, no events have occurred (for symptomatic venous thrombosis and pulmonary embolism). In such instances, reviewers should present their inferences regarding treatment effects on patient-important outcomes on the basis of the results of surrogate measures. That the inferences are coming from surrogates should be clearly labeled, and will almost certainly result in rating down the confidence in effect
Presentation of randomized controlled trials or observational studies
Randomized controlled trials (RCTs) usually provide higher-quality evidence than observational studies and, if RCTs are available, SoF tables should generally restrict themselves to reporting RCT results. On occasion, however, limitations of RCTs or particular strengths of observational studies may lead to conclusions that their confidence in effect estimates is similar, or that observational studies provide higher-quality evidence.
For instance, consider the use of octreotide to prevent
Dealing with analytic approaches that yield different results
Systematic reviews, in exploring sources of heterogeneity, may sometimes find that alternative analyses (“sensitivity analyses”) yield appreciably different results. For example, a systematic review of glucosamine for treating osteoarthritis found differences in pain reduction when including only trials with concealed allocation vs. all trials [20]. Presenting two rows, one summarizing each analytic approach, would have left the inevitably less-equipped readers with the decision about which
Measures of relative effect
Options for expressing relative measures of effect include the RR (synonym: risk ratio), odds ratio (OR), rate ratio, and hazard ratio [21], [22], [23]. ORs have advantageous statistical properties [24]. RRs, however, are more understandable intuitively, and easier to use for estimating absolute measures of effect in individual patients [21]. We find these advantages of RRs compelling (for more details, see Box 3). Meta-analysis can generate RRs or ORs from 2 × 2 tables using appropriate
Measures of absolute effect
As we have pointed out, relative measures tend to be consistent across risk groups, whereas absolute measures do not [22], [27], [28], [29]. Making management choices, however, focuses on trading off absolute effects on patient-important outcomes, therefore requiring both relative and absolute measures to appear in SoF tables.
The unrepresentativeness of patients in randomized trials, and the lack of consistency of absolute measures across risk groups and across individual trials argue against
Presentation of absolute effects
We suggest presenting the absolute effect—both benefits and harms—as natural frequencies (events per 10,000 patients in Table 1, although more frequent events can be presented as events per 1,000 or even per 100 patients) because this facilitates decision making [31], [32], [33], [34]. When events are sufficiently frequent, percentages may be as well, or marginally better, understood [35]. Although many clinicians prefer numbers needed to treat (NNTs), they may be more difficult to interpret
Absolute effects—confidence intervals
We further suggest reporting the CIs around the absolute risk in the intervention group (as in Table 1, Table 6) or around the difference between intervention and control groups (as in Table 2, Table 3, Table 4, Table 5). Just as one calculates the absolute risk in the intervention group on the basis of the absolute risk in the comparison group and the point estimate of the RR, the calculation of the CIs around the absolute risks in the intervention group is based on the absolute risk in the
Absolute effects—choice of time frame
In Table 1, the time frame for measurement of outcome is both obvious and short—symptomatic thrombosis, if it exists, will occur within days of a long flight. For conditions such as primary and secondary prevention of cardiovascular events, or cancer recurrence, there are options for choice of the duration of follow-up. Reviewers should therefore always indicate the length of follow-up to which the estimates of absolute effect refer. Note, this length of follow-up may not be the length of
Dealing with no events in either group
When no participant in any trial has suffered the outcome of interest, the trials provide no information about relative effects (and one can thus argue that there is no point in rating the quality of the evidence). However, particularly if there are large numbers of patients, the data may provide high-quality evidence that the absolute difference between alternative management strategies is small or very small. If reviewers believe this is the appropriate inference for an important or crucial
Uncertainty around estimates of baseline risk
Note that Table 1 provides estimates of risk in the intervention group based on the CIs around the RR. We do not, however, provide estimates of uncertainty regarding the estimates of baseline risk in high- and low-risk control groups. Not presenting such estimates reflects a high priority on simple presentations that clinicians and patients will find easily digestible.
Potentially, all the issues that raise uncertainty about estimates of absolute effects could raise uncertainty about estimates
What to do when there is no published evidence regarding an important outcome
We encourage systematic review authors and guideline developers to specify all important outcomes before commencing their reviews. If they do so, it is possible that they may find no published evidence regarding one or more outcomes (quality of life and rare side effects are two outcomes that may be subject to this problem). We suggest that if sufficiently important, such an outcome would warrant a row in the SoF table, with the confidence in effect estimates rating (and other cells aside from
Conclusion
The SoF table provides all the key information necessary for making decisions between competing health care management strategies [38]. Therefore, although not an absolute requirement for use of the GRADE approach, the SoF table is an invaluable tool for providing a succinct, accessible, transparent evidence summary for patients, health care providers, and policy makers.
2024, European Journal of Obstetrics and Gynecology and Reproductive Biology
To perform a network meta-analysis to specify the route of administration that maximises the effectiveness of each of the available prophylactic uterotonics without increasing the risk for side effects.
Literature searches on 12th September 2022 included: CENTRAL, MEDLINE, Embase, CINAHL, ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform. The reference lists of the retrieved study records were also searched.
Population: Randomized controlled trials involving women in the third stage of labour after a vaginal or caesarean delivery in hospital or community settings. Interventions: Systemically administered prophylactic uterotonics of any route and dose for primary postpartum hemorrhage prevention. Comparison: Any other prophylactic uterotonic, or a different route or dose of a given uterotonic, or placebo, or no treatment. Outcomes (primary): postpartum hemorrhage ≥ 500 mL and ≥ 1000 mL.
Risk of bias and trustworthiness assessments were performed, according to Cochrane’s guidance. Direct, indirect and network meta-analyses were conducted, and results were summarized either as risk ratio or mean difference with 95% confidence intervals for dichotomous and continuous outcomes, respectively. The certainty of generated evidence was assessed according to the GRADE approach. Cumulative probabilities were calculated and the surface under the cumulative ranking curve was used to create a ranking of the available drugs.
One hundred eighty-one studies involving 122,867 randomised women were included. Most studies were conducted in hospital settings in lower-middle income countries and involved women delivering vaginally. When compared with intramuscular oxytocin, carbetocin (RR 0.58, 95 % CI 0.40–0.84) and oxytocin (RR 0.75, 95 % CI 0.59–0.97) by an intravenous bolus, and intramuscular ergometrine plus oxytocin combination (RR 0.71, 95 % CI 0.56–0.91) are probably more effective in preventing primary postpartum hemorrhage. Intramuscularly administered oxytocin and carbetocin by an intravenous bolus have a favourable side effects profile.
Generated evidence was generally moderate and global inconsistency was low. Carbetocin and oxytocin by an intravenous bolus, and intramuscular ergometrine plus oxytocin combination are probably the top uterotonics for primary postpartum hemorrhage prevention. Large scale studies exploring different routes of administration for available prophylactic uterotonics, and women’s views should be conducted.
Atopic dermatitis (AD) is a common skin condition with multiple topical treatment options, but uncertain comparative effects.
We sought to systematically synthesize the benefits and harms of AD prescription topical treatments.
For the 2023 American Academy of Allergy, Asthma & Immunology and American College of Allergy, Asthma, and Immunology Joint Task Force on Practice Parameters AD guidelines, we searched MEDLINE, EMBASE, CENTRAL, CINAHL, LILACS, ICTRP, and GREAT databases to September 5, 2022, for randomized trials addressing AD topical treatments. Paired reviewers independently screened records, extracted data, and assessed risk of bias. Random-effects network meta-analyses addressed AD severity, itch, sleep, AD-related quality of life, flares, and harms. The Grading of Recommendations Assessment, Development and Evaluation approach informed certainty of evidence ratings. We classified topical corticosteroids (TCS) using 7 groups—group 1 being most potent. This review is registered in the Open Science Framework (https://osf.io/q5m6s).
The 219 included trials (43,123 patients) evaluated 68 interventions. With high-certainty evidence, pimecrolimus improved 6 of 7 outcomes—among the best for 2; high-dose tacrolimus (0.1%) improved 5—among the best for 2; low-dose tacrolimus (0.03%) improved 5—among the best for 1. With moderate- to high-certainty evidence, group 5 TCS improved 6—among the best for 3; group 4 TCS and delgocitinib improved 4—among the best for 2; ruxolitinib improved 4—among the best for 1; group 1 TCS improved 3—among the best for 2. These interventions did not increase harm. Crisaborole and difamilast were intermediately effective, but with uncertain harm. Topical antibiotics alone or in combination may be among the least effective. To maintain AD control, group 5 TCS were among the most effective, followed by tacrolimus and pimecrolimus.
For individuals with AD, pimecrolimus, tacrolimus, and moderate-potency TCS are among the most effective in improving and maintaining multiple AD outcomes. Topical antibiotics may be among the least effective.
Atopic dermatitis (AD) is an inflammatory skin condition with multiple systemic treatments and uncertainty regarding their comparative impact on AD outcomes.
We sought to systematically synthesize the benefits and harms of AD systemic treatments.
For the 2023 American Academy of Allergy, Asthma & Immunology and American College of Allergy, Asthma, and Immunology Joint Task Force on Practice Parameters AD guidelines, we searched MEDLINE, EMBASE, CENTRAL, Web of Science, and GREAT databases from inception to November 29, 2022, for randomized trials addressing systemic treatments and phototherapy for AD. Paired reviewers independently screened records, extracted data, and assessed risk of bias. Random-effects network meta-analyses addressed AD severity, itch, sleep, AD-related quality of life, flares, and harms. The Grading of Recommendations Assessment, Development and Evaluation approach informed certainty of evidence ratings. This review is registered in the Open Science Framework (https://osf.io/e5sna).
The 149 included trials (28,686 patients with moderate-to-severe AD) evaluated 75 interventions. With high-certainty evidence, high-dose upadacitinib was among the most effective for 5 of 6 patient-important outcomes; high-dose abrocitinib and low-dose upadacitinib were among the most effective for 2 outcomes. These Janus kinase inhibitors were among the most harmful in increasing adverse events. With high-certainty evidence, dupilumab, lebrikizumab, and tralokinumab were of intermediate effectiveness and among the safest, modestly increasing conjunctivitis. Low-dose baricitinib was among the least effective. Efficacy and safety of azathioprine, oral corticosteroids, cyclosporine, methotrexate, mycophenolate, phototherapy, and many novel agents are less certain.
Among individuals with moderate-to-severe AD, high-certainty evidence demonstrates that high-dose upadacitinib is among the most effective in addressing multiple patient-important outcomes, but also is among the most harmful. High-dose abrocitinib and low-dose upadacitinib are effective, but also among the most harmful. Dupilumab, lebrikizumab, and tralokinumab are of intermediate effectiveness and have favorable safety.
Fibromyalgia syndrome (FMS) is defined as chronic widespread pain associated with sleep disorders, cognitive dysfunction, and somatic symptoms present for at least three months and cannot be better explained by another diagnosis.
To examine efficacy and safety of non-pharmacological interventions for FMS in adults reported in Cochrane Reviews, and reporting quality of reviews.
Systematic reviews of randomised controlled trials (RCTs) of non-pharmacological interventions for FMS were identified from the Cochrane Database of Systematic Reviews (CDSR 2022, Issue 3 and CDSR 2023 Issue 6). Methodological quality was assessed using the AMSTAR-2 tool and a set of methodological criteria critical for analgesic effects. The primary efficacy outcomes of interest were clinically relevant pain relief, improvement in health-related quality of life (HRQoL), acceptability, safety, and reduction of mobility difficulties as reported by study participants. No pooled analyses were planned. We assumed a clinically relevant improvement was a minimal clinically important difference (MCID) between interventions and controls of 15%, or a SMD of more than 0.2, or a MD of more than 0.5, on a 0 to 10 scale.
Ten Cochrane reviews were eligible, reporting 181 randomized or quasi- randomized trials (11,917 participants, average trial size 66 participants). The reviews examined exercise training, acupuncture, transcutaneous electrical nerve stimulation, and psychological therapies.
One review was rated moderate according to AMSTAR 2, seven were rated low and two were rated critically low. All reviews met most of the additional methodological quality criteria. All reviews included studies with patient-reported outcomes for pain.
We found low certainty evidence of clinically relevant positive effects of aerobic and mixed exercise training and for cognitive behavioural therapies (CBTs) at reducing mobility difficulties and for mixed exercise training and CBTs for improving HRQoL at the end of the intervention. Number needed to treat for an additional beneficial outcome (NNTB) values for a MCID of 15% ranged between 4 and 9. We found low certainty evidence that was clinically relevant for mixed exercise and CBTs for reducing mobility difficulties at an average follow up of 24 weeks. We found low certainty evidence of clinically relevant positive effects of mixed exercise on HRQoL at an average follow up of 24 weeks. NNTB values for a MCID of 15% ranged from 5 to 11. The certainty of evidence of the acceptability (measured by dropouts) of the different non-pharmacological interventions ranged from very low to moderate and the dropout rate for any reason did not differ across the interventions or the controls, except for biofeedback and movement therapies. All the systematic reviews stated that the reporting of adverse events was inconsistent in the studies analysed (very low certainty evidence).
There is low certainty evidence of clinically relevant reduction of mobility difficulties and of improvement of HRQoL among individuals with FMS by aerobic and mixed exercise training and by CBTs at the end of the intervention. There is low certainty evidence that CBTs and mixed exercise training reduces mobility difficulties post-treatment and that mixed exercise training improves HRQoL at follow-up by clinically meaningful scores.
The GRADE system has been developed by the GRADE Working Group. The named authors drafted and revised this article. A complete list of contributors to this series can be found on the journal’s Web site at www.elsevier.com.