Chest
Volume 129, Issue 1, January 2006, Pages 174-181
Journal home page for Chest

Special Features
Grading Strength of Recommendations and Quality of Evidence in Clinical Guidelines: Report From an American College of Chest Physicians Task Force

https://doi.org/10.1378/chest.129.1.174Get rights and content

While grading the strength of recommendations and the quality of underlying evidence enhances the usefulness of clinical guidelines, the profusion of guideline grading systems undermines the value of the grading exercise. An American College of Chest Physicians (ACCP) task force formulated the criteria for a grading system to be utilized in all ACCP guidelines that included simplicity and transparency, explicitness of methodology, and consistency with current methodological approaches to the grading process. The working group examined currently available systems, and ultimately modified an approach formulated by the international GRADE group. The grading scheme classifies recommendations as strong (grade 1) or weak (grade 2), according to the balance among benefits, risks, burdens, and possibly cost, and the degree of confidence in estimates of benefits, risks, and burdens. The system classifies quality of evidence as high (grade A), moderate (grade B), or low (grade C) according to factors that include the study design, the consistency of the results, and the directness of the evidence. For all future ACCP guidelines, The College has adopted a simple, transparent approach to grading recommendations that is consistent with current developments in the field. The trend toward uniformity of approaches to grading will enhance the usefulness of practice guidelines for clinicians.

Section snippets

Strength of Recommendation

Guideline panels should make recommendations to administer, or not administer, an intervention, on the basis of tradeoffs between benefits on the one hand, and risks, burdens, and, potentially, costs on the other. If benefits outweigh risks and burdens, experts will recommend that clinicians offer a treatment to appropriately chosen patients. The uncertainty associated with the tradeoff between the benefits and the risks and burdens will determine the strength of recommendations.

The ACCP task

Factors That Influence the Strength of a Recommendation

Guideline panels must consider a number of factors in grading recommendations (Table 3). One issue is their confidence in the best estimates of benefit and harm. The rating of methodological quality, which we discuss below, captures that degree of confidence.

The prevention of outcomes with high patient importance6 should, in general, lead to stronger recommendations than the prevention of outcomes of lesser patient importance. For instance, one needs to expose four patients to a respiratory

Wording of Recommendations

Given the proliferation of grading systems, and the resulting confusion, it is desirable to provide clinicians with as many indicators as possible in interpreting the strength of recommendations. ACCP panels, when they are making a strong recommendation, will use the terminology, “We recommend… . ” When they make a weak recommendation, ACCP guideline panels will use less definitive wording, such as, “We suggest… . ” Further, the clarity of recommendations requires that the target patient

Confidence in Estimates of Magnitude of Benefits, Risks, Burdens, and Costs

Early systems of grading methodological quality relied primarily on the basic study design (ie, randomized control trials [RCTs], or observational studies). The fundamental study design remains critically important in determining our confidence in estimates of beneficial and detrimental treatment effects. Because of prognostic differences between groups, and the lack of safeguards such as blinding that can avoid biased ascertainment of outcomes, evidence based on observational studies will, in

Factors That Modify the Quality of Evidence: Limitations in RCTs

When RCTs have addressed the impact of alternative management strategies (both benefits and harms) on all relevant outcomes, they will yield high-quality evidence unless they have one of a number of limitations. The following limitations may decrease the quality of evidence supporting a recommendation (Table 4).

  • 1

    Our confidence in recommendations decreases if the available RCTs have major deficiencies that are likely to result in a biased assessment of the treatment effect. These methodological

Factors That Modify the Quality of Evidence: Observational Studies Can Provide Moderate or Strong Evidence

While observational studies will generally yield only low-quality evidence, there may be unusual circumstances in which guideline panels will classify such evidence as of moderate quality, or even high quality.

  • 1

    On the rare occasions when they yield extremely large and consistent estimates of the magnitude of a treatment effect, we may be confident about the results of observational studies. For example, oral anticoagulation in mechanical heart valves has not been compared to placebo in an RCT.

What to Do When Quality of Evidence Differs Across Outcomes?

When RCT results are available, the quality of evidence will often differ between primary efficacy and toxicity outcomes, usually between efficacy outcomes and cost, and almost always between efficacy outcomes and rare but serious side effects. On most occasions, efficacy outcomes will be the most important, and guideline panels can base their rating of the quality of the evidence exclusively on these end points. Panels should, however, consider whether toxicity end points are also crucial to

The ACCP Grading System and Initiatives Toward Uniform Grading Across Guideline Panels

In considering alternative grading systems, we found that the structure and guides for application and interpretation suggested by the GRADE group largely met the criteria in Table 1.18, 19 As a result, the categories presented in Table 2 permit similar interpretation to those of the GRADE group. The important aspect in which the ACCP task force approach differs is in combining low-quality and very low-quality evidence. While we achieved the primary goal of the ACCP task force, to identify a

Summary

In the system that the ACCP has adopted, the strength of any recommendation depends on the following two factors: the tradeoff between the benefits and the risks and burdens; and the quality of the evidence regarding treatment effect. We grade the tradeoff between the benefits, and the risks and burdens into the following two categories; category 1, in which the tradeoff is clear enough that most patients, despite differences in values, would make the same choice, leading to a strong

References (20)

There are more references available in the full text version of this article.

Cited by (1091)

View all citing articles on Scopus

Reproduction of this article is prohibited without written permission from the American College of Chest Physicians (www.chestjournal.org/misc/reprints.shtml).

View full text