• J Manag Care Spec Pharm · Mar 2014

    The GRACE checklist for rating the quality of observational studies of comparative effectiveness: a tale of hope and caution.

    • Nancy A Dreyer, Priscilla Velentgas, Kimberly Westrich, and Robert Dubois.
    • Quintiles Real-World Late Phase Research, 201 Broadway, Cambridge, MA 02139, USA. nancy.dreyer@quintiles.com.
    • J Manag Care Spec Pharm. 2014 Mar 1;20(3):301-8.

    BackgroundWhile there is growing demand for information about comparative effectiveness (CE), there is substantial debate about whether and when observational studies have sufficient quality to support decision making.ObjectiveTo develop and test an item checklist that can be used to qualify those observational CE studies sufficiently rigorous in design and execution to contribute meaningfully to the evidence base for decision support.MethodsAn 11-item checklist about data and methods (the GRACE checklist) was developed through literature review and consultation with experts from professional societies, payer groups, the private sector, and academia. Since no single gold standard exists for validation, checklist item responses were compared with 3 different types of external quality ratings (N=88 articles). The articles compared treatment effectiveness and/or safety of drugs, medical devices, and medical procedures. We validated checklist item responses 3 ways against external quality ratings, using published articles of observational CE or safety studies: (a) Systematic Review-quality assessment from a published systematic review; (b) Single Expert Review-quality assessment made according to the solicited "expert opinion" of a senior researcher; and (c) Concordant Expert Review-quality assessments from 2 experts for which there was concordance. Volunteers (N=113) from 5 continents completed 280 article assessments using the checklist. Positive and negative predictive values (PPV, NPV, respectively) of individual items were estimated to compare testers' assessments with those of experts.ResultsTaken as a whole, the scale had better NPV than PPV, for both data and methods. The most consistent predictor of quality relates to the validity of the primary outcomes measurement for the study purpose. Other consistent markers of quality relate to using concurrent comparators, minimizing the effects of bias by prudent choice of covariates, and using sensitivity analysis to test robustness of results. Concordance of expert opinion on the quality of the rated articles was 52%; most checklist items performed better.ConclusionsThe 11-item GRACE checklist provides guidance to help determine which observational studies of CE have used strong scientific methods and good data that are fit for purpose and merit consideration for decision making. The checklist contains a parsimonious set of elements that can be objectively assessed in published studies, and user testing shows that it can be successfully applied to studies of drugs, medical devices, and clinical and surgical interventions. Although no scoring is provided, study reports that rate relatively well across checklist items merit in-depth examination to understand applicability, effect size, and likelihood of residual bias. The current testing and validation efforts did not achieve clear discrimination between studies fit for purpose and those not, but we have identified a critical, though remediable, limitation in our approach. Not specifying a specific granular decision for evaluation, or not identifying a single study objective in reports that included more than one, left reviewers with too broad an assessment challenge. We believe that future efforts will be more successful if reviewers are asked to focus on a specific objective or question. Despite the challenges encountered in this testing, an agreed upon set of assessment elements, checklists, or score cards is critical for the maturation of this field. Substantial resources will be expended on studies of real-world effectiveness, and if the rigor of these observational assessments cannot be assessed, then the impact of the studies will be suboptimal. Similarly, agreement on key elements of quality will ensure that budgets are appropriately directed toward those elements. Given the importance of this task and the lessons learned from these extensive efforts at validation and user testing, we are optimistic about the potential for improved assessments that can be used for diverse situations by people with a wide range of experience and training. Future testing would benefit by directing reviewers to address a single, granular research question, which would avoid problems that arose by using the checklist to evaluate multiple objectives, by using other types of validation test sets, and by employing further multivariate analysis to see if any combination or sequence of item responses has particularly high predictive validity.

      Pubmed     Free full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…