Background:This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines,named Scientific,Transparent and Applicable Rankings tool(STAR),and test its reliability,validi...Background:This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines,named Scientific,Transparent and Applicable Rankings tool(STAR),and test its reliability,validity,and usability.Methods:This study set up a multidisciplinary working group including guideline methodologists,statisticians,journal editors,clinicians,and other experts.Scoping review,Delphi methods,and hierarchical analysis were used to develop the STAR tool.We evaluated the instrument’s intrinsic and interrater reliability,content and criterion validity,and usability.Results:STAR contained 39 items grouped into 11 domains.The mean intrinsic reliability of the domains,indicated by Cronbach’sαcoefficient,was 0.588(95%confidence interval[CI]:0.414,0.762).Interrater reliability as assessed with Cohen’s kappa coefficient was 0.774(95%CI:0.740,0.807)for methodological evaluators and 0.618(95%CI:0.587,0.648)for clinical evaluators.The overall content validity index was 0.905.Pearson’s r correlation for criterion validity was 0.885(95%CI:0.804,0.932).The mean usability score of the items was 4.6 and the median time spent to evaluate each guideline was 20 min.Conclusion:The instrument performed well in terms of reliability,validity,and efficiency,and can be used for comprehensively evaluating and ranking guidelines.展开更多
基金funded by China Scholarship Council(Grant No.202206180007)funded by China Scholarship Council(Grant No.202206180006).
文摘Background:This study aimed to develop a comprehensive instrument for evaluating and ranking clinical practice guidelines,named Scientific,Transparent and Applicable Rankings tool(STAR),and test its reliability,validity,and usability.Methods:This study set up a multidisciplinary working group including guideline methodologists,statisticians,journal editors,clinicians,and other experts.Scoping review,Delphi methods,and hierarchical analysis were used to develop the STAR tool.We evaluated the instrument’s intrinsic and interrater reliability,content and criterion validity,and usability.Results:STAR contained 39 items grouped into 11 domains.The mean intrinsic reliability of the domains,indicated by Cronbach’sαcoefficient,was 0.588(95%confidence interval[CI]:0.414,0.762).Interrater reliability as assessed with Cohen’s kappa coefficient was 0.774(95%CI:0.740,0.807)for methodological evaluators and 0.618(95%CI:0.587,0.648)for clinical evaluators.The overall content validity index was 0.905.Pearson’s r correlation for criterion validity was 0.885(95%CI:0.804,0.932).The mean usability score of the items was 4.6 and the median time spent to evaluate each guideline was 20 min.Conclusion:The instrument performed well in terms of reliability,validity,and efficiency,and can be used for comprehensively evaluating and ranking guidelines.