摘要
研究采用锚测验非等组设计,对IRT真分数等值和IRT观察分数等值两种方法进行了比较研究。研究数据取自TIMSS2003数据库,首先用BILOG程序得出参数估计和被试能力分布,然后用四种方法对项目参数进行再校准,最后,用PIE程序运行两种IRT等值方法。研究表明,针对研究的等值情境,四种再校准的方法没有显著区别,IRT真分数等值和IRT观察分数等值仅在较低的分数段出现了很小的差别。对样本量的分析表明,IRT观察分数等值的精确性受到样本容量的影响更大。
IRT true score equating and IRT observed score equating in common-item non-equivalent group design based on the item response theory were compared. The research data was selected from the TIMSS2003 database. The BILOG, ST and PIE programs were used to process the data. We came to the conclusion that in the research situation, the two IRT methods yielded very similar results. Larger differences between IRT true score equating and IRT observed score equating occurred near lower scores of the whole score distribution. For both methods, the equating Standard Error could be reduced by enlarging sample sizes. However, IRT observed score equating was more accessible by the sample size.
出处
《心理科学》
CSSCI
CSCD
北大核心
2010年第3期676-680,共5页
Journal of Psychological Science
基金
国家自然科学基金项目(30870784)资助