摘要
在一个大规模教育测量中,以小学四年级和初中二年级的数学测验为例,使用基于项目反应理论(item response theory,IRT)的测验决策一致性系数作为评价测验信度的标准之一,并比较在测验中选取不同分界分数、分数量尺时决策一致性系数的差异.结果发现:相比经典测验理论(classical test theory,CTT)下的信度系数,基于IRT下的测验整体信度要高于CTT下的信度;划定的分界分数(cut score)个数越少,决策一致性系数越大;分界分数位置会影响决策一致性系数,能力水平在分界分数附近的考生更容易被划分到不同类别中;将测验原始分数转换成量表分数后,多个原始分数对应一个转换分数的规则会增大决策一致性系数.
Two real data sets of a large-scale educational assessment program were used to investigate classification consistency indices and to explore pivotal index-influencing factors.It was found that the overall reliability based on IRT was higher than when based on CTT.With decreasing number of cut score and manyto-one transformation rule,classification consistency indices were higher than under other conditions.In the future,it will be useful to apply IRT method and classification consistency indices to the actual educational measurement.
出处
《北京师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2015年第6期643-648,共6页
Journal of Beijing Normal University(Natural Science)
基金
国家自然科学基金资助项目(31371047)
教育部哲学社会科学研究重大课题攻关基金资助项目(12JZD040)
关键词
决策一致性系数
项目反应理论
分界分数
分数量尺
classification consistency indices
item response theory
cut score
score scale