两种新的多维计算机化分类测验终止规则被引量：2

Two new termination rules for multidimensional computerized classification testing

下载PDF

导出

摘要计算机化分类测验(Computerized Classification Testing,CCT)由于具备分类的功能,目前在职业资格考试、健康与护理问卷等以分类为目的的测验中得到广泛应用。作为CCT的重要组成部分,终止规则不仅决定测验停止的条件而且直接影响分类准确率及测验效率。然而,目前少有研究对多维CCT(Mulitidimensional CCT,MCCT)的终止规则进行探索。针对已有MCCT终止规则的不足,提出两种新的MCCT终止规则(即基于马氏距离的多维序贯似然比规则Mahalanobis-SPRT和随机缩减的多维广义似然比规则M-SCGLR),并开展模拟研究在不同实验条件下(比如,不同的题库结构、能力维度间相关及分界函数)考查它们的表现。结果表明:(1)在使用补偿性分界函数的条件下,Mahalanobis-SPRT规则具有较高的分类精度和与同类方法相近的测验长度;(2)在几乎所有实验条件下,M-SCGLR规则不仅在测验精度上大幅优于已有的多维随机缩减规则,而且具有较短的测验长度。 Computerized classification testing(CCT) is a subset of computerized adaptive testing(CAT), and it aims to classify examinees into one of at least two possible categories that denote results such as pass/fail or non-mastery/partial mastery/mastery. Therefore, CCTs focus on increasing the accuracy of classification which is different from CATs designed for precise measurement. The termination rule is one of the key components of CCT. However, as pointed out by Nydick(2013), most CCTs(i.e., UCCTs) were designed under unidimensional item response theory(IRT), in which the unidimensionality assumption is easily violated in practice. Thus, researchers then began to construct multidimensional CCT termination rules(i.e., MCCT) based on multidimensional IRT. To date, however, these rules still have some deficiencies in terms of classification accuracy or test efficiency. Most current studies on termination rules of MCCT are based on termination rules of UCCT. In UCCTs, termination rules require setting a cut point, θ0, of the latent trait to calculate the statistics;and when they are extended from UCCT to MCCT, the cut point will become a classification bound curve or even a surface(i.e., g(θ)(28)0). At this time, a question is how to convert the curve or surface into θ0. To this end, the projected sequential probability ratio test(P-SPRT), constrained SPRT(C-SPRT;Nydick, 2013), and multidimensional generalized likelihood ratio(M-GLR) were respectively proposed to solve the problem in different ways. Among them, P-SPRT and C-SPRT choose specific points on g(θ) as the approximate cut point, θ0, by projecting into Euclidean space or constraining on g(θ) respectively;as for M-GLR, because the generalized likelihood ratio statistic can be calculated without a cut point, it can be directly employed in MCCT. To overcome the limitation that P-SPRT may lead to unstable results at the beginning of the test, this study proposed the Mahalanobis distance-based SPRT(Mahalanobis-SPRT). In addition, stochastic curtailment is a technique for shortening the test length by predicting whether the classification of participants will change as the test continues. This article also combined M-GLR with the stochastic curtailment and proposed M-GLR with stochastic curtailment(M-SCGLR). A full-scale simulation study was conducted to(1) compare both the Mahalanobis-SPRT and M-SCGLR with the P-SPRT, C-SPRT, M-GLR, and multidimensional stochastically curtailed SPRT(M-SCSPRT) under varying conditions;(2) compare the classification performance of the above six termination rules for participants with specific abilities to explore whether there is a significant difference in the sensitivity of various rules to classify specific participants. To achieve the first research objective, three levels of correlation between dimensions(ρ=0, 0.5, and 0.8), two item bank structures(within-item multidimensionality and between-item multidimensionality), and two kinds of classification boundary(compensatory boundary and non-compensatory boundary) were considered;to achieve the second objective, 36 specific ability points(θ1, θ2) were generated where θ1, θ2 ?{-0.5,-0.3,-0.1, 0.1, 0.3, 0.5}. The results showed that:(1) when the compensatory classification function was used, the Mahalanobis-SPRT led to higher classification accuracy and similar test length to the rules without stochastic curtailment;(2) under almost all conditions, the M-SCGLR not only possessed higher precision but also maintained the short test length, compared to M-SCSPRT that also uses stochastic curtailment;(3) the six termination rules showed a consistent change in the sensitivity of the precision and test length to specific participants. To sum up, two new MCCT termination rules(Mahalanobis-SPRT and M-SCGLR) are put forward in this article. Although the simulation results are very promising, several research directions merit further investigation, such as the development of MCCT termination rules for more than two categories, and the construction of MCCT termination rules by incorporating process data like the response time.

作者任赫陈平 REN He;CHEN Ping(Collaborative Innovation Center of Assessment for Basic Education Quality,Beijing Normal University,Beijing 100875,China)

机构地区北京师范大学中国基础教育质量监测协同创新中心

出处《心理学报》 CSSCI CSCD 北大核心 2021年第9期1044-1058,I0001-I0003,共18页 Acta Psychologica Sinica

基金国家自然科学基金面上项目(32071092) 中国基础教育质量监测协同创新中心基础教育质量监测科研基金项目(2019-01-082-BZK01和2019-01-082-BZK02) 中国基础教育质量监测协同创新中心自主课题(BJZK-2019A2-19003)资助

关键词计算机化分类测验终止规则多维项目反应理论马氏距离随机缩减 computerized classification testing termination rule multidimensional item response theory Mahalanobis distance stochastic curtailment

分类号 B841 [哲学宗教—基础心理学]

引文网络
相关文献

参考文献3

1陈平.两种新的计算机化自适应测验在线标定方法[J].心理学报,2016,48(9):1184-1198. 被引量：7
2郭磊,郑蝉金,边玉芳.变长CD-CAT中的曝光控制与终止规则[J].心理学报,2015,47(1):129-140. 被引量：17
3康春花,辛涛.测验理论的新发展:多维项目反应理论[J].心理科学进展,2010,18(3):530-536. 被引量：35

二级参考文献49

1辛涛.项目反应理论研究的新进展[J].中国考试,2005(7):18-21. 被引量：26
2Ackerman, T. (1996). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20(4), 311.
3Ackerman, T. A. (1994). Using Multidimensional Item Response Theory to Understand What Items and Tests Are Measuring. Applied Measurement in Education, 7(4), 255-278.
4Ackerman, T. A., Gierl, M. J., & Walker, C. M. (2003). Using multidimensional item response theory to evaluate educational and psychological Tests. MIRT Instructional Module/Educational Measurement: Issues and Practice, 37-53.
5Beguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66(4), 541-561.
6Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395-414.
7Book, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.
8Christoffcrsson, A. (1975). Factor analysis of dichotomized variables. Psychometrika, 40(1), 5-32.
9Davey, T., Oshima, T. C., & Lee, K. (1996). Linking multidimensional item calibrations. Applied Psychological Measurement, 20(4), 405-416.
10DeMars, C. E. (2006). Application of the hi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43(2), 145-168.

共引文献55

1林岳卿,方积乾.多维IRT与单维IRT在多维量表中应用的差异[J].中国卫生统计,2011,28(3):226-228. 被引量：5
2涂冬波,蔡艳,戴海琦,丁树良.多维项目反应理论:参数估计及其在心理测验中的应用[J].心理学报,2011,43(11):1329-1340. 被引量：25
3辛涛,乐美玲,张佳慧.教育测量理论新进展及发展趋势[J].中国考试,2012(5):3-11. 被引量：35
4杜文久,肖涵敏.多维项目反应理论等级反应模型[J].心理学报,2012,44(10):1402-1407. 被引量：13
5陈平,张佳慧,辛涛.在线标定技术在计算机化自适应测验中的应用[J].心理科学进展,2013,21(10):1883-1892. 被引量：9
6许志勇,丁树良,钟君.高考数学试卷多维项目反应理论的分析及应用[J].心理学探新,2013,33(5):438-443. 被引量：12
7詹沛达,王文中,王立君.项目反应理论新进展之题组反应理论[J].心理科学进展,2013,21(12):2265-2280. 被引量：16
8王怡,唐文清,刘晶,张敏强,李明,黎光明.IRT与MIRT在测验垂直等值中的应用[J].心理科学进展,2014,22(5):881-888. 被引量：3
9韩雨婷,高旭亮,汪大勋,蔡艳,涂冬波.多级评分项目的多维CAT选题策略开发[J].心理科学,2018,41(6):1500-1507. 被引量：5
10蔡艳,涂冬波,丁树良.MIRT模型中多维能力及其相关矩阵估计的影响因素[J].心理学探新,2014,34(5):426-430.

同被引文献6

1康春花,辛涛.测验理论的新发展:多维项目反应理论[J].心理科学进展,2010,18(3):530-536. 被引量：35
2陈平.两种新的计算机化自适应测验在线标定方法[J].心理学报,2016,48(9):1184-1198. 被引量：7
3彭亚风,罗照盛,喻晓锋,高椿雷,李喻骏.认知诊断评价中测验结构的优化设计[J].心理学报,2016,48(12):1600-1611. 被引量：7
4詹沛达,Hong Jiao,Kaiwen Man.多维对数正态作答时间模型:对潜在加工速度多维性的探究[J].心理学报,2020,52(9):1132-1142. 被引量：9
5简小珠,陈平.计算机化分类测验的特点与发展述评[J].考试研究,2020,16(6):77-89. 被引量：2
6詹沛达.计算机化多维测验中作答时间和作答精度数据的联合分析[J].心理科学,2019,42(1):170-178. 被引量：9

引证文献2

1任赫,黄颖诗,陈平.计算机化分类测验终止规则的类别、特点及应用[J].心理科学进展,2022,30(5):1168-1182. 被引量：1
2潘世权,赵守盈.补偿多维IRT模型的Q矩阵设计[J].心理学探新,2024,44(3):273-280.

二级引证文献1

1简小珠,张敏强.基于IRT的计算机化适应性测验的概念、类型及特征[J].中国考试,2024(9):66-75.

1耿小雨.丙戊酸镁与碳酸锂治疗双相情感障碍的效果比较[J].现代诊断与治疗,2021,32(6):902-903.
2曲晓明,毕双平.关于铁能培训在线平台题库建设的若干思考[J].铁法科技,2020(S01):251-253.
3李兵兵,洪景芳,李克磊,张灏,张尚明,王守森.动脉瘤性蛛网膜下腔出血对脑静脉循环的影响[J].中国临床神经外科杂志,2021,26(1):17-19. 被引量：3
4于帆,孙建平,李婷,朱天梦,王成科.新制标准铂电阻温度计退火特性理论模型及验证[J].计量学报,2021,42(6):759-764. 被引量：4
5倪眺,饶贵安,苏艺.不同分期系统对晚期SiewertⅡ型胃食管结合部癌的评估价值[J].国际消化病杂志,2021,41(4):261-266.
6Jian WANG,Pei-kang DONG,Xiu-feng XU,Tao HUANG,Shuai MAO,Qing-guo WANG,Jie HAO,Xiao-hong LIU,Xiao-dong SUN,Kai KANG,Quan ZHANG,Jing-tian LI,Tao WANG.Identification of tRNA-derived Fragments and Their Potential Roles in Atherosclerosis[J].Current Medical Science,2021,41(4):712-721. 被引量：2
7王垭铮,费栋栋,张杨,张曦戈,王阳,王勤涛.低剂量脂多糖诱导下人牙周膜干细胞对巨噬细胞促炎因子表达的影响及其机制研究[J].中华口腔医学杂志,2021,56(7):672-678.

心理学报

2021年第9期

浏览历史

内容加载中请稍等...

两种新的多维计算机化分类测验终止规则被引量：2

参考文献3

二级参考文献49

共引文献55

同被引文献6

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

两种新的多维计算机化分类测验终止规则 被引量：2

参考文献3

二级参考文献49

共引文献55

同被引文献6

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

两种新的多维计算机化分类测验终止规则被引量：2