摘要
允许修改答案的认知诊断计算机化自适应测验(Reviewable Cognitive Diagnostic Computerized Adaptive Testing,RCD-CAT),有利于更准确诊断被试的知识状态,题目口袋法(Item Pocket,IP)为被试提供了缓存作答并修改的机会,改进的题目口袋法(Modified IP,MIP)对IP内修改的题目重新计分。模拟研究比较了IP、MIP、stocking Ⅰ和stocking Ⅱ在RCD-CAT效果,结果发现:stocking设计的效果最优,其中stocking Ⅱ的效果略优于stocking Ⅰ,IP法和MIP法判准率要低于传统CD-CAT,stocking设计在RCD-CAT具有较好的应用前景。
Combining cognitive diagnosis with computerized adaptive testing, cognitive diagnostic computerized adaptive testing (CD-CAT) aims to more efficiently and accurately diagnose examinees' mastery status of a group of discretely defined skills, or attributes than paper & pencil tests. While it is a natural thing for examinees to review their answers and possibly change them in paper and pencil-based tests, the same thing is less common to happen in CD-CAT since it could deteriorate the measurement efficiency. The absence of review opportunities on operational CD-CAT creates a dilemma for test developers as examinees need to review and change answers during the test in order to achieve more accurate estimates of their true ability. Item Pocket (Han, 2013) is a method of reviewable computerized adaptive testing (RCAT). This method provides test takers with Item Pocket (IP) into which they can place items for later review and response change. Test takers can skip answering items by putting them in the IP, but the shortcoming of IP is that the capacity is not easy to control, if the capacity is too large that will result a comparatively large estimation error. Based on IP method, the study proposes a new IP method called modified IP (MIP), employing a new scoring method in IP. Compared with IP, stocking (1997) design causes greater restrictions for examinee behavior. In stocking design I, examinees are instructed in advance that they will be permitted to revise answers to fixed number of items, under stocking design II, the testing is divided into separately sections and examinees are informed in advance of testing that they will be permitted to revise answers to items only within a section. The advantage of design II is that it simultaneously restricts examinee control over the actual item presented because revised responses from previous sections influence the section of items in subsequent sections. CD-CAT is a further development of the CAT, but they are very different in some ways. In order to verify the above methods in Reviewable CD-CAT (RCD-CAT), two Monte Carlo simulation studies with different experimental conditions were conducted here, the interim and final states of knowledge were estimated using the maximum likelihood estimation (MLE) method, a group of 5,000 examinees were simulated for this study, and the tests were then created from an item pool of 300 items. These experimental conditions were cognitive diagnosis model (DINA and R-RUM), the number of attributes (5 and 7), item selection strategies (KL, PWKL, HKL and MPWKL), and the fixed test length CD-CAT (10 and 20 items respectively). Monte Carlo simulation results s^aowed that: (1)When using the DINA model, MIP and IP methods had very similar classification accuracy, however, while using R-RUM model MIP method had higher classification accuracy than IP method. Furthermore, both MIP and IP had low classification accuracy than traditional CD-CAT; (2)Stocking design had a higher classification accuracy than the other methods in all simulations, and stocking design II was slightly better than the stocking design I. In a word, RCD-CAT is more consistent with traditional examination habits, in addition, it can also improve classification accuracy. This study will help to provide theory and method support for future research and practical application.
出处
《心理科学》
CSSCI
CSCD
北大核心
2017年第3期721-727,共7页
Journal of Psychological Science
基金
国家自然科学基金(31660278
31300876
31100756
31360237)
江西省高校人文社科项目(XL1507
XL1508)
东北师范大学应用统计教育部重点实验室开放课题(KLAS130028614)
武汉市卫计委支撑课题(WG16C0)的资助