摘要
目的:编制4~6年级多重成就测验(MATs),为教育、临床和社会评估提供测评工具。方法:在多次试测基础上形成预测本,2002名被试作条目分析,对768名被试作各种信度估计,对227名被试性向测验和646名被试学科成绩作同时效度,5位专家作内容效度评定。结果:77.5%的条目难度为.20~.80,77%条目鉴别力达优良水平,分测验及分量表D值在.30以上。两题本分量表和总量表重测信度.91~.95,复本信度.87~.94,重测复本信度.82~.89,分半信度.79~.90,!系数.90~.96,评分者信度.94~.98,真分数变异.82和.86;概化分析表明分测验条目15个左右为宜,分量表与总量表条目量50与100即可。专家评定语数非常符合条目82%和86%,两题本分量表和总量表与学科成绩相关.23~.60,与学业能力倾向测验相关.39~.66,不同学校、年级间存在显著性差异,语文分测验存在性别差异;因素分析抽2因素时为语、数因子,与分量表吻合;提取多个因子时,表明可能存在言语、记忆、数算、数形和数理五因子。结论:MATs难度适中,鉴别力优良;各种信度考验结果基本符合测量学的标准并具有良好的内容效度和同时效度,结构效度较理想;两题本基本平行。
Objective: The Multiple Achievement Tests of the 4-6 Grades was development (MATs), The aim of this study was to provide an measurement for educational assessments, clinic diagnosis, and social supervisor. Methods: Through electing and pre testing the items, we formed MATs-A and MATs-B. which were believed equal. Formal sample consist of 2002 elementary and high school students. In addition, 768 subjects did the MATs test-retest, parallel test, retest and parallel test. 227 students were tested by academic aptitude test. 646 subjects' academic performances were collected for the test of concurrent validity. The content validity of MATs was evaluated by 5 experts. Results: The item difficulties of 77.5% items range from .20 to .80. The item discriminations of 77% items were good, and the subtests were above .30. The reliabilities of the subscales and full scales ranged from .91 to .95 in Spearman test-retest correlations, the parallel reliability.87 to .94, the retest and parallel reliability .82 to .89, the split-half reliability .79 to .90, and the Cronbach's coefficient .90 to .96. The scorers' reliabilities were above .94. The true scores of MATs-A and MATs-B were .82 and .86, respectively. The generalizability analysis showed that it was proper to use about 15 items in each subtest , and 50 and 100 items in the subscales and the fullscale, respectively. The experts rating items of 82% and 86% were well in the language scale, and in the mathematics scale. The correlations between subscales or full scale and academic performances were from .23 to .60. The correlations between subscales or full scale and academic aptitude test were from .39 to .66. There were significant differences in various schools and grades. There were language and mathematics factors when we extracted 2 factors from 10 variables by exploratory factor analysis. There were language knowledge, memory, mathematics computations, number and figure, and mathematics reasoning factors when we extracted many factors. Conclusion: MATs difficulties were appropriate and discriminations were good. A series of reliabilities testing conformed to the measurement standards. The content validities and concurrent validities were good, and the construct validities were better.
出处
《中国临床心理学杂志》
CSCD
2005年第3期253-257,共5页
Chinese Journal of Clinical Psychology
基金
湖南省教育厅科学研究项目(B/1914020742)
关键词
成就测验
编制
条目
信度
效度
Achievement test
Development
Item
Reliability
Validity