摘要
使用项目反应理论(IRT)中的多面Rasch模型,对两组共12名评委在国家公务员结构化面试中的评委偏差进行了分析。提出并验证了两种评委偏差:评委之间在宽严程度上的差异和评委自身的一致性问题。结果发现:不同评委之间在宽严程度上差异显著,且不同评委评定行为的跨考生、跨维度、跨性别、跨时间的自身一致性也存在差异。研究表明,这种进入到评委个体层次的分析突破了经典测量理论(CTT)定位于评委群体进行分析的局限,针对每位评委的偏差行为提供了详细具体的诊断信息,从而为评委的针对性培训和评委库的建立提供了现代测量学的新方法。
Introduction
Structured interview is one of the most important ways in personnel selection. The existence of rater bias, however, could threatens its reliability and validity to a great extent. Different test theories give different solutions to this problem. Many-faceted Rasch Model (MFRM), an extension to Rasch model, overcomes the shortcomings of Classical Test Theory (CT'F). By parameterizing not only interviewee's ability and item difficulty but also judge severity, MFRM offers an effective way to estimate rater bias and provides detailed information of rater's bias behavior. This study divided rater bias into two kinds - - intra-rater inconsistency and between-rater difference in stringency - and used MFRM to analyze these two kinds of rater bias.
Method
Data comes from a structured interview of national civilian candidates. There were 200 interviewees and 21 raters who were randomized into 3 panels in the morning of each of two days. Rating scores of two panels were used in this study. The first 34 interviewees (numbers 1 through 34) were interviewed by raters A,B,C,D,E,F,G in the first morning. viewees 35 to 66 were interviewed by raters A,E,H,I,J,K,L in the second morning. Using a 10 - point rating scale (1 to 10), each rater rated each interviewee independently on five dimensions.
Using FACETS 3.55.0, a computer program based on MFRM, we examined between - rater differences in stringency and intra- rater inconsistency. Rater bias across candidates, rating dimensions, gender, and time periods are further examined by bias analysis provided by FACETS.
Results
( 1 ) In the structured interview, rater severity differed from each other significantly ;
(2) In the structured interview, raters demonstrated different levels of internal consistency. The specific behavior of the inconsistent raters and the overly consistent raters were identified. ods
(3) Raters also showed different pattern of rater bias across candidates, rating dimensions, gender, and time periods.
Conclusions
The results suggest the existence of two rater bias sources. The application of MFRM in the analysis of structured interview offers an effective way to fairly select competent national civilian candidates, and provides valuable information for the selection of qualified raters, the identification of each rater's strong points and shortcomings, which is useful for the construction of rater bank and further training of incompetent raters.
出处
《心理学报》
CSSCI
CSCD
北大核心
2006年第4期614-625,共12页
Acta Psychologica Sinica
基金
北京师范大学青年教师人文社会科学研究基金项目。