期刊文献+

用于处理不努力作答的标准化残差系列方法和混合多层模型法的比较 被引量:1

A comparison of standard residual methods and a mixture hierarchical model for detecting non-effortful responses
下载PDF
导出
摘要 文章采用模拟研究,分别在混合多层模型假设满足和违背的情境下,比较了混合多层模型方法与标准化残差系列方法在识别不努力作答和参数估计方面的表现。结果显示:(1)不存在不努力作答或其严重性低时,各方法表现接近;(2)不努力作答严重性高时,固定参数迭代标准化残差法普遍更优,混合多层模型法仅在假设满足且两种作答反应时差异大的条件下表现较好。建议实际应用中优先选择固定参数迭代标准化残差法。 Assessment datasets contaminated by non-effortful responses may lead to serious consequences if not handled appropriately.Previous research has proposed two different strategies:down-weighting and accommodating.Down-weighting tries to limit the influence of aberrant responses on parameter estimation by reducing their weight.The extreme form of down-weighting is the detection and removal of irregular responses and response times(RTs).The standard residual-based methods,including the recently developed residual method using an iterative purification process,can be used to detect non-effortful responses in the framework of down-weighting.In accommodating,on the other hand,one tries to extend a model in order to account for the contaminations directly.This boils down to a mixture hierarchical model(MHM)for responses and RTs.However,to the authors’knowledge,few studies have compared standard residual methods and MHM under different simulation conditions.It is unknown which method should be applied in different situations.Meanwhile,MHM has strong assumptions for different types of responses.It would be valuable to examine the performance of the method when the assumptions are violated.The purpose of this study is to compare standard residual methods and MHM under a fully crossed simulation design.In addition,specific recommendations for their applications are provided.The simulation study included two scenarios.In simulation scenario I,data were generated under the assumptions of MHM.In simulation scenario II,the assumptions of MHM concerning non-effortful responses and RTs were both violated.Simulation scenario I had three manipulated factors.(1)Non-effort prevalence(π),which was the proportion of individuals with non-effortful responses.It had three levels:0%,20%and 40%.(2)Non-effort severity(π_(i)^(non)),which was the proportion of non-effortful responses for each non-effortful individual.It varied between two levels:low and high.Whenπ_(i)^(non)was low,π_(i)^(non)was generated from U(0,0.25);while whenπ_(i)^(non)was high,π_(i)^(non)was generated from U(0.5,0.75),where“U”denoted a uniform distribution.(3)Difference between RTs of non-effortful and effortful responses(d_(RT)).The difference between RTs from two groups,d_(RT),had two levels,small and large.The logarithm of RTs of non-effortful responses were generated from normal distribution N(μ,0.5^(2)),whereμ=-1 when RTd was small,μ=-2 when RTd was large.Forgenerating the non-effortful responses,we followed Wang,Xu and Shang(2018),with the probability of a correct responsegj setting at 0.25 for all non-effortful responses.In simulation scenario II,only the first two factors were considered.Non-effortful RTs were generated from a uniform distribution with a lower bound ofresponse for non-effortful responses was dependent on the ability level of each examinee.In all the conditions,sample size was fixed at I=2,000 and test length was fixed at J=30.For each condition,30 replications were generated.For effortful responses,Responses and RTs were simulated from van der Linden’s(2007)hierarchical model.Item parameters were generated with a_(j)~U(1,2.5),b_(j)~N(0,1),α_(j)~U(1.5,2.5),β_(j)~U(-0.2,0.2).For simulees,the person parameters(θi,τi)were generated from a bivariate normal distribution with the mean exp(-5)and upper bound being the 5th percentile of RT on item j withτ=0.The probability of a correct vector ofμ=(0,0)’and the covariance matrix of=Σ■Four methods were compared under eachcondition:the original standard residual method(OSR),conditional estimate standard residual(CSR),conditional estimate with fixed item parameters standard residual method using iterative purifying procedure(CSRI),and MHM.These methods were implemented in R and JAGS using a Bayesian MCMC sampling method for parameter calibration.Finally,these methods were evaluated in terms of convergence rate,detection accuracy and parameter recovery.The results are presented as following.First of all,MHM suffered from convergence issues,especially for the latent variable indicating non-effortful responses.On the contrary,all the standard residual methods achieved convergence successfully.The convergence issues were more serious in simulation scenario II.Secondly,when all the items were assumed to have effortful responses,the false positive rate(FPR)of MHM was 0.Although the standard residual methods had FPR around 5%(the nominal level),the accuracy of parameter estimates was similar for all these methods.Third,when data were contaminated by non-effortful responses,CSRI had higher true positive rate(TPR)almost in all the conditions.MHM showed lower TPR but lower false discovery rate(FDR),exhibiting even lower TPR in simulation scenario II.Whenπ_(i)^(non)was high,CSRI and MHM showed more advantages over the other methods in terms of parameter recovery.However,whenπ_(i)^(non)was high and d_(RT)was small,MHM generally had higher RMSE than CSRI.Compared to simulation scenario I,MHM performed worse in simulation scenario II.The only problem CSRI needed to deal with was its overestimation of time discrimination parameter across all the conditions except for whenπ=40%and d_(RT)was large.In a real data example,all the methods were applied to a dataset collected for program assessment and accountability purposes from undergraduates at a mid-sized southeastern university in USA.Evidences from convergence validity showed that CSRI and MHM might detect non-effortful responses more accurately and obtain more precise parameter estimates for this data.In conclusion,CSRI generally performed better than the other methods across all the conditions.It is highly recommended to use this method in practice because:(1)It showed acceptable FPR and fairly accurate parameter estimates even when all responses were effortful;(2)It was free of strong assumptions,which meant that it would be robust under various situations;(3)It showed most advantages whenπ_(i)^(non)was high in terms of the detection of non-effortful responses and the improvement of the parameter estimation.In order to improve the estimation of time discrimination parameter in CSRI,the robust estimation methods that down-weight flagged response patterns can be used as an alternative to directly removing non-effortful responses(i.e.,the method in the current study).MHM can perform well when all its assumptions are met andπ_(i)^(non)is high,d_(RT)is large.However,some parameters have difficulty in convergence under MHM,which will limit its application in practice.
作者 刘玥 刘红云 游晓锋 杨建芹 LIU Yue;LIU Hongyun;YOU Xiaofeng;YANG Jianqin(Institute of Brain and Psychological Sciences,Sichuan Normal University,Chengdu 610066,China;Beijing Key Laboratory of Applied Experimental Psychology,Beijing Normal University,Beijing 100875,China;Faculty of Psychology,Beijing Normal University,Beijing 100875,China;School of Mathematics and Information Science,Nanchang Normal University,Nanchang 360111,China)
出处 《心理学报》 CSSCI CSCD 北大核心 2022年第4期411-425,共15页 Acta Psychologica Sinica
基金 国家自然科学基金项目(32071091)。
关键词 不努力作答 标准化反应时残差 迭代净化 混合多层模型 贝叶斯估计 non-effortful response standard response time residual iterative purification mixture hierarchical model Bayesian estimation
  • 相关文献

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部