The objective of this paper is to explore the reliability of Online Automatic Scoring(OAS)through the comparison of OAS and Teacher Scoring(TS),and further demonstrate the feasibility of the integration of the two sco...The objective of this paper is to explore the reliability of Online Automatic Scoring(OAS)through the comparison of OAS and Teacher Scoring(TS),and further demonstrate the feasibility of the integration of the two scoring methods.The Pearson correlation statistics of the two scoring results of 115 compositions are processed with SPSS analysis software,indicating that the correlation between the two reaches 0.83,which means that OAS is relatively reliable in dealing with students’compositions.After the second stage of the TS experiment,the questionnaire results show that students generally recognize the OAS and have a clear understanding of the advantages and disadvantages of the two scoring methods.Combined with the students’interview,the conclusion is that the OAS is reliable and the integration of the two scoring methods will have a better effect.展开更多
The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, witho...The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, without the need to store all of the previously obtained data. Poisson bootstrap and stochastic approximation processes were used with online standardized data to avoid numerical explosions, the convergence of which has been established theoretically. This empirical convergence of online ensemble scores to a reference “batch” score was studied on five different datasets from which data streams were simulated, comparing six different processes to construct the online scores. For each score, 50 replications using a total of 10N observations (N being the size of the dataset) were performed to assess the convergence and the stability of the method, computing the mean and standard deviation of a convergence criterion. A complementary study using 100N observations was also performed. All tested processes on all datasets converged after N iterations, except for one process on one dataset. The best processes were averaged processes using online standardized data and a piecewise constant step-size.展开更多
基金Funded by Shandong Social Science Planning Program(山东省社科规划项目).
文摘The objective of this paper is to explore the reliability of Online Automatic Scoring(OAS)through the comparison of OAS and Teacher Scoring(TS),and further demonstrate the feasibility of the integration of the two scoring methods.The Pearson correlation statistics of the two scoring results of 115 compositions are processed with SPSS analysis software,indicating that the correlation between the two reaches 0.83,which means that OAS is relatively reliable in dealing with students’compositions.After the second stage of the TS experiment,the questionnaire results show that students generally recognize the OAS and have a clear understanding of the advantages and disadvantages of the two scoring methods.Combined with the students’interview,the conclusion is that the OAS is reliable and the integration of the two scoring methods will have a better effect.
文摘The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, without the need to store all of the previously obtained data. Poisson bootstrap and stochastic approximation processes were used with online standardized data to avoid numerical explosions, the convergence of which has been established theoretically. This empirical convergence of online ensemble scores to a reference “batch” score was studied on five different datasets from which data streams were simulated, comparing six different processes to construct the online scores. For each score, 50 replications using a total of 10N observations (N being the size of the dataset) were performed to assess the convergence and the stability of the method, computing the mean and standard deviation of a convergence criterion. A complementary study using 100N observations was also performed. All tested processes on all datasets converged after N iterations, except for one process on one dataset. The best processes were averaged processes using online standardized data and a piecewise constant step-size.