This explorative study investigates 1) whether and how quantitative measures of writing can be applied in finding out about scoring raters' specific tendency in their scoring of EFL writing; 2) how the knowledge of...This explorative study investigates 1) whether and how quantitative measures of writing can be applied in finding out about scoring raters' specific tendency in their scoring of EFL writing; 2) how the knowledge of raters' tendency and scoring results would help verify the best way of combining raters' scores; and 3) how the prediction of the writing scores of EFL writing obtained by quantitative writing performance measures would match the real scores given by raters. Based on a tentative CAF framework of writing measures, raters' performance or tendency in their scoring was observed and certain patterns of similarities as well as differences were found among the raters. The resuks of multiple linear regressions indicate that all raters give prior attention to the aspect of accuracy in their scoring. Differences among raters are also obvious. When it comes to the combination of different raters' scores, the study also finds that weighted average is the best of the three ways of combining scores for this group of raters because it has yielded the best predicting scores than the "pure average". It is even slightly better than the results obtained by facet analysis in terms of some important indices such as R square and Durbin-Watson value. The matching of the predicted scores with the real scores is well over 50 percent. The results of the study are further discussed in relation to the application of wpm and the possible improvement of wpm framework. The methodological, theoretical and practical implications of the study have also been touched upon in the relevant part of the article.展开更多
This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context...This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context of a country where English is not the first language of communication, in order to determine whether or not English as the first language of the country in which the test is taken could have a bearing on the test result. As suggested by Chun(2006), this study compares the results achieved by test takers in a non-English speaking environment with those obtained by different test takers in an English speaking environment. To be able to decide whether the Versant is more prone to setting-related bias than other English proficiency tests, the Versant test scores are correlated with the TOEFL scores of the test-takers in a non-English speaking setting and the correlation coefficient is then compared with that achieved in an English-speaking environment. The results suggest that the correlation between the Versant and TOEFL in a non-English-speaking environment is not significantly different from that obtained in an English-speaking environment.展开更多
基金funded by China National Planning Office of Philosophy and Social Science(No.08XYY007)
文摘This explorative study investigates 1) whether and how quantitative measures of writing can be applied in finding out about scoring raters' specific tendency in their scoring of EFL writing; 2) how the knowledge of raters' tendency and scoring results would help verify the best way of combining raters' scores; and 3) how the prediction of the writing scores of EFL writing obtained by quantitative writing performance measures would match the real scores given by raters. Based on a tentative CAF framework of writing measures, raters' performance or tendency in their scoring was observed and certain patterns of similarities as well as differences were found among the raters. The resuks of multiple linear regressions indicate that all raters give prior attention to the aspect of accuracy in their scoring. Differences among raters are also obvious. When it comes to the combination of different raters' scores, the study also finds that weighted average is the best of the three ways of combining scores for this group of raters because it has yielded the best predicting scores than the "pure average". It is even slightly better than the results obtained by facet analysis in terms of some important indices such as R square and Durbin-Watson value. The matching of the predicted scores with the real scores is well over 50 percent. The results of the study are further discussed in relation to the application of wpm and the possible improvement of wpm framework. The methodological, theoretical and practical implications of the study have also been touched upon in the relevant part of the article.
基金funded by the American University of Sharjah through the university research grant program on a competitive basis
文摘This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context of a country where English is not the first language of communication, in order to determine whether or not English as the first language of the country in which the test is taken could have a bearing on the test result. As suggested by Chun(2006), this study compares the results achieved by test takers in a non-English speaking environment with those obtained by different test takers in an English speaking environment. To be able to decide whether the Versant is more prone to setting-related bias than other English proficiency tests, the Versant test scores are correlated with the TOEFL scores of the test-takers in a non-English speaking setting and the correlation coefficient is then compared with that achieved in an English-speaking environment. The results suggest that the correlation between the Versant and TOEFL in a non-English-speaking environment is not significantly different from that obtained in an English-speaking environment.