This paper primarily focuses on the major development of language testing since 1980s. Critical to testing is me concept of language proficiency or ability which consists two components of language competence (or lan...This paper primarily focuses on the major development of language testing since 1980s. Critical to testing is me concept of language proficiency or ability which consists two components of language competence (or language knowledge) and strategic competence. A communicative approach to testing is demanding to catch up with the new teaching concept and to measure a learner' s cornmunicative competence. Communicative language tests are intended to measure people' s ability to use language communicatively in a variety of real life situations.展开更多
Objective:The paper discusses recent evidence on the assessment of language outcomes in children with hearing loss acquiring oral language. Methods: Research emphasizes that language tests must be specific enough to c...Objective:The paper discusses recent evidence on the assessment of language outcomes in children with hearing loss acquiring oral language. Methods: Research emphasizes that language tests must be specific enough to capture subtle deficits in vocabulary and grammar learning at different developmental ages. The Diagnostic Receptive and Expressive Assessment of Mandarin (DREAM) was carefully designed to be a comprehensive standardized Mandarin assessment normed in China's Mainland. Results:This paper summarizes the evidence-based item design process and validity and reliability results of DREAM. A pilot study reported here shows that DREAM provided detailed information about hearing impaired children's language abilities and can be used to aid intervention planning to maximize progress. Conclusion: DREAM represents an example of translational science, transferring methods from empirical studies of language acquisition in research environments into applied domains such as assessment and intervention. Research on outcomes in China will advance significantly with the availability of evidence-based comprehensive language tests that measure a sufficient age range of skills, are normed on Mandarin speaking children in China's Mainland, and are designed to capture features central to Mandarin language acquisition.展开更多
The 37th Language Testing Research Colloquium(LTRC 2015①)was held at Eaton Chelsea Hotel in Toronto Canada during March 16-20,2015.The first two days of March 16-17 were preconference workshop days with March 18-20 a...The 37th Language Testing Research Colloquium(LTRC 2015①)was held at Eaton Chelsea Hotel in Toronto Canada during March 16-20,2015.The first two days of March 16-17 were preconference workshop days with March 18-20 as the three main conference days.More than 300 participants from 27 countries and regions joined the conference.The top numbers of the展开更多
Assessment and evaluation can not only be used to test the degree of students’acquirement of knowledge but also reflect the qualities and effectiveness of teaching.This research uses SPSS24.0 to carry out the quantit...Assessment and evaluation can not only be used to test the degree of students’acquirement of knowledge but also reflect the qualities and effectiveness of teaching.This research uses SPSS24.0 to carry out the quantitative analysis of the score of a practice test of TEM 4,which provides a reference to the improvement of the TEM4 test and the enlightenment of English teaching.展开更多
Post-admission language tests tend to have a restricted range of proficiency levels among test-takers due to considerations made during the admission selection process.Although range restriction can present challenges...Post-admission language tests tend to have a restricted range of proficiency levels among test-takers due to considerations made during the admission selection process.Although range restriction can present challenges for proficiency-focused assessment,it can also bring opportunities to zoom in on fine-grained performance profiles of test-takers.This study reports on the validation of a profile-based rating scale for an ESL writing placement test in a US university.The profile-based rating scale was created by employing a three-staged,hybrid scale development approach,to provide not only accurate placement decisions but also fine-grained diagnostic information regarding ESL students’writing performance profiles.The scale strikes a balance between argument development and lexico-grammar,to better account for the range of writing performances among test-takers.To gather validity evidence for the profile-based rating scale,this study employs a sequential,mixed-methods approach to examine the quality of test-taker performances across profiles and rater perceptions on the scale.Nine certified raters were recruited to conduct independent evaluations of lexicogrammar and argumentation on a sample of 150 test-taker performances.These evaluations were subjected to many-facet Rasch measurement analysis to examine the differences across writing performance profiles included in the rating scale.Next,semi-structured,follow-up interviews were conducted with the raters,to complement the quantitative findings on the usability and effectiveness of the scale.The findings provide supportive evidence for the validity of the profile-based rating scale.I argue that by focusing on performance profiles,post-admission language tests can strengthen the alignment across curriculum,instruction,and assessment in ESL writing programs.展开更多
With the purpose of describing levels of English language proficiency expected at each stage in our school's Diploma and BA programs, we attempted to compare the level of courses with national standards as embodied i...With the purpose of describing levels of English language proficiency expected at each stage in our school's Diploma and BA programs, we attempted to compare the level of courses with national standards as embodied in the national TEM4 and TEM8, and with international standards as embodied in international examinations such as Cambridge ESOL, and other descriptions such as the Common European Framework via one quantifiable parameter: vocabulary range. This is justified as vocabulary range offers an approximate but useful guide to the level of a course or a testing system. We hypothesize that the language competence at different levels of our program matches various standard proficiency examinations. Paul Nation's Range software was used both in its standard form using his three BASEWRD files and in an adapted form adding the authors' own BASEWRD files extrapolated from various levels of our textbook series. This enabled us to compare the vocabulary range of our courses with that of both national and international examinations where word lists are available or recoverable. Research results supported the hypotheses suggested.展开更多
Objectives:This study aimed to systematically evaluate the effects of constraint-induced aphasia therapy(OAT)for aphasic patients reported by randomized controlled trials.Methods:Relevant randomized controlled trials ...Objectives:This study aimed to systematically evaluate the effects of constraint-induced aphasia therapy(OAT)for aphasic patients reported by randomized controlled trials.Methods:Relevant randomized controlled trials were retrieved from 11 electronic databases.A methodological quality assessment was conducted in accordance with the Cochrane Handbook,and metaanalyses were performed by using RevMan 5.2.A descriptive analysis was conducted when the included trials were not suitable for a meta-analysis.Results:A total of 12 trials were included.A statistically significant group difference was shown from the meta-analysis in the results measured by the Western Aphasia Battery(random-effects model,MD=1.23,95%CI=0.31 to 2.14,P<0.01).However,there were no statistically significant differences shown in the results of the Boston Naming Test(fixed-effects model,MD=-1.79,95%CI=-11.19 to Z62,P>0.05)and Aachen Aphasia Test(fixed-effects model,MD=-1.11,95%CI=-4.49 to 2.27,P>0.05).The descriptive analysis showed positive results in language performances of naming,repetition,and comprehension.Conclusion:This systematic review indicated that CIAT was efficient for improving language performance with regard to naming,comprehension,repetition,written language,and oral language based on the current evidence.And this review provides some meaningful guides for clinical practice:expand the therapy duration to 2 or 3 h per day,focus on naming,and choose the best assessment tool.It also indicates a need for more rigorous,large-scale,and high-quality trials in the future.展开更多
CSE (China's Standards of English) is China's first English proficiency assessment standard covering all semester. Its abilitygrading reflects the correspondence between the English level of students in each s...CSE (China's Standards of English) is China's first English proficiency assessment standard covering all semester. Its abilitygrading reflects the correspondence between the English level of students in each scholastic school and the ability to describe thedescriptors, and it is targeted and instructive for college English teaching. As an English assessment reform that benefits allChinese English learners, the Chinese English Proficiency Rating Scale has an epoch-making significance. One of the core tasks ofthe construction of foreign language assessment system is to formulate a unified competency assessment standard, and to provide ascientific competency index system and accurate Competency Scale for various foreign language examinations. CSE provides acommon reference framework for English learning, teaching and assessment in China, and combines English learning, Englishteaching and English assessment system. Improving the quality of teaching is a concern of every English teacher. The introductionof CSE undoubtedly provides authoritative criteria and powerful impetus for the deepening and refinement of College Englishreform.展开更多
As a large-scale standardized English proficiency test,IELTS attracts millions of test takers internationally.This essay,drawing on current theories in language testing and assessment,attempts to evaluate the writing ...As a large-scale standardized English proficiency test,IELTS attracts millions of test takers internationally.This essay,drawing on current theories in language testing and assessment,attempts to evaluate the writing test of the IELTS Academic Module.It is concluded that it is a generally valid and reliable test which generates overall beneficial effects,though several threats possibly undermine its test validity and reliability.Some suggestions on improvement are proposed at the end.展开更多
With a growing number of foreign language studies on proficiency outcomes,it is imperative to address the challenge of measuring students’proficiency development in a language program where standardized proficiency t...With a growing number of foreign language studies on proficiency outcomes,it is imperative to address the challenge of measuring students’proficiency development in a language program where standardized proficiency testing is not readily available.This article reports administering a Chinese elicited imitation test(EIT)by an instructor to track students’global oral proficiency development in a small language program in a mid-size U.S.public university.The test results from the EIT of second language(L2)Chinese suggest that this tool can provide the instructor with valuable insights into students’oral proficiency.This study also discusses the potential practical value of using this EIT in a language program with limited resources for standardized proficiency assessment.The hope is that this study will encourage language educators who are not already doing so to start using empirical evidence from a valid and reliable proficiency measurement tool to reflect on,improve,and guide their instructional practices.展开更多
This study reports the development, piloting and initial validation of a test measuring language analytic ability - one foreign language aptitude component for Chinese learners of foreign languages (FL). A test with...This study reports the development, piloting and initial validation of a test measuring language analytic ability - one foreign language aptitude component for Chinese learners of foreign languages (FL). A test with 50 items was constructed and administered to 53 third-year English majors. Rasch analyses showed that the subtest of inductive language learning ability was too easy. After removing misfitting items, the reduced grammatical sensitivity subtest showed satisfactory psychometric properties. The Rasch measures of the students' grammatical sensitivity were also found to be correlated significantly with their TEM-4 scores and their English reading grades, thus providing further evidence for the validity of the this subtest.展开更多
Spacecraft automatic test system, a comprehensive spacecraft test information system based on the various spacecraft test specifications formalized as spacecraft test language, is an important means to improve test ef...Spacecraft automatic test system, a comprehensive spacecraft test information system based on the various spacecraft test specifications formalized as spacecraft test language, is an important means to improve test efficiency. With the new require- ments of the multi-spacecraft test in China, the study of the spacecraft test language becomes a new challenge for spacecraft test field. In this article, a high-order spacecraft test language, China aerospace test and operation language (CATOL), is given asso- ciated with the current test requirements; meanwhile, the structure of the language is presented. Then, for characterizing and formalizing the spacecraft processes, the syntax and operational semantics of one of the sub-languages, CATOL-PR, are defined. Finally, the prototype system of this proposed language is presented. This language will improve the specification of spacecraft test work in China and the efficiency of spacecraft testers, and promote the development in spacecraft automatic test.展开更多
This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context...This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context of a country where English is not the first language of communication, in order to determine whether or not English as the first language of the country in which the test is taken could have a bearing on the test result. As suggested by Chun(2006), this study compares the results achieved by test takers in a non-English speaking environment with those obtained by different test takers in an English speaking environment. To be able to decide whether the Versant is more prone to setting-related bias than other English proficiency tests, the Versant test scores are correlated with the TOEFL scores of the test-takers in a non-English speaking setting and the correlation coefficient is then compared with that achieved in an English-speaking environment. The results suggest that the correlation between the Versant and TOEFL in a non-English-speaking environment is not significantly different from that obtained in an English-speaking environment.展开更多
s To identify the cortical areas engaged during Chinese word processing using func tional magnetic resonance imaging (fMRI) and to examine the reliability and repr oducibility of fMRI for localization of functional a...s To identify the cortical areas engaged during Chinese word processing using func tional magnetic resonance imaging (fMRI) and to examine the reliability and repr oducibility of fMRI for localization of functional areas in the human brain Methods FMRI data were collected on 8 young, right handed, native Chinese speakers duri ng performance of Chinese synonym and homophone judgment tasks on two different clinical MRI systems (1 5 T GE Signa Horizon and 1 5 T Siemens Vision) A cro ss correlation analysis was used to statistically generate the activation map Results Broca's area, Wernicke's area, bilateral extrastriate, and ventral tempo ral cortex were significantly activated during both the synonym and homophone ac tivities There was essentially no difference between results acquired on two d ifferent MRI systems Conclusions FMRI can be used for localizing cortical areas critical to Chinese language proc essing in the human brain The results are reliable and well reproducible acros s different clinical MRI systems展开更多
Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most...Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most flexible and probably the most effective of the objective item types.The multiple-choice item has its characteristics,advantages and disadvantages.We should bring out its strengths to make up for its weaknesses and use it appropriately.Although it has its limitations,it is suitable for large-scale tests and tests dealing with wide-range knowledge.We should correctly ap- ply testing principles and methods in order to make testing more effective and reliable.展开更多
文摘This paper primarily focuses on the major development of language testing since 1980s. Critical to testing is me concept of language proficiency or ability which consists two components of language competence (or language knowledge) and strategic competence. A communicative approach to testing is demanding to catch up with the new teaching concept and to measure a learner' s cornmunicative competence. Communicative language tests are intended to measure people' s ability to use language communicatively in a variety of real life situations.
基金the financial support of Bethel Hearing and Speaking Training Center Inc. for all research studies in this paper
文摘Objective:The paper discusses recent evidence on the assessment of language outcomes in children with hearing loss acquiring oral language. Methods: Research emphasizes that language tests must be specific enough to capture subtle deficits in vocabulary and grammar learning at different developmental ages. The Diagnostic Receptive and Expressive Assessment of Mandarin (DREAM) was carefully designed to be a comprehensive standardized Mandarin assessment normed in China's Mainland. Results:This paper summarizes the evidence-based item design process and validity and reliability results of DREAM. A pilot study reported here shows that DREAM provided detailed information about hearing impaired children's language abilities and can be used to aid intervention planning to maximize progress. Conclusion: DREAM represents an example of translational science, transferring methods from empirical studies of language acquisition in research environments into applied domains such as assessment and intervention. Research on outcomes in China will advance significantly with the availability of evidence-based comprehensive language tests that measure a sufficient age range of skills, are normed on Mandarin speaking children in China's Mainland, and are designed to capture features central to Mandarin language acquisition.
文摘The 37th Language Testing Research Colloquium(LTRC 2015①)was held at Eaton Chelsea Hotel in Toronto Canada during March 16-20,2015.The first two days of March 16-17 were preconference workshop days with March 18-20 as the three main conference days.More than 300 participants from 27 countries and regions joined the conference.The top numbers of the
文摘Assessment and evaluation can not only be used to test the degree of students’acquirement of knowledge but also reflect the qualities and effectiveness of teaching.This research uses SPSS24.0 to carry out the quantitative analysis of the score of a practice test of TEM 4,which provides a reference to the improvement of the TEM4 test and the enlightenment of English teaching.
文摘Post-admission language tests tend to have a restricted range of proficiency levels among test-takers due to considerations made during the admission selection process.Although range restriction can present challenges for proficiency-focused assessment,it can also bring opportunities to zoom in on fine-grained performance profiles of test-takers.This study reports on the validation of a profile-based rating scale for an ESL writing placement test in a US university.The profile-based rating scale was created by employing a three-staged,hybrid scale development approach,to provide not only accurate placement decisions but also fine-grained diagnostic information regarding ESL students’writing performance profiles.The scale strikes a balance between argument development and lexico-grammar,to better account for the range of writing performances among test-takers.To gather validity evidence for the profile-based rating scale,this study employs a sequential,mixed-methods approach to examine the quality of test-taker performances across profiles and rater perceptions on the scale.Nine certified raters were recruited to conduct independent evaluations of lexicogrammar and argumentation on a sample of 150 test-taker performances.These evaluations were subjected to many-facet Rasch measurement analysis to examine the differences across writing performance profiles included in the rating scale.Next,semi-structured,follow-up interviews were conducted with the raters,to complement the quantitative findings on the usability and effectiveness of the scale.The findings provide supportive evidence for the validity of the profile-based rating scale.I argue that by focusing on performance profiles,post-admission language tests can strengthen the alignment across curriculum,instruction,and assessment in ESL writing programs.
文摘With the purpose of describing levels of English language proficiency expected at each stage in our school's Diploma and BA programs, we attempted to compare the level of courses with national standards as embodied in the national TEM4 and TEM8, and with international standards as embodied in international examinations such as Cambridge ESOL, and other descriptions such as the Common European Framework via one quantifiable parameter: vocabulary range. This is justified as vocabulary range offers an approximate but useful guide to the level of a course or a testing system. We hypothesize that the language competence at different levels of our program matches various standard proficiency examinations. Paul Nation's Range software was used both in its standard form using his three BASEWRD files and in an adapted form adding the authors' own BASEWRD files extrapolated from various levels of our textbook series. This enabled us to compare the vocabulary range of our courses with that of both national and international examinations where word lists are available or recoverable. Research results supported the hypotheses suggested.
基金This study received no specific grant from any funding agency in the public,commercial,or not-for-profit sectors
文摘Objectives:This study aimed to systematically evaluate the effects of constraint-induced aphasia therapy(OAT)for aphasic patients reported by randomized controlled trials.Methods:Relevant randomized controlled trials were retrieved from 11 electronic databases.A methodological quality assessment was conducted in accordance with the Cochrane Handbook,and metaanalyses were performed by using RevMan 5.2.A descriptive analysis was conducted when the included trials were not suitable for a meta-analysis.Results:A total of 12 trials were included.A statistically significant group difference was shown from the meta-analysis in the results measured by the Western Aphasia Battery(random-effects model,MD=1.23,95%CI=0.31 to 2.14,P<0.01).However,there were no statistically significant differences shown in the results of the Boston Naming Test(fixed-effects model,MD=-1.79,95%CI=-11.19 to Z62,P>0.05)and Aachen Aphasia Test(fixed-effects model,MD=-1.11,95%CI=-4.49 to 2.27,P>0.05).The descriptive analysis showed positive results in language performances of naming,repetition,and comprehension.Conclusion:This systematic review indicated that CIAT was efficient for improving language performance with regard to naming,comprehension,repetition,written language,and oral language based on the current evidence.And this review provides some meaningful guides for clinical practice:expand the therapy duration to 2 or 3 h per day,focus on naming,and choose the best assessment tool.It also indicates a need for more rigorous,large-scale,and high-quality trials in the future.
文摘CSE (China's Standards of English) is China's first English proficiency assessment standard covering all semester. Its abilitygrading reflects the correspondence between the English level of students in each scholastic school and the ability to describe thedescriptors, and it is targeted and instructive for college English teaching. As an English assessment reform that benefits allChinese English learners, the Chinese English Proficiency Rating Scale has an epoch-making significance. One of the core tasks ofthe construction of foreign language assessment system is to formulate a unified competency assessment standard, and to provide ascientific competency index system and accurate Competency Scale for various foreign language examinations. CSE provides acommon reference framework for English learning, teaching and assessment in China, and combines English learning, Englishteaching and English assessment system. Improving the quality of teaching is a concern of every English teacher. The introductionof CSE undoubtedly provides authoritative criteria and powerful impetus for the deepening and refinement of College Englishreform.
文摘As a large-scale standardized English proficiency test,IELTS attracts millions of test takers internationally.This essay,drawing on current theories in language testing and assessment,attempts to evaluate the writing test of the IELTS Academic Module.It is concluded that it is a generally valid and reliable test which generates overall beneficial effects,though several threats possibly undermine its test validity and reliability.Some suggestions on improvement are proposed at the end.
文摘With a growing number of foreign language studies on proficiency outcomes,it is imperative to address the challenge of measuring students’proficiency development in a language program where standardized proficiency testing is not readily available.This article reports administering a Chinese elicited imitation test(EIT)by an instructor to track students’global oral proficiency development in a small language program in a mid-size U.S.public university.The test results from the EIT of second language(L2)Chinese suggest that this tool can provide the instructor with valuable insights into students’oral proficiency.This study also discusses the potential practical value of using this EIT in a language program with limited resources for standardized proficiency assessment.The hope is that this study will encourage language educators who are not already doing so to start using empirical evidence from a valid and reliable proficiency measurement tool to reflect on,improve,and guide their instructional practices.
基金supported by the Fundamental Research Funds for the Central Universities(105563GK)
文摘This study reports the development, piloting and initial validation of a test measuring language analytic ability - one foreign language aptitude component for Chinese learners of foreign languages (FL). A test with 50 items was constructed and administered to 53 third-year English majors. Rasch analyses showed that the subtest of inductive language learning ability was too easy. After removing misfitting items, the reduced grammatical sensitivity subtest showed satisfactory psychometric properties. The Rasch measures of the students' grammatical sensitivity were also found to be correlated significantly with their TEM-4 scores and their English reading grades, thus providing further evidence for the validity of the this subtest.
基金National Natural Science Foundation of China (61003016) Supported Project of the State Key Laboratory of Software Development Environment (SKLSDE-2009ZX-13)
文摘Spacecraft automatic test system, a comprehensive spacecraft test information system based on the various spacecraft test specifications formalized as spacecraft test language, is an important means to improve test efficiency. With the new require- ments of the multi-spacecraft test in China, the study of the spacecraft test language becomes a new challenge for spacecraft test field. In this article, a high-order spacecraft test language, China aerospace test and operation language (CATOL), is given asso- ciated with the current test requirements; meanwhile, the structure of the language is presented. Then, for characterizing and formalizing the spacecraft processes, the syntax and operational semantics of one of the sub-languages, CATOL-PR, are defined. Finally, the prototype system of this proposed language is presented. This language will improve the specification of spacecraft test work in China and the efficiency of spacecraft testers, and promote the development in spacecraft automatic test.
基金funded by the American University of Sharjah through the university research grant program on a competitive basis
文摘This study investigates a particular use of an application of speech recognition technology in the assessment of English proficiency. The use of the application, called Versant English Test, is examined in the context of a country where English is not the first language of communication, in order to determine whether or not English as the first language of the country in which the test is taken could have a bearing on the test result. As suggested by Chun(2006), this study compares the results achieved by test takers in a non-English speaking environment with those obtained by different test takers in an English speaking environment. To be able to decide whether the Versant is more prone to setting-related bias than other English proficiency tests, the Versant test scores are correlated with the TOEFL scores of the test-takers in a non-English speaking setting and the correlation coefficient is then compared with that achieved in an English-speaking environment. The results suggest that the correlation between the Versant and TOEFL in a non-English-speaking environment is not significantly different from that obtained in an English-speaking environment.
基金ThisstudywassupportedbygrantsfromtheChineseMinistryofScienceandTechnology (G1 9990 540 0 6)NationalInstitutesofHealth USA (5RO1MH55346)
文摘s To identify the cortical areas engaged during Chinese word processing using func tional magnetic resonance imaging (fMRI) and to examine the reliability and repr oducibility of fMRI for localization of functional areas in the human brain Methods FMRI data were collected on 8 young, right handed, native Chinese speakers duri ng performance of Chinese synonym and homophone judgment tasks on two different clinical MRI systems (1 5 T GE Signa Horizon and 1 5 T Siemens Vision) A cro ss correlation analysis was used to statistically generate the activation map Results Broca's area, Wernicke's area, bilateral extrastriate, and ventral tempo ral cortex were significantly activated during both the synonym and homophone ac tivities There was essentially no difference between results acquired on two d ifferent MRI systems Conclusions FMRI can be used for localizing cortical areas critical to Chinese language proc essing in the human brain The results are reliable and well reproducible acros s different clinical MRI systems
文摘Language testing is very important and necessary,and moreover as we all know,nowadays,in English language testing, the muhiple-choice item is most widely used and many users regard the multiple-choice item as the most flexible and probably the most effective of the objective item types.The multiple-choice item has its characteristics,advantages and disadvantages.We should bring out its strengths to make up for its weaknesses and use it appropriately.Although it has its limitations,it is suitable for large-scale tests and tests dealing with wide-range knowledge.We should correctly ap- ply testing principles and methods in order to make testing more effective and reliable.