A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOV...A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect.展开更多
The transformation of basic functions is one of the most commonly used techniques for seismic denoising,which employs sparse representation of seismic data in the transform domain. The choice of transform base functio...The transformation of basic functions is one of the most commonly used techniques for seismic denoising,which employs sparse representation of seismic data in the transform domain. The choice of transform base functions has an influence on denoising results. We propose a learning-type overcomplete dictionary based on the K-singular value decomposition( K-SVD) algorithm. To construct the dictionary and use it for random seismic noise attenuation,we replace fixed transform base functions with an overcomplete redundancy function library. Owing to the adaptability to data characteristics,the learning-type dictionary describes essential data characteristics much better than conventional denoising methods. The sparsest representation of signals is obtained by the learning and training of seismic data. By comparing the same seismic data obtained using the learning-type overcomplete dictionary based on K-SVD and the data obtained using other denoising methods,we find that the learning-type overcomplete dictionary based on the K-SVD algorithm represents the seismic data more sparsely,effectively suppressing the random noise and improving the signal-to-noise ratio.展开更多
A class of stochastic differential equations(SDEs) driven by semimartingale with non-Lipschitz coefficients was studied.By using Gronwall inequality,the non-confluence of solutions is proved under the general conditions.
Vocabulary knowledge is one of the most important aspects of language development. For bilingual students, early vocabulary development often predicts their future bilingual success. This paper examines early bilingua...Vocabulary knowledge is one of the most important aspects of language development. For bilingual students, early vocabulary development often predicts their future bilingual success. This paper examines early bilingual receptive vocabulary knowledge of ethnic minority children(N=135) from two large ethnic language communities(Uyghur and Kazak) in three national-level povertystricken counties in Xinjiang, China. The children’s bilingual vocabulary knowledge was assessed using translated versions of the Peabody Picture Vocabulary Test-IV(PPTV-IV) in Putonghua(PTH) and their mother tongue(MT) Uyghur or Kazak. Data were analyzed through four General Linear Models(GLM). The analyses showed that both groups scored higher in MT vocabulary knowledge than that in their PTH, although the Kazak students’ MT vocabulary scores were lower than those of the Uyghurs. While gender, age, L1, or residence location were not significant factors in differences across the two groups in PTH, among the Kazak children, the main effect of age was significant in MT;and among Uyghur children, residence location had a significant effect. The two groups also differed in patterns of acquisition in different parts of speech(nouns, verbs, and attributes) with Uyghur children performing strongest in MT and PTH verbs. The findings have important implications for ensuring the quality of early bilingual education among impoverished Chinese minority communities.展开更多
We propose a method that can achieve the Naxi-English bilingual word automatic alignment based on a log-linear model.This method defines the different Naxi-English structural feature functions,which are English-Naxi i...We propose a method that can achieve the Naxi-English bilingual word automatic alignment based on a log-linear model.This method defines the different Naxi-English structural feature functions,which are English-Naxi interval switching function and Naxi-English bilingual word position transformation function.With the manually labeled Naxi-English words alignment corpus,the parameters of the model are trained by using the minimum error,thus Naxi-English bilingual word alignment is achieved automatically.Experiments are conducted with IBM Model 3 as a benchmark,and the Naxi language constraints are introduced.The final experiment results show that the proposed alignment method achieves very good results:the introduction of the language characteristic function can effectively improve the accuracy of the Naxi-English Bilingual Word Alignment.展开更多
This paper aims at investigating the teletandem learning interactions between a group of Brazilian students from Instituto Federal de Educagao, Ciencia e Tecnologia do Estado de Goias, Brazil, and a group of foreign s...This paper aims at investigating the teletandem learning interactions between a group of Brazilian students from Instituto Federal de Educagao, Ciencia e Tecnologia do Estado de Goias, Brazil, and a group of foreign students from two German universities. In this study, the Brazilian students helped their foreign partners with Portuguese learning and were helped by them in the English learning. The participants used a synchronous computer software called Openmeetings and also an electronic dictionary as a complementary tool. Adopting a qualitative perspective in the data collection and analysis, this case study was conducted in the second semester of 2010. The data were collected by means of conversation sessions through Openmeetings and were analyzed in the light of studies on sociocultural theory as well as on tandem/teletandem language learning researches. The data analyses showed that the participants used English as an anchoring language to work with Portuguese and English itself, and German was introduced in the teletandem sessions. The data also showed that the whiteboard and the electronic dictionary were used as complementary resources to the use of audio and video for the language learning process the participants engaged in.展开更多
This paper will make an investigation on the properties of lexical causatives from the cross-linguistic perspective. Specifically, we shall contrast the lexical causatives in English, Japanese and Chinese. We adopt Py...This paper will make an investigation on the properties of lexical causatives from the cross-linguistic perspective. Specifically, we shall contrast the lexical causatives in English, Japanese and Chinese. We adopt Pylkk^inen's (2008) minimalist model as the framework. According to this model, the similarity of cross-linguistic causatives is attributed to the presence of the functional head vCAUSE. Variations of causatives in different languages can be attributed to two parameters: (i) whether vCAUSE obligatorily requires the presence of an external argument or not; (ii) the complement of vCAUSE is root-selecting, verb-selecting or phase-selecting. Causatives in languages can be ronghly divided into two types, namely the lexical causatives and the productive ones. As far as lexical causatives are concerned, languages can be classified into Voice-bundling vs. Non-Voice-bundling ones according to whether the presence of an external argument (i.e, causer or cause) is obligatorily required in lexical causatives or not. English is Voice-bundling and Japanese is Non-Voice-bundling. Chinese stands as the third type of languages which may be called semi-Voice-bnndling language since lexical unaccusative causatives in Chinese are Non-Voice-bundling while action-result-compounds unaccusatives (resultative unaccusatives) are Voice-bundling. Causative heads of lexical cat, satires in these three languages are all root-selecting.展开更多
Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isol...Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.展开更多
The purpose of this study is to shed light on the southern part of Italy where Catalonians ruled. Great numbers of Spaniards, principally the Catalonians, headed to that country. This affected the language, and, in tu...The purpose of this study is to shed light on the southern part of Italy where Catalonians ruled. Great numbers of Spaniards, principally the Catalonians, headed to that country. This affected the language, and, in turn, the history of people's last names. At first, some of the Spanish last names were used as nicknames for Italians. The method 1 used to get the data from primary sources was from spending time in Italy for four consecutive summers and visiting towns in the Naples area while I collected surnames found on houses. The Catalonians came to rule Sardinia, and their language and subsequently Spanish were official on the island. The linguistic influence of Spanish does not stop with surnames. A list of Spanish and Basque surnames which is redolent of the history of southern Italy and Sicily is appended. The geolinguistics interest lies in the way that the study of language, both ordinary words and proper nouns, offers important clues to the lives and movements of people of ages past, reflects political and economic aspects and also explains the ethnic origin of people who live in Sicily and Italy today or are descendants of Italians who have been important immigrants in the Americas, in Australia, and indeed around the world.展开更多
The issue of "headedness" is a product of Chomsky's (1988) notion of UG (Universal Grammar) that led to the development of a framework known as P&P (Principles and Parameters) theory. It is this theory we ha...The issue of "headedness" is a product of Chomsky's (1988) notion of UG (Universal Grammar) that led to the development of a framework known as P&P (Principles and Parameters) theory. It is this theory we have adopted for our analysis in this paper. The purpose of this paper is to examine the inconsistency in the value of Head Parameter with reference to the value of DP (determiner phrase) in Yorfib^i. As a native speaker of Yorfib~, the author has adopted an introspective method of data collection and used the intuitive knowledge of other native speakers of the language for necessary clarifications. Despite the fact that English and Yorfib~ are both head-initial, the structure of the NPs (noun phrases) in English shows that the head noun is always pre-modified, making the NP "head-final"--a violation of the value of Head Parameter in the language. This necessitated the need for Abney's (1987) DP hypothesis; in which the determiner heads its own phrase, thereby making a NP in English head-initial. This solves the problem of Head Parameter in English. However, since nouns in Yor/lb^i are post-modified, adopting "DP-analysis" will automatically produce head-final--a violation of the value of Head Parameter in the language. Given the inconsistency in the specification of head-complement order among the noun phrases in English and Yorfib~, this paper proposes to set a parameter for SVO (Subject-Verb-Object) languages with pre-modification (like English) to adopt "DP-analysis", and those with post-modification (like Yorfib^t) to adopt "NP-analysis". This will ensure "head-initial" value for the two categories of SVO languages展开更多
This preliminary study investigates the relationship between lexical richness and communicative effectiveness in placement essays for graduate students whose first language (L1) is Chinese. Lexical richness is measure...This preliminary study investigates the relationship between lexical richness and communicative effectiveness in placement essays for graduate students whose first language (L1) is Chinese. Lexical richness is measured with Nation's web-based vocabulary profiler (Cobb, undated) software, and communicative effectiveness is scored by multiple raters. The structure of the study, lexical data compile review, results analysis, and teaching implications will be discussed. The study indicates that lexical richness needs to be taken into account in scoring the quality of writing. However, further investigation and analysis still need to be developed and discussed on the ongoing study.展开更多
基金supported by the National Natural Science Foundation of China under Grants No.61173100,No.61173101the Fundamental Research Funds for the Central Universities under Grant No.DUT10RW202
文摘A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect.
基金Supported by the National"863"Project(No.2014AA06A605)
文摘The transformation of basic functions is one of the most commonly used techniques for seismic denoising,which employs sparse representation of seismic data in the transform domain. The choice of transform base functions has an influence on denoising results. We propose a learning-type overcomplete dictionary based on the K-singular value decomposition( K-SVD) algorithm. To construct the dictionary and use it for random seismic noise attenuation,we replace fixed transform base functions with an overcomplete redundancy function library. Owing to the adaptability to data characteristics,the learning-type dictionary describes essential data characteristics much better than conventional denoising methods. The sparsest representation of signals is obtained by the learning and training of seismic data. By comparing the same seismic data obtained using the learning-type overcomplete dictionary based on K-SVD and the data obtained using other denoising methods,we find that the learning-type overcomplete dictionary based on the K-SVD algorithm represents the seismic data more sparsely,effectively suppressing the random noise and improving the signal-to-noise ratio.
基金National Natural Science Foundation of China(No.71171003)Natural Science Foundation of Anhui Province of China(No.090416225)Natural Science Foundation of Universities of Anhui Province of China(No.KJ2010A037)
文摘A class of stochastic differential equations(SDEs) driven by semimartingale with non-Lipschitz coefficients was studied.By using Gronwall inequality,the non-confluence of solutions is proved under the general conditions.
文摘Vocabulary knowledge is one of the most important aspects of language development. For bilingual students, early vocabulary development often predicts their future bilingual success. This paper examines early bilingual receptive vocabulary knowledge of ethnic minority children(N=135) from two large ethnic language communities(Uyghur and Kazak) in three national-level povertystricken counties in Xinjiang, China. The children’s bilingual vocabulary knowledge was assessed using translated versions of the Peabody Picture Vocabulary Test-IV(PPTV-IV) in Putonghua(PTH) and their mother tongue(MT) Uyghur or Kazak. Data were analyzed through four General Linear Models(GLM). The analyses showed that both groups scored higher in MT vocabulary knowledge than that in their PTH, although the Kazak students’ MT vocabulary scores were lower than those of the Uyghurs. While gender, age, L1, or residence location were not significant factors in differences across the two groups in PTH, among the Kazak children, the main effect of age was significant in MT;and among Uyghur children, residence location had a significant effect. The two groups also differed in patterns of acquisition in different parts of speech(nouns, verbs, and attributes) with Uyghur children performing strongest in MT and PTH verbs. The findings have important implications for ensuring the quality of early bilingual education among impoverished Chinese minority communities.
基金supported by the National Nature Science Foundation of China under Grants No.60863011,No.61175068,No.61100205,No.60873001the Fundamental Research Funds for the Central Universities under Grant No.2009RC0212+1 种基金the National Innovation Fund for Technology-based Firms under Grant No.11C26215305905the Open Fund of Software Engineering Key Laboratory of Yunnan Province under Grant No.2011SE14
文摘We propose a method that can achieve the Naxi-English bilingual word automatic alignment based on a log-linear model.This method defines the different Naxi-English structural feature functions,which are English-Naxi interval switching function and Naxi-English bilingual word position transformation function.With the manually labeled Naxi-English words alignment corpus,the parameters of the model are trained by using the minimum error,thus Naxi-English bilingual word alignment is achieved automatically.Experiments are conducted with IBM Model 3 as a benchmark,and the Naxi language constraints are introduced.The final experiment results show that the proposed alignment method achieves very good results:the introduction of the language characteristic function can effectively improve the accuracy of the Naxi-English Bilingual Word Alignment.
文摘This paper aims at investigating the teletandem learning interactions between a group of Brazilian students from Instituto Federal de Educagao, Ciencia e Tecnologia do Estado de Goias, Brazil, and a group of foreign students from two German universities. In this study, the Brazilian students helped their foreign partners with Portuguese learning and were helped by them in the English learning. The participants used a synchronous computer software called Openmeetings and also an electronic dictionary as a complementary tool. Adopting a qualitative perspective in the data collection and analysis, this case study was conducted in the second semester of 2010. The data were collected by means of conversation sessions through Openmeetings and were analyzed in the light of studies on sociocultural theory as well as on tandem/teletandem language learning researches. The data analyses showed that the participants used English as an anchoring language to work with Portuguese and English itself, and German was introduced in the teletandem sessions. The data also showed that the whiteboard and the electronic dictionary were used as complementary resources to the use of audio and video for the language learning process the participants engaged in.
文摘This paper will make an investigation on the properties of lexical causatives from the cross-linguistic perspective. Specifically, we shall contrast the lexical causatives in English, Japanese and Chinese. We adopt Pylkk^inen's (2008) minimalist model as the framework. According to this model, the similarity of cross-linguistic causatives is attributed to the presence of the functional head vCAUSE. Variations of causatives in different languages can be attributed to two parameters: (i) whether vCAUSE obligatorily requires the presence of an external argument or not; (ii) the complement of vCAUSE is root-selecting, verb-selecting or phase-selecting. Causatives in languages can be ronghly divided into two types, namely the lexical causatives and the productive ones. As far as lexical causatives are concerned, languages can be classified into Voice-bundling vs. Non-Voice-bundling ones according to whether the presence of an external argument (i.e, causer or cause) is obligatorily required in lexical causatives or not. English is Voice-bundling and Japanese is Non-Voice-bundling. Chinese stands as the third type of languages which may be called semi-Voice-bnndling language since lexical unaccusative causatives in Chinese are Non-Voice-bundling while action-result-compounds unaccusatives (resultative unaccusatives) are Voice-bundling. Causative heads of lexical cat, satires in these three languages are all root-selecting.
文摘Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.
文摘The purpose of this study is to shed light on the southern part of Italy where Catalonians ruled. Great numbers of Spaniards, principally the Catalonians, headed to that country. This affected the language, and, in turn, the history of people's last names. At first, some of the Spanish last names were used as nicknames for Italians. The method 1 used to get the data from primary sources was from spending time in Italy for four consecutive summers and visiting towns in the Naples area while I collected surnames found on houses. The Catalonians came to rule Sardinia, and their language and subsequently Spanish were official on the island. The linguistic influence of Spanish does not stop with surnames. A list of Spanish and Basque surnames which is redolent of the history of southern Italy and Sicily is appended. The geolinguistics interest lies in the way that the study of language, both ordinary words and proper nouns, offers important clues to the lives and movements of people of ages past, reflects political and economic aspects and also explains the ethnic origin of people who live in Sicily and Italy today or are descendants of Italians who have been important immigrants in the Americas, in Australia, and indeed around the world.
文摘The issue of "headedness" is a product of Chomsky's (1988) notion of UG (Universal Grammar) that led to the development of a framework known as P&P (Principles and Parameters) theory. It is this theory we have adopted for our analysis in this paper. The purpose of this paper is to examine the inconsistency in the value of Head Parameter with reference to the value of DP (determiner phrase) in Yorfib^i. As a native speaker of Yorfib~, the author has adopted an introspective method of data collection and used the intuitive knowledge of other native speakers of the language for necessary clarifications. Despite the fact that English and Yorfib~ are both head-initial, the structure of the NPs (noun phrases) in English shows that the head noun is always pre-modified, making the NP "head-final"--a violation of the value of Head Parameter in the language. This necessitated the need for Abney's (1987) DP hypothesis; in which the determiner heads its own phrase, thereby making a NP in English head-initial. This solves the problem of Head Parameter in English. However, since nouns in Yor/lb^i are post-modified, adopting "DP-analysis" will automatically produce head-final--a violation of the value of Head Parameter in the language. Given the inconsistency in the specification of head-complement order among the noun phrases in English and Yorfib~, this paper proposes to set a parameter for SVO (Subject-Verb-Object) languages with pre-modification (like English) to adopt "DP-analysis", and those with post-modification (like Yorfib^t) to adopt "NP-analysis". This will ensure "head-initial" value for the two categories of SVO languages
文摘This preliminary study investigates the relationship between lexical richness and communicative effectiveness in placement essays for graduate students whose first language (L1) is Chinese. Lexical richness is measured with Nation's web-based vocabulary profiler (Cobb, undated) software, and communicative effectiveness is scored by multiple raters. The structure of the study, lexical data compile review, results analysis, and teaching implications will be discussed. The study indicates that lexical richness needs to be taken into account in scoring the quality of writing. However, further investigation and analysis still need to be developed and discussed on the ongoing study.