Short-term memory allows individuals to recall stimuli, such as numbers or words, for several seconds to several minutes without rehearsal. Although the capacity of short-term memory is considered to be 7 ±?2 ...Short-term memory allows individuals to recall stimuli, such as numbers or words, for several seconds to several minutes without rehearsal. Although the capacity of short-term memory is considered to be 7 ±?2 items, this can be increased through a process called chunking. For example, in Japan, 11-digit cellular phone numbers and 10-digit toll free numbers are chunked into three groups of three or four digits: 090-XXXX-XXXX and 0120-XXX-XXX, respectively. We use probability theory to predict that the most effective chunking involves groups of three or four items, such as in phone numbers. However, a 16-digit credit card number exceeds the capacity of short-term memory, even when chunked into groups of four digits, such as XXXX-XXXX-XXXX-XXXX. Based on these data, 16-digit credit card numbers should be sufficient for security purposes.展开更多
This letter presents a new chunking method based on Maximum Entropy (ME) model with N-fold template correction model.First two types of machine learning models are described.Based on the analysis of the two models,the...This letter presents a new chunking method based on Maximum Entropy (ME) model with N-fold template correction model.First two types of machine learning models are described.Based on the analysis of the two models,then the chunking model which combines the profits of conditional probability model and rule based model is proposed.The selection of features and rule templates in the chunking model is discussed.Experimental results for the CoNLL-2000 corpus show that this approach achieves impressive accuracy in terms of the F-score:92.93%.Compared with the ME model and ME Markov model,the new chunking model achieves better performance.展开更多
A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn...A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.展开更多
Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data ...Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data is stored in the cloud-fog storage environments.This cloud-Fog based health model allows the users to get health-related data from different sources,and duplicated informa-tion is also available in the background.Therefore,it requires an additional sto-rage area,increase in data acquisition time,and insecure data replication in the environment.This paper is proposed to eliminate the de-duplication data using a window size chunking algorithm with a biased sampling-based bloomfilter and provide the health data security using the Advanced Signature-Based Encryp-tion(ASE)algorithm in the Fog-Cloud Environment(WCA-BF+ASE).This WCA-BF+ASE eliminates the duplicate copy of the data and minimizes its sto-rage space and maintenance cost.The data is also stored in an efficient and in a highly secured manner.The security level in the cloud storage environment Win-dows Chunking Algorithm(WSCA)has got 86.5%,two thresholds two divisors(TTTD)80%,Ordinal in Python(ORD)84.4%,Boom Filter(BF)82%,and the proposed work has got better security storage of 97%.And also,after applying the de-duplication process,the proposed method WCA-BF+ASE has required only less storage space for variousfile sizes of 10 KB for 200,400 MB has taken only 22 KB,and 600 MB has required 35 KB,800 MB has consumed only 38 KB,1000 MB has taken 40 KB of storage spaces.展开更多
English speaking skill is one of the most important skills that senior high students need to obtain in learning English.However,there are still many problems existing in students’speaking practice.As a teaching and l...English speaking skill is one of the most important skills that senior high students need to obtain in learning English.However,there are still many problems existing in students’speaking practice.As a teaching and learning strategy,Chunking is now gradually used in English classroom and has received a positive feedback.Therefore,in this paper,the influence of Chunking on improving English speaking skill among senior high school students will be investigated and analyzed through the methods of questionnaire and the follow-up interview to answer four questions:(1)What effect does Chunking have on the oral fluency of high school students?(2)What effect does Chunking have on the oral accuracy of high school students?(3)What effect does Chunking have on the vocabulary?And(4)Does the English speaking performance relate to genders?After analyzing the results of questionnaire by the SPSS and summing up the interview record,we found that most of them agree the fact that the strategy of Chunking does benefit their oral fluency,oral accuracy,and vocabulary.Also,female students have higher scores than male students.展开更多
Based on variable sized chunking, this paper proposes a content aware chunking scheme, called CAC, that does not assume fully random file contents, but tonsiders the characteristics of the file types. CAC uses a candi...Based on variable sized chunking, this paper proposes a content aware chunking scheme, called CAC, that does not assume fully random file contents, but tonsiders the characteristics of the file types. CAC uses a candidate anchor histogram and the file-type specific knowledge to refine how anchors are determined when performing de- duplication of file data and enforces the selected average chunk size. CAC yields more chunks being found which in turn produces smaller average chtmks and a better reduction in data. We present a detailed evaluation of CAC and the experimental results show that this scheme can improve the compression ratio chunking for file types whose bytes are not randomly distributed (from 11.3% to 16.7% according to different datasets), and improve the write throughput on average by 9.7%.展开更多
This paper suggests a chunk approach to solve the plateau problem among advanced English learners. The paper first discusses the extant problems and then provides a definition of the chunk approach. Based on some rese...This paper suggests a chunk approach to solve the plateau problem among advanced English learners. The paper first discusses the extant problems and then provides a definition of the chunk approach. Based on some research results in cognitive psychology, it analyses the important role that chunks play in language acquisition and production and thus provides a cognitive foundation for implementing the chunk approach in English teaching. The paper also offers a set of classroom activities which can be easily adopted or adapted by other teachers.展开更多
Fluency on oral English has always been the goal of Chinese English learners. Language corpuses offer great convenience to language researches. Prefabricated chunks are a great help for learners to achieve oral Englis...Fluency on oral English has always been the goal of Chinese English learners. Language corpuses offer great convenience to language researches. Prefabricated chunks are a great help for learners to achieve oral English fluency. With the help of computer software, chunks in SECCL are categorized. The conclusion is in the process of chunks acquiring, emphasis should be on content-related chunks, especially specific topic-related ones. One effective way to gain topic-related chunks is to build topic-related English corpus of native speakers.展开更多
This paper aims to demonstrate the pervasiveness of metaphor chunks in News English and introduce effective ways of understanding themcorrectly from the perspective of cognitive linguistics.Considering the difficulty ...This paper aims to demonstrate the pervasiveness of metaphor chunks in News English and introduce effective ways of understanding themcorrectly from the perspective of cognitive linguistics.Considering the difficulty in making out the accurate meaning of metaphor chunks in News Eng-lish,some translation strategies have also been proposed in hopes that it will benefit readers in their understanding and appreciation of News English.展开更多
Language is the most important tool for human beings with the outside world.In order to improve the efficiency of communication,people need to maximize the efficiency of language processing to ensure the smooth produc...Language is the most important tool for human beings with the outside world.In order to improve the efficiency of communication,people need to maximize the efficiency of language processing to ensure the smooth production and understanding of the meaning,although it is a subtle and complex process in human communication.As Dr.Widdowson proposed that language knowledge is largely chunk knowledge in the 1980’s.The process of language output is the process of copying prefabricated Chunks knowledge and then transferring it to language output.Based on data collected before,this paper intends to study the dominant reproduction and implicit output of the English language from the perspective of prefabricated chunks,to play a guiding role in optimizing the output ability of EFL learners.展开更多
Based on the concepts of Lexical Chunks and Multimodal Teaching,this paper focuses the input source of English vocabulary learning,integrating the advantages of Lexical Approach with Multimodal Teaching.As a new teach...Based on the concepts of Lexical Chunks and Multimodal Teaching,this paper focuses the input source of English vocabulary learning,integrating the advantages of Lexical Approach with Multimodal Teaching.As a new teaching exploration of English vocabulary,the teaching practice in classroom has shown that teachers should make full and reasonable use of various teaching means and resources to achieve multimodal teaching of lexical chunks,which is helpful to promote students to learn vocabulary quickly and effectively,and improve students'English language competence and performance.展开更多
Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and e...Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and entity relation extracting is important work to understanding information semantic in natural language processing. Chunk analysis is a shallow parsing method, and entity relation extraction is used in establishing relationship between entities. Because full syntax parsing is complexity in Chinese text understanding, many researchers is more interesting in chunk analysis and relation extraction. Conditional random fields (CRFs) model is the valid probabilistic model to segment and label sequence data. This paper models chunk and entity relation problems in Chinese text. By transforming them into label solution we can use CRFs to realize the chunk analysis and entities relation extraction.展开更多
Lexical chunks minimize the language learners'burden of memorization and play a very important role in saving language pro cessing efforts so as to improve the learners'language fluency,appropriacy and idiomat...Lexical chunks minimize the language learners'burden of memorization and play a very important role in saving language pro cessing efforts so as to improve the learners'language fluency,appropriacy and idiomaticity.Lexical chunks are taken as"scaffolding"in college English teaching to effectively enhance learners'language proficiency.展开更多
文摘Short-term memory allows individuals to recall stimuli, such as numbers or words, for several seconds to several minutes without rehearsal. Although the capacity of short-term memory is considered to be 7 ±?2 items, this can be increased through a process called chunking. For example, in Japan, 11-digit cellular phone numbers and 10-digit toll free numbers are chunked into three groups of three or four digits: 090-XXXX-XXXX and 0120-XXX-XXX, respectively. We use probability theory to predict that the most effective chunking involves groups of three or four items, such as in phone numbers. However, a 16-digit credit card number exceeds the capacity of short-term memory, even when chunked into groups of four digits, such as XXXX-XXXX-XXXX-XXXX. Based on these data, 16-digit credit card numbers should be sufficient for security purposes.
基金Supported by National Natural Science Foundation of China (No.60504021).
文摘This letter presents a new chunking method based on Maximum Entropy (ME) model with N-fold template correction model.First two types of machine learning models are described.Based on the analysis of the two models,then the chunking model which combines the profits of conditional probability model and rule based model is proposed.The selection of features and rule templates in the chunking model is discussed.Experimental results for the CoNLL-2000 corpus show that this approach achieves impressive accuracy in terms of the F-score:92.93%.Compared with the ME model and ME Markov model,the new chunking model achieves better performance.
基金supported by the National Natural Science Foundation of China under Grant No.61173100the Fundamental Research Funds for the Central Universities under Grant No.GDUT10RW202
文摘A hybrid approach to English Part-of-Speech(PoS) tagging with its target application being English-Chinese machine translation in business domain is presented,demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation.A small size of 998 k English annotated corpus in business domain is built semi-automatically based on a new tagset;the maximum entropy model is adopted,and rule-based approach is used in post-processing.The tagger is further applied in Noun Phrase(NP) chunking.Experiments show that our tagger achieves an accuracy of 98.14%,which is a quite satisfactory result.In the application to NP chunking,the tagger gives rise to 2.21% increase in F-score,compared with the results using Stanford tagger.
文摘Due to the development of technology in medicine,millions of health-related data such as scanning the images are generated.It is a great challenge to store the data and handle a massive volume of data.Healthcare data is stored in the cloud-fog storage environments.This cloud-Fog based health model allows the users to get health-related data from different sources,and duplicated informa-tion is also available in the background.Therefore,it requires an additional sto-rage area,increase in data acquisition time,and insecure data replication in the environment.This paper is proposed to eliminate the de-duplication data using a window size chunking algorithm with a biased sampling-based bloomfilter and provide the health data security using the Advanced Signature-Based Encryp-tion(ASE)algorithm in the Fog-Cloud Environment(WCA-BF+ASE).This WCA-BF+ASE eliminates the duplicate copy of the data and minimizes its sto-rage space and maintenance cost.The data is also stored in an efficient and in a highly secured manner.The security level in the cloud storage environment Win-dows Chunking Algorithm(WSCA)has got 86.5%,two thresholds two divisors(TTTD)80%,Ordinal in Python(ORD)84.4%,Boom Filter(BF)82%,and the proposed work has got better security storage of 97%.And also,after applying the de-duplication process,the proposed method WCA-BF+ASE has required only less storage space for variousfile sizes of 10 KB for 200,400 MB has taken only 22 KB,and 600 MB has required 35 KB,800 MB has consumed only 38 KB,1000 MB has taken 40 KB of storage spaces.
文摘English speaking skill is one of the most important skills that senior high students need to obtain in learning English.However,there are still many problems existing in students’speaking practice.As a teaching and learning strategy,Chunking is now gradually used in English classroom and has received a positive feedback.Therefore,in this paper,the influence of Chunking on improving English speaking skill among senior high school students will be investigated and analyzed through the methods of questionnaire and the follow-up interview to answer four questions:(1)What effect does Chunking have on the oral fluency of high school students?(2)What effect does Chunking have on the oral accuracy of high school students?(3)What effect does Chunking have on the vocabulary?And(4)Does the English speaking performance relate to genders?After analyzing the results of questionnaire by the SPSS and summing up the interview record,we found that most of them agree the fact that the strategy of Chunking does benefit their oral fluency,oral accuracy,and vocabulary.Also,female students have higher scores than male students.
基金Supported by the National Natural Science Foundation of China (No.60673001) the State Key Development Program of Basic Research of China (No. 2004CB318203).
文摘Based on variable sized chunking, this paper proposes a content aware chunking scheme, called CAC, that does not assume fully random file contents, but tonsiders the characteristics of the file types. CAC uses a candidate anchor histogram and the file-type specific knowledge to refine how anchors are determined when performing de- duplication of file data and enforces the selected average chunk size. CAC yields more chunks being found which in turn produces smaller average chtmks and a better reduction in data. We present a detailed evaluation of CAC and the experimental results show that this scheme can improve the compression ratio chunking for file types whose bytes are not randomly distributed (from 11.3% to 16.7% according to different datasets), and improve the write throughput on average by 9.7%.
文摘This paper suggests a chunk approach to solve the plateau problem among advanced English learners. The paper first discusses the extant problems and then provides a definition of the chunk approach. Based on some research results in cognitive psychology, it analyses the important role that chunks play in language acquisition and production and thus provides a cognitive foundation for implementing the chunk approach in English teaching. The paper also offers a set of classroom activities which can be easily adopted or adapted by other teachers.
文摘Fluency on oral English has always been the goal of Chinese English learners. Language corpuses offer great convenience to language researches. Prefabricated chunks are a great help for learners to achieve oral English fluency. With the help of computer software, chunks in SECCL are categorized. The conclusion is in the process of chunks acquiring, emphasis should be on content-related chunks, especially specific topic-related ones. One effective way to gain topic-related chunks is to build topic-related English corpus of native speakers.
文摘This paper aims to demonstrate the pervasiveness of metaphor chunks in News English and introduce effective ways of understanding themcorrectly from the perspective of cognitive linguistics.Considering the difficulty in making out the accurate meaning of metaphor chunks in News Eng-lish,some translation strategies have also been proposed in hopes that it will benefit readers in their understanding and appreciation of News English.
文摘Language is the most important tool for human beings with the outside world.In order to improve the efficiency of communication,people need to maximize the efficiency of language processing to ensure the smooth production and understanding of the meaning,although it is a subtle and complex process in human communication.As Dr.Widdowson proposed that language knowledge is largely chunk knowledge in the 1980’s.The process of language output is the process of copying prefabricated Chunks knowledge and then transferring it to language output.Based on data collected before,this paper intends to study the dominant reproduction and implicit output of the English language from the perspective of prefabricated chunks,to play a guiding role in optimizing the output ability of EFL learners.
文摘Based on the concepts of Lexical Chunks and Multimodal Teaching,this paper focuses the input source of English vocabulary learning,integrating the advantages of Lexical Approach with Multimodal Teaching.As a new teaching exploration of English vocabulary,the teaching practice in classroom has shown that teachers should make full and reasonable use of various teaching means and resources to achieve multimodal teaching of lexical chunks,which is helpful to promote students to learn vocabulary quickly and effectively,and improve students'English language competence and performance.
文摘Currently, large amounts of information exist in Web sites and various digital media. Most of them are in natural lan-guage. They are easy to be browsed, but difficult to be understood by computer. Chunk parsing and entity relation extracting is important work to understanding information semantic in natural language processing. Chunk analysis is a shallow parsing method, and entity relation extraction is used in establishing relationship between entities. Because full syntax parsing is complexity in Chinese text understanding, many researchers is more interesting in chunk analysis and relation extraction. Conditional random fields (CRFs) model is the valid probabilistic model to segment and label sequence data. This paper models chunk and entity relation problems in Chinese text. By transforming them into label solution we can use CRFs to realize the chunk analysis and entities relation extraction.
文摘Lexical chunks minimize the language learners'burden of memorization and play a very important role in saving language pro cessing efforts so as to improve the learners'language fluency,appropriacy and idiomaticity.Lexical chunks are taken as"scaffolding"in college English teaching to effectively enhance learners'language proficiency.