We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuab...We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.展开更多
Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefo...Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.展开更多
Based on the paraphrasing of Chinese simple sentences,the complex sentence paraphrasing by using templates are studied.Through the classification of complex sentences,syntactic analysis and structural anal...Based on the paraphrasing of Chinese simple sentences,the complex sentence paraphrasing by using templates are studied.Through the classification of complex sentences,syntactic analysis and structural analysis,the proposed methods construct complex sentence paraphrasing templates that the associated words are as the core.The part of speech tagging is used in the calculation of the similarity between the paraphrasing sentences and the paraphrasing template.The joint complex sentence can be divided into parallel relationship,sequence relationship,selection relationship,progressive relationship,and interpretive relationship’s complex sentences.The subordinate complex sentence can be divided into transition relationship,conditional relationship,hypothesis relationship,causal relationship and objective relationship’s complex sentences.Joint complex sentence and subordinate complex sentence are divided to associated words.By using pretreated sentences,the preliminary experiment is carried out to decide the threshold between the paraphrasing sentence and the template.A small scale paraphrase experiment shows the method is availability,acquire the coverage rate of paraphrasing template 40.20%and the paraphrase correct rate 62.61%.展开更多
Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summariza...Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.展开更多
The present paper attempts to make a contrastive study on Chinese and English word order with a view of identifying the discrepancies and propose its significance to mutual translation.As to the research methodology,q...The present paper attempts to make a contrastive study on Chinese and English word order with a view of identifying the discrepancies and propose its significance to mutual translation.As to the research methodology,qualitative analysis and com parative analysis are adopted when the similarities and differences of the word order of English and Chinese are explained.It can be concluded that Chinese and English word order differs at the phrase level and sentence level:in terms of the discrepancies in phrase structure,they are mainly manifested in adverbial and attributive phrases;as to the discrepancies in sentence structure,they are reflected in simple,complex and special sentences.展开更多
文摘We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.
文摘Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.
文摘Based on the paraphrasing of Chinese simple sentences,the complex sentence paraphrasing by using templates are studied.Through the classification of complex sentences,syntactic analysis and structural analysis,the proposed methods construct complex sentence paraphrasing templates that the associated words are as the core.The part of speech tagging is used in the calculation of the similarity between the paraphrasing sentences and the paraphrasing template.The joint complex sentence can be divided into parallel relationship,sequence relationship,selection relationship,progressive relationship,and interpretive relationship’s complex sentences.The subordinate complex sentence can be divided into transition relationship,conditional relationship,hypothesis relationship,causal relationship and objective relationship’s complex sentences.Joint complex sentence and subordinate complex sentence are divided to associated words.By using pretreated sentences,the preliminary experiment is carried out to decide the threshold between the paraphrasing sentence and the template.A small scale paraphrase experiment shows the method is availability,acquire the coverage rate of paraphrasing template 40.20%and the paraphrase correct rate 62.61%.
基金The National Natural Science Foundation of China(No.61133012)the Humanity and Social Science Foundation of the Ministry of Education(No.12YJCZH274)+1 种基金the Humanity and Social Science Foundation of Jiangxi Province(No.XW1502,TQ1503)the Science and Technology Project of Jiangxi Science and Technology Department(No.20121BBG70050,20142BBG70011)
文摘Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.
文摘The present paper attempts to make a contrastive study on Chinese and English word order with a view of identifying the discrepancies and propose its significance to mutual translation.As to the research methodology,qualitative analysis and com parative analysis are adopted when the similarities and differences of the word order of English and Chinese are explained.It can be concluded that Chinese and English word order differs at the phrase level and sentence level:in terms of the discrepancies in phrase structure,they are mainly manifested in adverbial and attributive phrases;as to the discrepancies in sentence structure,they are reflected in simple,complex and special sentences.