In order to convey complete meanings,there is a phenomenon in Chinese of using multiple running sentences.Xu Jingning(2023,p.66)states,“In communication,a complete expression of meaning often requires more than one c...In order to convey complete meanings,there is a phenomenon in Chinese of using multiple running sentences.Xu Jingning(2023,p.66)states,“In communication,a complete expression of meaning often requires more than one clause,which is common in human languages.”Domestic research on running sentences includes discussions on defining the concept and structural features of running sentences,sentence properties,sentence pattern classifications and their criteria,as well as issues related to translating running sentences into English.This article primarily focuses on scholarly research into the English translation of running sentences in China,highlighting recent achievements and identifying existing issues in the study of running sentence translation.However,by reviewing literature on the translation of running sentences,it is found that current research in the academic community on non-core running sentences is limited.Therefore,this paper proposes relevant strategies to address this issue.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuab...We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.展开更多
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide...Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.展开更多
Cognitive grammar,as a linguistic theory that attaches importance to the relationship between language and thinking,provides us with a more comprehensive way to understand the structure,semantics and cognitive process...Cognitive grammar,as a linguistic theory that attaches importance to the relationship between language and thinking,provides us with a more comprehensive way to understand the structure,semantics and cognitive processing of noun predicate sentences.Therefore,under the framework of cognitive grammar,this paper tries to analyze the semantic connection and cognitive process in noun predicate sentences from the semantic perspective and the method of example theory,and discusses the motivation of the formation of this construction,so as to provide references for in-depth analysis of the cognitive laws behind noun predicate sentences.展开更多
篇章关系抽取旨在识别篇章中实体对之间的关系.相较于传统的句子级别关系抽取,篇章级别关系抽取任务更加贴近实际应用,但是它对实体对的跨句子推理和上下文信息感知等问题提出了新的挑战.本文提出融合实体和上下文信息(Fuse entity and ...篇章关系抽取旨在识别篇章中实体对之间的关系.相较于传统的句子级别关系抽取,篇章级别关系抽取任务更加贴近实际应用,但是它对实体对的跨句子推理和上下文信息感知等问题提出了新的挑战.本文提出融合实体和上下文信息(Fuse entity and context information,FECI)的篇章关系抽取方法,它包含两个模块,分别是实体信息抽取模块和上下文信息抽取模块.实体信息抽取模块从两个实体中自动地抽取出能够表示实体对关系的特征.上下文信息抽取模块根据实体对的提及位置信息,从篇章中抽取不同的上下文关系特征.本文在三个篇章级别的关系抽取数据集上进行实验,效果得到显著提升.展开更多
文摘In order to convey complete meanings,there is a phenomenon in Chinese of using multiple running sentences.Xu Jingning(2023,p.66)states,“In communication,a complete expression of meaning often requires more than one clause,which is common in human languages.”Domestic research on running sentences includes discussions on defining the concept and structural features of running sentences,sentence properties,sentence pattern classifications and their criteria,as well as issues related to translating running sentences into English.This article primarily focuses on scholarly research into the English translation of running sentences in China,highlighting recent achievements and identifying existing issues in the study of running sentence translation.However,by reviewing literature on the translation of running sentences,it is found that current research in the academic community on non-core running sentences is limited.Therefore,this paper proposes relevant strategies to address this issue.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
文摘We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.
基金supported in part by the National Natural Science Foundation of China under Grants 62273272 and 61873277in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446+1 种基金in part by the Key Research and Development Program of Shaanxi Province under Grant 2023-YBGY-243in part by the Youth Innovation Team of Shaanxi Universities.
文摘Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.
文摘Cognitive grammar,as a linguistic theory that attaches importance to the relationship between language and thinking,provides us with a more comprehensive way to understand the structure,semantics and cognitive processing of noun predicate sentences.Therefore,under the framework of cognitive grammar,this paper tries to analyze the semantic connection and cognitive process in noun predicate sentences from the semantic perspective and the method of example theory,and discusses the motivation of the formation of this construction,so as to provide references for in-depth analysis of the cognitive laws behind noun predicate sentences.
文摘篇章关系抽取旨在识别篇章中实体对之间的关系.相较于传统的句子级别关系抽取,篇章级别关系抽取任务更加贴近实际应用,但是它对实体对的跨句子推理和上下文信息感知等问题提出了新的挑战.本文提出融合实体和上下文信息(Fuse entity and context information,FECI)的篇章关系抽取方法,它包含两个模块,分别是实体信息抽取模块和上下文信息抽取模块.实体信息抽取模块从两个实体中自动地抽取出能够表示实体对关系的特征.上下文信息抽取模块根据实体对的提及位置信息,从篇章中抽取不同的上下文关系特征.本文在三个篇章级别的关系抽取数据集上进行实验,效果得到显著提升.