This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and ro...This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.展开更多
With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific...With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering ...A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algo- rithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic ab- stracting is above 75 %. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.展开更多
The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natura...The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natural language, introduces a method of incremental model construction.展开更多
Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are man...Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are many language phenomena. Firstly, most utterances are ill-formed. Secondly, ellipsis, anaphora and negation are also widely used in Chinese spoken dialogue. Determining how to extract semantic information from incomplete sentences and resolve negation, anaphora and ellipsis is crucial. SHTQS (Shanghai Transportation Query System) is an intelligent telephone-based spoken dialogue system providing information about the best route between any two sites in Shanghai. After a brief description of the system, the natural language processing is emphasized. Speech recognition sentences unavoidably contain errors. In language sequence processing procedures, these errors can be easily passed to the later parts and take on a ripple effect. To detect and recover these from errors as early as possible, language-processing strategies are specially considered. For errors resulting from divided words in speech recognition, segmentation and POS Tagging approaches that can rectify these errors are designed. Since most of the inquiry utterances are ill-formed and negation, anaphora and ellipsis are common language phenomena, the language understanding must be adequately adaptive. So, a partial syntactic parsing scheme is adopted and a chart algorithm is used. The parser is based on unification grammar. The semantic frame that extracts from the best arc set of the chart is used to represent the meaning of sentences. The negation, anaphora and ellipsis are also analyzed and corresponding processing approaches are presented. The accuracy of the language processing part is 88.39% and the testing result shows that the language processing strategies are rational and effective.展开更多
On the basis of the characteristics of Chinese language such.as simple and uniform structure, distinct hierarchy and construction by word order and function words, and in the view of the human cognitive mechanism, a h...On the basis of the characteristics of Chinese language such.as simple and uniform structure, distinct hierarchy and construction by word order and function words, and in the view of the human cognitive mechanism, a hierarchical combination method for computer understanding of Chinese language is put forward in this paper. By this method, the whole information of a sentence is hierarchically combined from the partial information of the basic units of it, with the unification operation under attribute description frames. This method is perfect in combining syntax analysis with semantic analysis, easy to implement, and very suitable for the computer understanding system for processing Chinese language.展开更多
构建了基于BERT的双向连接模式BERT-based Bi-directional Association Model(BBAM)以实现在意图识别和槽位填充之间建立双向关系的目标,来实现意图识别与槽位填充的双向关联,融合两个任务的上下文信息,对意图识别与槽位填充两个任务之...构建了基于BERT的双向连接模式BERT-based Bi-directional Association Model(BBAM)以实现在意图识别和槽位填充之间建立双向关系的目标,来实现意图识别与槽位填充的双向关联,融合两个任务的上下文信息,对意图识别与槽位填充两个任务之间的联系进行深度挖掘,从而优化问句理解的整体性能.为了验证模型在旅游领域中的实用性和有效性,通过远程监督和人工校验构建了旅游领域问句数据集TFQD(Tourism Field Question Dataset),BBAM模型在此数据集上的槽填充任务F 1值得分为95.21%,意图分类准确率(A)为96.71%,整体识别准确率(A_(sentence))高达89.62%,显著优于多种基准模型.所提出的模型在ATIS和Snips两个公开数据集上与主流联合模型进行对比实验后,结果表明其具备一定的泛化能力.展开更多
文摘This work is about the progress of previous related work based on an experiment to improve the intelligence of robotic systems,with the aim of achieving more linguistic communication capabilities between humans and robots.In this paper,the authors attempt an algorithmic approach to natural language generation through hole semantics and by applying the OMAS-III computational model as a grammatical formalism.In the original work,a technical language is used,while in the later works,this has been replaced by a limited Greek natural language dictionary.This particular effort was made to give the evolving system the ability to ask questions,as well as the authors developed an initial dialogue system using these techniques.The results show that the use of these techniques the authors apply can give us a more sophisticated dialogue system in the future.
文摘With the increasing of data on the internet, data analysis has become inescapable to gain time and efficiency, especially in bibliographic information retrieval systems. We can estimate the number of actual scientific journals points to around 40</span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span><span><span style="font-family:""><span style="font-family:Verdana;">000 with about four million articles published each year. Machine learning and deep learning applied to recommender systems had become unavoidable whether in industry or in research. In this current, we propose an optimized interface for bibliographic information retrieval as a </span><span style="font-family:Verdana;">running example, which allows different kind of researchers to find their</span><span style="font-family:Verdana;"> needs following some relevant criteria through natural language understanding. Papers indexed in Web of Science and Scopus are in high demand. Natural language including text and linguistic-based techniques, such as tokenization, named entity recognition, syntactic and semantic analysis, are used to express natural language queries. Our Interface uses association rules to find more related papers for recommendation. Spanning trees are challenged to optimize the search process of the system.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
基金supported by the National Natural Science Foundation of China(No.70572090,No.60305009)the Ph.D.Degree Teacher Foundation of North China Electric Power University.
文摘A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algo- rithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic ab- stracting is above 75 %. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.
基金This paper was supported by the National Natural Science Foundation of China and the National '863' Hi-Tech Programme of China.
文摘The process of understanding natural language can be viewed as the process of model construction. This paper? employing Kripke frame for intuitionistic logic semantics as the implement of model construction for natural language, introduces a method of incremental model construction.
文摘Spoken dialogue systems are an active research field with wide applications. But the differences in the Chinese spoken dialogue system are not as distinct as that of English. In Chinese spoken dialogues, there are many language phenomena. Firstly, most utterances are ill-formed. Secondly, ellipsis, anaphora and negation are also widely used in Chinese spoken dialogue. Determining how to extract semantic information from incomplete sentences and resolve negation, anaphora and ellipsis is crucial. SHTQS (Shanghai Transportation Query System) is an intelligent telephone-based spoken dialogue system providing information about the best route between any two sites in Shanghai. After a brief description of the system, the natural language processing is emphasized. Speech recognition sentences unavoidably contain errors. In language sequence processing procedures, these errors can be easily passed to the later parts and take on a ripple effect. To detect and recover these from errors as early as possible, language-processing strategies are specially considered. For errors resulting from divided words in speech recognition, segmentation and POS Tagging approaches that can rectify these errors are designed. Since most of the inquiry utterances are ill-formed and negation, anaphora and ellipsis are common language phenomena, the language understanding must be adequately adaptive. So, a partial syntactic parsing scheme is adopted and a chart algorithm is used. The parser is based on unification grammar. The semantic frame that extracts from the best arc set of the chart is used to represent the meaning of sentences. The negation, anaphora and ellipsis are also analyzed and corresponding processing approaches are presented. The accuracy of the language processing part is 88.39% and the testing result shows that the language processing strategies are rational and effective.
文摘On the basis of the characteristics of Chinese language such.as simple and uniform structure, distinct hierarchy and construction by word order and function words, and in the view of the human cognitive mechanism, a hierarchical combination method for computer understanding of Chinese language is put forward in this paper. By this method, the whole information of a sentence is hierarchically combined from the partial information of the basic units of it, with the unification operation under attribute description frames. This method is perfect in combining syntax analysis with semantic analysis, easy to implement, and very suitable for the computer understanding system for processing Chinese language.
文摘构建了基于BERT的双向连接模式BERT-based Bi-directional Association Model(BBAM)以实现在意图识别和槽位填充之间建立双向关系的目标,来实现意图识别与槽位填充的双向关联,融合两个任务的上下文信息,对意图识别与槽位填充两个任务之间的联系进行深度挖掘,从而优化问句理解的整体性能.为了验证模型在旅游领域中的实用性和有效性,通过远程监督和人工校验构建了旅游领域问句数据集TFQD(Tourism Field Question Dataset),BBAM模型在此数据集上的槽填充任务F 1值得分为95.21%,意图分类准确率(A)为96.71%,整体识别准确率(A_(sentence))高达89.62%,显著优于多种基准模型.所提出的模型在ATIS和Snips两个公开数据集上与主流联合模型进行对比实验后,结果表明其具备一定的泛化能力.