Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) t...Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) technologies have been used to automatically generate biography for any people with online information. One of the key challenges is the entity linking (EL) which can link biography sentence to corresponding entities. Currently the used general EL systems usually generate errors originated from entity name variation and ambiguity. Compared with general text, biography sentences possess unique yet rarely studied relational knowledge (RK) and temporal knowledge (TK), which could sufficiently distinguish entities. This article proposed a new statistical framework called the knowledge enhanced EL (KeEL) system for automated biography construction. It utilizes commonsense knowledge like PK and TK to enhance Entity Linking. The performance of KeEL on Wikipedia data was evaluated. It is shown that, compared with state-of-the-art method, KeEL significantly improves the precision and recall of Entity Linking.展开更多
With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses ...With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses the comments of the learning platform or the learner’s expression for emotion recognition.The research data on other modalities are scarce.Most of the studies also ignore the impact of instructional videos on learners and the guidance of knowledge on data.Because of the need for other modal research data,we construct a synchronous multimodal data set for analyzing learners’emotional states in online learning scenarios.The data set recorded the eye movement data and photoplethysmography(PPG)signals of 68 subjects and the instructional video they watched.For the problem of ignoring the instructional videos on learners and ignoring the knowledge,a multimodal emotion recognition method in video learning based on knowledge enhancement is proposed.This method uses the knowledge-based features extracted from instructional videos,such as brightness,hue,saturation,the videos’clickthrough rate,and emotion generation time,to guide the emotion recognition process of physiological signals.This method uses Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM)networks to extract deeper emotional representation and spatiotemporal information from shallow features.The model uses multi-head attention(MHA)mechanism to obtain critical information in the extracted deep features.Then,Temporal Convolutional Network(TCN)is used to learn the information in the deep features and knowledge-based features.Knowledge-based features are used to supplement and enhance the deep features of physiological signals.Finally,the fully connected layer is used for emotion recognition,and the recognition accuracy reaches 97.51%.Compared with two recent researches,the accuracy improved by 8.57%and 2.11%,respectively.On the four public data sets,our proposed method also achieves better results compared with the two recent researches.The experiment results show that the proposed multimodal emotion recognition method based on knowledge enhancement has good performance and robustness.展开更多
Humanity has fantasized about artificial intelligence tools able to discuss with human beings fluently for decades.Numerous efforts have been proposed ranging from ELIZA to the modern vocal assistants.Despite the larg...Humanity has fantasized about artificial intelligence tools able to discuss with human beings fluently for decades.Numerous efforts have been proposed ranging from ELIZA to the modern vocal assistants.Despite the large interest in this research and innovation field,there is a lack of common understanding on the concept of conversational agents and general over expectations that hide the current limitations of existing solutions.This work proposes a literature review on the subject with a focus on the most promising type of conversational agents that are powered on top of knowledge bases and that can offer the ground knowledge to hold conversation autonomously on different topics.We describe a conceptual architecture to define the knowledge-enhanced conversational agents and investigate different domains of applications.We conclude this work by listing some promising research pathways for future work.展开更多
Chinese medicine(CM)diagnosis intellectualization is one of the hotspots in the research of CM modernization.The traditional CM intelligent diagnosis models transform the CM diagnosis issues into classification issues...Chinese medicine(CM)diagnosis intellectualization is one of the hotspots in the research of CM modernization.The traditional CM intelligent diagnosis models transform the CM diagnosis issues into classification issues,however,it is difficult to solve the problems such as excessive or similar categories.With the development of natural language processing techniques,text generation technique has become increasingly mature.In this study,we aimed to establish the CM diagnosis generation model by transforming the CM diagnosis issues into text generation issues.The semantic context characteristic learning capacity was enhanced referring to Bidirectional Long Short-Term Memory(BILSTM)with Transformer as the backbone network.Meanwhile,the CM diagnosis generation model Knowledge Graph Enhanced Transformer(KGET)was established by introducing the knowledge in medical field to enhance the inferential capability.The KGET model was established based on 566 CM case texts,and was compared with the classic text generation models including Long Short-Term Memory sequence-to-sequence(LSTM-seq2seq),Bidirectional and Auto-Regression Transformer(BART),and Chinese Pre-trained Unbalanced Transformer(CPT),so as to analyze the model manifestations.Finally,the ablation experiments were performed to explore the influence of the optimized part on the KGET model.The results of Bilingual Evaluation Understudy(BLEU),Recall-Oriented Understudy for Gisting Evaluation 1(ROUGE1),ROUGE2 and Edit distance of KGET model were 45.85,73.93,54.59 and 7.12,respectively in this study.Compared with LSTM-seq2seq,BART and CPT models,the KGET model was higher in BLEU,ROUGE1 and ROUGE2 by 6.00–17.09,1.65–9.39 and 0.51–17.62,respectively,and lower in Edit distance by 0.47–3.21.The ablation experiment results revealed that introduction of BILSTM model and prior knowledge could significantly increase the model performance.Additionally,the manual assessment indicated that the CM diagnosis results of the KGET model used in this study were highly consistent with the practical diagnosis results.In conclusion,text generation technology can be effectively applied to CM diagnostic modeling.It can effectively avoid the problem of poor diagnostic performance caused by excessive and similar categories in traditional CM diagnostic classification models.CM diagnostic text generation technology has broad application prospects in the future.展开更多
Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventio...Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventional systems which heavily depend on feature engineering or syntactic parsing.However,the ORE systems do not use robust neural networks such as pre-trained language models to take advantage of large-scale unstructured data effectively.In respons to this issue,a new system entitled Chinese Open Relation Extraction with Knowledge Enhancement(CORE-KE)is presented in this paper.The CORE-KE system employs a pre-trained language model(with the support of a Bidirectional Long Short-Term Memory(BiLSTM)layer and a Masked Conditional Random Field(Masked CRF)layer)on unstructured data in order to improve Chinese open relation extraction.Entity descriptions in Wikidata and additional knowledge(in terms of triple facts)extracted from Chinese ORE datasets are used to fine-tune the pre-trained language model.In addition,syntactic features are further adopted in the training stage of the CORE-KE system for knowledge enhancement.Experimental results of the CORE-KE system on two large-scale datasets of open Chinese entities and relations demonstrate that the CORE-KE system is superior to other ORE systems.The F1-scores of the CORE-KE system on the two datasets have given a relative improvement of 20.1%and 1.3%,when compared with benchmark ORE systems,respectively.The source code is available at https:/github.COm/cjwen15/CORE-KE.展开更多
To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the i...To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the input text,and then sent the expanded text to both the context encoder BERT and the structure encoder GAT to capture the contextual relationship features and structural features of the input text.Finally,the match was determined based on the fusion result of the two features.Experiment results based on the public datasets BQ_corpus and LCQMC showed that KS-BERT outperforms advanced models such as ERNIE 2.0.This Study showed that knowledge enhancement and structure enhancement are two effective ways to improve BERT in short text matching.In BQ_corpus,ACC was improved by 0.2%and 0.3%,respectively,while in LCQMC,ACC was improved by 0.4%and 0.9%,respectively.展开更多
基金supported by the National Natural Science Foundation of China (61035004)
文摘Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) technologies have been used to automatically generate biography for any people with online information. One of the key challenges is the entity linking (EL) which can link biography sentence to corresponding entities. Currently the used general EL systems usually generate errors originated from entity name variation and ambiguity. Compared with general text, biography sentences possess unique yet rarely studied relational knowledge (RK) and temporal knowledge (TK), which could sufficiently distinguish entities. This article proposed a new statistical framework called the knowledge enhanced EL (KeEL) system for automated biography construction. It utilizes commonsense knowledge like PK and TK to enhance Entity Linking. The performance of KeEL on Wikipedia data was evaluated. It is shown that, compared with state-of-the-art method, KeEL significantly improves the precision and recall of Entity Linking.
基金supported by the National Science Foundation of China (Grant Nos.62267001,61906051)。
文摘With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses the comments of the learning platform or the learner’s expression for emotion recognition.The research data on other modalities are scarce.Most of the studies also ignore the impact of instructional videos on learners and the guidance of knowledge on data.Because of the need for other modal research data,we construct a synchronous multimodal data set for analyzing learners’emotional states in online learning scenarios.The data set recorded the eye movement data and photoplethysmography(PPG)signals of 68 subjects and the instructional video they watched.For the problem of ignoring the instructional videos on learners and ignoring the knowledge,a multimodal emotion recognition method in video learning based on knowledge enhancement is proposed.This method uses the knowledge-based features extracted from instructional videos,such as brightness,hue,saturation,the videos’clickthrough rate,and emotion generation time,to guide the emotion recognition process of physiological signals.This method uses Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM)networks to extract deeper emotional representation and spatiotemporal information from shallow features.The model uses multi-head attention(MHA)mechanism to obtain critical information in the extracted deep features.Then,Temporal Convolutional Network(TCN)is used to learn the information in the deep features and knowledge-based features.Knowledge-based features are used to supplement and enhance the deep features of physiological signals.Finally,the fully connected layer is used for emotion recognition,and the recognition accuracy reaches 97.51%.Compared with two recent researches,the accuracy improved by 8.57%and 2.11%,respectively.On the four public data sets,our proposed method also achieves better results compared with the two recent researches.The experiment results show that the proposed multimodal emotion recognition method based on knowledge enhancement has good performance and robustness.
文摘Humanity has fantasized about artificial intelligence tools able to discuss with human beings fluently for decades.Numerous efforts have been proposed ranging from ELIZA to the modern vocal assistants.Despite the large interest in this research and innovation field,there is a lack of common understanding on the concept of conversational agents and general over expectations that hide the current limitations of existing solutions.This work proposes a literature review on the subject with a focus on the most promising type of conversational agents that are powered on top of knowledge bases and that can offer the ground knowledge to hold conversation autonomously on different topics.We describe a conceptual architecture to define the knowledge-enhanced conversational agents and investigate different domains of applications.We conclude this work by listing some promising research pathways for future work.
基金Supported by the National Natural Science Foundation of China(No.82174276 and 82074580)the Key Research and Development Program of Jiangsu Province(No.BE2022712)+2 种基金China Postdoctoral Foundation(No.2021M701674)Postdoctoral Research Program of Jiangsu Province(No.2021K457C)Qinglan Project of Jiangsu Universities 2021。
文摘Chinese medicine(CM)diagnosis intellectualization is one of the hotspots in the research of CM modernization.The traditional CM intelligent diagnosis models transform the CM diagnosis issues into classification issues,however,it is difficult to solve the problems such as excessive or similar categories.With the development of natural language processing techniques,text generation technique has become increasingly mature.In this study,we aimed to establish the CM diagnosis generation model by transforming the CM diagnosis issues into text generation issues.The semantic context characteristic learning capacity was enhanced referring to Bidirectional Long Short-Term Memory(BILSTM)with Transformer as the backbone network.Meanwhile,the CM diagnosis generation model Knowledge Graph Enhanced Transformer(KGET)was established by introducing the knowledge in medical field to enhance the inferential capability.The KGET model was established based on 566 CM case texts,and was compared with the classic text generation models including Long Short-Term Memory sequence-to-sequence(LSTM-seq2seq),Bidirectional and Auto-Regression Transformer(BART),and Chinese Pre-trained Unbalanced Transformer(CPT),so as to analyze the model manifestations.Finally,the ablation experiments were performed to explore the influence of the optimized part on the KGET model.The results of Bilingual Evaluation Understudy(BLEU),Recall-Oriented Understudy for Gisting Evaluation 1(ROUGE1),ROUGE2 and Edit distance of KGET model were 45.85,73.93,54.59 and 7.12,respectively in this study.Compared with LSTM-seq2seq,BART and CPT models,the KGET model was higher in BLEU,ROUGE1 and ROUGE2 by 6.00–17.09,1.65–9.39 and 0.51–17.62,respectively,and lower in Edit distance by 0.47–3.21.The ablation experiment results revealed that introduction of BILSTM model and prior knowledge could significantly increase the model performance.Additionally,the manual assessment indicated that the CM diagnosis results of the KGET model used in this study were highly consistent with the practical diagnosis results.In conclusion,text generation technology can be effectively applied to CM diagnostic modeling.It can effectively avoid the problem of poor diagnostic performance caused by excessive and similar categories in traditional CM diagnostic classification models.CM diagnostic text generation technology has broad application prospects in the future.
基金the high-level university construction special project of Guangdong province,China 2019(No.5041700175)the new engineering research and practice project of the Ministry of Education,China(NO.E-RGZN20201036)。
文摘Open Relation Extraction(ORE)is a task of extracting semantic relations from a text document.Current ORE systems have significantly improved their efficiency in obtaining Chinese relations,when compared with conventional systems which heavily depend on feature engineering or syntactic parsing.However,the ORE systems do not use robust neural networks such as pre-trained language models to take advantage of large-scale unstructured data effectively.In respons to this issue,a new system entitled Chinese Open Relation Extraction with Knowledge Enhancement(CORE-KE)is presented in this paper.The CORE-KE system employs a pre-trained language model(with the support of a Bidirectional Long Short-Term Memory(BiLSTM)layer and a Masked Conditional Random Field(Masked CRF)layer)on unstructured data in order to improve Chinese open relation extraction.Entity descriptions in Wikidata and additional knowledge(in terms of triple facts)extracted from Chinese ORE datasets are used to fine-tune the pre-trained language model.In addition,syntactic features are further adopted in the training stage of the CORE-KE system for knowledge enhancement.Experimental results of the CORE-KE system on two large-scale datasets of open Chinese entities and relations demonstrate that the CORE-KE system is superior to other ORE systems.The F1-scores of the CORE-KE system on the two datasets have given a relative improvement of 20.1%and 1.3%,when compared with benchmark ORE systems,respectively.The source code is available at https:/github.COm/cjwen15/CORE-KE.
文摘To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the input text,and then sent the expanded text to both the context encoder BERT and the structure encoder GAT to capture the contextual relationship features and structural features of the input text.Finally,the match was determined based on the fusion result of the two features.Experiment results based on the public datasets BQ_corpus and LCQMC showed that KS-BERT outperforms advanced models such as ERNIE 2.0.This Study showed that knowledge enhancement and structure enhancement are two effective ways to improve BERT in short text matching.In BQ_corpus,ACC was improved by 0.2%and 0.3%,respectively,while in LCQMC,ACC was improved by 0.4%and 0.9%,respectively.