Slot filling,to extract entities for specific types of information(slot),is a vitally important modular of dialogue systems for automatic diagnosis.Doctor responses can be regarded as the weak supervision of patient q...Slot filling,to extract entities for specific types of information(slot),is a vitally important modular of dialogue systems for automatic diagnosis.Doctor responses can be regarded as the weak supervision of patient queries.In this way,a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue,alleviating the problem of costly and time-consuming data annotation.However,weakly labeled data suffers from extremely noisy samples.To alleviate the problem,we propose a simple and effective Co-WeakTeaching method.The method trains two slot filling models simultaneously.These two models learn from two different weakly labeled data,ensuring learning from two aspects.Then,one model utilizes selected weakly labeled data generated by the other,iteratively.The model,obtained by the Co-WeakTeaching on weakly labeled data,can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated data.Experimental results on these two settings illustrate the effectiveness of the method with an increase of 8.03%and 14.74%in micro and macro f1 scores,respectively.展开更多
There is a growing interest in developing human-computer dialogue systems which is an important branch in the field of artificial intelligence(AI).However,the evaluation of large-scale Chinese human-computer dialogues...There is a growing interest in developing human-computer dialogue systems which is an important branch in the field of artificial intelligence(AI).However,the evaluation of large-scale Chinese human-computer dialogues is still a challenging task.To attract more attention to dialogue evaluation work,we held the fourth Evaluation of Chinese Human-Computer Dialogue Technology(ECDT).It consists of few-shot learning in spoken language understanding(SLU)(Task 1)and knowledge-driven multi-turn dialogue competition(Task 2),the data sets of which are provided by Harbin Institute of Technology and Tsinghua University.In this paper,we will introduce the evaluation tasks and data sets in detail.Meanwhile,we will also analyze the evaluation results and the existing problems in the evaluation.展开更多
The human-computer dialogue has recently attracted extensive attention from both academia and industry as an important branch in the field of artificial intelligence(AI).However,there are few studies on the evaluation...The human-computer dialogue has recently attracted extensive attention from both academia and industry as an important branch in the field of artificial intelligence(AI).However,there are few studies on the evaluation of large-scale Chinese human-computer dialogue systems.In this paper,we introduce the Second Evaluation of Chinese Human-Computer Dialogue Technology,which focuses on the identification of a user’s intents and intelligent processing of intent words.The Evaluation consists of user intent classification(Task 1)and online testing of task-oriented dialogues(Task 2),the data sets of which are provided by iFLYTEK Corporation.The evaluation tasks and data sets are introduced in detail,and meanwhile,the evaluation results and the existing problems in the evaluation are discussed.展开更多
文摘Slot filling,to extract entities for specific types of information(slot),is a vitally important modular of dialogue systems for automatic diagnosis.Doctor responses can be regarded as the weak supervision of patient queries.In this way,a large amount of weakly labeled data can be obtained from unlabeled diagnosis dialogue,alleviating the problem of costly and time-consuming data annotation.However,weakly labeled data suffers from extremely noisy samples.To alleviate the problem,we propose a simple and effective Co-WeakTeaching method.The method trains two slot filling models simultaneously.These two models learn from two different weakly labeled data,ensuring learning from two aspects.Then,one model utilizes selected weakly labeled data generated by the other,iteratively.The model,obtained by the Co-WeakTeaching on weakly labeled data,can be directly tested on testing data or sequentially fine-tuned on a small amount of human-annotated data.Experimental results on these two settings illustrate the effectiveness of the method with an increase of 8.03%and 14.74%in micro and macro f1 scores,respectively.
基金supported by the National Natural Science Foundation of China(No.62076081,No.61772153,No.61936010).
文摘There is a growing interest in developing human-computer dialogue systems which is an important branch in the field of artificial intelligence(AI).However,the evaluation of large-scale Chinese human-computer dialogues is still a challenging task.To attract more attention to dialogue evaluation work,we held the fourth Evaluation of Chinese Human-Computer Dialogue Technology(ECDT).It consists of few-shot learning in spoken language understanding(SLU)(Task 1)and knowledge-driven multi-turn dialogue competition(Task 2),the data sets of which are provided by Harbin Institute of Technology and Tsinghua University.In this paper,we will introduce the evaluation tasks and data sets in detail.Meanwhile,we will also analyze the evaluation results and the existing problems in the evaluation.
文摘The human-computer dialogue has recently attracted extensive attention from both academia and industry as an important branch in the field of artificial intelligence(AI).However,there are few studies on the evaluation of large-scale Chinese human-computer dialogue systems.In this paper,we introduce the Second Evaluation of Chinese Human-Computer Dialogue Technology,which focuses on the identification of a user’s intents and intelligent processing of intent words.The Evaluation consists of user intent classification(Task 1)and online testing of task-oriented dialogues(Task 2),the data sets of which are provided by iFLYTEK Corporation.The evaluation tasks and data sets are introduced in detail,and meanwhile,the evaluation results and the existing problems in the evaluation are discussed.