摘要
问句理解的目标是识别给定话语的潜在意图,并在问答系统中提取所有相关槽位标签。传统方法多使用单一语言语料库构建联合任务模型,忽略了实际场景中用户查询通常是多语言和多样化的事实,因而缺乏能够有效支持多语言联合意图识别和槽位填充的方法。本文提出了一种问句理解联合模型——跨语言双向传播模型(Cross-lingual Bi-directional Propagation Model,XBPM),能够有效处理跨语言意图识别和槽位填充联合任务,其重点是增强模型在多语言场景,特别是中国少数民族语言的识别性能。模型基于跨语言预训练模型的意图识别和槽位填充任务之间的双向连接,赋予其强大的跨语言迁移能力。为了解决少数民族语言语料稀缺问题,本文构建了包括16 548个汉语数据和1 399个哈萨克语数据的跨语言旅游问句数据集(Cross-lingual Tourism Field Question Dataset, XTFQD),为跨语言意图识别和槽位填充联合任务提供了新的训练和评估语料。在公共跨语言问句理解联合数据集MTOD(Multilingual Task Oriented Dialog, MTOD)和跨语言旅游问句数据集XTFQD上进行的对比实验和消融实验表明,与基线模型相比,XBPM模型在单语语料和跨语言场景下都表现出了显著的性能改进,验证了模型的有效性。
The goal of question understanding is to identify the underlying intent of a given utterance and extract all relevant slot labels in a question-answering system.Most traditional methods construct joint task models using a single-language corpus,disregarding the fact that user queries in real-world scenarios are often multilingual and diverse.Therefore,the current state-of-the-art methods lack ef-fective approaches to support multilingual joint intent detection and slot filling.In this paper,we propose a novel question understanding joint model called the Cross-lingual Bi-directional Propagation Model(XBPM),which focuses on enhancing the recognition performance of the model in multilingual scenarios,particularly in the context of Chinese ethnic minority languages.The proposed model leverages bi-directional connections between intent detection and slot filling tasks based on cross-lingual pre-training models,endowing it with strong cross-lingual transferability.Additionally,we construct a multilingual question understanding joint task corpus called XTFQD,which includes utterances in the tourism domain in both Chinese and Kazakh languages,addressing the data scarcity issue in multilingual question understanding joint tasks for ethnic minority languages.Comparative experimental results demonstrate that our model outperforms traditional joint models in terms of cross-lingual transfer performance.Further ablation experiments confirm the effectiveness of the proposed approach.
作者
刘涵
古丽拉·阿东别克
于迎霞
马雅静
LIU Han;ALTENBEK Culi;YU Yingxia;MA Yajing(Xinjiang University,Urumqi 830046,China;The Base of Kazakh and Kirghiz Language of National Language Resources Monitoring and Research Center for Minority Languages,Urumqi 830046,China;Xinjiang Laboratory of Multi-language Information Technology,Urumqi 830046,China)
出处
《中央民族大学学报(自然科学版)》
2024年第4期20-29,56,共11页
Journal of Minzu University of China(Natural Sciences Edition)
关键词
跨语言迁移
意图识别
槽位填充
问答系统
cross-lingual transfer
intent detection
slot filling
question answering system