摘要
意图识别和槽信息填充的联合模型将口语语言理解(Spoken Language Understanding,SLU)技术提升到了一个新的水平,但由于存在出现频率低或未见过的槽指称项(0-shot slot mentions),模型的序列标注性能受限,而且这些联合模型往往没有利用输入序列存在的语法知识信息。已有研究表明,序列标注任务可以通过引入依赖树结构,辅助推断序列标注中槽的存在。由于中文话语由一串字序列组成,在中文口语语言理解中,输入话语的字和槽信息是一一对应的,因而槽信息填充模型往往是字特征模型。基于词的依赖树结构无法直接应用于基于字特征的槽填充模型。为了解决字词之间的矛盾,该文提出了一种基于字模型的依赖引导槽填充模型(Dependency-guided Character-based Slot Filling model,DCSF),提供一种简洁的方法用于解决将词级依赖树结构引入中文字特征模型的冲突,同时通过对话语中词汇内部关系进行建模,保留了词级上下文信息和分词信息。在公共基准语料库SMP-ECDT和CrossWOZ上的实验结果表明,该模型优于比较模型,特别是在未见过的槽指称项和低资源情况下有很大的改进。
The joint model for intent detection and slot filling have boosted the state of the art of spoken language understanding(SLU).However,the presence of rarely seen or unseen mention degrades the performance of the model.Earlier researches show that sequence labeling task can benefit from the use of dependency tree structure for inferring existence of slot tags.In Chinese spoken language understanding,dominant models for slot filling are character-based hence word-level dependency tree structure can’t be integrated into model directly.In this paper,we propose a dependency-guided character-based slot filling(DCSF)model,which provides a concise way to resolve the conflict of incorporating the word-level dependency tree structure into the character-level model in Chinese.Our DCSF model can integrate dependency tree information into the character-level model while preserving word-level context and segmentation information by modeling different types of relationships between Chinese characters in the utterance.Experimental results on the public benchmark corpora SMP-ECDT and CrossWOZ show our model outperforms the compared models and has a great improvement,especially in low resource and unseen slot mentions scenario.
作者
朱展标
黄沛杰
张业兴
刘树东
张华林
黄均曜
林丕源
ZHU Zhanbiao;HUANG Peijie;ZHANG Yexing;LIU Shudong;ZHANG Hualin;HUANG Junyao;LIN Piyuan(College of Mathematics and Informatics,South China Agricultural University,Guangzhou,Guangdong 510462,China;Guangzhou Key Laboratory of Intelligent Agriculture,Guangzhou,Guangdong 510462,China)
出处
《中文信息学报》
CSCD
北大核心
2022年第8期118-126,共9页
Journal of Chinese Information Processing
基金
广东省自然科学基金(2021A1515011864)
广东省智慧农业重点实验室(201902010081)
国家自然科学基金(71472068)
广东省普通高校特色创新项目(2020KTSCX016)
广东省大学生创新训练计划项目(S202010564169,S202110564051)。
关键词
口语对话理解
槽信息填充
依赖结构
字特征模型
spoken language understanding
slot filling
dependency structure
character-based model