摘要
针对电力客服对话文本存在的错误文本较多、口语化程度高和标注数据少等问题,提出了一种基于双向传播框架的电力客服对话文本挖掘算法.该算法使用外部语料获取情感词和评价属性来扩展电力客服对话评价要素,使用基于词向量的语料相似度计算方法识别长尾词,挖掘出电力客服对话文本的情感词和评价属性.实验结果表明,所提出的算法相比于传统反向传播方法具有更高的识别准确率和情感词提取精度,低频词提取和短语扩展方法也能提升识别精度.
Aiming at the problems of numerous mistaken texts,high colloquialism degree and fewer annotation data existing in customer dialogue texts for power service,a customer dialogue text mining algorithm for power service based on a bidirectional propagation framework was proposed.The external corpus was used in the as-proposed algorithm to obtain the emotional words and evaluation attributes to expand the evaluation elements of customer dialogue with power service,a corpus similarity calculation method based on word vector was used to identify the long tail words,and the emotional words and evaluation attributes of customer dialogue texts in power service were extracted.The experimental results show that the as-proposed algorithm has higher recognition accuracy and emotional word extraction accuracy compared with traditional back propagation methods.At the same time,the low-frequency word extraction and phrase expansion methods can also improve the recognition accuracy.
作者
胡若云
孙钢
丁麒
沈然
谷泓杰
HU Ruo-yun;SUN Gang;DING Qi;SHEN Ran;GU Hong-jie(Electric Power Research Institute,State Grid Zhejiang Electric Power Company,Hangzhou 310009,China)
出处
《沈阳工业大学学报》
EI
CAS
北大核心
2021年第2期188-192,共5页
Journal of Shenyang University of Technology
基金
国家电网有限公司科技项目(5211DS17002Q).
关键词
双向传播
客服
对话
情感词
评价属性
低频词
词向量
短语扩展
bidirectional propagation
customer service
dialogue
emotional word
evaluation attribute
low frequency word
word vector
phrase expansion