摘要
近年来,随着旅游类互联网产品的兴起,网络上产生了大量针对目的景点的主观评论,使用深度学习算法对相关评论进行意见挖掘,帮助游客快速了解景区特点并为旅游监管提供依据,已然成为一个新的趋势。如何将细粒度意见挖掘方法,如方面级情感分析,应用到旅游评论中,成为一个迫切需要解决的问题。针对上述问题,结合方面级情感分析中意见词抽取和类别分类两个子任务,文中提出了一种针对旅游评论的基于BERT的端到端意见挖掘方法。首先利用BERT对旅游评论进行编码,再经过下游指针网络解码后对相应的旅游评论进行序列标注,得到<意见词,类别>二元组,以形成完整的观点表达。通过实验与现有的序列标注方法相比,该方法具有更好的挖掘效果。此外,构造了针对中文旅游网站评论的数据集,可用于中文方面级情感分析各个子任务的研究。
In recent years,with the rise of tourism-related Internet products,a large number of subjective comments on specific attractions have been generated on the Internet.It has become a new trend to use deep learning algorithm to mine the opinions of relevant comments,help tourists quickly understand the characteristics of scenic spots and provide basis for tourism supervision.How to apply fine-grained opinion mining methods,such as aspect-level sentiment analysis,on tourism comments has become an urgent problem to be solved.Therefore,we introduce a BERT-based end-to-end opinion mining method for tourism comments,by combing opinion word extraction and category classification,which are the two subtasks of aspect-level sentiment analysis.First,we use BERT to encode the tourism comments and then decode the context representation by passing through the downstream pointer network to do the sequence labeling,which obtains the binary<opinion word,category>to form a complete opinion expression.The experimental results can demonstrate the efficiency of the proposed method compared with the traditional sequence labeling method.In addition,we conduct a data set for comments on Chinese tourism website,which can be used for the research of various subtasks of aspect-level sentiment analysis in Chinese.
作者
蔡玉舒
曹扬
江维
詹瑾瑜
李响
杨瑞
CAI Yu-shu;CAO Yang;JIANG Wei;ZHAN Jin-yu;LI Xiang;YANG Rui(School of Information and Software Engineering,University of Electronic Science and Technology of China,Chengdu 610054,China;CETC Big Data Research Institute Co.,Ltd.,Guiyang 550022,China;Big Data Application on Improving Government Governance Capabilities National Engineering Laboratory,Guiyang 550022,China)
出处
《计算机技术与发展》
2021年第9期118-123,共6页
Computer Technology and Development
基金
提升政府治理能力大数据应用技术国家工程实验室开放基金项目(W-2019007)
国家自然科学基金面上项目(62072076)
中科院计算机体系结构国家重点实验室开放课题(CARCH201811)
中央高校基本科研业务费(ZYGX2019J078)。
关键词
旅游评论意见挖掘
细粒度意见挖掘
BERT
序列标注
指针网络
opinion mining of tourism comments
fine grained opinion mining
BERT
sequence labeling
pointer network