摘要
【目的】面向在线商品评论,通过探索"产品特征–观点"对应关系的识别方法,抽取商品特征标签,凝练评论精华。在网络资讯良莠混杂的环境下,帮助用户有效获得有价值的资讯。【方法】引入依存语法关系,对评论模板实现自动分类、过滤、泛化并形成模板库。基于模板库和外部词典提取特征标签,同时确立候选标签的筛选过滤机制。【结果】面向真实的网络评论集,本文方法的性能优于单纯过滤与泛化的抽取方法。F值最优达到56.5%,调整参数后,准确率达到65%。【局限】需要在特征抽取前依据评论语句质量进行前期过滤,考虑特征词库的自动化获取,在模板形成过程中,还需添加更多的句法关系,进一步提高特征标签的抽取准确度。【结论】单纯依据句法模板频率进行模板过滤的方法有提升空间。特征抽取过程考虑模板的长度特征,设定抽取窗口,对特征标签进行筛选、合并特征能获取更好的抽取结果。
[Objective] The method of association recognition for features and the relevant opinions is investigated in order to extract features tags and summarize users' generated online reviews, which is helpful for Web users to access useful information effectively, especially when online information normally varies greatly in quality. [Methods] The dependency parsing is employed to obtain the extraction templates, the template library is constructed after the processes of classifying, filtering and generalization. In terms of the templates and the corresponding external lexicons, feature tags are extracted and sifted out according to the filtering rules. [Results] The experiment results indicate that the method outperforms the similar one which is only based on templates filtration or generalization. The performance of F-measure achieves 56.5% and the accuracy could reach 65% by adjusting the corresponding parameters. [Limitations] The filtering strategy for improving the quality of review data is not conducted in the research. Building feature lexicon automatically and adding more syntactic relations need to consider to extend the library of templates and make improvement of extraction accuracy further. [Conclusions] The better performance can be achieved by finding the most appropriate values for the template-specific parameters, such as the length of template, or by adopting an effective filtering window strategy to detect the noise templates.
出处
《现代图书情报技术》
CSSCI
北大核心
2014年第12期44-50,共7页
New Technology of Library and Information Service
基金
广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号:CD13CTS01)的研究成果之一
关键词
评论挖掘
标签抽取
依存句法分析
Review mining Tags extraction Dependency parsing analysis