期刊文献+

依存句法模板下的商品特征标签抽取研究 被引量:9

Using Dependency Parsing Pattern to Extract Product Feature Tags
原文传递
导出
摘要 【目的】面向在线商品评论,通过探索"产品特征–观点"对应关系的识别方法,抽取商品特征标签,凝练评论精华。在网络资讯良莠混杂的环境下,帮助用户有效获得有价值的资讯。【方法】引入依存语法关系,对评论模板实现自动分类、过滤、泛化并形成模板库。基于模板库和外部词典提取特征标签,同时确立候选标签的筛选过滤机制。【结果】面向真实的网络评论集,本文方法的性能优于单纯过滤与泛化的抽取方法。F值最优达到56.5%,调整参数后,准确率达到65%。【局限】需要在特征抽取前依据评论语句质量进行前期过滤,考虑特征词库的自动化获取,在模板形成过程中,还需添加更多的句法关系,进一步提高特征标签的抽取准确度。【结论】单纯依据句法模板频率进行模板过滤的方法有提升空间。特征抽取过程考虑模板的长度特征,设定抽取窗口,对特征标签进行筛选、合并特征能获取更好的抽取结果。 [Objective] The method of association recognition for features and the relevant opinions is investigated in order to extract features tags and summarize users' generated online reviews, which is helpful for Web users to access useful information effectively, especially when online information normally varies greatly in quality. [Methods] The dependency parsing is employed to obtain the extraction templates, the template library is constructed after the processes of classifying, filtering and generalization. In terms of the templates and the corresponding external lexicons, feature tags are extracted and sifted out according to the filtering rules. [Results] The experiment results indicate that the method outperforms the similar one which is only based on templates filtration or generalization. The performance of F-measure achieves 56.5% and the accuracy could reach 65% by adjusting the corresponding parameters. [Limitations] The filtering strategy for improving the quality of review data is not conducted in the research. Building feature lexicon automatically and adding more syntactic relations need to consider to extend the library of templates and make improvement of extraction accuracy further. [Conclusions] The better performance can be achieved by finding the most appropriate values for the template-specific parameters, such as the length of template, or by adopting an effective filtering window strategy to detect the noise templates.
作者 聂卉 杜嘉忠
出处 《现代图书情报技术》 CSSCI 北大核心 2014年第12期44-50,共7页 New Technology of Library and Information Service
基金 广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号:CD13CTS01)的研究成果之一
关键词 评论挖掘 标签抽取 依存句法分析 Review mining Tags extraction Dependency parsing analysis
  • 相关文献

参考文献14

  • 1中国互联网络发展状况统计报告(2014年7月)[EB/OL].[2014-07-29]. http://www.cnnic.net.cn/gywm/xwzx/rdxw/2014/ 201407/W020140721559080702009.pdf.
  • 2Liu B. Sentiment Analysis and Opinion Mining [M]. Morgan & Claypool Publishers, 2012.
  • 3Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
  • 4Kim S M, Hovy E. Determining the Sentiment of Opinions [C]. In: Proceedings of the COLING, 2004: 1367-1373.
  • 5Kobayashi N, Inui K, Matsumoto Y, et al. Collecting Evaluative Expressions for Opinion Extraction [A]. // Natural Language Processing (IJCNLP 2004) [M]. Heidelberg, Berlin: Springer, 2005: 596-605.
  • 6Bloom K, Garg N, Argamon S. Extracting Appraisal Expressions [C]. In: Proceedings of Human LanguageTechnology Conferences of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL). 2007: 308-315.
  • 7Zhuang L, Jing F, Zhu X. Movie Review Mining and Summarization [C]. In: Proceedings of the 2006 ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA. ACM, 2006: 43-50.
  • 8娄德成,姚天昉.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006,26(11):2622-2625. 被引量:64
  • 9王素格,吴苏红.基于依存关系的旅游景点评论的特征-观点对抽取[J].中文信息学报,2012,26(3):116-121. 被引量:17
  • 10赵妍妍,秦兵,车万翔,刘挺.基于句法路径的情感评价单元识别[J].软件学报,2011,22(5):887-898. 被引量:58

二级参考文献43

  • 1娄德成,姚天昉.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006,26(11):2622-2625. 被引量:64
  • 2李素建,刘群.汉语组块的定义和获取[C]//孙茂松,陈群秀.语言计算与基于内容的文本处理:全国计算语言学联合学术会议(SWCL2003)论文集.北京:清华大学出版社,2003:110-115.
  • 3张姝,贾文杰,夏迎炬,等.基于CRF的评价对象抽取技术研究[C]//Proceedings of the COAE2008,Harbin,2008:32-37.
  • 4许洪波,孙乐,姚天昉.第三届中文倾向性分析评测总结报告[R].第三届中文倾向性分析评测(COAE2011).2011,1-24.
  • 5Ana-Maria Popescu,Oren Etzioni.Extracting productfFeatures and opinions from reviews[C] //Proceedingsof the Conference on Human Language Technology andEmpirical Methods in Natural Language Processing.2005:32-33.
  • 6Li Zhuang,Feng Jing,Xiaoyan Zhu.Movie reviewmining and summarization[C] //Proceedings of the15th ACM International Conference on Information andKnowledge Management.2006:43-50.
  • 7Nozomi Kobayashi,Kentaro Inui,Yuji Matsumoto.Collecting evaluative expressions for opinion extraction[C] //Proceedings of the 1st International JointConference on Natural Language Processing.2004:584-589.
  • 8Janyce Wiebe,Theresa Wilson,Rebecca Bruce,et al.Learning subjective language[J].ComputationalLinguistics.2004,30(03):277-308.
  • 9G.Somprasertsri,P.Lalitrojwong.Mining Feature-Opinion in online customer reviews for opinionsummarization[J].Journal of Universal ComputerScience.2010,16(6):938-955.
  • 10V.Hatzivassiloglou,KR.McKeown.Predicting thesemantic orientation of adjectives[C] //Proceedings ofthe 35th Annual Meeting of the Association forComputational Linguistics.1997:174-181.

共引文献137

同被引文献89

  • 1冯志伟.特思尼耶尔的从属关系语法[J].当代语言学,1983(1):63-65. 被引量:48
  • 2许力生.语言学研究的语境理论构建[J].浙江大学学报(人文社会科学版),2006,36(4):158-165. 被引量:59
  • 3http://www.csie.ntu.edu.tw/-cjlin/libsvm/.
  • 4NLPIR/ICTCLAS汉语分词系统[EB/OL].[2014-07-19].http://ictclas.nlpir.org/.
  • 5http: //www. cnnic, net. cn/hlwfzyj/hlwxzbg/hlwtjbg/201507/ P020150723549500667087. pdf, 2015.
  • 6HUANG S, NIU Z, SHI C. Automatic construction of domain- specific sentiment lexicon based on constrained label propaga- tion [J]. Knowledge-Based Systems. 2014, 56: 191-200.
  • 7LIU S, CHEN J. A multi-label classification based approach for sentiment classification [ J]. Expert System with Application, 2015, 42- 1083-1093.
  • 8HE Y, ZHOU D. Self-training from labeled features for senti- ment analysis [ J ]. Information Processing and Management, 2011, 47: 606-616.
  • 9ORTIGOSA J, RODRIGUEZ J, ALZATE L, et al. Approaching sentiment analysis by using semi-supervised learning of multi-di- mensional classifiers[ J ]. Neurocomputing, 2012.
  • 10BERGER A, PIETRA S D, PITER V D, et al. A maximum en- tropy approach to natural language processing [ J ]. Computer Linguistics, 1996, 22 (1): 39-71.

引证文献9

二级引证文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部