
用户评论中的标签抽取以及排序 被引量:11

Extraction and Ranking of Tags for User Opinions
摘要 对于一个实体(产品或者商户),往往伴随着成千上万的用户评论。如何从这些冗杂的评论信息中抽取能够描述此实体的精华信息是研究的热点问题。该文提出了一种能够为每个实体抽取特征标签的方法,并且语义去重,保证标签在语义空间内相互独立。首先,对于每个实体的所有评论,进行中文分词、词性标注,并且做依存句法分析。然后,根据每个句子中的依存关系,抽取关键标签,构成此实体的标签库,并且对标签库进行显式语义去重。最后通过K-Means聚类以及Latent Dirichlet Allocation(LDA)主题模型将每个标签映射到语义独立的主题空间,再根据每个标签相对该主题的置信度进行排序。通过以上步骤,可以为每个实体抽取语义独立的关键标签描述,实验中,该文通过对返回标签列表的准确性以及语义多样性进行了统计分析,验证了标签抽取方法的可行性和有效性。 There are usually millions of comments for an entity (e. g. a shop or a product). How to extract the con- sice and useful information to describe the entity is a challenging issue. This paper proposes a method to extract tags without semantic redundancy. First, we perform the word segmentation, POS tagging and dependency parsing for all the comments. Then, we extract tags aeroding to the dependency realtions, and reduce the semantically duplicate tags explicitly. Finally, we map all the tags to the independent semantic space via K-Means and Latent Dirichlet A1- location(LDA), and rank the tag list. according to the topic confidence. The results of the experiments show that our method could extract the tags accurately with semantic independency.
出处 《中文信息学报》 CSCD 北大核心 2012年第5期14-19,45,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60970047 61103151 61173068) 教育部博士点基金资助项目(20110131110028)
关键词 意见挖掘 主题模型 语义独立 标签抽取 排序 opinion mining topic model semantic independent tag extraction ranking
  • 相关文献


  • 1Blei D. M. , A.Y. Ng, M.I. Jordan. Latent dirichlet allocatio[J].The Journal of Machine Learning Re- search, 2003. 3: 993-1022.
  • 2Kobayashi N. , K. Inui, Y. Matsumoto, et al. Collec- ting evaluative expressions for opinion extraction[C]// Proceedings of Natural Language Processing-IJCNLP 2004, 2005: 596-605.
  • 3姚天防,聂青阳,李建超,等.一个用于汉语汽车评论的意见挖掘系统[c]//中文信息处理前沿进展一中国中文信息学会二十五周年学术会议论文集.北京:清华大学出版社,2006,260-281.
  • 4姚天昉,程希文,徐飞玉,汉思·乌思克尔特,王睿.文本意见挖掘综述[J].中文信息学报,2008,22(3):71-80. 被引量:106
  • 5Zhuang L. , F. Jing, X.Y. Zhu, et al. Movie review mining and summarization [C]//Proeeedings of the 15th ACM International Conference on Information and Knowledge Management 2006: 43-50.
  • 6Hu, M. , B. Liu. Mining opinion features in customer reviews[C]//Proceedings of 19th National Conference on Artificial Intelligence: Menlo Park, CA~ Cam- bridge, MA; London' AAAI Press; MIT Press; 1999. 2004; 755-760.
  • 7Ma B. L. W. H. Y. Integrating classification and asso- ciation rule mining[C]//Proceedings of In Knowledge Discovery and Data Mining, 1998.
  • 8Popescu A. M. , O. Etzioni. Extracting product fea- tures and opinions from reviews[C]//Proceedings of HLT-Demo '05 HLT/EMNLP on Interactive Demon- strations Association for Computational Linguistics. 2005: 339-346.
  • 9Etzioni O. , M. Cafarella, D. Downey, et al. Unsu- pervised named-entity extraction from the web: An ex- perimental study[C]//Proceedings of Artificial Intelli- gence, 2005: 165(1): 91-134.
  • 10MacQueen J. Some methods for classification and a- nalysis of multivariate observations[C]//Proceedings of 5th Berkeley Symposium on Mathematical Statis- tics and Probability. California, USA,1967 : 14.


  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2娄德成,姚天昉.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006,26(11):2622-2625. 被引量:64
  • 3徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:119
  • 4姚天昉,等.一个用于汉语汽车评论的意见挖掘系统[A].中文信息处理前沿进展-中国中文信息学会二十五周年学术会议论文集[C].北京:清华大学出版社,2006,260-281.
  • 5S.-M. Kim and E. Hovy. Determining the Sentiment of Opinions [A]. In: Proceedings of COLING-04, the Conference on Computational Linguistics (COLING-2004) [C]. Geneva, Switzerland: 2004, 1367-1373.
  • 6J. Yi, T. Nasukawa, R. Bunescu, and W. Niblack. Sentiment Analyzer; Extracting Sentiments about a Given Topic using Natural Language Processing Techniques [A]. In: Proceedings of the 3rd IEEE International Conference on Dala Mining (ICDM-2003) [C]. Melbourne, Florida: Z003, 427-434.
  • 7M. Hu and B. Liu. Mining Opinion Features in Cus tomer Reviews [A]. In: Proceedings of Nineteeth Na tional Conference on Artificial Intellgience (AAAI 2004) [C]. San Jose, USA: 2004.
  • 8A. M. Popescu and O. Etzioni. Extracting Product Features and Opinions from Reviews [A]. In: Proceedings of HI.T EMNLP-05, the Human Language Technology Conference/Conference on Empirical Methods in Natural Language Processing [C]. Vancouver, Canada.. 2005, 339-346.
  • 9X. Cheng. Automatic Topic Term Detection and Sentiment Classification for Opinion Mining [D]. Master Thesis. Saarbr cken, Germany: The University of Saarland, 2007.
  • 10S. Bethard, H. Yu, A. Thornton, V. Hatzivassiloglou, and D. Jurafsky. Automatic Extraction of Opinion Propositions and their Holders [A]. In.. J. G.Shanahan et al. (eds). Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications [C]. Stanford, USA: 2004.












使用帮助 返回顶部