期刊文献+

基于特征倾向性的网页特征提取方法研究 被引量:3

Research on web page feature extraction method based on semantic orientation
下载PDF
导出
摘要 网页包含的信息很丰富,传统的TFIDF公式很难满足内容过滤系统的要求。针对网页过滤技术中的特征选择方法存在的问题,加入语义信息,改进TFIDF公式,提出了一种比较适合网页过滤的特征选择方法。该方法综合考虑特征的长度、在网页中的位置信息,并且将情感色彩这种语义信息附加到特征上。实验结果表明,该方法在网页过滤系统中取得了较好的效果,尤其是实时内容过滤系统中,具有一定的实用价值。 Because the html page contains rich information, the traditional TFIDF formula is difiicult to meet the requirements ofcontent filtering systems. Some feature extraction methods for web filtering exist problems, semantic information is added, the TFIDF formula is improved and then a method of featttre extraction is proposed. It is more suitable for web filtering. This method considers the length of features, the location information of pages and the semantic orientation information of features. The experimental results show that the method is suitable for feature extraction in real-time web page filtering with some practical value.
出处 《计算机工程与设计》 CSCD 北大核心 2009年第16期3894-3896,共3页 Computer Engineering and Design
基金 国家自然科学基金项目(60673041) 国家863高技术研究发展计划基金项目(2006AA01Z147)
关键词 网页过滤 特征提取 语义倾向 情感分析 中文信息处理 web page filtering feature extraction semantic orientation emotional analysis Chinese information processing
  • 相关文献

参考文献7

  • 1徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:119
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3刘群,李素建.基于《知网》的词汇语义相似度的计算[C].台北:第三届汉语词汇语义学研讨会,2002.
  • 4Vermeij MJM.The orientation of user options through advers, verbs and nouns[C].3rd Twente Student Conference on IT, Enschede,2005.
  • 5Kamps J,Marx M,Mokken R J.Using WordNet to measure semantic orientation of adjectives [C]. Lisbon: Proceedings of LREC-04,4th International Conference on Language Resources and Evaluation,2004:1115-1118.
  • 6Bo Pang,Lillian Lee.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales [C]. ACL,2005:115-124.
  • 7BK Tsou,RWM Yuen.Polarity classification of celebrity coverage in the Chinese press[C].Virginia:intemational Conference on Intelligence Analysis,2005.

二级参考文献20

  • 1董振东.语义关系的表达和知识系统的建造[J].语言文字应用,1998(3):79-85. 被引量:58
  • 2金珠,林鸿飞,赵晶.基于HowNet的话题跟踪及倾向性分类研究[J].情报学报,2005,24(5):555-561. 被引量:21
  • 3朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 4董振东 董强.[EB/OL].知网.http://www.keenage.com,.
  • 5Vasileios Hatzivassiloglou, Kathleen R. McKeown. Predicting the semantic orientation of adjectives[A]. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the ACL[C], 1997:174- 181.
  • 6Turney, Peter, Littman Michael. Measuring praise and criticism: Inference of semantic orientation from association[J]. ACM Transactions on Information Systems, 2003, 21(4): 315- 346.
  • 7Turney ,Peter. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews[A]. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics[C]. 2002:417 -424.
  • 8Bo Pang,Lillian Lee, Shivanathan Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques[A]. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing[C]. 2002:79 - 86.
  • 9Bo Pang,Lillian Lee. Seeing Stars: Exploiting Class Relationships for Sentiment Categorizalion with respect to Rating Seales[A]. ACL2005, 115-124.
  • 10K Dave, S lawrence, DM Pennock. , Mining the peanut gallery: opinion extraction and semantic classification of product reviews[A]. WWW2003, 519-28.

共引文献439

同被引文献34

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部