期刊文献+

基于层次评分函数的多粒度搜索算法研究

Hierarchical scoring function based multi-granularity searching method
下载PDF
导出
摘要 在线论坛中包含了大量的有用信息,通过检索论坛中的数据用户可以方便地获取所需的知识,然而论坛数据的层次特征给内容检索提出了严峻的挑战。针对论坛数据的层次特征,提出了一种基于层次评分函数的多粒度搜索方法。将论坛数据用树型层次结构表示,并基于多个因素提出了融合话题、发言、语句和单词多个粒度的层次评分函数。为了避免多种粒度的数据在返回结果中具有重复性,提出了一种有约束的返回结果最大化模型。将返回结果最大化模型转换为最大独立集合问题,并给出了一种启发式优化算法。实验表明,提出的算法在检索论坛数据时不仅具有很好的效率,而且准确性非常高。 Online forums contains much useful information, which makes it convenient for users to retrieve necessary know- ledge, however, the hierarchical structure of forum data poses great challenges to content retrieve. In order to solve this prob- lem, this paper proposed a hierarchical scoring function based multi-granularity searching method. Firstly, it represented the forum data with trees, and gave a scoring function including topics, posts, sentences and words based on several considera- tions. Secondly, in order to avoid the replication of data in results of multi-granularity, it proposed a maximization model of re- suits with constraints. Finally, it transformed the maximization model of results into the problem of maximal independent sets, and gave a heuristic optimal algorithm. The experiments show that, the proposed method is more efficient and accurate that re- lated works while retrieving forum data.
作者 姜攀 李跃新
出处 《计算机应用研究》 CSCD 北大核心 2016年第1期101-103,121,共4页 Application Research of Computers
基金 湖北省国际交流与合作项目(2012IHA0140) 湖北省教育厅科学技术研究计划指导性项目(B2014153)
关键词 论坛 信息检索 层次评分函数 多粒度搜索 forum information retrieval hierarchical scoring function multi-granularity searching
  • 相关文献

参考文献11

  • 1高俊波,杨静.在线论坛中的意见领袖分析[J].电子科技大学学报,2007,36(6):1249-1252. 被引量:30
  • 2田萱,李冬梅.上下文信息检索研究综述[J].计算机科学,2011,38(9):18-24. 被引量:13
  • 3Huang Y M,Chen J N,Kuo Y H,et al.An intelligent human-expert forum system based on fuzzy information retrieval technique[J].Expert Systems with Applications,2008,34(1):446-458.
  • 4Sondhi P,Gupta M,Zhai C X,et al.Shallow information extraction from medical forum data[C] //Proc of the 23rd International Confe-rence on Computational Linguistics.2010:1158-1166.
  • 5Jha M,Elhadad N.Cancer stage prediction based on patient online discourse[C] //Proc of Workshop on Biomedical Natural Language Processing.2010:64-71.
  • 6Xu Gu,Ma Weiying.Building implicit links from content for forum search[C] //Proc of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,2006:300-307.
  • 7Duan Huizheng,Zhai Cengxiang.Exploiting thread structures to improve smoothing of language models for forum post retrieval[C] //Advances in Information Retrieval.Berlin:Springer,2011:350-361.
  • 8Smiley D,Pugh D E.Apache Solr 3 enterprise search server[M].[S.l.] :Packt Publishing Ltd,2011.
  • 9Ramos J.Using TF-IDF to determine word relevance in document queries[C] //Proc of the 1st International Conference on Machine Lear-ning.2003:100-110.
  • 10Whissell J S,Clarke C L A.Improving document clustering using Okapi BM25 feature weighting[J].Information Retrieval,2011,14(5):466-487.

二级参考文献69

  • 1邹纲,刘洋,刘群,孟遥,于浩,西野文人,亢世勇.面向Internet的中文新词语检测[J].中文信息学报,2004,18(6):1-9. 被引量:59
  • 2高俊波,张敏,王煦法.一种新的征兆发现算法研究[J].小型微型计算机系统,2006,27(4):687-690. 被引量:4
  • 3宫辉,徐渝.高校BBS社群结构与信息传播的影响因素[J].西安交通大学学报(社会科学版),2007,27(1):93-96. 被引量:29
  • 4Berners-Lee T, Hendler J. Publishing On The Semantic Web - the Coming Internet Revolution Will Profoundly Affect Scientific Information[J]. Nature, 2001,410(6832) : 1023-1024.
  • 5Berners-Lee T, Hendler J, Lassila O. The Semantic Web-A New form of Web Content That is Meaningful to Computers Will Unleash a Revolution of New Possibilities-J]. Scientific American, 2001,284 (5) :34-43.
  • 6Dou Z, Song R, Wen J-R, et al. Evaluating the Effectiveness of Personalized Web Search[J]. IEEE Trans Knowl Data Eng, 2009:1178-1190.
  • 7Sheng Q,Shi Z. A knowledge-based data model and query algebra for the next-generation web [J]. Advanced Web Technologies And Applications, 2004,3007 : 489-499.
  • 8WordNet Homepage[EB/OL]. http://wordnet, princeton, edu/.
  • 9Lawrence S. Context in Web Seareh[J]. IEEE Data Engineering Bulletin, 2000,23 (3) : 25-32.
  • 10Pitkow J, Schutze H, Cass T, et al. Personalized search[J]. Communications of the Acm, 2002,45 (9) : 50-55.

共引文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部