期刊文献+

一种用于查询扩展词选取的主题模型 被引量:2

A Topic Model for Extracting Expansion Items
下载PDF
导出
摘要 为能在搜索引擎返回的结果集上构建贴近用户意图的主题层,并在文档词与主题间建立映射,将社会化标注引入经典的LDA模型,构建一种基于主题-标签-文档词之间关系的三层主题模型,并将其用于伪相关反馈查询扩展词的选取.实验结果表明,该模型提取的查询扩展词能描述标签的语义,模型用于伪相关反馈后,提取的扩展词能覆盖查询条件,在多数情况下结果列表的NDCG值高于基本伪相关反馈和结果集聚类方法. Topic model can help pseudo feedback in query expansion. The main shortcoming of classic topic model is that a topic level needs to be assumed. For constructing the topic levels of closing users on corpus and creating the mapping between topics and words, social annotation was introduced into classic topic model LDA( latent dirichlet allocation), and a three-level topic model of topic, label and word was constructed, which was applied to choose query expansion of pseudo feedback. The results showed that this model can describe the semantic of the label, and extract the expansion items that covered the query. The model' s NDCG values are higher than those of the classic pseudo feedback and result set clustering.
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第3期348-351,共4页 Journal of Northeastern University(Natural Science)
基金 辽宁省自然科学基金资助项目(20102060)
关键词 主题模型 伪相关反馈 查询扩展 扩展词选取 社会化标注 topic model pseudo feedback query expansion word extraction social annotation
  • 相关文献

参考文献10

  • 1Lin Y,Iin H F, Song J, et al. Social annotation in queryexpansion : a machine learning approach [ C ]//Special InterestGroup on Information Retrieval.Beijing,2011:405 -414.
  • 2Cao G H,Nie J Y,Gao J F,et al. Selecting good expansionterms for pseudo-relevance feedback [ C ]//Special InterestGroup on Information Retrieval. Singapore,2008 :243 -250.
  • 3Zhai C X. Beyond search: statistical topic models for textanalysis [ C ]//Special Interest Group on InformationRetrieval. Beijing,2011:3 -4.
  • 4Lee K S,Croft W B,Allan J. A cluster-based resamplingmethod for pseudo-relevance feedback [ C ] //Special InterestGroup on Information Retrieval. Singapore ,2008 :235 -242.
  • 5Inna G K, Oren K. Cluster-based query expansion [ C ]//Special Interest Group on Information Retrieval. Boston,2009:646 -647.
  • 6Song W, Zhang Y,Liu T,et al. Bridging topic modeling andpersonalized search [ C ] //The 23 rd International Conferenceon Computational Linguistics. Beijing,2010: 1167 -1175.
  • 7郭朋伟,高克宁,张斌.基于评论修正的博客聚类算法[J].东北大学学报(自然科学版),2010,31(6):782-785. 被引量:2
  • 8Tsai F S. A tag-topic model for blog mining [ J ]. ExpertSystems with Applications,2011,38(5) :5330 -5335.
  • 9Ramage D,Hall D, Nallapati R,et al. Manning labeled LDA :a supervised topic model for credit attribution in multi-labeledcorpora [ C ] //Empirical Methods in Natural LanguageProcessing. Singapore,2009 :248 -256.
  • 10Samaneh M, Martin E. ILDA : interdependent LDA model forlearning latent aspects and their ratings from online productreviews [ C ]//Special Interest Group on InformationRetrieval. Beijing,2011:665 -669.

二级参考文献11

  • 1韩家炜.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2006.
  • 2Zhang W, Yu C T, Meng W Y. Opinion retrieval from blogs [ C]//CIKM. Lisboa, 2007 : 831 - 840.
  • 3Agarwal N, Liu H. Blogosphere: research issues, tools, and applications[J]. SIGKDD Explorations, 2008,10 ( 1 ) : 18 -31.
  • 4Ishida K. Extracting spam blogs with co-citation clusters[C] //WWW2008. Beijing, 2008:1043 - 1044.
  • 5Agarwal N, Oliveras M G, Liu H. Clustering blogs with collective wisdom [ C]//ICWE. New York, 2008:336- 339.
  • 6Brooks C H, Montanez N. Improved annotation of the blogosphere via autotagging and hierarchical clustering [C]// WWW2006. Edinburgh, 20061625-632.
  • 7Li B B, Xu S T, Zhang J. Enhancing clustering blog documents by utilizing author/reader comments[C]//ACM Southeast Regional Conference. New York, 2007:94-99.
  • 8Bansal N, Chiang F, Koudas N. Seeking stable clusters in the blogosphere[ C]//VLDB. Vienna, 2007 : 806 - 817.
  • 9Sun A, Suryanto M A, Liu Y. Blog classification using tags: an empirical study[C]//ICADL. Hanoi, 2007 : 307 - 316.
  • 10Han Jia- wei, Kamber M. Data mining concepts and techniques[M]. Translated by Fan Ming, Meng Xiao-feng. Beijing:China Machine Press, 2006:467- 483.

共引文献1

同被引文献19

  • 1Lin Y, Lin H F, Song J, et al. Social annotation in query expansion: a machine learning approach[C]// Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Re- trieval. New York: ACM. 2011: 405-414.
  • 2Xu Yang, Jones Gareth J F, Wang Bin. Query depend- ent pseudo-relevance feedback based on Wikipedia [C]//Proceedings of the 32nd International ACM SI- GIR Conference on Research and Development in Infor- mation Retrieval. New York: ACM, 2009: 59-66.
  • 3Lee K S, Croft W B, Allan J. A cluster-based resam-piing method for pseudo-relevance feedback [C]//Pro- ceedings of the 31st Annual International ACM SIOIR Conference on Research and Development in Information Retrieval. New York: ACM, 2008: 235-242.
  • 4Xu J, Croft W B. Improving the effectiveness of infor- mation retrieval with local context analysis [J]. ACM Trans Inform Syst, 2000, 18(1): 79-112.
  • 5Cao G, Nie J Y, Gao J, Robertson S. Selecting good expansion terms for pseudo-relevance feedback [C]// Proceedings of the 31st Annual International ACM SI- GIR Conference on Research and Development in Infor- mation Retrieval. New York.- ACM, 2008: 243-250.
  • 6Zhai Chengxiang, Lafferty J. A study of smoothing methods for language models applied to information re- trieval [J]. ACM Trans Inform Syst, 2004, 22(2).- 179-214.
  • 7Bilenko M, Basu S, Mooney R J. Integrating con- straints and metric learning in semi-supervised cluste- ring [C]//Proceedings of the Twenty-first International Conference on Machine Learning. New York, 2004: 11.
  • 8闭剑婷,苏一丹.基于潜在语义分析的跨语言查询扩展方法[J].计算机工程,2009,35(10):49-50. 被引量:13
  • 9吴芳,丁玲,张杰,刘金亮.跨语言信息检索中基于本体的查询扩展模型研究[J].计算机教育,2009(17):122-124. 被引量:2
  • 10吴丹,何大庆,王惠临.基于伪相关反馈的跨语言查询扩展[J].情报学报,2010,29(2):232-239. 被引量:19

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部