
结合句子级别检索的信息检索模型 被引量:6

Information Retrieval Model Combining Sentence Level Retrieval
摘要 查询词之间的距离较为接近的文档,相关的可能性更大,将这种距离信息用于信息检索模型的构造可有效提高检索的性能。然而直接估计查询词在文档中的距离需要大量的训练文本,且计算复杂度高。该文提出了一种结合句子级别检索的信息检索模型,将文档分为若干个窗口,通过计算句子和查询的相关度考察查询词在给定窗口中的共现性,该方法可增大那些查询词彼此靠近的文档的相关度,从而使得检索模型可返回更为相关的文档。标准数据集上的实验结果表明所提出的模型可以取得较好的性能。 Models exploiting the position and proximity information of query terms in the documents improve the retrieval performance withit's a high computation complexity.The paper presents an approximation method by compute the relevant degree of the sentence to query,resulting an information retrieval model combining sentence level retrieval.Experiment results show our model can get better performance than baseline models.
出处 《中文信息学报》 CSCD 北大核心 2016年第2期107-112,120,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金(61462043 61462045 61562042) 江西省自然科学基金(20151BAB217014)
关键词 信息检索模型 句子级别检索 句子相关度 information retrieval model sentence level retrieval sentence relevant
  • 相关文献


  • 1Christopher D. Manning, Prabhakar Raghavan, Hin- rich Schutze. Introduction to Information retrieval ~-M]. Cambridge:Cambridge University Press, 2009.
  • 2Gerard Salton. Automatic Information Organization and Retrieval[M]. New York: McGraw-Hill, 1968.
  • 3Gerad Salton, Anita Wong, Chung-Shu Yang. A Vec tor Space Model for Automatic Indexing[J]. Commu- nications of the ACM, 1975, 18(11):613-620.
  • 4Gerad Salton, Chung-Shu Yang, Clement T Yu. A Theory of Term Importance in Automatic Text Analy- sis[J]. Journal of the American Society for Informa- tion Science, 1975, 26(1)~33-44.
  • 5Gerard Salton. The Smart Retrieval System-Experi- ments in Automatic Document Processing[M]. New Jersey : Prentice-Hall, 1971.
  • 6Stephen P Harter. A Probabilistic Approach to Auto- matic Keyword Indexing (part I ~ II)FJ]. Journal of the American Society for Information Science, 1975, 26 .. 197-206, 280-289.
  • 7Stephen E Robertson, Cornelis J, Van Rijsbergen, et al. Probabilistic Models of Indexing and Searching [C]//Proceedings of the 3rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGRIR'80 ), Cambridge, UK, 1981: 35-56.
  • 8Norbert Fuhr. Probabilistic Models in Information Re-trievall-J]. The Computer Journal, 1992, 35(3):243- 255.
  • 9M E Maron, J L Kuhns. On Relevance, Probabilistic Indexing and Information RetrievalI-J]. Journal of the ACM, 1960, 7:216-244.
  • 10Stephen E Robertson, Cornelis J, Van Rijsbergen, et al. Probabilistic Models of Indexing and Searching [C]//Proceedings of the 3rd Annual International ACM SIGIR Conference on Research and Develop- ment in Information Retrieval (SIGRIR'80), Cam- bridge, 1981: 35-56.


  • 1Ponte J M, Croft W B. A language modeling approach to information retrieval // Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. Melbourne, 1998:275-281.
  • 2Miller D R H, Leek T, Schwartz R M. Using hidden markov models for information retrieval // Pro- ceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. Melbourne, 1998:80-89.
  • 3Berger A, Lafferty J. Information retrieval as statistical translation // Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval.Berkeley, 1999:222-229.
  • 4Lafferty J, Zhai Chengxiang. Document language models, query models, and risk minimization for information retrieval//Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. New Orleans, 2001:111-119.
  • 5Zhai Chengxiang, Lafferty J. Two-stage language models for information retrieval//Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. Tampere, 2002:49-56.
  • 6Li- Yuanhua, Zhai Chengxiang. Positional language models for information retrieval//Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. Boston, 2009:299-306.
  • 7LO Yuanhua, Zhai Chengxiang. Positional relevance model for Pseudo-Relevance feedback//Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. Geneva, 2010:579-586.
  • 8Boudin F, Nie J Y, Dawes M. Positional language models for clinical information retrieval // Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Massachusetts, 2010:108-115.
  • 9Karimzadehgan M, Zhai Chengxiang. Estimation of statistical translation models based on mutual information for Ad Hoc information retrieval // Proceedings of the 33rd international ACM SIGIR conference on Research and information retrieval. Geneva, 2010 development in 323-330.












使用帮助 返回顶部