期刊文献+

INFORMATION RETRIEVAL FOR SHORT DOCUMENTS 被引量:2

INFORMATION RETRIEVAL FOR SHORT DOCUMENTS
下载PDF
导出
摘要 The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is available, the word-use variability problem will have substantial impact on the Information Retrieval (IR) performance. To solve the problem, a new technology to short document retrieval named Reference Document Model (RDM) is put forward in this letter. RDM gets the statistical semantic of the query/document by pseudo feedback both for the query and document from reference documents. The contributions of this model are three-fold: (1) Pseudo feedback both for the query and the document; (2) Building the query model and the document model from reference documents; (3) Flexible indexing units, which can be ally linguistic elements such as documents, paragraphs, sentences, n-grams, term or character. For short document retrieval, RDM achieves significant improvements over the classical probabilistic models on the task of ad hoc retrieval on Text REtrieval Conference (TREC) test sets. Results also show that the shorter the document, the better the RDM performance. The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the ab-stract is available, the word-use variability problem will have substantial impact on the Information Retrieval (IR) performance. To solve the problem, a new technology to short document retrieval named Reference Document Model (RDM) is put forward in this letter. RDM gets the statistical semantic of the query/document by pseudo feedback both for the query and document from reference documents. The contributions of this model are three-fold: (1) Pseudo feedback both for the query and the document; (2) Building the query model and the document model from reference documents; (3) Flexible indexing units, which can be any linguistic elements such as documents, paragraphs, sentences, n-grams, term or character. For short document retrieval, RDM achieves significant improvements over the classical probabilistic models on the task of ad hoc retrieval on Text REtrieval Conference (TREC) test sets. Results also show that the shorter the document, the better the RDM performance.
出处 《Journal of Electronics(China)》 2006年第6期933-936,共4页 电子科学学刊(英文版)
基金 Supported by the Funds of Heilongjiang Outstanding Young Teacher (1151G037).
关键词 Information retrieval Short documents Reference Document Model (RDM) 信息恢复 短文档 基准文档模型 信息论
  • 相关文献

参考文献10

  • 1G. Salton,C. Buckley.Improving retrieval per- formance by relevance feedback[].Journal of the American Society for Information Science.1990
  • 2R. Fidel.Individual variability in online searching behavior[].Proceedings of the American Society for Information Science (ASIS) th Annual Meeting.1985
  • 3G. Salton,A. Wong, C.,S. Yang.A vector space model for information retrieval[].Communications of the ACM.1975
  • 4M. J. Bates.Subject access in online catalogs: a de- sign model[].Journal of the American Society for In- formation Science and Technology.1986
  • 5C. Raman,C. Harr,C. O. Simon, et al.Subwebs for specialized search[].Proceedings of the th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’).2004
  • 6S. Deerwester,S. T. Dumais,G. W. Furnas, et al.Indexing by latent semantic analysis[].Journal of the American Society for Information Science.1990
  • 7C. Zhai.Risk minimization and language modeling in text retrieval[]..2002
  • 8D. Tarr,H. Borko.Factors influencing inter-indexer consistency[].Proceedings of the American Society for Information Science (ASIS) th Annual Meet- ing.1974
  • 9J. Lafferty,C. Zhai.Document language models, query models, and risk minimization for informa- tion retrieval[].Proceedings of the th Annual In- ternational ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’ ).2001
  • 10D. Harman.Relevance feedback revisited[].Proceed- ings of the th Annual International ACM SIGIR Conference on Research and Development in In- formation Retrieval (SIGIR’).1992

同被引文献3

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部