期刊文献+

Context-Sensitive Document Ranking 被引量:2

Context-Sensitive Document Ranking
原文传递
导出
摘要 Ranking is a main research issue in IR-styled keyword search over a set of documents. In this paper, we study a new keyword search problem, called context-sensitive document ranking, which is to rank documents with an additional context that provides additional information about the application domain where the documents are to be searched and ranked. The work is motivated by the fact that additional information associated with the documents can possibly assist users to find more relevant documents when they are unable to find the needed documents from the documents alone. In this paper, a context is a multi-attribute graph, which can represent any information maintained in a relational database, where multi-attribute nodes represent tuples, and edges represent primary key and foreign key references among nodes. The context-sensitive ranking is related to several research issues, how to score documents, how to evaluate the additional information obtained in the context that may contribute to the document ranking, how to rank the documents by combining the scores/costs from the documents and the context. More importantly, the relationships between documents and the information stored in a relational database may be uncertain, because they are from different data sources and the relationships are determined systematically using similarity match which causes uncertainty. In this paper, we concentrate ourselves on these research issues, and provide our solution on how to rank the documents in a context where there exist uncertainty between the documents and the context. We confirm the effectiveness of our approaches by conducting extensive experimental studies using real datasets. We present our findings in this paper. Ranking is a main research issue in IR-styled keyword search over a set of documents. In this paper, we study a new keyword search problem, called context-sensitive document ranking, which is to rank documents with an additional context that provides additional information about the application domain where the documents are to be searched and ranked. The work is motivated by the fact that additional information associated with the documents can possibly assist users to find more relevant documents when they are unable to find the needed documents from the documents alone. In this paper, a context is a multi-attribute graph, which can represent any information maintained in a relational database, where multi-attribute nodes represent tuples, and edges represent primary key and foreign key references among nodes. The context-sensitive ranking is related to several research issues, how to score documents, how to evaluate the additional information obtained in the context that may contribute to the document ranking, how to rank the documents by combining the scores/costs from the documents and the context. More importantly, the relationships between documents and the information stored in a relational database may be uncertain, because they are from different data sources and the relationships are determined systematically using similarity match which causes uncertainty. In this paper, we concentrate ourselves on these research issues, and provide our solution on how to rank the documents in a context where there exist uncertainty between the documents and the context. We confirm the effectiveness of our approaches by conducting extensive experimental studies using real datasets. We present our findings in this paper.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第3期444-457,共14页 计算机科学技术学报(英文版)
基金 supported by the Research Grants Council of the Hong Kong SAR,China,under Grant Nos. 419008 and 419109
关键词 document ranking uncertain ranking structure cost SIMILARITY document ranking, uncertain ranking, structure cost, similarity
  • 相关文献

参考文献40

  • 1Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval. 1st Edition, Addison Wesley, May 1999.
  • 2Kacholia V, Pandit S, Chakrabarti S, Sudaxshan S, Desai R, Kaxambelkar H. Bidirectional expansion for keyword search on graph databases. In Proe. VLDB 2005, Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.505-516.
  • 3Ding B, Yu J X, Wang S, Qin L, Zhang X, Lin X. Finding top- k rain-cost connected trees in databases. In Proc. ICDE2007, Istanbul, Turkey, April 15-20, 2007, pp.836-845.
  • 4Ananthakrishna R, Chaudhuri S, Ganti V. Eliminating fuzzy duplicates in data warehouses. In Proc. VLDB 2002, Hong Kong, China, Aug. 20-23, 2002, pp.586-597.
  • 5Li W S, Candan K S, Vu Q, Agrawal D. Retrieving and organizing web pages by "information unit". In Proc. WWW2001, Hong Kong, China, May 1-5, 2001, pp.230-244.
  • 6Luo Y, Lin X, Wang W, Zhou X. Spark: Top-k keyword query in relational databases. In Proc. SIGMOD 2007, Beijing,China, June 11-14, 2007, pp.115-126.
  • 7He H, Wang H, Yang J, Yu P S. Blinks: Ranked keyword searches on graphs. In Proc. SIGMOD 2007, Beijing, China, June 11-14, 2007, pp.305-316.
  • 8Soliman M A, Ilyas I F, Chang K C C. Top-k query process- ing in uncertain databases. In Proc. ICDE2007, Istanbul, Turkey, April 15-20, 2007, pp.896-905.
  • 9Yi K, Li F, Kollios G, Srivastava D. Efficient processing of top-k queries in uncertain databases with x-Relations. IEEE Trans. Knowl. Data Eng., 2008, 20(12): 1669-1682.
  • 10Hua M, Pei J, Zhang W, Lin X. Ranking queries on uncertain data: A probabilistic threshold approach. In Proc. SIG- MOD 2008, Vancouver, Canada, June 9-12, 2008, pp.673-686.

同被引文献48

  • 1刘煜,郭利.ISI Web of Knowledge(R)平台上信息资源的收集[J].中国科技期刊研究,2003,14(z1):732-736. 被引量:4
  • 2刘建华,张智雄.情报重要度的指标体系和计算方法[R].北京:中国科学院文献情报中心,2011.
  • 3A strategy for American Innovation: Securing Our Economic Growth and Prosperity [ EB/OL ]. [ 2013 - 01 - 09 ]. http :// www. whitehouse, gov/sites/default/files/uploads/Innovation- Strategy. pdf.
  • 4Obama Administration Unveils "Big Data" Initiative: Announces S 200 Million in New R&D Investments[ EB/OL]. [2013-01-09]. http://www, whitehouse, gov/sites/default/files/micros- ites/ostp/big_data_press_release_final 2. pdf.
  • 5Innovation Union Scoreboard [ EB/OL ]. [ 2013 - 01 - 09 ]. ht- tp ://ec. europa, eu/enterpdse/policies/innovation/faets-figures -analysis/innovation- scoreboard/index_en, htm.
  • 6Main Science and Technology Indicators (MSTI) : 2012/1 edi- tion[EB/OL]. [2013-01-09]. http://www, oecd. org/sci- 122edition. htm.
  • 7Ten C "s for Evaluating Interact Sources [ EB/OL ]. [ 2013 -04- 11 ]. http.-//www, montgomerycollege, exlu/Departrnents/ "en % 20C% 20internet% 20sources. htm.
  • 8Stoker D, Cooke A. Evaluation of Networked Information Sources[ J ]. Publicatinns-Esscn University Library, 1995 (18) : 287 -287.
  • 9Evaluating Interact Research Sources [ EB/OL]. [ 2013-01- 09 ]. http://www, virtualsalt, com/evaluSit, him.
  • 100liver K M, Wilkinson G L, Bennett L T. Evaluating the Quali- ty of Intemet Information Sources[ M ]. Washington: Department of Education, 1997.

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部