期刊文献+

基于抽样的不确定图k最近邻搜索算法 被引量:1

K-NEAREST NEIGHBOR SEARCH ALGORITHM FOR UNCERTAIN GRAPHS BASED ON SAMPLING
下载PDF
导出
摘要 在诸如生物网络或社交网络等各种由不确定数据组成的网络中,不确定图是一种十分重要和普遍使用的数学模型。由于不确定图中计算两点连通概率问题是#P完全问题,其k最近邻查询问题要比确定图复杂得多,并且与"距离"的定义相关。采用"最短距离"作为距离定义,讨论了在不确定图是加权图的情况下,求解k最近邻搜索问题(k-NN问题)。为了克服计算两点连通概率带来的时间指数爆炸问题,提出了一个基于Dijkstra算法的抽样k-NN查询算法,研究了其收敛性和收敛速度,同时通过实验验证了所提出的方法效率优于kMinDist方法并且具有很高的查全率。 Uncertain graphs are a very important and widely used mathematical model in networks composed of uncertain data such as biological networks or social networks. Since the two-point connectivity probability problem in the graph is a complete problem, the k nearest-neighbor query problem is m u c h more complex than the normal graph and is related to the definition of " distance" . Using the shortest distance as the definition of distance, w e discuss the k-nearest neighbor search problem (k- NN ) under the condition that the uncertain graph is a weighted graph. In order to overcome the problem of time exponential explosion caused by computing two-point connectivity probability, a sampling k - N N query algorithm based on Dijkstra algorithm is proposed. The convergence rate and convergence rate are studied. The proposed method is proved to be more efficient than kMinDist method and has high recall rate.
作者 张伟
出处 《计算机应用与软件》 2017年第6期180-186,共7页 Computer Applications and Software
基金 国家科技支撑计划项目(2014BAI17B01) 软件开发环境国家重点实验室开放课题(SKLSDE-2012KF-02)
关键词 人工智能 不确定图 概率图 生物网络 K-NN 抽样技术 Artificial intelligence Uncertain graphs Probability graph Biological network k - N NSampling technique
  • 相关文献

参考文献15

二级参考文献223

  • 1解(亻刍),汪小帆.复杂网络中的社团结构分析算法研究综述[J].复杂系统与复杂性科学,2005,2(3):1-12. 被引量:86
  • 2周水庚 蔚赵春 蒋豪良.图结构数据搜索的概念、问题与进展[J].中国计算机学会通讯,2007,3(8):59-65.
  • 3Shasha D, Wang T L J, Giugno R. Algorithmics and applications of tree and graph searching//Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Madison, 2002:39-52.
  • 4Yan X, Yu P S, Han J. Graph indexing: A frequent structure-based approach//Proeeedings of the 2004 ACM SIGMOD International Conference on Management of Data. Paris, 2004:335-346.
  • 5Saito R, Suzuki H, Hayashizaki Y. Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Research, 2002, 30(5): 1163-1168.
  • 6李建中 于戈 周傲英.不确定性数据管理的要求与挑战[J].中国计算机学会通讯,2009,5(4):6-14.
  • 7高宏 张炜.不确定图数据管理研究现状[J].中国计算机学会通讯,2009,5(4):31-36.
  • 8Dalvi N N, Suciu D. Efficient query evaluation on probabilistic databases. VLDB Journal, 2007, 16(4): 523-544.
  • 9Soliman M A, Ilyas I F, Chang K C. Top-k query processing in uncertain databases//Proceedings of the 23rd International Conference on Data Engineering. Istanbul, 2007:896-905.
  • 10Hintsanen P. The most reliable subgraph problem//Proceedings of the llth European Conference on Principles and Practice of Knowledge Discovery in Databases. Warsaw, 2007: 471-478.

共引文献164

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部