摘要
数据挖掘算法现面临挑战,这个挑战就是要处理日益增长的复杂对象。对于图数据,随机游走核是有力的容错图匹配方法。由于随机游走核的局部定义,它的适用性取决于潜在图表示的特性。另外通过定义图实例的核函数,数据挖掘算法的整个工具变得可用。迄今为止,已经提出了基于图的游走、子树和循环的图核。一般问题在于,这些核要么运算量大要么受限于他们的表达性。我们试着通过定义基于路径有表达性的图核克服这个问题。由于计算图的所有路径和最长路径是NP-难,我们建议基于最短路径图核。这些核在多项式时间内就可以计算,保持表现力并且仍然是正定的。
Data mining algorithms are facing the challenge to deal with an increasing number of complex objects.For graph data,random walk kernels are powerful methods for error-tolerant graph matching.Because of their local definition,however,the ap plicability of random walk kernels strongly depends on the characteristics of the underlying graph representation.Additionally,a whole toolbox of data mining algorithms becomes available by defining a kernel function on instances of graphs.Graph kernels based on walks,subtrees and cycles in graphs have been proposed so far.As a general problem,these kernels are either computa tionally expensive or limited in their expressiveness.We try to overcome this problem by defining expressive graph kernels which are based on paths.As the computation of all paths and longest paths in a graph is NP-hard,we propose graph kernels based on shortest paths.These kernels are computable in polynomial time,retain expressivity and are still positive definite.
作者
张燕
ZHANG Yan(Department of Information and Control Engineering Institute,Xi'an University of Architecture and Technology,Xi'an 710055,China)
出处
《电脑知识与技术》
2013年第3期1622-1625,1629,共5页
Computer Knowledge and Technology
关键词
NP-难
图核
核方法
随机游走核
最短路径核
正定
NP-hard
graph kernels
kernel methods
random walk kernel
shortest path graph kernel
positive definite