图神经网络嵌入维度估计的结构熵极小化方法

Structural Entropy Minimization Method of Embedding Dimension Estimation for Graph Neural Networks

下载PDF

导出

摘要图神经网络已成为当前图结构数据表示学习最常用的方法,在各层级的图结构数据表征、应用及分析任务上都取得了显著的效果。图神经网络学习到的嵌入融合了结构特征和节点语义。按照不同粒度的嵌入对象划分,图神经网络方法被分为节点级、子图级和整图级表示学习方法,也对应不同的下游应用任务。尽管如此,现有的图神经网络模型在嵌入维度设定上,仍依赖工程化的人工经验探索方法,缺乏有理论性依据指导的可计算方法,导致图神经网络表示学习模型在下游应用中往往效果欠佳。本文基于结构熵极小化原理,提出了一种全新的可解释的图神经网络模型嵌入维度估计的理论和方法框架。对于节点级嵌入的图神经网络模型,该框架同时考虑结构熵和节点属性熵,为所有节点的嵌入向量给出一套统一的最优维度估计。对于整图级嵌入或子图级嵌入的图神经网络模型,该框架除了考虑上述两类熵还考虑了图样本间的差异性,为不同复杂度的图样本提供个性化的最优嵌入维度估计。在18个图结构数据集上开展了丰富的下游应用实验,验证了所提框架在图学习分类应用中均有效和稳定地提升了精度,充分证实了所提图神经网络嵌入维度估计的理论和方法的正确性。 Graph neural networks(GNNs)have achieved significant results in the representation,application and analysis of graph-structured data at all levels,and GNNs have become the most popular method in graph-structured data representation learning.The embedding learned by GNNs integrates both graph structure feature and node semantics.By embedding at different granularities,GNN methods are divided into node-level,subgraph-level and whole graph-level representation learning methods,corresponding to different downstream application tasks.However,existing GNN models still rely on artificial experience exploration methods in the decision of embedding dimension,which lacks computation methods with theoretical basis,resulting in sub-optimal performances on GNN representation learning model in downstream applications.Based on the structural entropy minimization principle,this paper proposes a novel theoretical and interpretable framework for embedding dimension estimation in GNN models.For node-level GNN models,this framework provides a unified optimal embedding dimension estimation for all nodes by considering both structure entropy and node attribute entropy.For GNN models with whole graph-level and subgraph-level,the framework also considers the difference between graph samples besides the above two types of entropy,providing a customized optimal embedding dimension estimation for graph samples with different complexities.Extensive downstream experiments are conducted on 18 graph-structured datasets,it is verified that the proposed framework effectively and stably improves the accuracy of graph learning classification tasks,which fully proves the correctness of theory and method for the proposed GNN embedding dimension estimation.

作者彭浩苏丁力李昂生苏剑林孙硕 PENG Hao;SU Dingli;LI Angsheng;SU Jianlin;SUN Shuo(School of Cyber Science and Technology,Beihang University,Beijing 100191,China;School of Computer Science and Engineering,Beihang University,Beijing 100191,China;State Key Laboratory of Software Development Environment,Beihang University,Beijing 100191,China;Shenzhen Zhuiyi Technology Co.,Ltd.,Shenzhen 518054,China)

机构地区北京航空航天大学网络空间安全学院北京航空航天大学计算机学院北京航空航天大学软件开发环境国家重点实验室深圳追一科技有限公司

出处《网络空间安全科学学报》 2023年第3期107-125,共19页 Journal of Cybersecurity

基金北京市自然科学基金(4222030) 国家自然科学基金项目(62322202,61932002)。

关键词维度估计结构熵图神经网络熵可解释性 dimension estimation structure entropy GNN entropy interpretability

分类号 TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1张静,李志晓,高红志.河北省城乡居民消费结构变动研究——基于ELES模型的实证分析[J].沧州师范学院学报,2024,40(1):36-40. 被引量：1
2李晨,王文,王佳欢,刘大海.中国海洋战略性新兴产业技术创新驱动效应测算及驱动要素分解[J].海洋经济,2023,13(6):12-20.
3文钰窈,唐丽娟.GIS技术在城乡规划设计中的应用及分析[J].住宅产业,2024(4):38-40.
4贾春雨.绿色建筑的工程应用及分析[J].住宅与房地产,2024(5):182-184.
5孙国宽.古诗词辅助思政课教学的路径探新[J].内蒙古教育,2024(2):59-66. 被引量：1
6王艳蔚,董长鑫.预制混凝土天沟案例应用及分析[J].混凝土世界,2024(4):48-53.
7闫旸,陈泽秋,邓钧霖.基于多臂老虎机的异质网络表示学习方法[J].天津职业技术师范大学学报,2024,34(1):61-65. 被引量：1
8马玉磊,张兵.基于深度学习的可见光通信系统室内三维定位[J].光学技术,2024,50(2):201-208. 被引量：1
9窦应飞,高玉红,潘哲,石苗苗.房颤患者射频消融术后负性心理与心理灵活性的相关性研究[J].中文科技期刊数据库（文摘版）医药卫生,2024(4):0111-0114.
10孙泽沛,陈江涛,王子豪,潘炜.基于多维特征学习的加密移动应用流量分类方法[J].网络空间安全科学学报,2023,1(3):13-24. 被引量：1

网络空间安全科学学报

2023年第3期

浏览历史

内容加载中请稍等...

图神经网络嵌入维度估计的结构熵极小化方法

相关作者

相关机构

相关主题

浏览历史