Graph embedding aims to map the high-dimensional nodes to a low-dimensional space and learns the graph relationship from its latent representations.Most existing graph embedding methods focus on the topological struct...Graph embedding aims to map the high-dimensional nodes to a low-dimensional space and learns the graph relationship from its latent representations.Most existing graph embedding methods focus on the topological structure of graph data,but ignore the semantic information of graph data,which results in the unsatisfied performance in practical applications.To overcome the problem,this paper proposes a novel deep convolutional adversarial graph autoencoder(GAE)model.To embed the semantic information between nodes in the graph data,the random walk strategy is first used to construct the positive pointwise mutual information(PPMI)matrix,then,graph convolutional net-work(GCN)is employed to encode the PPMI matrix and node content into the latent representation.Finally,the learned latent representation is used to reconstruct the topological structure of the graph data by decoder.Furthermore,the deep convolutional adversarial training algorithm is introduced to make the learned latent representation conform to the prior distribution better.The state-of-the-art experimental results on the graph data validate the effectiveness of the proposed model in the link prediction,node clustering and graph visualization tasks for three standard datasets,Cora,Citeseer and Pubmed.展开更多
Autoencoder-based rating prediction methods with external attributes have received wide attention due to their ability to accurately capture users'preferences.However,existing methods still have two significant li...Autoencoder-based rating prediction methods with external attributes have received wide attention due to their ability to accurately capture users'preferences.However,existing methods still have two significant limitations:i)External attributes are often unavailable in the real world due to privacy issues,leading to low quality of representations;and ii)existing methods lack considering complex associations in users'rating behaviors during the encoding process.To meet these challenges,this paper innovatively proposes an inherent-attribute-aware dual-graph autoencoder,named IADGAE,for rating prediction.To address the low quality of representations due to the unavailability of external attributes,we propose an inherent attribute perception module that mines inductive user active patterns and item popularity patterns from users'rating behaviors to strengthen user and item representations.To exploit the complex associations hidden in users’rating behaviors,we design an encoder on the item-item co-occurrence graph to capture the co-occurrence frequency features among items.Moreover,we propose a dual-graph feature encoder framework to simultaneously encode and fuse the high-order representations learned from the user-item rating graph and item-item co-occurrence graph.Extensive experiments on three real datasets demonstrate that IADGAE is effective and outperforms existing rating prediction methods,which achieves a significant improvement of 4.51%~41.63%in the RMSE metric.展开更多
变分图自编码器是图嵌入研究中重要的深度学习模型,但存在着先验正态分布缺陷、训练过程中容易出现后验塌陷等问题.本文从建立云概念空间与隐空间的映射关系入手,引入云模型数字特征对网络中的节点进行不确定性概念表示,设计了一种基于...变分图自编码器是图嵌入研究中重要的深度学习模型,但存在着先验正态分布缺陷、训练过程中容易出现后验塌陷等问题.本文从建立云概念空间与隐空间的映射关系入手,引入云模型数字特征对网络中的节点进行不确定性概念表示,设计了一种基于多维云模型的变分图自编码器(Variational Graph Autoencoder based on Multidimensional Cloud Model,MCM-VGAE).该模型实现了隐空间的多维云概念嵌入及相应的漂移性损失度量,将先验分布扩展为泛正态分布,利用多维正向云发生器及云包络带修正采样算法实现了重参数化过程,有效缓解了后验塌陷现象.在应用效果上,模型在多类型数据集上的链路预测、节点聚类、图嵌入可视化实验表现均优于基准模型,进一步说明了方法的普适有效性.展开更多
提出一种结构化深度聚类网络模型来预测microRNA(miRNA)和疾病的关联。模型将miRNA和疾病的集成相似性投入自编码器,将自编码器的输出通过传递算子传递到图卷积层,利用双重监督机制对模型进行训练。5折交叉验证结果显示,该模型分别在HMD...提出一种结构化深度聚类网络模型来预测microRNA(miRNA)和疾病的关联。模型将miRNA和疾病的集成相似性投入自编码器,将自编码器的输出通过传递算子传递到图卷积层,利用双重监督机制对模型进行训练。5折交叉验证结果显示,该模型分别在HMDD v2.0和HMDD v3.0数据集上平均AUC(Area Under the Curve)值分别为93.23%和94.58%。展开更多
单细胞数据聚类在生物信息分析中具有重要作用,但受测序原理和测序平台的限制,单细胞数据集普遍存在高维稀疏性、高方差噪声和基因数据缺失的问题,导致单细胞数据在聚类分析和应用方面仍面临诸多挑战。现有的单细胞聚类方法主要针对细...单细胞数据聚类在生物信息分析中具有重要作用,但受测序原理和测序平台的限制,单细胞数据集普遍存在高维稀疏性、高方差噪声和基因数据缺失的问题,导致单细胞数据在聚类分析和应用方面仍面临诸多挑战。现有的单细胞聚类方法主要针对细胞和基因表达间的关系进行建模,忽略了对细胞间潜在特征关系的充分挖掘以及对噪声的去除,导致聚类结果不理想,从而阻碍了后期对数据的分析。针对上述问题,提出了一种联合零膨胀负二项(Zero Inflated Negative Binomial,ZINB)模型与图注意力自编码器的自优化单细胞聚类算法(Self-optimized Single Cell Clustering Using ZINB Model and Graph Attention Autoencoder,scZDGAC)。该算法首先使用ZINB模型并结合可扩展的DCA去噪算法,通过ZINB分布更好地拟合数据特征分布,提升自编码器的去噪性能,并减小噪声和数据丢失对KNN算法输出的影响;然后通过图注意力自编码器在不同权重的细胞之间传播信息,更好地捕获细胞间的潜在特征进行聚类;最后scZDGAC采用自优化的方法使原本两个独立的聚类模块和特征模块相互受益,不断迭代更新聚类中心,进一步提升聚类性能。为了对聚类结果进行评价,文中使用调整兰德指数(ARI)和标准化互信息(NMI)两个通用评价指标。在6个不同规模的单细胞数据集上与其他算法进行对比实验,结果表明,所提聚类算法在聚类性能上较其他方法有很大提高,很好地展现了该算法的鲁棒性。展开更多
基金Supported by the Strategy Priority Research Program of Chinese Academy of Sciences(No.XDC02070600).
文摘Graph embedding aims to map the high-dimensional nodes to a low-dimensional space and learns the graph relationship from its latent representations.Most existing graph embedding methods focus on the topological structure of graph data,but ignore the semantic information of graph data,which results in the unsatisfied performance in practical applications.To overcome the problem,this paper proposes a novel deep convolutional adversarial graph autoencoder(GAE)model.To embed the semantic information between nodes in the graph data,the random walk strategy is first used to construct the positive pointwise mutual information(PPMI)matrix,then,graph convolutional net-work(GCN)is employed to encode the PPMI matrix and node content into the latent representation.Finally,the learned latent representation is used to reconstruct the topological structure of the graph data by decoder.Furthermore,the deep convolutional adversarial training algorithm is introduced to make the learned latent representation conform to the prior distribution better.The state-of-the-art experimental results on the graph data validate the effectiveness of the proposed model in the link prediction,node clustering and graph visualization tasks for three standard datasets,Cora,Citeseer and Pubmed.
基金supported in part by National Natural Science Foundation of China(U21B2015,61972300)in part by Young Scientists Fund of the National Natural Science Foundation of China(62202356)+1 种基金in part by Young Talent Fund of Association for Science and Technology in Shaanxi(20220113)in part by Intelligent Financial Software Engineering New Technology Joint Laboratory Project(99901220858)。
文摘Autoencoder-based rating prediction methods with external attributes have received wide attention due to their ability to accurately capture users'preferences.However,existing methods still have two significant limitations:i)External attributes are often unavailable in the real world due to privacy issues,leading to low quality of representations;and ii)existing methods lack considering complex associations in users'rating behaviors during the encoding process.To meet these challenges,this paper innovatively proposes an inherent-attribute-aware dual-graph autoencoder,named IADGAE,for rating prediction.To address the low quality of representations due to the unavailability of external attributes,we propose an inherent attribute perception module that mines inductive user active patterns and item popularity patterns from users'rating behaviors to strengthen user and item representations.To exploit the complex associations hidden in users’rating behaviors,we design an encoder on the item-item co-occurrence graph to capture the co-occurrence frequency features among items.Moreover,we propose a dual-graph feature encoder framework to simultaneously encode and fuse the high-order representations learned from the user-item rating graph and item-item co-occurrence graph.Extensive experiments on three real datasets demonstrate that IADGAE is effective and outperforms existing rating prediction methods,which achieves a significant improvement of 4.51%~41.63%in the RMSE metric.
文摘变分图自编码器是图嵌入研究中重要的深度学习模型,但存在着先验正态分布缺陷、训练过程中容易出现后验塌陷等问题.本文从建立云概念空间与隐空间的映射关系入手,引入云模型数字特征对网络中的节点进行不确定性概念表示,设计了一种基于多维云模型的变分图自编码器(Variational Graph Autoencoder based on Multidimensional Cloud Model,MCM-VGAE).该模型实现了隐空间的多维云概念嵌入及相应的漂移性损失度量,将先验分布扩展为泛正态分布,利用多维正向云发生器及云包络带修正采样算法实现了重参数化过程,有效缓解了后验塌陷现象.在应用效果上,模型在多类型数据集上的链路预测、节点聚类、图嵌入可视化实验表现均优于基准模型,进一步说明了方法的普适有效性.
文摘提出一种结构化深度聚类网络模型来预测microRNA(miRNA)和疾病的关联。模型将miRNA和疾病的集成相似性投入自编码器,将自编码器的输出通过传递算子传递到图卷积层,利用双重监督机制对模型进行训练。5折交叉验证结果显示,该模型分别在HMDD v2.0和HMDD v3.0数据集上平均AUC(Area Under the Curve)值分别为93.23%和94.58%。
文摘单细胞数据聚类在生物信息分析中具有重要作用,但受测序原理和测序平台的限制,单细胞数据集普遍存在高维稀疏性、高方差噪声和基因数据缺失的问题,导致单细胞数据在聚类分析和应用方面仍面临诸多挑战。现有的单细胞聚类方法主要针对细胞和基因表达间的关系进行建模,忽略了对细胞间潜在特征关系的充分挖掘以及对噪声的去除,导致聚类结果不理想,从而阻碍了后期对数据的分析。针对上述问题,提出了一种联合零膨胀负二项(Zero Inflated Negative Binomial,ZINB)模型与图注意力自编码器的自优化单细胞聚类算法(Self-optimized Single Cell Clustering Using ZINB Model and Graph Attention Autoencoder,scZDGAC)。该算法首先使用ZINB模型并结合可扩展的DCA去噪算法,通过ZINB分布更好地拟合数据特征分布,提升自编码器的去噪性能,并减小噪声和数据丢失对KNN算法输出的影响;然后通过图注意力自编码器在不同权重的细胞之间传播信息,更好地捕获细胞间的潜在特征进行聚类;最后scZDGAC采用自优化的方法使原本两个独立的聚类模块和特征模块相互受益,不断迭代更新聚类中心,进一步提升聚类性能。为了对聚类结果进行评价,文中使用调整兰德指数(ARI)和标准化互信息(NMI)两个通用评价指标。在6个不同规模的单细胞数据集上与其他算法进行对比实验,结果表明,所提聚类算法在聚类性能上较其他方法有很大提高,很好地展现了该算法的鲁棒性。