With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most...With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most elementary learning models of machine learning.Privacy issues surrounding SVM classifier training have attracted increasing attention.In this paper,we investigate Differential Privacy-compliant Federated Machine Learning with Dimensionality Reduction,called FedDPDR-DPML,which greatly improves data utility while providing strong privacy guarantees.Considering in distributed learning scenarios,multiple participants usually hold unbalanced or small amounts of data.Therefore,FedDPDR-DPML enables multiple participants to collaboratively learn a global model based on weighted model averaging and knowledge aggregation and then the server distributes the global model to each participant to improve local data utility.Aiming at high-dimensional data,we adopt differential privacy in both the principal component analysis(PCA)-based dimensionality reduction phase and SVM classifiers training phase,which improves model accuracy while achieving strict differential privacy protection.Besides,we train Differential privacy(DP)-compliant SVM classifiers by adding noise to the objective function itself,thus leading to better data utility.Extensive experiments on three high-dimensional datasets demonstrate that FedDPDR-DPML can achieve high accuracy while ensuring strong privacy protection.展开更多
Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to u...Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.展开更多
针对现有的基于知识图谱的推荐算法往往侧重于物品端邻居信息,而忽视用户端兴趣特征问题,提出一种融合图注意力和知识图卷积网络的双端邻居推荐算法。首先,在用户端,以用户的历史兴趣作为种子,在知识图中迭代传播偏好,融合图注意力形成...针对现有的基于知识图谱的推荐算法往往侧重于物品端邻居信息,而忽视用户端兴趣特征问题,提出一种融合图注意力和知识图卷积网络的双端邻居推荐算法。首先,在用户端,以用户的历史兴趣作为种子,在知识图中迭代传播偏好,融合图注意力形成用户潜在兴趣向量;其次,在物品端,结合图卷积网络在知识图遍历路径中聚合重要邻域信息,获得物品偏好聚合向量;同时在损失函数中融入标签平滑正则化项;最后使用内积运算得到用户对物品的喜好预测。通过在公开数据集下的实验结果表明,文章算法与其他基准算法相比,在CTR(Click Through Rate)和Top-K(对模型给出的前K个预测结果进行性能评估)推荐场景下的评估指标AUC(Area Under Curve)、F_(1)(F_(1)-score)、recall(召回率)均有所提高。文章该算法具有较好的推荐性能和可解释性。展开更多
自动知识抽取方法可以自动识别并抽取Web文档中与本体匹配的事实知识。利用这些事实知识既可以构建基于知识的服务,也能够为语义Web的实现提供必要的语义数据。但面向自然语言特别是中文自然语言的自动知识抽取非常困难.提出了基于语义...自动知识抽取方法可以自动识别并抽取Web文档中与本体匹配的事实知识。利用这些事实知识既可以构建基于知识的服务,也能够为语义Web的实现提供必要的语义数据。但面向自然语言特别是中文自然语言的自动知识抽取非常困难.提出了基于语义Web理论和中文自然语言处理(natural language processing,NLP)技术的自动知识抽取新方法AKE,用聚集体知识概念刻画N元关系知识,能够在不使用大规模语言知识库和同义词表的情况下自动识别中文自然语言文档内容中显式和隐含的简单事实知识和N元关系复杂事实知识.实验结果表明该方法优于目前已知的其他方法.展开更多
基金supported in part by National Natural Science Foundation of China(Nos.62102311,62202377,62272385)in part by Natural Science Basic Research Program of Shaanxi(Nos.2022JQ-600,2022JM-353,2023-JC-QN-0327)+2 种基金in part by Shaanxi Distinguished Youth Project(No.2022JC-47)in part by Scientific Research Program Funded by Shaanxi Provincial Education Department(No.22JK0560)in part by Distinguished Youth Talents of Shaanxi Universities,and in part by Youth Innovation Team of Shaanxi Universities.
文摘With the widespread data collection and processing,privacy-preserving machine learning has become increasingly important in addressing privacy risks related to individuals.Support vector machine(SVM)is one of the most elementary learning models of machine learning.Privacy issues surrounding SVM classifier training have attracted increasing attention.In this paper,we investigate Differential Privacy-compliant Federated Machine Learning with Dimensionality Reduction,called FedDPDR-DPML,which greatly improves data utility while providing strong privacy guarantees.Considering in distributed learning scenarios,multiple participants usually hold unbalanced or small amounts of data.Therefore,FedDPDR-DPML enables multiple participants to collaboratively learn a global model based on weighted model averaging and knowledge aggregation and then the server distributes the global model to each participant to improve local data utility.Aiming at high-dimensional data,we adopt differential privacy in both the principal component analysis(PCA)-based dimensionality reduction phase and SVM classifiers training phase,which improves model accuracy while achieving strict differential privacy protection.Besides,we train Differential privacy(DP)-compliant SVM classifiers by adding noise to the objective function itself,thus leading to better data utility.Extensive experiments on three high-dimensional datasets demonstrate that FedDPDR-DPML can achieve high accuracy while ensuring strong privacy protection.
文摘Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.
文摘针对现有的基于知识图谱的推荐算法往往侧重于物品端邻居信息,而忽视用户端兴趣特征问题,提出一种融合图注意力和知识图卷积网络的双端邻居推荐算法。首先,在用户端,以用户的历史兴趣作为种子,在知识图中迭代传播偏好,融合图注意力形成用户潜在兴趣向量;其次,在物品端,结合图卷积网络在知识图遍历路径中聚合重要邻域信息,获得物品偏好聚合向量;同时在损失函数中融入标签平滑正则化项;最后使用内积运算得到用户对物品的喜好预测。通过在公开数据集下的实验结果表明,文章算法与其他基准算法相比,在CTR(Click Through Rate)和Top-K(对模型给出的前K个预测结果进行性能评估)推荐场景下的评估指标AUC(Area Under Curve)、F_(1)(F_(1)-score)、recall(召回率)均有所提高。文章该算法具有较好的推荐性能和可解释性。
文摘自动知识抽取方法可以自动识别并抽取Web文档中与本体匹配的事实知识。利用这些事实知识既可以构建基于知识的服务,也能够为语义Web的实现提供必要的语义数据。但面向自然语言特别是中文自然语言的自动知识抽取非常困难.提出了基于语义Web理论和中文自然语言处理(natural language processing,NLP)技术的自动知识抽取新方法AKE,用聚集体知识概念刻画N元关系知识,能够在不使用大规模语言知识库和同义词表的情况下自动识别中文自然语言文档内容中显式和隐含的简单事实知识和N元关系复杂事实知识.实验结果表明该方法优于目前已知的其他方法.