期刊文献+

知识增强的自监督表格数据异常检测方法研究

Self-Supervised Tabular Data Anomaly Detection Method Based on Knowledge Enhancement
下载PDF
导出
摘要 传统的监督异常检测方法快速发展,为了减少对标签的依赖,自监督预训练方法得到了广泛的研究,同时研究表明额外的内在语义知识嵌入对于表格学习至关重要。为了挖掘表格数据当中存在的丰富知识信息,提出了一种基于知识增强的自监督表格数据异常检测方法(self-supervised tabular data anomaly detection method based on knowledge enhancement,STKE)并进行了改进。提出的数据处理模块将领域知识(语义)、统计数学知识融入到特征构建中,同时自监督预训练(参数学习)提供上下文知识先验,实现表格数据的丰富信息迁移。在原始数据上采用mask机制,通过学习相关的非遮掩特征来学习遮掩特征,同时预测在数据隐层空间加性高斯噪声的原始值。该策略促使模型即使在有噪声输入的情况下也能恢复原始的特征信息。使用混合注意机制有效提取数据特征之间的关联信息。在6个数据集上的实验结果展现了提出的方法优越的性能。 The traditional supervised anomaly detection methods have developed rapidly.In order to reduce the depen-dence on labels,self-supervised pre-training methods are widely studied,and the studies show that additional intrinsic semantic knowledge embedding is crucial for table learning.In order to mine the rich knowledge information in tabular data,the self-supervised tabular data anomaly detection method based on knowledge enhancement(STKE)is proposed with the following improvements.The proposed data processing module integrates domain knowledge(semantics)and statistical mathematics knowledge into feature construction.At the same time,self-supervised pre-training(parameter learning)provides contextual knowledge priors to achieve the rich information transfer of tabular data.The mask mecha-nism is used on the original data to learn the masked features by learning the relevant non-masked features,and predict the original value of the additive Gaussian noise in the hidden layer space of the data.This strategy promotes the model even in the presence of noisy inputs.The original feature information can also be recovered.A hybrid attention mecha-nism is used to effectively extract association information between data features.The experimental results of the proposed method on six datasets show superior performance.
作者 高小玉 赵晓永 王磊 GAOXiaoyu;ZHAOXiaoyong;WANGLei(SchoolofInformationManagement,BeijingInformationScience&TechnologyUniversity,Beijing 100192,China)
出处 《计算机工程与应用》 CSCD 北大核心 2024年第10期140-147,共8页 Computer Engineering and Applications
基金 国家重点研发计划(2019YFB1705402) 教育部人文社科规划基金项目(20YJAZH129) 北京市教育委员会社科计划重点项目(SZ202011232024)。
关键词 异常检测 自监督 知识增强 预训练 anomaly detection self-supervised knowledge enhancement pre-training
  • 相关文献

参考文献3

二级参考文献69

  • 1Chapelle O,Scholkopf B,Zien A. Semi-Supervised Learning[M].Cambridge,ma:the Mit Press,2006.
  • 2Zhu X J. Semi-supervised Learning Literature Survey.Technical Report 1530[R].Department of Computer Sciences,University of Wisconsin at Madison,Madison,WI,2006.
  • 3Zhou Z H,Li M. Semi-supervised learning by disagreement[J].Knowledge and Information Systems,2010,(03):415-439.
  • 4Shahshahani B M,Landgrebe D A. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J].IEEE Transactions on Geoscience and Remote Sensing,1994,(05):1087-1095.
  • 5Miller D,Uyar H. A mixture of experts classifier with learning based on both labelled and unlabelled data[A].Cambridge,ma:the Mit Press,1997.571-577.
  • 6Nigam K,McCallum A K,Thrun S,Mitchell T. Text classification from labeled and unlabeled documents using EM[J].Machine Learning,2000,(2-3):103-134.
  • 7Blum A,Mitchell T. Combining labeled and unlabeled data with co-training[A].New York,USA:ACM,1998.92-100.
  • 8Joachims T. Transductive inference for text classification using support vector machines[A].San Francisco,CA,USA,Morgan Kaufmann Publishers Inc,1999.200-209.
  • 9Zhu X J,Ghahramani Z,Lafferty J. Semi-supervised learning using Gaussian fields and harmonic functions[A].Menlo Park,ca:aaai Press,2003.912-919.
  • 10Zhou Z H. Semi-supervised learning by disagreement[A].Piscataway,NJ:IEEE,2008.93.

共引文献158

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部