期刊文献+
共找到1,139篇文章
< 1 2 57 >
每页显示 20 50 100
Backward Support Computation Method for Positive and Negative Frequent Itemset Mining
1
作者 Mrinmoy Biswas Akash Indrani Mandal Md. Selim Al Mamun 《Journal of Data Analysis and Information Processing》 2023年第1期37-48,共12页
Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on p... Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on positive itemsets generated from frequently occurring itemsets (PFIS). However, there has been a significant study focused on infrequent itemsets with utilization of negative association rules to mine interesting frequent itemsets (NFIS) from transactions. In this work, we propose an efficient backward calculating negative frequent itemset algorithm namely EBC-NFIS for computing backward supports that can extract both positive and negative frequent itemsets synchronously from dataset. EBC-NFIS algorithm is based on popular e-NFIS algorithm that computes supports of negative itemsets from the supports of positive itemsets. The proposed algorithm makes use of previously computed supports from memory to minimize the computation time. In addition, association rules, i.e. positive and negative association rules (PNARs) are generated from discovered frequent itemsets using EBC-NFIS algorithm. The efficiency of the proposed algorithm is verified by several experiments and comparing results with e-NFIS algorithm. The experimental results confirm that the proposed algorithm successfully discovers NFIS and PNARs and runs significantly faster than conventional e-NFIS algorithm. 展开更多
关键词 Data Mining Positive frequent itemset Negative frequent itemset Association Rule Backward Support
下载PDF
Frequent Itemset Mining of User’s Multi-Attribute under Local Differential Privacy 被引量:1
2
作者 Haijiang Liu Lianwei Cui +1 位作者 Xuebin Ma Celimuge Wu 《Computers, Materials & Continua》 SCIE EI 2020年第10期369-385,共17页
Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of ... Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of local differential privacy protection models to mine frequent itemsets is a relatively reliable and secure protection method.Local differential privacy means that users first perturb the original data and then send these data to the aggregator,preventing the aggregator from revealing the user’s private information.We propose a novel framework that implements frequent itemset mining under local differential privacy and is applicable to user’s multi-attribute.The main technique has bitmap encoding for converting the user’s original data into a binary string.It also includes how to choose the best perturbation algorithm for varying user attributes,and uses the frequent pattern tree(FP-tree)algorithm to mine frequent itemsets.Finally,we incorporate the threshold random response(TRR)algorithm in the framework and compare it with the existing algorithms,and demonstrate that the TRR algorithm has higher accuracy for mining frequent itemsets. 展开更多
关键词 Local differential privacy frequent itemset mining user’s multi-attribute
下载PDF
FPGA-Based Stream Processing for Frequent Itemset Mining with Incremental Multiple Hashes
3
作者 Kasho Yamamoto Masayuki Ikebe +1 位作者 Tetsuya Asai Masato Motomura 《Circuits and Systems》 2016年第10期3299-3309,共11页
With the advent of the IoT era, the amount of real-time data that is processed in data centers has increased explosively. As a result, stream mining, extracting useful knowledge from a huge amount of data in real time... With the advent of the IoT era, the amount of real-time data that is processed in data centers has increased explosively. As a result, stream mining, extracting useful knowledge from a huge amount of data in real time, is attracting more and more attention. It is said, however, that real- time stream processing will become more difficult in the near future, because the performance of processing applications continues to increase at a rate of 10% - 15% each year, while the amount of data to be processed is increasing exponentially. In this study, we focused on identifying a promising stream mining algorithm, specifically a Frequent Itemset Mining (FIsM) algorithm, then we improved its performance using an FPGA. FIsM algorithms are important and are basic data- mining techniques used to discover association rules from transactional databases. We improved on an approximate FIsM algorithm proposed recently so that it would fit onto hardware architecture efficiently. We then ran experiments on an FPGA. As a result, we have been able to achieve a speed 400% faster than the original algorithm implemented on a CPU. Moreover, our FPGA prototype showed a 20 times speed improvement compared to the CPU version. 展开更多
关键词 Data Mining frequent itemset Mining FPGA Stream Processing
下载PDF
Double-layer Bayesian Classifier Ensembles Based on Frequent Itemsets 被引量:3
4
作者 Wei-Guo Yi Jing Duan Ming-Yu Lu 《International Journal of Automation and computing》 EI 2012年第2期215-220,共6页
Numerous models have been proposed to reduce the classification error of Na¨ ve Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that e... Numerous models have been proposed to reduce the classification error of Na¨ ve Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classification error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms. 展开更多
关键词 朴素贝叶斯分类 频繁项集 贝叶斯分类器 误码性能 集成学习 重量分配 多模型 算法
下载PDF
A novel algorithm for frequent itemset mining in data warehouses 被引量:2
5
作者 徐利军 谢康林 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第2期216-224,共9页
Current technology for frequent itemset mining mostly applies to the data stored in a single transaction database. This paper presents a novel algorithm MultiClose for frequent itemset mining in data warehouses. Multi... Current technology for frequent itemset mining mostly applies to the data stored in a single transaction database. This paper presents a novel algorithm MultiClose for frequent itemset mining in data warehouses. MultiClose respectively computes the results in single dimension tables and merges the results with a very efficient approach. Close itemsets technique is used to improve the performance of the algorithm. The authors propose an efficient implementation for star schemas in which their al- gorithm outperforms state-of-the-art single-table algorithms. 展开更多
关键词 数据仓库 数据挖掘 频率项集 闭项集
下载PDF
FICW: Frequent Itemset Based Text Clustering with Window Constraint
6
作者 ZHOU Chong LU Yansheng ZOU Lei HU Rong 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1345-1351,共7页
Most of the existing text clustering algorithms overlook the fact that one document is a word sequence with semantic information. There is some important semantic information existed in the positions of words in the s... Most of the existing text clustering algorithms overlook the fact that one document is a word sequence with semantic information. There is some important semantic information existed in the positions of words in the sequence. In this paper, a novel method named Frequent Itemset-based Clustering with Window (FICW) was proposed, which makes use of the semantic information for text clustering with a window constraint. The experimental results obtained from tests on three (hypertext) text sets show that FICW outperforms the method compared in both clustering accuracy and efficiency. 展开更多
关键词 文本聚类 搜索引擎 信息检索 语义 FICW
下载PDF
Mining φ-Frequent Itemset Using FP-Tree
7
作者 李天瑞 《Journal of Modern Transportation》 2001年第1期67-74,共8页
The problem of association rule mining has gained considerable prominence in the data mining community for its use as an important tool of knowledge discovery from large scale databases. And there has been a spurt of ... The problem of association rule mining has gained considerable prominence in the data mining community for its use as an important tool of knowledge discovery from large scale databases. And there has been a spurt of research activities around this problem. However, traditional association rule mining may often derive many rules in which people are uninterested. This paper reports a generalization of association rule mining called φ association rule mining. It allows people to have different interests on different itemsets that arethe need of real application. Also, it can help to derive interesting rules and substantially reduce the amount of rules. An algorithm based on FP tree for mining φ frequent itemset is presented. It is shown by experiments that the proposed methodis efficient and scalable over large databases. 展开更多
关键词 DATA processing DATABASES φ association rule MINING φ frequent itemset FP tree DATA MINING
下载PDF
A Depth-first Algorithm of Finding All Association Rules Generated by a Frequent Itemset
8
作者 武坤 姜保庆 魏庆 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期1-4,9,共5页
The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and propose... The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed an improved algorithm. The algorithm finds all consequents layer by layer, so it is breadth-first. In this paper, we propose a new algorithm Generate Rules by using Set-Enumeration Tree (GRSET) which uses the structure of Set-Enumeration Tree and depth-first method to find all consequents of the association rules one by one and get all association rules correspond to the consequents. Experiments show GRSET algorithm to be practicable and efficient. 展开更多
关键词 计算方法 项集 深度优先算法 广度优先算法
下载PDF
Hadamard Encoding Based Frequent Itemset Mining under Local Differential Privacy 被引量:1
9
作者 赵丹 赵素云 +3 位作者 陈红 刘睿瑄 李翠平 张晓莹 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第6期1403-1422,共20页
Local differential privacy(LDP)approaches to collecting sensitive information for frequent itemset mining(FIM)can reliably guarantee privacy.Most current approaches to FIM under LDP add"padding and sampling"... Local differential privacy(LDP)approaches to collecting sensitive information for frequent itemset mining(FIM)can reliably guarantee privacy.Most current approaches to FIM under LDP add"padding and sampling"steps to obtain frequent itemsets and their frequencies because each user transaction represents a set of items.The current state-of-the-art approach,namely set-value itemset mining(SVSM),must balance variance and bias to achieve accurate results.Thus,an unbiased FIM approach with lower variance is highly promising.To narrow this gap,we propose an Item-Level LDP frequency oracle approach,named the Integrated-with-Hadamard-Transform-Based Frequency Oracle(IHFO).For the first time,Hadamard encoding is introduced to a set of values to encode all items into a fixed vector,and perturbation can be subsequently applied to the vector.An FIM approach,called optimized united itemset mining(O-UISM),is pro-posed to combine the padding-and-sampling-based frequency oracle(PSFO)and the IHFO into a framework for acquiring accurate frequent itemsets with their frequencies.Finally,we theoretically and experimentally demonstrate that O-UISM significantly outperforms the extant approaches in finding frequent itemsets and estimating their frequencies under the same privacy guarantee. 展开更多
关键词 local differential privacy frequent itemset mining frequency oracle
原文传递
Parallel Incremental Frequent Itemset Mining for Large Data 被引量:5
10
作者 Yu-Geng Song Hui-Min Cui Xiao-Bing Feng 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第2期368-385,共18页
经常的 itemset 采矿(鳍) 是在许多域里采用的一个流行数据采矿问题,例如在零售的商品建议工业,在网寻找的日志分析,和询问建议(或相关搜索) 。很多鳍算法被建议了获得更好的性能,包括为处理大数据的 parallelized 算法体积。而且... 经常的 itemset 采矿(鳍) 是在许多域里采用的一个流行数据采矿问题,例如在零售的商品建议工业,在网寻找的日志分析,和询问建议(或相关搜索) 。很多鳍算法被建议了获得更好的性能,包括为处理大数据的 parallelized 算法体积。而且,增长的鳍算法也被建议处理增长数据库更改。然而,大多数这些增长算法有低并行,引起巨大的数据库上的低效率。这份报纸礼品二个平行的增长的鳍算法叫的 IncMiningPFP 和 IncBuildingPFP,在 MapReduce 框架上实现了。IncMiningPFP 保存原版的结果传递的 FP 树采矿,并且为增长计算利用他们。特别地,我们建议一个方法在增长通行证产生一棵部分 FP 树,以便避免不必要的采矿工作。进一步,当插入的交易包括更少项目时,一些增长平行任务罐头被省略。IncbuildingPFP 保存在原版造的 CanTrees 过去,然后在增长通行证期间把新交易加到他们。我们的试验性的结果显示出那 IncMiningPFP 能在 PFP 上完成重要加速(平行 FPGrowth ) 并且在增长输入的大多数情况中的一个顺序的增长算法(CanTree ) 数据库,并且在里面另外的情况 IncBuildingPFP 能完成它。 展开更多
关键词 增长平行 FPGrowth 数据采矿 经常的 itemset 采矿 MAPREDUCE
原文传递
Effect of Count Estimation in Finding Frequent Itemsets over Online Transactional Data Streams 被引量:2
11
作者 JoongHyukChang WonSukLee 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第1期63-69,共7页
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to this reason, most algorithms for data streams sacrifice the correctness of their results for fast processin... A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to this reason, most algorithms for data streams sacrifice the correctness of their results for fast processing time. The processing time is greatly influenced by the amount of information that should be maintained. This issue becomes more serious in finding frequent itemsets or frequency counting over an online transactional data stream since there can be a large number of itemsets to be monitored. We have proposed a method called the estDec method for finding frequent itemsets over an online data stream. In order to reduce the number of monitored itemsets in this method, monitoring the count of an itemset is delayed until its support is large enough to become a frequent itemset in the near future. For this purpose, the count of an itemset should be estimated. Consequently, how to estimate the count of an itemset is a critical issue in minimizing memory usage as well as processing time. In this paper, the effects of various count estimation methods for finding frequent itemsets are analyzed in terms of mining accuracy, memory usage and processing time. 展开更多
关键词 计数估计 数据流 数据分析 频繁项设置
原文传递
Mining Frequent Itemsets in Correlated Uncertain Databases 被引量:1
12
作者 童咏昕 陈雷 余洁莹 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第4期696-712,共17页
最近与事情(IoT ) 和弥漫的计算的因特网的成长流行,大量不明确的数据,例如, RFID 数据,传感器数据,即时录像数据,被收集了。作为不明确的数据采矿的最基本的问题之一,不明确的经常的模式采矿在数据库和数据采矿社区吸引了许多... 最近与事情(IoT ) 和弥漫的计算的因特网的成长流行,大量不明确的数据,例如, RFID 数据,传感器数据,即时录像数据,被收集了。作为不明确的数据采矿的最基本的问题之一,不明确的经常的模式采矿在数据库和数据采矿社区吸引了许多注意。尽管不明确的经常的模式采矿有一些解决方案,他们中的大多数假设数据是独立的,它不在很真实世界的情形是真的。因此,基于独立假设的当前的方法可以为相关不明确的数据产生不精密的结果。在这份报纸,我们在相关不明确的数据上在采矿的问题上集中经常的 itemsets,在关联能在任何个不明确的数据目标(交易) 存在的地方。我们建议一个新奇概率的模型,打电话给相关经常的概率模型(CFP 模型) 在给定的相关不明确的数据集代表支持的概率分发。基于从 CFP 模型导出的支持的分发,我们观察到某概率的经常的 itemsets 在有高积极的关联的几宗交易仅仅是经常的。特别地, itemsets,它全球概率经常,在在数据消除存在噪音和关联的影响有更多的意义。以便减少冗余的经常的 itemsets,我们进一步建议模式的一种新类型,叫了全球概率的经常的 itemsets,不明确的数据库被划分识别如果全部相关,在每组交易总是是经常的 itemsets 成拆散组基于他们的关联。加快采矿进程,我们也设计一个动态编程解决方案,以及二修剪并且跳技术。真实、合成的数据集的广泛的实验验证建议模型和算法的有效性和效率。 展开更多
关键词 频繁项集 数据挖掘 数据库 频繁模式挖掘 概率模型 不确定数据 关联 普适计算
原文传递
Text Classification Using Sentential Frequent Itemsets
13
作者 刘石竹 胡和平 《Journal of Computer Science & Technology》 SCIE EI CSCD 2007年第2期334-336,F0003,共4页
文章分类技术主要依靠,数据设置了的文件的单个术语分析更多的概念,特别特定的,被术语的集合通常传送。完成更多的精确文章分类器,在一样的句子包括经常的共同发生的词的更增进知识的特征和他们的重量在如此的情形是特别地重要的。... 文章分类技术主要依靠,数据设置了的文件的单个术语分析更多的概念,特别特定的,被术语的集合通常传送。完成更多的精确文章分类器,在一样的句子包括经常的共同发生的词的更增进知识的特征和他们的重量在如此的情形是特别地重要的。在这篇论文,我们用句子的经常的条款集合建议一条新奇途径,一个概念来自协会规则采矿为文章分类,它把一个句子而非一个文件看作一宗交易,并且使用变量精度粗集评估各句子的经常的 itemset 的贡献到分类的基于的方法。在路透社和新闻组语料库上的实验被执行,它验证建议系统的有实行可能。电子增补材料电子增补材料为在 http://dx.doi.org/10.1007/s11390-007-9041-7 的这篇文章是可得到的并且为授权的用户可存取。 展开更多
关键词 文本分类 文本信息处理 句子 粗集模型
原文传递
Mining Frequent Closed Itemsets in Large High Dimensional Data
14
作者 余光柱 曾宪辉 邵世煌 《Journal of Donghua University(English Edition)》 EI CAS 2008年第4期416-424,共9页
Large high-dimensional data have posed great challenges to existing algorithms for frequent itemsets mining.To solve the problem,a hybrid method,consisting of a novel row enumeration algorithm and a column enumeration... Large high-dimensional data have posed great challenges to existing algorithms for frequent itemsets mining.To solve the problem,a hybrid method,consisting of a novel row enumeration algorithm and a column enumeration algorithm,is proposed.The intention of the hybrid method is to decompose the mining task into two subtasks and then choose appropriate algorithms to solve them respectively.The novel algorithm,i.e.,Inter-transaction is based on the characteristic that there are few common items between or among long transactions.In addition,an optimization technique is adopted to improve the performance of the intersection of bit-vectors.Experiments on synthetic data show that our method achieves high performance in large high-dimensional data. 展开更多
关键词 频繁关闭系统 大空间数据 混合方法 计算机程序
下载PDF
基于滑动窗口含负项的高效用模式挖掘
15
作者 武妍 荀亚玲 马煜 《计算机工程与设计》 北大核心 2024年第3期845-851,共7页
针对传统高效用模式挖掘均未考虑项的效用值为负,以及对流数据处理的时效性问题,提出一种基于滑动窗口的高效用挖掘算法HUPN_SW。利用一种新定义的滑动窗口正负效用列表PNSWU-List,维护挖掘最近批次高效用模式集所需的所有信息,实现有... 针对传统高效用模式挖掘均未考虑项的效用值为负,以及对流数据处理的时效性问题,提出一种基于滑动窗口的高效用挖掘算法HUPN_SW。利用一种新定义的滑动窗口正负效用列表PNSWU-List,维护挖掘最近批次高效用模式集所需的所有信息,实现有效的逐批次挖掘,避免重复的数据库扫描,在不产生候选效用模式集的情况下,直接挖掘出高效用模式,使HUPN_SW有效适应于动态流数据。实验结果表明,HUPN_SW算法在运行时间和可扩展性方面有良好表现。 展开更多
关键词 频繁模式挖掘 滑动窗口 高效用模式挖掘 高效用项集 负效用 流数据 效用列表
下载PDF
频繁项集挖掘研究前沿及展望
16
作者 张晴 谭旭 吕欣 《深圳信息职业技术学院学报》 2024年第1期1-14,共14页
频繁项集挖掘是数据挖掘领域的核心任务之一,其目标是发现在数据库中频繁出现的模式。这些模式对于关联规则、分类、异常检测等多个数据挖掘任务都具有重要作用。由于随着项集大小的增加,项集的组合数量呈指数级增长,导致计算复杂性急... 频繁项集挖掘是数据挖掘领域的核心任务之一,其目标是发现在数据库中频繁出现的模式。这些模式对于关联规则、分类、异常检测等多个数据挖掘任务都具有重要作用。由于随着项集大小的增加,项集的组合数量呈指数级增长,导致计算复杂性急剧上升,研究人员一直在努力开发高效的算法来解决这一问题。面向频繁项集挖掘的算法、紧凑表示和前沿应用,深入探讨不同技术的的工作原理、优势和局限性,从而对这一领域的研究现状进行全面总结。最后,进一步探讨了该领域的前沿发展趋势,指出计算效率、基于约束的频繁项集挖掘、模式的可解释性以及算法在不同领域的创新应用等未来潜在研究方向。 展开更多
关键词 频繁项集 数据挖掘 模式增长 关联规则
下载PDF
中医药辨治糖尿病心脏病用药规律分析
17
作者 陈丽霞 郭苗苗 +4 位作者 李儒婷 彭剑飞 张惠玲 王靓 施慧 《陕西中医药大学学报》 2024年第3期74-81,共8页
目的基于现代文献探究糖尿病心脏病的用药规律。方法检索中国知网(CNKI)、中国生物医学文献数据库(CBM)等数据库建库至2021年12月收录的有关中药辨治糖尿病心脏病的文献。分别使用Lantern 5.0、Weka 3.8.5软件,对药物及症状进行隐结构... 目的基于现代文献探究糖尿病心脏病的用药规律。方法检索中国知网(CNKI)、中国生物医学文献数据库(CBM)等数据库建库至2021年12月收录的有关中药辨治糖尿病心脏病的文献。分别使用Lantern 5.0、Weka 3.8.5软件,对药物及症状进行隐结构分析以及药物与药物、药物与证型、药物与症状的频繁项集分析。结果共计文献131篇。数据挖掘分析常用症状51项,包括苔白、面色少华、头晕等;药物使用145味,包括丹参、麦冬、黄芪等;药物功效有补虚、活血化瘀、清热等。药物隐结构模型得到包括补益肝肾、涩精固脱等4类隐类;症状隐结构模型得到气虚、阴虚、阳虚、痰湿等证素。挖掘出药物-药物频繁项集12项,包括川芎+麦冬+丹参等;药物-证型频繁项集17项,其中包括肉桂+五味子+阴阳两虚等;药物-症状频繁项集12项,包括瓜蒌+大便溏+苔白等。结论中药辨治糖尿病心脏病以调补心肾、健脾益气为主,并根据具体证型予以用药,可为临床干预糖尿病心脏病提供参考依据。 展开更多
关键词 糖尿病 心脏病 数据挖掘 隐结构 频繁项集 用药规律
下载PDF
基于并行式频繁项集的党政收费平台
18
作者 郭振华 孙艳青 王中兴 《电子设计工程》 2024年第5期31-36,共6页
为提高党政收费时效性与信息化管理水平,基于并行式频繁项集挖掘算法开发高效率、智能化的党政收费管理平台。基于云计算技术构建党政收费管理平台的总体架构,提供云缴费、党建教育学习、党建宣传等信息化功能。在Spark分布式计算框架... 为提高党政收费时效性与信息化管理水平,基于并行式频繁项集挖掘算法开发高效率、智能化的党政收费管理平台。基于云计算技术构建党政收费管理平台的总体架构,提供云缴费、党建教育学习、党建宣传等信息化功能。在Spark分布式计算框架上构建Spark集群,构造党政收费频繁项集挖掘矩阵,根据矩阵行列间运算获得频繁k项集支持度,利用“主-从”节点模式实现并行式频繁项集挖掘,获得党政收费管理信息分类结果。测试结果显示,该平台各功能最大平均响应时长仅为1.51 s,挖掘党政收费信息频繁项集的时间开销短、推荐非空率高,呈现了良好的频繁项集挖掘效率与质量。该平台助力优化党政费用交纳工作模式,为党员管理的信息化、智能化提供支持。 展开更多
关键词 并行式 云计算 频繁项集 Spark平台 挖掘 党政收费
下载PDF
改进关联规则算法在自然资源云中的应用研究
19
作者 李佳临 邬阳 +3 位作者 魏奇 赵雯雯 李芳芳 陈卉 《时空信息学报》 2024年第1期140-147,共8页
针对自然资源信息管理分散、网络安全防御能力弱,以及难以追踪溯源威胁攻击行为等问题,本研究在自然资源云中建立了一套安全防护体系,用以整合网络安全资源,强化网络安全态势感知能力,做到攻击敏捷预测、快速回溯。安全防护体系工作效... 针对自然资源信息管理分散、网络安全防御能力弱,以及难以追踪溯源威胁攻击行为等问题,本研究在自然资源云中建立了一套安全防护体系,用以整合网络安全资源,强化网络安全态势感知能力,做到攻击敏捷预测、快速回溯。安全防护体系工作效能的提升,核心在于其安全组件检测引擎模块中关联规则算法的改进。首先,在数据采集阶段,通过预处理将威胁告警数据转换为可供机器处理的标准数据格式;其次,在矩阵计算阶段,使用Map Reduce分布式计算框架提升频繁项集的处理效率;最后,以Apriori算法为蓝本,通过单次扫描锁定频繁k项集范围、矩阵向量内积运算、减少冗余候选项集生成三项措施进行算法改进。实验仿真表明:在处理同样容量网络安全多源数据集合,并在相同维度的关联规则矩阵下,本算法处理效率较经典Apriori算法提升3倍以上;随着输入数据集合瞬时容量的逐渐扩增,本算法的时间复杂度稳定,并为增量挖掘算法的一半以下。研究成果可以实现自然资源部网络安全防护工作从传统的“被动挨打”转向“主动防御”的新局面。 展开更多
关键词 自然资源云 关联规则 MAPREDUCE 频繁项集 APRIORI 网络安全
下载PDF
基于PrefixSpan和LightGBM的网元拓扑连接关系判别方法
20
作者 倪晋宇 涂泾伦 +2 位作者 杨天昊 陈晓峰 白云飞 《数字通信世界》 2024年第1期41-44,89,共5页
文章创新地提出了一种基于PrefixSpan和LightGBM的网元拓扑连接关系判别的方法,采用PrefixSpan算法对告警数据进行抽取挖掘,然后将挖掘结果进行分析并将分析结果输入到LightGBM中进行监督学习,获得最终网元拓扑连接关系判定模型。实验... 文章创新地提出了一种基于PrefixSpan和LightGBM的网元拓扑连接关系判别的方法,采用PrefixSpan算法对告警数据进行抽取挖掘,然后将挖掘结果进行分析并将分析结果输入到LightGBM中进行监督学习,获得最终网元拓扑连接关系判定模型。实验结果表明:本方法在基站及相关网元拓扑连接关系的推断中f1值达到了0.89,有效提升了网元拓扑连接关系判别的准确度,为网元拓扑连接关系校正提供了有力手段,为数字孪生网络构建打下坚实的基础。 展开更多
关键词 数字孪生网络 频繁项集 时序 网元拓扑连接 机器学习
下载PDF
上一页 1 2 57 下一页 到第
使用帮助 返回顶部