摘要
针对在生物信息网络中对复杂和大规模的数据集进行挖掘时所出现的算法挖掘精度低、运行速度慢、内存占用大等问题,提出一种基于关联规则映射的生物信息网络多维数据挖掘算法。该算法结合网络数据集之间的关联映射关系,从而确定网络数据集的关联规则,并引入挖掘因子和相对误差来提高算法的挖掘精度;根据多维子空间中数据集之间的关联程度进行子空间区分以及子空间内数据集区分,从而实现对不同数据集的有效挖掘。在实验中,对不同数据集数量下的算法内存占用情况、算法挖掘精度、算法运行时间进行仿真,从实验结果可以看出基于关联规则映射的挖掘算法可以有效地提高挖掘精度,在减少内存占用和提升计算速度上也具有一定的优势。
For the problems such as mining low accuracy of algorithm, low speed and large memory footprint when digging the complex and large-scale data sets in the biological information network, this paper proposed a biological information network multi-dimensional data mining algorithm that based on association rules mapping. The algorithm combined association mapping relationship between the network dataset to determine the association rules of network dataset, and introduced the mining factor and relative error to improve mining accuracy of the algorithm. According to the multi-dimensional subspace degree of associa- tion between the data sets to distinguish the subspace and subspace datasets in order to achieve effective excavation of different data sets. The experiment al results on the memory usage of the algorithm on the number of different sets of data, the accuracy of mining algorithm, the simulation of algorithm running time, show the association rule mining algorithm can effectively improve the mining map accuracy, reduce the memory footprint and enhance the computing speed.
出处
《计算机应用研究》
CSCD
北大核心
2015年第6期1614-1616,1620,共4页
Application Research of Computers
基金
广东省"产学研"资助项目(2012B091100043)
关键词
数据挖掘
关联规则映射
生物信息网络
多维数据挖掘
data mining
association rule mapping
biological information network
multidimensional data mining