摘要
在软件故障测试和数据库访问中,对非显著特征数据的挖掘是难点,通过对非显著特征数据的挖掘,处理数据分布比较稀疏且呈现模式分布不规则的数据访问问题。提出一种基于链距离估计的非显著特征数据挖掘算法,在时域上对链距离估计模型进行平移处理,给出非显著特征数据的离群因子概念,提取关联度主特征量,基于链距离估计结果,得到有效特征挖掘概率密度值,实现对非显著特征数据挖掘算法改进。仿真实验表明,该算法使得无论是不同密度的点簇相互靠近还是出现模式偏离的情况,都能有效的挖掘出非显著特征点,从而增强了数据挖掘算法的有效性和通用性,采用该法能有效提高非显著特征数据的挖掘性能,数据挖掘的命中率较高,在数据库访问和软件故障测试等领域具有应用价值。
In software fault testing and database access, mining of non significant features of the data is difficult, through the mining of non significant features of the data, processing data are very sparse and presented a model of the distribution of irregular data access issues. Put forward a kind of chain distance estimation based on non significant feature of data min?ing algorithms in time domain for distance estimation model for translational processing chain, are non significant features of the data outlier factor concept, extraction of association degree of the main features, the results estimated distance based on the chain, effectively feature mining probability density value, implementation of mining algorithm, improvement of non significant feature data. Simulation results show that the algorithm makes both the different density of the cluster are close to each other or model deviates from the situation, can efficiently discover the non obvious features, thereby enhancing the data mining algorithm is effective and versatile, can effectively improve the mining performance of non significant features of the data by using this method, data mining the hit rate is higher, it has application value in the database access and soft?ware fault testing and etc.
出处
《科技通报》
北大核心
2015年第6期142-144,共3页
Bulletin of Science and Technology
基金
湖北省教育厅优秀中青年项目(Q20132904
D20132903)
湖北省自科基金项目(2013CFB473)
关键词
非显著特征
数据挖掘
软件测试
数据库访问
non significant feature
data mining
software testing
database access