摘要
针对现有的隐私保护关联规则挖掘算法无法满足效率与精度之间较好折中的问题,提出了一种平均信息分布聚类混合算法AIDCH(The average information distributed clustering hybrid algorithm).算法建立了关联规则向量,在其中用到了信息论方面的内容.计算信息源各个特征的次数积累关联,提取一种潜在的明显特征,以邻域潜在的特征作为聚类对象进行聚类,引入数据挖掘关联本体概念,在非单调性约束的条件下进行挖掘,克服由隐私保护带来的关联空间数据弱化的弊端.实验表明,该算法在保护隐私的情况下,能够获得精度和效率之间较好的折中,具有一定的实用价值.
To solve the problem that the existing privacy preserving association rule mining algorithm cannot meet better trade;off between efficiency and accuracy, the paper proposes average information distributed clustering hybrid algorithm. The algorithm creates a vector of association rules, which uses content of information theory. Accumulation of calculation information source the number of times and extraction obvious features of a potential, the potential characteristics of the neighborhood as clustering object clustering, and the introduction of data mining association ontology concept, digging under the conditions of the non-monotonicity constraint, to overcome the weakening drawbacks associated space data by the Privacy. The experiments show that the algorithm can obtain a good tradeoff between accuracy and efficiency in the case of the protection of privacy.
出处
《微电子学与计算机》
CSCD
北大核心
2014年第2期168-172,共5页
Microelectronics & Computer
基金
河南省重点科技攻关项目(102102210265)
河南省基础与前沿研究项目(132300410400)
河南省信息技术教育研究规划项目(ITE12064)
关键词
隐私保护
关联规则挖掘
关联本体
潜在特征提取
聚类
privacy protection
association rules mining
correlation ontology
potential feature extraction
cluster