摘要
由于微阵列数据集行(样本)少列(基因)多的特征,使得采用传统列枚举方法对其进行频繁闭合模式挖掘较为困难.基于行枚举方法,提出超链接结构HT-struct,并基于该结构提出频繁闭合模式挖掘新算法HTCLOSE.算法采用深度优先搜索策略,结合高效的修剪技术和巧妙的链表组织技术,在时间和空间上均得到了优化.实验表明,HTCLOSE算法通常快于行枚举算法CARPENTER.
Because the microarray datasets contain a large number of columns (genes) and a small number of rows (samples), mining frequent closed patterns in microarray datasets pose a great challenge for traditional algorithms based on the column enumeration space. Based on the row enumeration space, a hyperlink structure,HT-struct was suggested and a new algorithm, HTCLOSE was proposed for mining frequent closed patterns. HTCLOSE searched the row enumeration space in depth-first, combined efficient pruning and ingenious hyperlink organizing. Several experiments on real-life microarray datasets showed that HTCLOSE was faster than CARPENTER, an algorithm based on the row enumeration space.
出处
《小型微型计算机系统》
CSCD
北大核心
2008年第2期274-278,共5页
Journal of Chinese Computer Systems
基金
国家自然科学基金重点项目(60533020)资助