In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interacti...In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining展开更多
Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only ...Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.展开更多
数据世系描述数据产生、演化的机理和流程,对数据质量评估、数据恢复、数据分析有重要意义.伴随着数据共享的日益深化,对数据世系的主要表现结构世系工作流进行共享的需求也日益迫切.世系工作流中包含的节点模块,以及节点间的时序关系...数据世系描述数据产生、演化的机理和流程,对数据质量评估、数据恢复、数据分析有重要意义.伴随着数据共享的日益深化,对数据世系的主要表现结构世系工作流进行共享的需求也日益迫切.世系工作流中包含的节点模块,以及节点间的时序关系可能涉及数据所有者的隐私,对其进行共享不可避免地会带来隐私保护问题.已有研究侧重世系工作流局部映射关系的维持,对世系工作流可用性的重要表现--工作流时序约束关系维持效果较弱;也缺少对工作流相邻节点有向度分布隐私的保护.针对上述问题,引入输入/输出度序列(Input and Output Degree Sequence with Scale i,IO-iD)模型,在描述世系工作流节点度分布的同时,兼顾对工作流方向特性的提取;提出Previous-Next时序序列结构,描述工作流中节点与其邻接节点的子结构特征;在此基础上,提出基于差分隐私的隐私保护世系工作流发布算法DpriPP,实现弱背景知识依赖的隐私保护世系工作流发布与工作流时序依赖关系可用性的有效维持.理论分析和实验结果表明,所提算法在保护世系工作流局部相邻节点有向度分布隐私的同时,能有效维持世系工作流节点局部与整体时序依赖关系的可用性.展开更多
基金Supported by the National Natural Science Funda-tion of China (70371015) andthe Natural Science Foundation of Jian-gsu Province (BK2004058)
文摘In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining
基金Supported by the National Natural Science Foun-dation of China (70371015)
文摘Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.
文摘数据世系描述数据产生、演化的机理和流程,对数据质量评估、数据恢复、数据分析有重要意义.伴随着数据共享的日益深化,对数据世系的主要表现结构世系工作流进行共享的需求也日益迫切.世系工作流中包含的节点模块,以及节点间的时序关系可能涉及数据所有者的隐私,对其进行共享不可避免地会带来隐私保护问题.已有研究侧重世系工作流局部映射关系的维持,对世系工作流可用性的重要表现--工作流时序约束关系维持效果较弱;也缺少对工作流相邻节点有向度分布隐私的保护.针对上述问题,引入输入/输出度序列(Input and Output Degree Sequence with Scale i,IO-iD)模型,在描述世系工作流节点度分布的同时,兼顾对工作流方向特性的提取;提出Previous-Next时序序列结构,描述工作流中节点与其邻接节点的子结构特征;在此基础上,提出基于差分隐私的隐私保护世系工作流发布算法DpriPP,实现弱背景知识依赖的隐私保护世系工作流发布与工作流时序依赖关系可用性的有效维持.理论分析和实验结果表明,所提算法在保护世系工作流局部相邻节点有向度分布隐私的同时,能有效维持世系工作流节点局部与整体时序依赖关系的可用性.