期刊文献+

云计算环境下知识约简算法 被引量:42

Knowledge Reduction Algorithms in Cloud Computing
下载PDF
导出
摘要 知识约简是粗糙集理论的重要研究内容之一.经典的知识约简算法是假设所有数据一次性装入内存中,这显然不适合处理海量数据.为此,从属性(集)的可辨识性和不可辨识性出发,给出了可辨识和不可辨识对象对的概念及其性质,并阐述了它们与差别矩阵的关系.利用MapReduce设计了并行计算等价类的方法,提出了面向大规模数据的数据并行知识约简算法,讨论并实现了3种并行策略.最后,通过实验表明了云计算环境下知识约简算法是有效可行的,具有较好的可扩展性. Knowledge reduction is one of the important research issues in rough set theory.Classical knowledge reduction algorithms assume all the datasets can be loaded into the main memory,which are infeasible for large-scale datasets.Massive data with high dimensions makes attribute reduction a challenging task.To this end,the concepts and properties of discernibility and indiscernibility object pairs are given in terms of the discernibility and indiscernibility of the attribute(s).The relationship between discernibility matrix and them is illustrated in detail.Then,an algorithm of computing equivalence classes is designed for large-scale data in data parallel,and the corresponding knowledge reduction algorithms are proposed in cloud computing.Finally,three parallelism strategies are implemented and discussed.The experimental results demonstrate that knowledge reduction algorithms in cloud computing can scale well and efficiently process massive datasets on commodity computers.
出处 《计算机学报》 EI CSCD 北大核心 2011年第12期2332-2343,共12页 Chinese Journal of Computers
基金 国家自然科学基金(60970061 61075056 61103067) 中央高校基本科研业务费专项资金 江苏省属高校自然科学资金项目(09KJD520004)资助~~
关键词 云计算 粗糙集 知识约简 数据并行 MAPREDUCE cloud computing rough set knowledge reduction data parallel MapReduce
  • 相关文献

参考文献10

二级参考文献54

共引文献1374

同被引文献484

引证文献42

二级引证文献573

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部