期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
基于并行计算的大数据挖掘在电网中的应用 被引量:3
1
作者 贾金伟 吴旭鹏 +1 位作者 李启本 戴人杰 《电力与能源》 2017年第6期724-729,共6页
目前并行计算和云计算平台已成为解决大数据挖掘的重要手段。并行计算是将大数据划分成独立的小数据分别进行计算,阐述了常用的分布式和MapReduce方法等数据挖掘方法。分布式方法将大数据以手动的方式划分为若干个子集,并采用相应的数... 目前并行计算和云计算平台已成为解决大数据挖掘的重要手段。并行计算是将大数据划分成独立的小数据分别进行计算,阐述了常用的分布式和MapReduce方法等数据挖掘方法。分布式方法将大数据以手动的方式划分为若干个子集,并采用相应的数据挖掘算法进行处理,通过合并子集结果来获得最终的结果。MapReduce方法基于云计算平台对数据进行筛选和排序,再拆分成若干个映射任务,最后汇总成最终的输出结果。结合国家电网四个大数据集对分布式和MapReduce方法在数据挖掘的准确性和效率上进行对比,仿真结果表明,除了类不平衡的数据集,MapReduce明显优于基准和分布式计算模式。 展开更多
关键词 配数据挖掘 并行计算 云计算 分布式 MAPREDUCE
下载PDF
An Improving Indexing Approach on Time Series Based on Minimum Bounding Rectangle
2
作者 孙梅玉 唐漾 方建安 《Journal of Donghua University(English Edition)》 EI CAS 2009年第1期75-79,共5页
A fundamental problem in whole sequence matching and subsequence matching is the problem of representation of time series.In the last decade many high level representations of time series have been proposed for data m... A fundamental problem in whole sequence matching and subsequence matching is the problem of representation of time series.In the last decade many high level representations of time series have been proposed for data mining which involve a trade-off between accuracy and compactness.In this paper the author proposes a novel time series representation called Grid Minimum Bounding Rectangle(GMBR) and based on Minimum Bounding Rectangle.In this paper,the binary idea is applied into the Minimum Bounding Rectangle.The experiments have been performed on synthetic,as well as real data sequences to evaluate the proposed method.The experiment demonstrates that 69%-92% of irrelevant sequences are pruned using the proposed method. 展开更多
关键词 GMBR REPRESENTATION time series data mining
下载PDF
A REAL-TIME C-V CLUSTERING ALGORITHM FOR WEB-MINING
3
作者 Li Haiying Zhuang Zhenquan Li Bin Wan Ke (Dept. of Electronic S &T, University of Science and Technology of China, HeFei 230026) 《Journal of Electronics(China)》 2002年第1期71-75,共5页
In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote ... In this letter, a real-time C-V (Characteristic-Vector) clustering algorithm is put forth to treat with vast action data which are dynamically collected from web site. The algorithm cites the concept of C-V to denote characteristic, synchronously it adopts two-value [0,1]input and self-definition vigilance parameter to design clustering-architecture. Vector Degree of Matching (VDM) plays a key role in the clustering algorithm, which determines the magnitude of typical characteristic. Making use of stability analysis, the classifications are confirmed to have reliably hierarchical structure when vigilance parameter shifts from 0.1 to 0.99. This non-linear relation between vigilance parameter and classification upper limit helps mining out representative classifications from net-users according to the actual web resource, then administering system can map them to web resource space to implement the intelligent configuration effectually and rapidly. 展开更多
关键词 Clustering algorithm Characteristic-vector Vector degree of matching
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部