期刊文献+
共找到1,365篇文章
< 1 2 69 >
每页显示 20 50 100
Hybrid 1DCNN-Attention with Enhanced Data Preprocessing for Loan Approval Prediction
1
作者 Yaru Liu Huifang Feng 《Journal of Computer and Communications》 2024年第8期224-241,共18页
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model... In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control. 展开更多
关键词 Loan Approval Prediction Deep Learning One-Dimensional Convolutional Neural Network Attention Mechanism data preprocessing
下载PDF
基于改进K-means的电商页面数据分析与挖掘 被引量:4
2
作者 叶昊 缪宜恒 张宏俊 《软件》 2023年第6期35-43,共9页
数据挖掘技术是利用计算机强大的计算能力来代替部分人工分析的一项技术。传统的数据分析是人们利用自己的大脑对数据进行分析、思考和解读,但人脑所能承载的计算量是有限的。目前,计算机强大的计算能力代替了人脑,它们不仅可以处理一... 数据挖掘技术是利用计算机强大的计算能力来代替部分人工分析的一项技术。传统的数据分析是人们利用自己的大脑对数据进行分析、思考和解读,但人脑所能承载的计算量是有限的。目前,计算机强大的计算能力代替了人脑,它们不仅可以处理一些不需要自主思考的增删改查类工作,有时还可以担任一些需要自我学习能力的任务,比如对网页数据进行高质量分析与挖掘。为了进一步探究网页数据分析与挖掘,本文提出了一种基于优化样本距离计算方法,从而改进了K-means算法的聚类中心计算方法。具体来说,本文获取常见电商页面“当当网”公开的以“手机”为关键词的近12000条数据,使用文本挖掘技术对其进行数据挖掘,对数据的文本信息进行清洗、中文分词以及关键词权重计算等全面预处理,最终使用聚类中心优化的K-means算法,挖掘看似毫无关联的数据集中的隐藏信息为电商用户提供市场导向。 展开更多
关键词 电商页面 数据挖掘 数据预处理 中文文本聚类
下载PDF
Data preprocessing and preliminary results of the Moon-based Ultraviolet Telescope on the CE-3 lander 被引量:4
3
作者 Wei-Bin Wen Fang Wang +8 位作者 Chun-Lai Li Jing Wang Li Cao Jian-Jun Liu Xu Tan Yuan Xiao Qiang Fu Yan Su Wei Zuo 《Research in Astronomy and Astrophysics》 SCIE CAS CSCD 2014年第12期1674-1681,共8页
The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can... The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches. 展开更多
关键词 Chang'e-3 mission -- the Moon-based Ultraviolet Telescope -- data preprocessing -- near ultraviolet band
下载PDF
SMK-means:An Improved Mini Batch K-means Algorithm Based on Mapreduce with Big Data 被引量:1
4
作者 Bo Xiao Zhen Wang +1 位作者 Qi Liu Xiaodong Liu 《Computers, Materials & Continua》 SCIE EI 2018年第9期365-379,共15页
In recent years,the rapid development of big data technology has also been favored by more and more scholars.Massive data storage and calculation problems have also been solved.At the same time,outlier detection probl... In recent years,the rapid development of big data technology has also been favored by more and more scholars.Massive data storage and calculation problems have also been solved.At the same time,outlier detection problems in mass data have also come along with it.Therefore,more research work has been devoted to the problem of outlier detection in big data.However,the existing available methods have high computation time,the improved algorithm of outlier detection is presented,which has higher performance to detect outlier.In this paper,an improved algorithm is proposed.The SMK-means is a fusion algorithm which is achieved by Mini Batch K-means based on simulated annealing algorithm for anomalous detection of massive household electricity data,which can give the number of clusters and reduce the number of iterations and improve the accuracy of clustering.In this paper,several experiments are performed to compare and analyze multiple performances of the algorithm.Through analysis,we know that the proposed algorithm is superior to the existing algorithms. 展开更多
关键词 BIG data OUTLIER detection SMk-means MINI BATCH k-means simulated annealing
下载PDF
Diabetes Type 2: Poincaré Data Preprocessing for Quantum Machine Learning 被引量:1
5
作者 Daniel Sierra-Sosa Juan D.Arcila-Moreno +1 位作者 Begonya Garcia-Zapirain Adel Elmaghraby 《Computers, Materials & Continua》 SCIE EI 2021年第5期1849-1861,共13页
Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid appr... Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid approach combining quantum and classic devices is the Variational Quantum Classifier(VQC),which development seems promising.Albeit being largely studied,VQC implementations for“real-world”datasets are still challenging on Noisy Intermediate Scale Quantum devices(NISQ).In this paper we propose a preprocessing pipeline based on Stokes parameters for data mapping.This pipeline enhances the prediction rates when applying VQC techniques,improving the feasibility of solving classification problems using NISQ devices.By including feature selection techniques and geometrical transformations,enhanced quantum state preparation is achieved.Also,a representation based on the Stokes parameters in the PoincaréSphere is possible for visualizing the data.Our results show that by using the proposed techniques we improve the classification score for the incidence of acute comorbid diseases in Type 2 Diabetes Mellitus patients.We used the implemented version of VQC available on IBM’s framework Qiskit,and obtained with two and three qubits an accuracy of 70%and 72%respectively. 展开更多
关键词 Quantum machine learning data preprocessing stokes parameters Poincarésphere
下载PDF
Polarimetric Meteorological Satellite Data Processing Software Classification Based on Principal Component Analysis and Improved K-Means Algorithm 被引量:1
6
作者 Manyun Lin Xiangang Zhao +3 位作者 Cunqun Fan Lizi Xie Lan Wei Peng Guo 《Journal of Geoscience and Environment Protection》 2017年第7期39-48,共10页
With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In th... With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation. 展开更多
关键词 Principal COMPONENT ANALYSIS Improved k-mean ALGORITHM METEOROLOGICAL data Processing FEATURE ANALYSIS SIMILARITY ALGORITHM
下载PDF
DATA PREPROCESSING AND RE KERNEL CLUSTERING FOR LETTER
7
作者 Zhu Changming Gao Daqi 《Journal of Electronics(China)》 2014年第6期552-564,共13页
Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ... Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy. 展开更多
关键词 data preprocessing Kernel clustering Kernel Nearest Neighbor(KNN) Re kernel clustering
下载PDF
Power Data Preprocessing Method of Mountain Wind Farm Based on POT-DBSCAN
8
作者 Anfeng Zhu Zhao Xiao Qiancheng Zhao 《Energy Engineering》 EI 2021年第3期549-563,共15页
Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which co... Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which combines POT with DBSCAN(POT-DBSCAN)to improve the prediction efficiency of wind power prediction model.Firstly,according to the data of WT in the normal operation condition,the power prediction model ofWT is established based on the Particle Swarm Optimization(PSO)Arithmetic which is combined with the BP Neural Network(PSO-BP).Secondly,the wind-power data obtained from the supervisory control and data acquisition(SCADA)system is preprocessed by the POT-DBSCAN method.Then,the power prediction of the preprocessed data is carried out by PSO-BP model.Finally,the necessity of preprocessing is verified by the indexes.This case analysis shows that the prediction result of POT-DBSCAN preprocessing is better than that of the Quartile method.Therefore,the accuracy of data and prediction model can be improved by using this method. 展开更多
关键词 Wind turbine SCADA data data preprocessing method power prediction
下载PDF
D-IMPACT: A Data Preprocessing Algorithm to Improve the Performance of Clustering
9
作者 Vu Anh Tran Osamu Hirose +8 位作者 Thammakorn Saethang Lan Anh T. Nguyen Xuan Tho Dang Tu Kien T. Le Duc Luu Ngo Gavrilov Sergey Mamoru Kubo Yoichi Yamada Kenji Satou 《Journal of Software Engineering and Applications》 2014年第8期639-654,共16页
In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise a... In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise and outliers, and separate clusters. Our experimental results on two-dimensional datasets and practical datasets show that this algorithm can produce new datasets such that the performance of the clustering algorithm is improved. 展开更多
关键词 ATTRACTION CLUSTERING data preprocessing DENSITY SHRINKING
下载PDF
A State of Art Analysis of Telecommunication Data by k-Means and k-Medoids Clustering Algorithms
10
作者 T. Velmurugan 《Journal of Computer and Communications》 2018年第1期190-202,共13页
Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-clus... Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application. 展开更多
关键词 k-means ALGORITHM k-Medoids ALGORITHM data CLUSTERING Time COMPLEXITY TELECOMMUNICATION data
下载PDF
Detecting Anomalies in Irregular Data Using K-means Clustered Signal Dictionary
11
作者 G. Talavera Reyes Rajan M. Chandra +1 位作者 Ha Thu Le Zekeriya Aliyazicioglu 《Computer Technology and Application》 2016年第5期244-252,共9页
The critical nature of satellite network traffic provides a challenging environment to detect intrusions. The intrusion detection method presented aims to raise an alert whenever satellite network signals begin to exh... The critical nature of satellite network traffic provides a challenging environment to detect intrusions. The intrusion detection method presented aims to raise an alert whenever satellite network signals begin to exhibit anomalous patterns determined by Euclidian distance metric. In line with anomaly-based intrusion detection systems, the method presented relies heavily on building a model of"normal" through the creation of a signal dictionary using windowing and k-means clustering. The results of three signals fi'om our case study are discussed to highlight the benefits and drawbacks of the method presented. Our preliminary results demonstrate that the clustering technique used has great potential for intrusion detection for non-periodic satellite network signals. 展开更多
关键词 Intrusion detection irregular data k-means clustering machine learning signal dictionary
下载PDF
Application of Federated Learning Algorithm Based on K-Means in Electric Power Data
12
作者 Weimin He Lei Zhao 《Journal of New Media》 2022年第4期191-203,共13页
Accurate electricity forecasting is the key basis for guiding the power sector to arrange operation plans and guaranteeing the profitability of electric power companies.However,with the increasing demand of enterprise... Accurate electricity forecasting is the key basis for guiding the power sector to arrange operation plans and guaranteeing the profitability of electric power companies.However,with the increasing demand of enterprises and departments for data security,the phenomenon of“Isolated Data Island”becomes more and more serious,resulting in the accuracy loss of the traditional electricity prediction model.Federated learning,as an emerging artificial intelligence technology,is designed to ensure data privacy while carrying out efficient machine learning,which provides a new way to solve the problem of“Isolated Data Island”in terms of electricity forecasting.Nonetheless,due to the popularity of smart meters,the collected electricity data presents the characteristics of uneven distribution and huge data volume,so it is difficult to apply the electric quantity prediction model generated only by federated learning in practice.To solve this problem,a clustering federated learning method(C-FL)is proposed to protect data privacy while improving the accuracy of power prediction.Firstly,C-FL uses K-means algorithm to cluster power data locally in power enterprises,and then builds accurate power forecasting models for each class of power data combined with other local clients through federated learning.A large number of experimental results show that the clustering federated learning method proposed in this paper is superior to the existing federated learning models in terms of the accuracy of electric power forecasting. 展开更多
关键词 Electricity forecast federated learning k-means data security
下载PDF
K-MEANS算法在IDS中的应用研究 被引量:3
13
作者 李玲娟 李冰 薛明 《计算机技术与发展》 2010年第7期129-131,F0003,共4页
聚类算法广泛应用于入侵检测系统(IDS)的数据挖掘中。虽然K-MEANS算法是最为经典的聚类算法之一,但是由于入侵检测系统的数据集具有特殊性,直接在其上进行K-MEANS聚类的效果不佳。为了提高K-MEANS在IDS数据集上的聚类准确性,引入一种数... 聚类算法广泛应用于入侵检测系统(IDS)的数据挖掘中。虽然K-MEANS算法是最为经典的聚类算法之一,但是由于入侵检测系统的数据集具有特殊性,直接在其上进行K-MEANS聚类的效果不佳。为了提高K-MEANS在IDS数据集上的聚类准确性,引入一种数据预处理方法。该方法对IDS的记录特征做标准化处理,使原本取值范围差异很大的数值型特征在同一个区间内取值,排除原始数据中不同度量带来的不良影响,从而优化聚类的效果。仿真实验表明,K-MEANS算法对预处理后的IDS数据集的聚类准确度有很大的提高。 展开更多
关键词 数据挖掘 入侵检测系统 K均值聚类 预处理
下载PDF
基于K-means聚类算法的学生表现数据分析及预测建模研究 被引量:5
14
作者 吕丁 《微型电脑应用》 2021年第5期148-150,共3页
通过对学生生活、学习、活动等行为特征数据分析挖掘,采用改良的K-means聚类算法建立学生表现类别模型,实现根据学生表现数据将学生进行分类。选择学生“德育成绩、体育成绩、智育成绩、竞赛等级、贫困生等级、奖学金等级”6个属性数据... 通过对学生生活、学习、活动等行为特征数据分析挖掘,采用改良的K-means聚类算法建立学生表现类别模型,实现根据学生表现数据将学生进行分类。选择学生“德育成绩、体育成绩、智育成绩、竞赛等级、贫困生等级、奖学金等级”6个属性数据作为特征评价指标。针对高校学生管理系统类别放多造成的数据重复、缺失、存储类型不一致等问题,对数据清洗、集成和变换数据存储格式,得到满足K-means算法的输入数据。 展开更多
关键词 k-means聚类算法 学生表现 数据预处理 聚类中心
下载PDF
An efficient enhanced k-means clustering algorithm 被引量:30
15
作者 FAHIM A.M SALEM A.M +1 位作者 TORKEY F.A RAMADAN M.A 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第10期1626-1633,共8页
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista... In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation. 展开更多
关键词 Clustering algorithms Cluster analysis k-means algorithm data analysis
下载PDF
Application of Self-Organizing Feature Map Neural Network Based on K-means Clustering in Network Intrusion Detection 被引量:5
16
作者 Ling Tan Chong Li +1 位作者 Jingming Xia Jun Cao 《Computers, Materials & Continua》 SCIE EI 2019年第7期275-288,共14页
Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one... Due to the widespread use of the Internet,customer information is vulnerable to computer systems attack,which brings urgent need for the intrusion detection technology.Recently,network intrusion detection has been one of the most important technologies in network security detection.The accuracy of network intrusion detection has reached higher accuracy so far.However,these methods have very low efficiency in network intrusion detection,even the most popular SOM neural network method.In this paper,an efficient and fast network intrusion detection method was proposed.Firstly,the fundamental of the two different methods are introduced respectively.Then,the selforganizing feature map neural network based on K-means clustering(KSOM)algorithms was presented to improve the efficiency of network intrusion detection.Finally,the NSLKDD is used as network intrusion data set to demonstrate that the KSOM method can significantly reduce the number of clustering iteration than SOM method without substantially affecting the clustering results and the accuracy is much higher than Kmeans method.The Experimental results show that our method can relatively improve the accuracy of network intrusion and significantly reduce the number of clustering iteration. 展开更多
关键词 k-means clustering self-organizing feature map neural network network security intrusion detection NSL-KDD data set
下载PDF
k-means算法在彩票异常交易检测系统中的应用 被引量:3
17
作者 王选 赵鹏 《福建电脑》 2022年第6期1-4,共4页
庞大的彩票交易数据中会暗存着一些异常交易,这些异常交易影响了彩票业务的安全和侵害了彩票购买者的权益。从交易大数据中快速发现异常数据点,并对其进行实时处理,成为有待解决的重要问题之一。本文设计了以特征工程思想为基础的DPA算... 庞大的彩票交易数据中会暗存着一些异常交易,这些异常交易影响了彩票业务的安全和侵害了彩票购买者的权益。从交易大数据中快速发现异常数据点,并对其进行实时处理,成为有待解决的重要问题之一。本文设计了以特征工程思想为基础的DPA算法,使其可以对候选异常数据点进行快速检测。实验结果表明,本文的算法可以有效地提高检测的精确度和执行时间。 展开更多
关键词 交易 数据预处理 异常检测 k-means
下载PDF
A Tradeoff Between Accuracy and Speed for K-Means Seed Determination
18
作者 Farzaneh Khorasani Morteza Mohammadi Zanjireh +1 位作者 Mahdi Bahaghighat Qin Xin 《Computer Systems Science & Engineering》 SCIE EI 2022年第3期1085-1098,共14页
With a sharp increase in the information volume,analyzing and retrieving this vast data volume is much more essential than ever.One of the main techniques that would be beneficial in this regard is called the Clusteri... With a sharp increase in the information volume,analyzing and retrieving this vast data volume is much more essential than ever.One of the main techniques that would be beneficial in this regard is called the Clustering method.Clustering aims to classify objects so that all objects within a cluster have similar features while other objects in different clusters are as distinct as possible.One of the most widely used clustering algorithms with the well and approved performance in different applications is the k-means algorithm.The main problem of the k-means algorithm is its performance which can be directly affected by the selection in the primary clusters.Lack of attention to this crucial issue has consequences such as creating empty clusters and decreasing the convergence time.Besides,the selection of appropriate initial seeds can reduce the cluster’s inconsistency.In this paper,we present a new method to determine the initial seeds of the k-mean algorithm to improve the accuracy and decrease the number of iterations of the algorithm.For this purpose,a new method is proposed considering the average distance between objects to determine the initial seeds.Our method attempts to provide a proper tradeoff between the accuracy and speed of the clustering algorithm.The experimental results showed that our proposed approach outperforms the Chithra with 1.7%and 2.1%in terms of clustering accuracy for Wine and Abalone detection data,respectively.Furthermore,achieved results indicate that comparing with the Reverse Nearest Neighbor(RNN)search approach,the proposed method has a higher convergence speed. 展开更多
关键词 data clustering k-means algorithm information retrieval outlier detection clustering accuracy unsupervised learning
下载PDF
Mining Profitability of Telecommunication Customers Using K-Means Clustering
19
作者 Hasitha Indika Arumawadu R. M. Kapila Tharanga Rathnayaka S. K. Illangarathne 《Journal of Data Analysis and Information Processing》 2015年第3期63-71,共9页
Data mining is the powerful technique, which can be widely used for discovering the customers’ behaviors as well as customer’s preferences. As a result, it has been widely used in top level companies for evaluating ... Data mining is the powerful technique, which can be widely used for discovering the customers’ behaviors as well as customer’s preferences. As a result, it has been widely used in top level companies for evaluating their Customer Relationship Management (CRM) system today. In this study, a new K-means clustering method proposed to evaluate the cluster customers’ profitability in telecommunication industry in Sri Lanka. Furthermore, RFM model mainly used as an input variable for K-means clustering and distortion curve used to identify optimal number of initial clusters. Based on the results, telecommunication customers’ profitability in Sri Lanka mainly categorized into three levels. 展开更多
关键词 k-means Clustering data MINING RFM Model CUSTOMER Relationship Management
下载PDF
Parallel K-Means Algorithm for Shared Memory Multiprocessors
20
作者 Tayfun Kucukyilmaz 《Journal of Computer and Communications》 2014年第11期15-23,共9页
Clustering is the task of assigning a set of instances into groups in such a way that is dissimilarity of instances within each group is minimized. Clustering is widely used in several areas such as data mining, patte... Clustering is the task of assigning a set of instances into groups in such a way that is dissimilarity of instances within each group is minimized. Clustering is widely used in several areas such as data mining, pattern recognition, machine learning, image processing, computer vision and etc. K-means is a popular clustering algorithm which partitions instances into a fixed number clusters in an iterative fashion. Although k-means is considered to be a poor clustering algorithm in terms of result quality, due to its simplicity, speed on practical applications, and iterative nature it is selected as one of the top 10 algorithms in data mining [1]. Parallelization of k-means is also studied during the last 2 decades. Most of these work concentrate on shared-nothing architectures. With the advent of current technological advances on GPU technology, implementation of the k-means algorithm on shared memory architectures recently start to attract some attention. However, to the best of our knowledge, no in-depth analysis on the performance of k-means on shared memory multiprocessors is done in the literature. In this work, our aim is to fill this gap by providing theoretical analysis on the performance of k-means algorithm and presenting extensive tests on a shared memory architecture. 展开更多
关键词 k-means CLUSTERING data MINING SHARED MEMORY Systems High Performance
下载PDF
上一页 1 2 69 下一页 到第
使用帮助 返回顶部