期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Hierarchical hesitant fuzzy K-means clustering algorithm 被引量:21
1
作者 CHEN Na XU Ze-shui XIA Mei-mei 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2014年第1期1-17,共17页
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar... Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm. 展开更多
关键词 90B50 68T10 62H30 Hesitant fuzzy set hierarchical clustering k-means clustering intuitionisitc fuzzy set
下载PDF
A Kernel Clustering Algorithm for Fast Training of Support Vector Machines
2
作者 刘笑嶂 冯国灿 《Journal of Donghua University(English Edition)》 EI CAS 2011年第1期53-56,共4页
A new algorithm named kernel bisecting k-means and sample removal(KBK-SR) is proposed as sampling preprocessing for support vector machine(SVM) training to improve the efficiency.The proposed algorithm tends to quickl... A new algorithm named kernel bisecting k-means and sample removal(KBK-SR) is proposed as sampling preprocessing for support vector machine(SVM) training to improve the efficiency.The proposed algorithm tends to quickly produce balanced clusters of similar sizes in the kernel feature space,which makes it efficient and effective for reducing training samples.Theoretical analysis and experimental results on three UCI real data benchmarks both show that,with very short sampling time,the proposed algorithm dramatically accelerates SVM sampling and training while maintaining high test accuracy. 展开更多
关键词 support vector machines(SVMs) sample reduction topdown hierarchical clustering kernel bisecting k-means
下载PDF
Performances of Clustering Methods Considering Data Transformation and Sample Size: An Evaluation with Fisheries Survey Data
3
作者 WO Jia ZHANG Chongliang +2 位作者 XU Binduo XUE Ying REN Yiping 《Journal of Ocean University of China》 SCIE CAS CSCD 2020年第3期659-668,共10页
Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable ... Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods. 展开更多
关键词 hierarchical cluster k-means cluster expectation-maximization cluster optimal number of clusters stability data transformation
下载PDF
Comparison of Clustering Methods in Yeast Saccharomyces Cerevisiae
4
作者 Wen Wang Ni-Ni Rao Xi Chen Shang-Lei Xu 《Journal of Electronic Science and Technology》 CAS 2010年第2期178-182,共5页
In recent years, microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for disc... In recent years, microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation. Many clustering methods including k-means, fuzzy c-means, and hierarchical clustering have been widely used in literatures. Yet no comprehensive comparative study has been performed to evaluate the effectiveness of these methods, specially, in yeast saccharomyces cerevisiae. In this paper, these three gene clustering methods are compared. Classification accuracy and CPU time cost are employed for measuring performance of these algorithms. Our results show that hierarchical clustering outperforms k-means and fuzzy c-means clustering. The analysis provides deep insight to the complicated gene clustering problem of expression profile and serves as a practical guideline for routine microarray cluster analysis of gene expression. 展开更多
关键词 Fuzzy c-means hierarchical clustering k-means yeast saecharomyees cerevisiae.
下载PDF
A Clustering Approach for Customer Billing Prediction in Mall: A Machine Learning Mechanism
5
作者 Sriramakrishnan Chandrasekaran Abhishek Kumar 《Journal of Computer and Communications》 2019年第3期55-66,共12页
Machine learning implementations are being done in a long way in science and technology and especially in medical stream. In this article, we are focusing on machine learning implementation on mall customers and based... Machine learning implementations are being done in a long way in science and technology and especially in medical stream. In this article, we are focusing on machine learning implementation on mall customers and based on their income and how they can invest in the purchase in a mall. This explains the features like Customer ID, gender, age, income, and spending score. There, we mentioned a score in purchasing the goods in the mall. In this scenario, we are implementing clustering mechanisms, and here we apply the dataset of mall customers which is a public dataset and create clusters related to the customer purchase. We implement machine learning models for the prediction of whether the visited customer will purchase any product or not. For this kind of works, we require many of the inputs like the features mentioned in the paper. To maintain the features, we require a model with machine learning capability. We are performing K-Means clustering and Hierarchical clustering mechanisms, and finally, we implement a confusion matrix to achieve and identify the highest accuracy in those two algorithms. Here, we consider machine learning mechanisms to predict the category of the customer about whether they can buy a product or not based on the independent variables. This work presents you a simple machine learning prediction model based on which we can predict the category of the customer based on clustering. Before clustering, we don’t know to what group they belong to. But after clustering, we can identify the category that data node belongs to. In this article, we are mentioning the process of determining the employee based information using machine learning clustering mechanisms. 展开更多
关键词 clustering Machine Learning CATEGORY Technology hierarchical k-means
下载PDF
A Direct Data-Cluster Analysis Method Based on Neutrosophic Set Implication
6
作者 Sudan Jha Gyanendra Prasad Joshi +2 位作者 Lewis Nkenyereya Dae Wan Kim Florentin Smarandache 《Computers, Materials & Continua》 SCIE EI 2020年第11期1203-1220,共18页
Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets... Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets.This paper focuses on cluster analysis based on neutrosophic set implication,i.e.,a k-means algorithm with a threshold-based clustering technique.This algorithm addresses the shortcomings of the k-means clustering algorithm by overcoming the limitations of the threshold-based clustering algorithm.To evaluate the validity of the proposed method,several validity measures and validity indices are applied to the Iris dataset(from the University of California,Irvine,Machine Learning Repository)along with k-means and threshold-based clustering algorithms.The proposed method results in more segregated datasets with compacted clusters,thus achieving higher validity indices.The method also eliminates the limitations of threshold-based clustering algorithm and validates measures and respective indices along with k-means and threshold-based clustering algorithms. 展开更多
关键词 Data clustering data mining neutrosophic set k-means validity measures cluster-based classification hierarchical clustering
下载PDF
Analyzing the Urban Hierarchical Structure Based on Multiple Indicators of Economy and Industry: An Econometric Study in China 被引量:1
7
作者 Jing Cheng Yang Xie Jie Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第6期1831-1855,共25页
For a city,analyzing its advantages,disadvantages and the level of economic development in a country is important,especially for the cities in China developing at flying speed.The corresponding literatures for the cit... For a city,analyzing its advantages,disadvantages and the level of economic development in a country is important,especially for the cities in China developing at flying speed.The corresponding literatures for the cities in China have not considered the indicators of economy and industry in detail.In this paper,based on multiple indicators of economy and industry,the urban hierarchical structure of 285 cities above the prefecture level in China is investigated.The indicators from the economy,industry,infrastructure,medical care,population,education,culture,and employment levels are selected to establish a new indicator system for analyzing urban hierarchical structure.The factor analysis method is used to investigate the relationship between the variables of selected indicators and obtain the score of each common factor and comprehensive scores and rankings for 285 cities above the prefecture level in China.According to the comprehensive scores,285 cities above the prefecture level are clustered into 15 levels by using K-means clustering algorithm.Then,the hierarchical structure system of the cities above the prefecture level in China is obtained and corresponding policy implications are proposed.The results and implications can not only be applied to the urban planning and development in China but also offer a reference on other developing countries.The methodologies used in this paper can also be applied to study the urban hierarchical structure in other countries. 展开更多
关键词 Urban planning hierarchical structure prefecture-level city factor analysis method k-means clustering algorithm China
下载PDF
A Novel Clustering Technique for Efficient Clustering of Big Data in Hadoop Ecosystem 被引量:5
8
作者 Sunil Kumar Maninder Singh 《Big Data Mining and Analytics》 2019年第4期240-247,共8页
Big data analytics and data mining are techniques used to analyze data and to extract hidden information.Traditional approaches to analysis and extraction do not work well for big data because this data is complex and... Big data analytics and data mining are techniques used to analyze data and to extract hidden information.Traditional approaches to analysis and extraction do not work well for big data because this data is complex and of very high volume. A major data mining technique known as data clustering groups the data into clusters and makes it easy to extract information from these clusters. However, existing clustering algorithms, such as k-means and hierarchical, are not efficient as the quality of the clusters they produce is compromised. Therefore, there is a need to design an efficient and highly scalable clustering algorithm. In this paper, we put forward a new clustering algorithm called hybrid clustering in order to overcome the disadvantages of existing clustering algorithms. We compare the new hybrid algorithm with existing algorithms on the bases of precision, recall, F-measure, execution time, and accuracy of results. From the experimental results, it is clear that the proposed hybrid clustering algorithm is more accurate, and has better precision, recall, and F-measure values. 展开更多
关键词 clustering HADOOP BIG data k-means hierarchical
原文传递
Classification of Hourly Clearness Index of Solar Radiation in the District of Yamoussoukro
9
作者 Siaman Paule Carine Yeboua Yao N’Goran Kouakou Konan 《Energy and Power Engineering》 2019年第5期220-231,共12页
The exploitation of systems using solar energy as a source of energy is not fluctuations free because of short passage of clouds on solar radiation. The amplitude, the persistence and the frequency of these fluctuatio... The exploitation of systems using solar energy as a source of energy is not fluctuations free because of short passage of clouds on solar radiation. The amplitude, the persistence and the frequency of these fluctuations should be analyzed with appropriate tools, instead of focusing on their location over time. The analysis of these fluctuations should use the instantaneous clearness index whose distribution is given as a first approximation which is independent not only of the season but also of the site. It is important to evaluate the potential solar energy in a region. Indeed such evaluation helps the decision-makers in their reflections on agricultural or photovoltaic solar projects. Then this study was conducted for a predictive purpose. The method used in our work combines the classification method which is the hierarchical ascending classification and two partitioning methods, the principal component?analysis and the K-means method. The partitioning method enabled to?achieve a number of well-known situations (in advance) that are representative of the day. The study was based on the data of a climatic weather station in the district of Yamoussoukro located in the center region of C&ocirc;te d’Ivoire during the 2017 year. Using the clearness index, the study allowed the classification of the solar radiation in the region. Thus, it showed that only 346 days of the 365 days in 2017 were classified (95%). We identified three clusters of days, the cloudy sky (29%), the partly cloudy sky?(32%) and the clear sky (39%). The statistical tests used for the characterization?of these clusters will be detailed in a future study. 展开更多
关键词 CLEARNESS Index hierarchical clustering Principal Component Analysis k-means Method CLASSIFICATION
下载PDF
Hybrid Data Mining Models for Predicting Customer Churn
10
作者 Amjad Hudaib Reham Dannoun +2 位作者 Osama Harfoushi Ruba Obiedat Hossam Faris 《International Journal of Communications, Network and System Sciences》 2015年第5期91-96,共6页
The term “customer churn” is used in the industry of information and communication technology (ICT) to indicate those customers who are about to leave for a new competitor, or end their subscription. Predicting this... The term “customer churn” is used in the industry of information and communication technology (ICT) to indicate those customers who are about to leave for a new competitor, or end their subscription. Predicting this behavior is very important for real life market and competition, and it is essential to manage it. In this paper, three hybrid models are investigated to develop an accurate and efficient churn prediction model. The three models are based on two phases;the clustering phase and the prediction phase. In the first phase, customer data is filtered. The second phase predicts the customer behavior. The first model investigates the k-means algorithm for data filtering, and Multilayer Perceptron Artificial Neural Networks (MLP-ANN) for prediction. The second model uses hierarchical clustering with MLP-ANN. The third one uses self organizing maps (SOM) with MLP-ANN. The three models are developed based on real data then the accuracy and churn rate values are calculated and compared. The comparison with the other models shows that the three hybrid models outperformed single common models. 展开更多
关键词 Data Mining k-means hierarchical cluster Self ORGANIZING MAPS MULTILAYER PERCEPTRON Artificial Neural Networks CHURN Prediction
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部