During the condition monitoring of a planetary gearbox, features are extracted from raw data for a fault diagnosis.However, different features have different sensitivity for identifying different fault types, and thus...During the condition monitoring of a planetary gearbox, features are extracted from raw data for a fault diagnosis.However, different features have different sensitivity for identifying different fault types, and thus, the selection of a sensitive feature subset from an entire feature set and retaining as much of the class discriminatory information as possible has a directly effect on the accuracy of the classification results. In this paper, an improved hybrid feature selection technique(IHFST) that combines a distance evaluation technique(DET), Pearson’s correlation analysis, and an ad hoc technique is proposed. In IHFST, a temporary feature subset without irrelevant features is first selected according to the distance evaluation criterion of DET, and the Pearson’s correlation analysis and ad hoc technique are then employed to find and remove redundant features in the temporary feature subset, respectively, and hence,a sensitive feature subset without irrelevant or redundant features is selected from the entire feature set. Further, the k-means clustering method is applied to classify the different kinds of health conditions. The effectiveness of the proposed method was validated through several experiments carried out on a planetary gearbox with incipient cracks seeded in the tooth root of the sun gear, planet gear, and ring gear. The results show that the proposed method can successfully distinguish the different health conditions of a planetary gearbox, and achieves a better classification performance than other methods. This study proposes a sensitive feature subset selection method that achieves an obvious improvement in terms of the accuracy of the fault classification.展开更多
In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power ...In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power generation forecast method using the combination of K-means++, grey relational analysis (GRA) and support vector regression (SVR) based on feature selection (Hybrid Kmeans-GRA-SVR, HKGSVR) was proposed. The historical power data were clustered through the multi-index K-means++ algorithm and divided into ideal and non-ideal weather. The GRA algorithm was used to match the similar day and the nearest neighbor similar day of the prediction day. And selected appropriate input features for different weather types to train the SVR model. Under ideal weather, the average values of MAE, RMSE and R2 were 0.8101, 0.9608 kW and 99.66%, respectively. And this method reduced the average training time by 77.27% compared with the standard SVR model. Under non-ideal weather conditions, the average values of MAE, RMSE and R2 were 1.8337, 2.1379 kW and 98.47%, respectively. And this method reduced the average training time of the standard SVR model by 98.07%. The experimental results show that the prediction accuracy of the proposed model is significantly improved compared to the other five models, which verify the effectiveness of the method.展开更多
将Bag of Features算法引入汽车图像识别领域中,并提出了将DoG(Difference of Gaussian)特征提取算法和PLSA分类算法结合在一起实现车辆和背景图像分类。首先用DoG特征提取算法提取图像特征,用这些特征聚类产生码书并对图像进行柱状图描...将Bag of Features算法引入汽车图像识别领域中,并提出了将DoG(Difference of Gaussian)特征提取算法和PLSA分类算法结合在一起实现车辆和背景图像分类。首先用DoG特征提取算法提取图像特征,用这些特征聚类产生码书并对图像进行柱状图描述,最后设计PLSA分类器对车辆图像和背景图像进行分类。实验对比了该算法与Tamura纹理特征算法和Gabor纹理特征算法在车辆图像识别中的效果。结果表明本文算法分类正确率优于另外两种方法。展开更多
Recent work has established that digital images of a human face, when collected with a fixed pose but under a variety of illumination conditions, possess discriminatory information that can be used in classification. ...Recent work has established that digital images of a human face, when collected with a fixed pose but under a variety of illumination conditions, possess discriminatory information that can be used in classification. In this paper we perform classification on Grassmannians to demonstrate that sufficient discriminatory information persists in feature patch (e.g., nose or eye patch) illumination spaces. We further employ the use of Karcher mean on the Grassmannians to demonstrate that this compressed representation can accelerate computations with relatively minor sacrifice on performance. The combination of these two ideas introduces a novel perspective in performing face recognition.展开更多
Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the respons...Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.展开更多
K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首...K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首先,为提升CS算法的搜索速度和精度,在莱维飞行阶段,设计了自适应步长因子;为调节CS算法全局搜索和局部搜索之间的平衡、加快CS算法的收敛,动态调整发现概率,进而提出改进的动态CS算法(IDCS),在IDCS的基础上构建了结合动态CS的特征选择算法(DCFS)。其次,为提升传统欧氏距离的计算精确度,设计同时考虑样本和特征对距离计算贡献程度的加权欧氏距离;为了确定最佳聚类数目的选取方法,依据改进的加权欧氏距离构造了加权簇内距离和簇间距离。最后,为克服传统K-means聚类目标函数仅考虑簇内的距离而未考虑簇间距离的缺陷,提出基于中位数的轮廓系数的目标函数,进而设计了DCFSK。实验结果表明,在10个基准测试函数上,IDCS的各项指标取得了较优的结果;相较于K-means、DBSCAN(Density-Based Spatial Clustering of Applications with Noise)等算法,在6个合成数据集与6个UCI数据集上,DCFSK的聚类效果最佳。展开更多
In Chinese Ianguage, there are many nouns used as verbs. In fact, there is a same phenomenon in English. The paper includes; the preponderance of nouns over verbs, the classification and development of nouns to verbs ...In Chinese Ianguage, there are many nouns used as verbs. In fact, there is a same phenomenon in English. The paper includes; the preponderance of nouns over verbs, the classification and development of nouns to verbs conversion, the three features of nouns to verbs conversion.展开更多
Wind farm power prediction is proposed based on adaptive feature weight entropy fuzzy clustering algorithm.According to the fuzzy clustering method,a large number of historical data of a wind farm in Inner Mongolia ar...Wind farm power prediction is proposed based on adaptive feature weight entropy fuzzy clustering algorithm.According to the fuzzy clustering method,a large number of historical data of a wind farm in Inner Mongolia are analyzed and classified.Model of adaptive entropy weight for clustering is built.Wind power prediction model based on adaptive entropy fuzzy clustering feature weights is built.Simulation results show that the proposed method could distinguish the abnormal data and forecast more accurately and compute fastly.展开更多
基金Supported by National Natural Science Foundation of China(Grant No.51475053)
文摘During the condition monitoring of a planetary gearbox, features are extracted from raw data for a fault diagnosis.However, different features have different sensitivity for identifying different fault types, and thus, the selection of a sensitive feature subset from an entire feature set and retaining as much of the class discriminatory information as possible has a directly effect on the accuracy of the classification results. In this paper, an improved hybrid feature selection technique(IHFST) that combines a distance evaluation technique(DET), Pearson’s correlation analysis, and an ad hoc technique is proposed. In IHFST, a temporary feature subset without irrelevant features is first selected according to the distance evaluation criterion of DET, and the Pearson’s correlation analysis and ad hoc technique are then employed to find and remove redundant features in the temporary feature subset, respectively, and hence,a sensitive feature subset without irrelevant or redundant features is selected from the entire feature set. Further, the k-means clustering method is applied to classify the different kinds of health conditions. The effectiveness of the proposed method was validated through several experiments carried out on a planetary gearbox with incipient cracks seeded in the tooth root of the sun gear, planet gear, and ring gear. The results show that the proposed method can successfully distinguish the different health conditions of a planetary gearbox, and achieves a better classification performance than other methods. This study proposes a sensitive feature subset selection method that achieves an obvious improvement in terms of the accuracy of the fault classification.
文摘In order to ensure that the large-scale application of photovoltaic power generation does not affect the stability of the grid, accurate photovoltaic (PV) power generation forecast is essential. A short-term PV power generation forecast method using the combination of K-means++, grey relational analysis (GRA) and support vector regression (SVR) based on feature selection (Hybrid Kmeans-GRA-SVR, HKGSVR) was proposed. The historical power data were clustered through the multi-index K-means++ algorithm and divided into ideal and non-ideal weather. The GRA algorithm was used to match the similar day and the nearest neighbor similar day of the prediction day. And selected appropriate input features for different weather types to train the SVR model. Under ideal weather, the average values of MAE, RMSE and R2 were 0.8101, 0.9608 kW and 99.66%, respectively. And this method reduced the average training time by 77.27% compared with the standard SVR model. Under non-ideal weather conditions, the average values of MAE, RMSE and R2 were 1.8337, 2.1379 kW and 98.47%, respectively. And this method reduced the average training time of the standard SVR model by 98.07%. The experimental results show that the prediction accuracy of the proposed model is significantly improved compared to the other five models, which verify the effectiveness of the method.
文摘将Bag of Features算法引入汽车图像识别领域中,并提出了将DoG(Difference of Gaussian)特征提取算法和PLSA分类算法结合在一起实现车辆和背景图像分类。首先用DoG特征提取算法提取图像特征,用这些特征聚类产生码书并对图像进行柱状图描述,最后设计PLSA分类器对车辆图像和背景图像进行分类。实验对比了该算法与Tamura纹理特征算法和Gabor纹理特征算法在车辆图像识别中的效果。结果表明本文算法分类正确率优于另外两种方法。
文摘Recent work has established that digital images of a human face, when collected with a fixed pose but under a variety of illumination conditions, possess discriminatory information that can be used in classification. In this paper we perform classification on Grassmannians to demonstrate that sufficient discriminatory information persists in feature patch (e.g., nose or eye patch) illumination spaces. We further employ the use of Karcher mean on the Grassmannians to demonstrate that this compressed representation can accelerate computations with relatively minor sacrifice on performance. The combination of these two ideas introduces a novel perspective in performing face recognition.
文摘Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.
文摘K-means聚类算法随机确定初始聚类数目,而且原始数据集中含有大量的冗余特征会导致聚类时精度降低,而布谷鸟搜索(CS)算法存在收敛速度慢和局部搜索能力弱等问题,为此提出一种基于自适应布谷鸟优化特征选择的K-means聚类算法(DCFSK)。首先,为提升CS算法的搜索速度和精度,在莱维飞行阶段,设计了自适应步长因子;为调节CS算法全局搜索和局部搜索之间的平衡、加快CS算法的收敛,动态调整发现概率,进而提出改进的动态CS算法(IDCS),在IDCS的基础上构建了结合动态CS的特征选择算法(DCFS)。其次,为提升传统欧氏距离的计算精确度,设计同时考虑样本和特征对距离计算贡献程度的加权欧氏距离;为了确定最佳聚类数目的选取方法,依据改进的加权欧氏距离构造了加权簇内距离和簇间距离。最后,为克服传统K-means聚类目标函数仅考虑簇内的距离而未考虑簇间距离的缺陷,提出基于中位数的轮廓系数的目标函数,进而设计了DCFSK。实验结果表明,在10个基准测试函数上,IDCS的各项指标取得了较优的结果;相较于K-means、DBSCAN(Density-Based Spatial Clustering of Applications with Noise)等算法,在6个合成数据集与6个UCI数据集上,DCFSK的聚类效果最佳。
文摘In Chinese Ianguage, there are many nouns used as verbs. In fact, there is a same phenomenon in English. The paper includes; the preponderance of nouns over verbs, the classification and development of nouns to verbs conversion, the three features of nouns to verbs conversion.
基金supported by the Natural Science Foundation of China under contact(61233007)
文摘Wind farm power prediction is proposed based on adaptive feature weight entropy fuzzy clustering algorithm.According to the fuzzy clustering method,a large number of historical data of a wind farm in Inner Mongolia are analyzed and classified.Model of adaptive entropy weight for clustering is built.Wind power prediction model based on adaptive entropy fuzzy clustering feature weights is built.Simulation results show that the proposed method could distinguish the abnormal data and forecast more accurately and compute fastly.