The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the wester...The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.展开更多
In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tig...In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.展开更多
An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public ...An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.展开更多
On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions ...On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions for parallel recovery. On the basis of adequate consideration of fuzziness of black-start zone partitioning, a new algorithm based on fuzzy clustering analysis is presented. Characteristic indexes are extracted fully and accurately. The raw data matrix is made up of the electrical distance between every nodes and blackstart resources. Closure transfer method is utilized to get the dynamic clustering. The availability and feasibility of the proposed algorithm are verified on the New-England 39 bus system at last.展开更多
Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software pack...Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software packages running clustering analysis,there is a lack of packages conducting clustering analysis within a structural equation modeling framework.The package,gscaLCA which is implemented in the R statistical computing environment,was developed for conducting clustering analysis and has been extended to a latent variable modeling.More specifically,by applying both fuzzy clustering(FC)algorithm and generalized structured component analysis(GSCA),the package gscaLCA computes membership prevalence and item response probabilities as posterior probabilities,which is applicable in mixture modeling such as latent class analysis in statistics.As a hybrid model between data clustering in classifications and model-based mixture modeling approach,fuzzy clusterwise GSCA,denoted as gscaLCA,encompasses many advantages from both methods:(1)soft partitioning from FC and(2)efficiency in estimating model parameters with bootstrap method via resolution of global optimization problem from GSCA.The main function,gscaLCA,works for both binary and ordered categorical variables.In addition,gscaLCA can be used for latent class regression as well.Visualization of profiles of latent classes based on the posterior probabilities is also available in the package gscaLCA.This paper contributes to providing a methodological tool,gscaLCA that applied researchers such as social scientists and medical researchers can apply clustering analysis in their research.展开更多
With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow e...With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow explosively.These multi-source heterogeneous data have data differences,which lead to data variation in the process of transmission and preservation,thus forming the bad information of incomplete data.Therefore,the research on data integrity has become an urgent task.This paper is based on the characteristics of random chance and the Spatio-temporal difference of the system.According to the characteristics and data sources of the massive data generated by power equipment,the fuzzy mining model of power equipment data is established,and the data is divided into numerical and non-numerical data based on numerical data.Take the text data of power equipment defects as the mining material.Then,the Apriori algorithm based on an array is used to mine deeply.The strong association rules in incomplete data of power equipment are obtained and analyzed.From the change trend of NRMSE metrics and classification accuracy,most of the filling methods combined with the two frameworks in this method usually show a relatively stable filling trend,and will not fluctuate greatly with the growth of the missing rate.The experimental results show that the proposed algorithm model can effectively improve the filling effect of the existing filling methods on most data sets,and the filling effect fluctuates greatly with the increase of the missing rate,that is,with the increase of the missing rate,the improvement effect of the model for the existing filling methods is higher than 4.3%.Through the incomplete data clustering technology studied in this paper,a more innovative state assessment of smart grid reliability operation is carried out,which has good research value and reference significance.展开更多
The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigat...The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,展开更多
The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigat...The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,展开更多
The risk recognition model for preventing and monitoring the Coronary Heart Diseases (CHD) in the aged is proposed, which is based on the testing results of four indexes and includes Low Density Lipoprotein (LDL), Tot...The risk recognition model for preventing and monitoring the Coronary Heart Diseases (CHD) in the aged is proposed, which is based on the testing results of four indexes and includes Low Density Lipoprotein (LDL), Total Cholesterol (TC), Triglyceridemia (TG)and age. Some people who took the health checkup in Shanghai Xinhua Hospital are classified into 3 groups,and each group is associated with prevalence risk of contracting CHD. Then the fuzzy recognition method is applied to evaluate the risk of CHD. The accuracy rate is up to 85%. The model is applicable to not only analysis of risk in medical but also analysis of risk in finance, insurance and some other fields.展开更多
A new approach of glacier classification is suggested on the basis of fuzzy cluster analysis of cations in ice cores. Cations in an ice core act as a synthetic index to refelect both the local and the global climate....A new approach of glacier classification is suggested on the basis of fuzzy cluster analysis of cations in ice cores. Cations in an ice core act as a synthetic index to refelect both the local and the global climate. Fuzzy cluster analysis of long time series data of cations in ice cores from five representative glacial ice cores (from south to north) has been used to create a similarity scale matrix R among these glaciers. Accordingly, any change in R represents a change in environment and climate. This type of analysis can determine the relativity of samples (glaciers) according to a cluster level ( λ ). Fuzzy cluster analysis of cations in ice cores collected from Antarctica and the Qinghai Tibetan Plateau indicates drastic difference between glaciers of these two regions.展开更多
Turbulent motion could be regarded as the superposition of fluctuations with different scales. It's of great theoretical and practical importance to determine the classification of turbulent scales quantitatively ...Turbulent motion could be regarded as the superposition of fluctuations with different scales. It's of great theoretical and practical importance to determine the classification of turbulent scales quantitatively to the better description of vortex motions with different scales, and to the research of the interaction among different sclaes of vortex and the construction of better turbulent models. The mathematical method, which carries out the classification on a certain requirement, is called cluster analysis. In this paper, fuzzy cluster analysis method is used to study the classification of turbulent scales quantitatively in smooth and rough wall boundary conditions. Furthermore, the properties and interactions among all kinds of flow structures are also studied. The results are helpful to gain some insight into the properties and interactions of all kinds of turbulent scales in wall turbulent shear flow.展开更多
The thermal-induced error is a very important sour ce of machining errors of machine tools. To compensate the thermal-induced machin ing errors, a relationship model between the thermal field and deformations was need...The thermal-induced error is a very important sour ce of machining errors of machine tools. To compensate the thermal-induced machin ing errors, a relationship model between the thermal field and deformations was needed. The relationship can be deduced by virtual of FEM (Finite Element Method ), ANN (Artificial Neural Network) or MRA (Multiple Regression Analysis). MR A is on the basis of a total understanding of the temperature distribution of th e machine tool. Although the more the temperatures measured are, the more accura te the MRA is, too more temperatures will hinder the analysis calculation. So it is necessary to identify the key temperatures of the machine tool. The selectio n of key temperatures decides the efficiency and precision of MRA. Because of th e complexities and multi-input and multi-output structure of the relationships , the exact quantitative portions as well as the unclear portions must be taken into consideration together to improve the identification of key temperatures. I n this paper, a fuzzy cluster analysis was used to select the key temperatures. The substance of identifying the key temperatures is to group all temperatures b y their relativity, and then to select a temperature from each group as the repr esentation. A fuzzy cluster analysis can uncover the relationships between t he thermal field and deformations more truly and thoroughly. A fuzzy cluster ana lysis is the cluster analysis based on fuzzy sets. Given U={u i|i=0,...,N}, in which u i is the temperature measured, a fuzzy matrix R can be obta ined. The transfer close package t(R) can be deduced from R. A fuzzy clu ster of U then conducts on the basis of t(R). Based on the fuzzy cluster analysis discussed above, this paper identified the k ey temperatures of a horizontal machining center. The number of the temperatures measured was reduced to 4 from 32, and then the multiple regression relationshi p models between the 4 temperatures and the thermal deformations of the spindle were drawn. The remnant errors between the regression models and measured deform ations reached a satisfying low level. At the same time, the decreasing of tempe rature variable number improved the efficiency of measure and analysis greatly.展开更多
A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Second...A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Secondly, the fuzzy compatibility relation matrix of the model is converted into fuzzy equivalence relation matrix. Finally, the diagram of clustering genealogy is generated according to the fuzzy equivalence relation matrix, which enables the dynamic selection of different thresholds to effectively solve the problem of cluster analysis of the samples with multi-dimensional attributes.展开更多
The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, d...The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.展开更多
Cables that have been in service for over 20 years in Shanghai, a city with abundant surface water, failed more frequently and induced different cable accidents. This necessitates researches on the insulation aging st...Cables that have been in service for over 20 years in Shanghai, a city with abundant surface water, failed more frequently and induced different cable accidents. This necessitates researches on the insulation aging state of cables working in special circumstances. We performed multi-parameter tests with samples from about 300 cable lines in Shanghai. The tests included water tree investigation, tensile test, dielectric spectroscopy test, thermogravimetric analysis (TGA), fourier transform infrared spectroscopy (FTIR), and electrical aging test. Then, we carried out regression analysis between every two test parameters. Moreover, through two-sample t-Test and analysis of va- riance (ANOVA) of each test parameter, we analyzed the influences of cable-laying method and sampling section on the degradation of cable insulation respectively. Furthermore, the test parameters which have strong correlation in the regression analysis or significant differ- ences in the t-Test or ANOVA analysis were determined to be the ones identifying the XLPE cable insulation aging state. The thresholds for distinguishing insulation aging states had been also obtained with the aid of statistical analysis and fuzzy clustering. Based on the fuzzy in- ference, we established a cable insulation aging diagnosis model using the intensity transfer method. The results of regression analysis indicate that the degradation of cable insulation accelerates as the degree of in-service aging increases. This validates the rule that the in- crease of microscopic imperfections in solid material enhances the dielectric breakdown strength. The results of the two-sample t-Test and the ANOVA indicate that the direct-buried cables are more sensitive to insulation degradation than duct cables. This confirms that the tensile strength and breakdown strength are reliable functional parameters in cable insulation evaluations. A case study further indicates that the proposed diagnosis model based on the fuzzy inference can reflect the comprehensive aging state of cable insulation well, and that the cable service time has no correlation with the insulation aging state.展开更多
Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time F...Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time Fourier transform algorithm based on Morlet wavelet is applied to extract the power spectrum of DNA sequence. Then, threshold value determination method based on kernel fuzzy C-mean clustering is used to combine Signal to Noise Ratio (SNR) data of exon and intron into a sequence, classify the sequence into two types, calculate the weighted sum of two SNR clustering centers obtained and the discrimination threshold value. Finally, exon interval endpoint identification algorithm based on Takagi-Sugeno fuzzy identification model is presented to train Takagi-Sugeno model, optimize model parameters with Levenberg-Marquardt least square method, complete model and determine fuzzy rule. To verify the effectiveness of the proposed method, example tests are conducted on typical gene sequence sample data.展开更多
Traditional clustering method is easy to slow convergence speed because of high data dimension and setting random initial clustering center. To improve these problems, a novel method combining subtractive clustering w...Traditional clustering method is easy to slow convergence speed because of high data dimension and setting random initial clustering center. To improve these problems, a novel method combining subtractive clustering with fuzzy C-means( FCM)clustering will be advanced. In the method, the initial cluster number and cluster center can be obtained using subtractive clustering. On this basis,clustering result will be further optimized with FCM. In addition,the data dimension will be reduced through the analytic hierarchy process( AHP) before clustering calculating.In order to verify the effectiveness of fusion algorithm,an example about enterprise credit evaluation will be carried out. The results show that the fusion clustering algorithm is suitable for classifying high-dimension data,and the algorithm also does well in running up processing speed and improving visibility of result. So the method is suitable to promote the use.展开更多
To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totali...To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.展开更多
Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities...Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.展开更多
基金The National Natural Science Foundation of China under contract Nos 42106005,91958203,41676131,41876155.
文摘The classification of the springtime water mass has an important influence on the hydrography,regional climate change and fishery in the Taiwan Strait.Based on 58 stations of CTD profiling data collected in the western and southwestern Taiwan Strait during the spring cruise of 2019,we analyze the spatial distributions of temperature(T)and salinity(S)in the investigation area.Then by using the fuzzy cluster method combined with the T-S similarity number,we classify the investigation area into 5 water masses:the Minzhe Coastal Water(MZCW),the Taiwan Strait Mixed Water(TSMW),the South China Sea Surface Water(SCSSW),the South China Sea Subsurface Water(SCSUW)and the Kuroshio Branch Water(KBW).The MZCW appears in the near surface layer along the western coast of Taiwan Strait,showing low-salinity(<32.0)tongues near the Minjiang River Estuary and the Xiamen Bay mouth.The TSMW covers most upper layer of the investigation area.The SCSSW is mainly distributed in the upper layer of the southwestern Taiwan Strait,beneath which is the SCSUW.The KBW is a high temperature(core value of 26.36℃)and high salinity(core value of 34.62)water mass located southeast of the Taiwan Bank and partially in the central Taiwan Strait.
基金funded by the National Natural Science Foundation of China(42174131)the Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03).
文摘In this research,an integrated classification method based on principal component analysis-simulated annealing genetic algorithm-fuzzy cluster means(PCA-SAGA-FCM)was proposed for the unsupervised classification of tight sandstone reservoirs which lack the prior information and core experiments.A variety of evaluation parameters were selected,including lithology characteristic parameters,poro-permeability quality characteristic parameters,engineering quality characteristic parameters,and pore structure characteristic parameters.The PCA was used to reduce the dimension of the evaluation pa-rameters,and the low-dimensional data was used as input.The unsupervised reservoir classification of tight sandstone reservoir was carried out by the SAGA-FCM,the characteristics of reservoir at different categories were analyzed and compared with the lithological profiles.The analysis results of numerical simulation and actual logging data show that:1)compared with FCM algorithm,SAGA-FCM has stronger stability and higher accuracy;2)the proposed method can cluster the reservoir flexibly and effectively according to the degree of membership;3)the results of reservoir integrated classification match well with the lithologic profle,which demonstrates the reliability of the classification method.
基金National Science Foundation of China(91637105,41775048 and 41475041)National Key R&D Program of China(2018YFC1507800)Research on Tourism Traffic Meteorological Service Products in Heilongjiang Province(HQZD2017004)
文摘An evaluation index is a prerequisite for the scientific evaluation of a public meteorological service.This paper aims to explore a technical method for determining and screening evaluation indicators.Based on public satisfaction survey data obtained in Wafangdian,China in 2010,this study investigates the suitability of fuzzy clustering analysis method in establishing an evaluation index.Through quantitative analysis of multilayer fuzzy clustering of various evaluation indicators,correlation analysis indicates that if the results of clustering were identical for two evaluation indicators in the same sub-evaluation layer,then one indicator could be removed,or the two indicators merged.For evaluation indicators in different sub-evaluation layers,although clustering reveals attribute correlations,these indicators may not be substituted for one another.Analysis of the applicability of the fuzzy clustering method shows that it plays a certain role in the establishment and correction of an evaluation index.
文摘On the process of power system black start after an accident, it can help to optimize the resources allocation and accelerate the recovery process that decomposing the power system into several independent partitions for parallel recovery. On the basis of adequate consideration of fuzziness of black-start zone partitioning, a new algorithm based on fuzzy clustering analysis is presented. Characteristic indexes are extracted fully and accurately. The raw data matrix is made up of the electrical distance between every nodes and blackstart resources. Closure transfer method is utilized to get the dynamic clustering. The availability and feasibility of the proposed algorithm are verified on the New-England 39 bus system at last.
基金supported by the Yonsei University Research Fund of 2021(2021-22-0060).
文摘Clustering analysis identifying unknown heterogenous subgroups of a population(or a sample)has become increasingly popular along with the popularity of machine learning techniques.Although there are many software packages running clustering analysis,there is a lack of packages conducting clustering analysis within a structural equation modeling framework.The package,gscaLCA which is implemented in the R statistical computing environment,was developed for conducting clustering analysis and has been extended to a latent variable modeling.More specifically,by applying both fuzzy clustering(FC)algorithm and generalized structured component analysis(GSCA),the package gscaLCA computes membership prevalence and item response probabilities as posterior probabilities,which is applicable in mixture modeling such as latent class analysis in statistics.As a hybrid model between data clustering in classifications and model-based mixture modeling approach,fuzzy clusterwise GSCA,denoted as gscaLCA,encompasses many advantages from both methods:(1)soft partitioning from FC and(2)efficiency in estimating model parameters with bootstrap method via resolution of global optimization problem from GSCA.The main function,gscaLCA,works for both binary and ordered categorical variables.In addition,gscaLCA can be used for latent class regression as well.Visualization of profiles of latent classes based on the posterior probabilities is also available in the package gscaLCA.This paper contributes to providing a methodological tool,gscaLCA that applied researchers such as social scientists and medical researchers can apply clustering analysis in their research.
文摘With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow explosively.These multi-source heterogeneous data have data differences,which lead to data variation in the process of transmission and preservation,thus forming the bad information of incomplete data.Therefore,the research on data integrity has become an urgent task.This paper is based on the characteristics of random chance and the Spatio-temporal difference of the system.According to the characteristics and data sources of the massive data generated by power equipment,the fuzzy mining model of power equipment data is established,and the data is divided into numerical and non-numerical data based on numerical data.Take the text data of power equipment defects as the mining material.Then,the Apriori algorithm based on an array is used to mine deeply.The strong association rules in incomplete data of power equipment are obtained and analyzed.From the change trend of NRMSE metrics and classification accuracy,most of the filling methods combined with the two frameworks in this method usually show a relatively stable filling trend,and will not fluctuate greatly with the growth of the missing rate.The experimental results show that the proposed algorithm model can effectively improve the filling effect of the existing filling methods on most data sets,and the filling effect fluctuates greatly with the increase of the missing rate,that is,with the increase of the missing rate,the improvement effect of the model for the existing filling methods is higher than 4.3%.Through the incomplete data clustering technology studied in this paper,a more innovative state assessment of smart grid reliability operation is carried out,which has good research value and reference significance.
文摘The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,
文摘The influence of major cultural practices including different nitrogen application rates, population densities, transplanting leaf ages of seedling, and water regimes on rice canopy spectral reflectance was investigated. Results showed that increased nitrogen rates, water regimes and population densities and decreased seedling ages could enhance reflectance at NIR (near infrared) bands and reduce reflectance at visible bands. Using reflectance of green, red and NIR band and ratio index of 810-560 nm could distinguish the different type of rice by fuzzy cluster analysis,
基金Projects supported by Swiss Re-Fudan Research FoundationShanghai Key-point Science & Constructive project
文摘The risk recognition model for preventing and monitoring the Coronary Heart Diseases (CHD) in the aged is proposed, which is based on the testing results of four indexes and includes Low Density Lipoprotein (LDL), Total Cholesterol (TC), Triglyceridemia (TG)and age. Some people who took the health checkup in Shanghai Xinhua Hospital are classified into 3 groups,and each group is associated with prevalence risk of contracting CHD. Then the fuzzy recognition method is applied to evaluate the risk of CHD. The accuracy rate is up to 85%. The model is applicable to not only analysis of risk in medical but also analysis of risk in finance, insurance and some other fields.
文摘A new approach of glacier classification is suggested on the basis of fuzzy cluster analysis of cations in ice cores. Cations in an ice core act as a synthetic index to refelect both the local and the global climate. Fuzzy cluster analysis of long time series data of cations in ice cores from five representative glacial ice cores (from south to north) has been used to create a similarity scale matrix R among these glaciers. Accordingly, any change in R represents a change in environment and climate. This type of analysis can determine the relativity of samples (glaciers) according to a cluster level ( λ ). Fuzzy cluster analysis of cations in ice cores collected from Antarctica and the Qinghai Tibetan Plateau indicates drastic difference between glaciers of these two regions.
文摘Turbulent motion could be regarded as the superposition of fluctuations with different scales. It's of great theoretical and practical importance to determine the classification of turbulent scales quantitatively to the better description of vortex motions with different scales, and to the research of the interaction among different sclaes of vortex and the construction of better turbulent models. The mathematical method, which carries out the classification on a certain requirement, is called cluster analysis. In this paper, fuzzy cluster analysis method is used to study the classification of turbulent scales quantitatively in smooth and rough wall boundary conditions. Furthermore, the properties and interactions among all kinds of flow structures are also studied. The results are helpful to gain some insight into the properties and interactions of all kinds of turbulent scales in wall turbulent shear flow.
文摘The thermal-induced error is a very important sour ce of machining errors of machine tools. To compensate the thermal-induced machin ing errors, a relationship model between the thermal field and deformations was needed. The relationship can be deduced by virtual of FEM (Finite Element Method ), ANN (Artificial Neural Network) or MRA (Multiple Regression Analysis). MR A is on the basis of a total understanding of the temperature distribution of th e machine tool. Although the more the temperatures measured are, the more accura te the MRA is, too more temperatures will hinder the analysis calculation. So it is necessary to identify the key temperatures of the machine tool. The selectio n of key temperatures decides the efficiency and precision of MRA. Because of th e complexities and multi-input and multi-output structure of the relationships , the exact quantitative portions as well as the unclear portions must be taken into consideration together to improve the identification of key temperatures. I n this paper, a fuzzy cluster analysis was used to select the key temperatures. The substance of identifying the key temperatures is to group all temperatures b y their relativity, and then to select a temperature from each group as the repr esentation. A fuzzy cluster analysis can uncover the relationships between t he thermal field and deformations more truly and thoroughly. A fuzzy cluster ana lysis is the cluster analysis based on fuzzy sets. Given U={u i|i=0,...,N}, in which u i is the temperature measured, a fuzzy matrix R can be obta ined. The transfer close package t(R) can be deduced from R. A fuzzy clu ster of U then conducts on the basis of t(R). Based on the fuzzy cluster analysis discussed above, this paper identified the k ey temperatures of a horizontal machining center. The number of the temperatures measured was reduced to 4 from 32, and then the multiple regression relationshi p models between the 4 temperatures and the thermal deformations of the spindle were drawn. The remnant errors between the regression models and measured deform ations reached a satisfying low level. At the same time, the decreasing of tempe rature variable number improved the efficiency of measure and analysis greatly.
文摘A fuzzy clustering analysis model based on the quotient space is proposed. Firstly, the conversion from coarse to fine granularity and the hierarchical structure are used to reduce the multidimensional samples. Secondly, the fuzzy compatibility relation matrix of the model is converted into fuzzy equivalence relation matrix. Finally, the diagram of clustering genealogy is generated according to the fuzzy equivalence relation matrix, which enables the dynamic selection of different thresholds to effectively solve the problem of cluster analysis of the samples with multi-dimensional attributes.
文摘The fuzzy C-means clustering algorithm(FCM) to the fuzzy kernel C-means clustering algorithm(FKCM) to effectively perform cluster analysis on the diversiform structures are extended, such as non-hyperspherical data, data with noise, data with mixture of heterogeneous cluster prototypes, asymmetric data, etc. Based on the Mercer kernel, FKCM clustering algorithm is derived from FCM algorithm united with kernel method. The results of experiments with the synthetic and real data show that the FKCM clustering algorithm is universality and can effectively unsupervised analyze datasets with variform structures in contrast to FCM algorithm. It is can be imagined that kernel-based clustering algorithm is one of important research direction of fuzzy clustering analysis.
基金Project supported by National Natural Science Foundation of China(51277117), Shang- hai Science and Technology Comrmssion(11 DZ2283000).
文摘Cables that have been in service for over 20 years in Shanghai, a city with abundant surface water, failed more frequently and induced different cable accidents. This necessitates researches on the insulation aging state of cables working in special circumstances. We performed multi-parameter tests with samples from about 300 cable lines in Shanghai. The tests included water tree investigation, tensile test, dielectric spectroscopy test, thermogravimetric analysis (TGA), fourier transform infrared spectroscopy (FTIR), and electrical aging test. Then, we carried out regression analysis between every two test parameters. Moreover, through two-sample t-Test and analysis of va- riance (ANOVA) of each test parameter, we analyzed the influences of cable-laying method and sampling section on the degradation of cable insulation respectively. Furthermore, the test parameters which have strong correlation in the regression analysis or significant differ- ences in the t-Test or ANOVA analysis were determined to be the ones identifying the XLPE cable insulation aging state. The thresholds for distinguishing insulation aging states had been also obtained with the aid of statistical analysis and fuzzy clustering. Based on the fuzzy in- ference, we established a cable insulation aging diagnosis model using the intensity transfer method. The results of regression analysis indicate that the degradation of cable insulation accelerates as the degree of in-service aging increases. This validates the rule that the in- crease of microscopic imperfections in solid material enhances the dielectric breakdown strength. The results of the two-sample t-Test and the ANOVA indicate that the direct-buried cables are more sensitive to insulation degradation than duct cables. This confirms that the tensile strength and breakdown strength are reliable functional parameters in cable insulation evaluations. A case study further indicates that the proposed diagnosis model based on the fuzzy inference can reflect the comprehensive aging state of cable insulation well, and that the cable service time has no correlation with the insulation aging state.
文摘Sequence analysis technology under big data provides unprecedented opportunities for modern life science. A novel gene coding sequence identification method is proposed in this paper. Firstly, an improved short-time Fourier transform algorithm based on Morlet wavelet is applied to extract the power spectrum of DNA sequence. Then, threshold value determination method based on kernel fuzzy C-mean clustering is used to combine Signal to Noise Ratio (SNR) data of exon and intron into a sequence, classify the sequence into two types, calculate the weighted sum of two SNR clustering centers obtained and the discrimination threshold value. Finally, exon interval endpoint identification algorithm based on Takagi-Sugeno fuzzy identification model is presented to train Takagi-Sugeno model, optimize model parameters with Levenberg-Marquardt least square method, complete model and determine fuzzy rule. To verify the effectiveness of the proposed method, example tests are conducted on typical gene sequence sample data.
基金Innovation Program of Shanghai Municipal Education Commission,China(No.12YZ191)
文摘Traditional clustering method is easy to slow convergence speed because of high data dimension and setting random initial clustering center. To improve these problems, a novel method combining subtractive clustering with fuzzy C-means( FCM)clustering will be advanced. In the method, the initial cluster number and cluster center can be obtained using subtractive clustering. On this basis,clustering result will be further optimized with FCM. In addition,the data dimension will be reduced through the analytic hierarchy process( AHP) before clustering calculating.In order to verify the effectiveness of fusion algorithm,an example about enterprise credit evaluation will be carried out. The results show that the fusion clustering algorithm is suitable for classifying high-dimension data,and the algorithm also does well in running up processing speed and improving visibility of result. So the method is suitable to promote the use.
文摘To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.
文摘Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.