In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising...In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.展开更多
This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first ...This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first split into eight typographical categories. The classification scheme uses pattern matching to classify the characters in each category into a set of fuzzy prototypes based on a nonlinear weighted similarity function. The fuzzy unsupervised character classification, which is natural in the repre...展开更多
Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with g...Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.展开更多
An algorithm of hyperspectral remote sensing images classification is proposed based on the frequency spectrum of spectral signature.The spectral signature of each pixel in the hyperspectral image is taken as a discre...An algorithm of hyperspectral remote sensing images classification is proposed based on the frequency spectrum of spectral signature.The spectral signature of each pixel in the hyperspectral image is taken as a discrete signal,and the frequency spectrum is obtained using discrete Fourier transform.The discrepancy of frequency spectrum between ground objects' spectral signatures is visible,thus the difference between frequency spectra of reference and target spectral signature is used to measure the spectral similarity.Canberra distance is introduced to increase the contribution from higher frequency components.Then,the number of harmonics involved in the proposed algorithm is determined after analyzing the frequency spectrum energy cumulative distribution function of ground object.In order to evaluate the performance of the proposed algorithm,two hyperspectral remote sensing images are adopted as experimental data.The proposed algorithm is compared with spectral angle mapper (SAM),spectral information divergence (SID) and Euclidean distance (ED) using the product accuracy,user accuracy,overall accuracy,average accuracy and Kappa coefficient.The results show that the proposed algorithm can be applied to hyperspectral image classification effectively.展开更多
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Postdoctoral Scientific Program of Jiangsu Province(No.0701045B)
文摘In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.
文摘This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first split into eight typographical categories. The classification scheme uses pattern matching to classify the characters in each category into a set of fuzzy prototypes based on a nonlinear weighted similarity function. The fuzzy unsupervised character classification, which is natural in the repre...
基金Project(60763001) supported by the National Natural Science Foundation of ChinaProject(2010GZS0072) supported by the Natural Science Foundation of Jiangxi Province,ChinaProject(GJJ12271) supported by the Science and Technology Foundation of Provincial Education Department of Jiangxi Province,China
文摘Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.
基金supported by the National Basic Research Program of China ("973" Program) (Grant No. 2010CB950800)International S&T Cooperation Program of China (Grant No. 2010DFA21880)China Postdoctoral Science Foundation (Grant No. 2012M510053)
文摘An algorithm of hyperspectral remote sensing images classification is proposed based on the frequency spectrum of spectral signature.The spectral signature of each pixel in the hyperspectral image is taken as a discrete signal,and the frequency spectrum is obtained using discrete Fourier transform.The discrepancy of frequency spectrum between ground objects' spectral signatures is visible,thus the difference between frequency spectra of reference and target spectral signature is used to measure the spectral similarity.Canberra distance is introduced to increase the contribution from higher frequency components.Then,the number of harmonics involved in the proposed algorithm is determined after analyzing the frequency spectrum energy cumulative distribution function of ground object.In order to evaluate the performance of the proposed algorithm,two hyperspectral remote sensing images are adopted as experimental data.The proposed algorithm is compared with spectral angle mapper (SAM),spectral information divergence (SID) and Euclidean distance (ED) using the product accuracy,user accuracy,overall accuracy,average accuracy and Kappa coefficient.The results show that the proposed algorithm can be applied to hyperspectral image classification effectively.