A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR...A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.展开更多
The wavenumber spectral components WN4 at the mesosphere and low thermosphere(MLT)altitudes(70–10 km)and in the latitude range between±45°are obtained from temperature data(T)observed by the Sounding of the...The wavenumber spectral components WN4 at the mesosphere and low thermosphere(MLT)altitudes(70–10 km)and in the latitude range between±45°are obtained from temperature data(T)observed by the Sounding of the Atmosphere using Broadband Emission Radiometry(SABER)instruments on board the National Aeronautics and Space Administration(NASA)’s Thermosphere–Ionosphere–Mesosphere Energetics and Dynamics(TIMED)spacecraft during the 11-year solar period from 2002 to 2012.We analyze in detail these spectral components WNk and obtain the main properties of their vertical profiles and global structures.We report that all of the wavenumber spectral components WNk occur mainly around 100 km altitude,and that the most prominent component is the wavenumber spectral component WN4 structure.Comparing these long duration temperature data with results of previous investigations,we have found that the yearly variation of spectral component WN4 is similar to that of the eastward propagating non-migrating diurnal tide with zonal wavenumber 3(DE3)at the low latitudes,and to that of the semi-diurnal tide with zonal wavenumber 2(SE2)at the mid-latitudes:the amplitudes of the A4 are larger during boreal summer and autumn at the low-latitudes;at the mid-latitudes the amplitudes have a weak peak in March.In addition,the amplitudes of component WN4 undergo a remarkable short period variation:significant day-to-day variation of the spectral amplitudes A4 occurs primarily in July and September at the low-latitudes.In summary,we conclude that the non-migrating tides DE3 and SE2 are likely to be the origins,at the low-latitudes and the mid-latitudes in the MLT region,respectively,of the observed wavenumber spectral component WN4.展开更多
The physical and chemical heterogeneities of soils make the soil spectral different and complicated, and it is valuable to increase the accuracy of prediction models for soil organic matter(SOM) based on pre-classif...The physical and chemical heterogeneities of soils make the soil spectral different and complicated, and it is valuable to increase the accuracy of prediction models for soil organic matter(SOM) based on pre-classification. This experiment was conducted under a controllable environment, and different soil samples from northeast of China were measured using ASD2500 hyperspectral instrument. The results showed that there are different reflectances in different soil types. There are statistically significant correlation between SOM and reflectence at 0.05 and 0.01 levels in 550–850 nm, and all soil types get significant at 0.01 level in 650–750 nm. The results indicated that soil types of the northeast can be divided into three categories: The first category shows relatively flat and low reflectance in the entire band; the second shows that the spectral reflectance curve raises fastest in 460–610 nm band, the sharp increase in the slope, but uneven slope changes; the third category slowly uplifts in the visible band, and its slope in the visible band is obviously higher than the first category. Except for the classification by curve shapes of reflectance, principal component analysis is one more effective method to classify soil types. The first principal component includes 62.13–97.19% of spectral information and it mainly relates to the information in 560–600, 630–690 and 690–760 nm. The second mainly represents spectral information in 1 640–1 740, 2 050–2 120 and 2 200–2 300 nm. The samples with high OM are often in the left, and the others with low OM are in the right of the scatter plot(the first principal component is the horizontal axis and the second is the longitudinal axis). Soil types in northeast of China can be classified effectively by those two principles; it is also a valuable reference to other soil in other areas.展开更多
The structure of any Bangla numerical character is more complex compared to English numerical character. Two pairs of numerical character in Bangla resembles to be closed and they are: “one and nine” and “five and ...The structure of any Bangla numerical character is more complex compared to English numerical character. Two pairs of numerical character in Bangla resembles to be closed and they are: “one and nine” and “five and six”. We found that, handwritten Bangla numerical character cannot be recognized using single machine learning algorithm or discrete wavelet transform (DWT). Above phenomenon motivated us to use combination of DWT, Fuzzy Inference System (FIS) and Principal Component Analysis (PCA) to recognize numerical characters of Bangla in handwritten format. The four lowest spectral components of a preprocessed image are taken using DWT, which is considered as the feature vector to recognize the digits in first phase. The feature vector is then applied to FIS and PCA separately. The combined method provides recognition accuracy of 95.8% whereas application of individual method gives less rate of accuracy. Instead of storing the images itself in a folder, if we can store the feature vector of images achieved from DWT in tabular form. The records of table can be applied in FIS, PCA or other object detection algorithm. Although the technique used in the paper can detect objects with moderate rate of accuracy but can save huge storage against a benchmark database of images. If a tradeoff is made between storage requirements and accuracy of recognition, the model of the paper is preferable compared to other present state-of-art. Another finding of the paper is that, the spectral components of images acquired by DWT only matched with FIS and PCA for classification but do not match properly with unsupervised (K-mean clustering) and supervised (support vector machine) learning.展开更多
An integrated method for identifying the propagation of multi-loop process oscillations and their source location is proposed in this paper. Oscillatory process loop variables are automatically selected based on the c...An integrated method for identifying the propagation of multi-loop process oscillations and their source location is proposed in this paper. Oscillatory process loop variables are automatically selected based on the component-related ratio index and a mixing matrix, both of which are obtained in data preprocessing by spectral independent component analysis. The complex causality among oscillatory process variables is then revealed by Granger causality test and is visualized in the form of causality diagram. The simplification of causal connectivity in the diagram is performed according to the understanding of process knowledge and the final simplest causality diagram, which represents the main oscillation propagation paths, is achieved by the automated cutting-off thresh-old search, with which less significant causality pathways are filtered out. The source of the oscillation disturbance can be identified intuitively through the final causality diagram. Both simulated and real plant data tests are presented to demonstrate the effectiveness and feasibility of the proposed method.展开更多
On the basis of the analytical results of the period components of monthly mean sea level of 236 stationsin the Pacific, the period components plus linear trend are ed to fit the monthly mean sea level series. The sta...On the basis of the analytical results of the period components of monthly mean sea level of 236 stationsin the Pacific, the period components plus linear trend are ed to fit the monthly mean sea level series. The statisticalresults of linear trend Coefficients of these stations indicate that, if the abnormal values of sea-level rise and fall are neglected, the average rise rate of relative sea level in the Pacific is 1. 16 mm/a. Affected by nonuniformity Of land subsidence and other factors, the regional change of relative sea level rise or fall in the Pacific is greater. In the light of thepositive or negative values of linear trend coefficients as well as the geographical position of the sea area, zoning is madeof the sea level rise or fall in the Pacific including the coastal areas of China and Southeast Asia to obtain the averagerate of rise or fall in each sea area. The rise or fall trends of relative sea level obtained for the entire Pacific Ocean,west coast of North America, the northern and central South America, the greater part of the tropical Pacific and thecoastal Islands of Japan are basically in keeping with the other relevant results. The regional average estimated result ofthe relative sea level in the coast of East Asia is on the rise while the estimated results provided by Barnett tend todrop; the main cause of this nonuniformity is the number of stations selected and the distributional density.展开更多
基金The National Natural Science Foundation of China(No.60673060)the Natural Science Foundation of Jiangsu Province(No.BK2005047)
文摘A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.
基金The present work is supported by National Science Foundation of China(41604138,41427901,41621063,41474133,41674158,41874179,41322030).
文摘The wavenumber spectral components WN4 at the mesosphere and low thermosphere(MLT)altitudes(70–10 km)and in the latitude range between±45°are obtained from temperature data(T)observed by the Sounding of the Atmosphere using Broadband Emission Radiometry(SABER)instruments on board the National Aeronautics and Space Administration(NASA)’s Thermosphere–Ionosphere–Mesosphere Energetics and Dynamics(TIMED)spacecraft during the 11-year solar period from 2002 to 2012.We analyze in detail these spectral components WNk and obtain the main properties of their vertical profiles and global structures.We report that all of the wavenumber spectral components WNk occur mainly around 100 km altitude,and that the most prominent component is the wavenumber spectral component WN4 structure.Comparing these long duration temperature data with results of previous investigations,we have found that the yearly variation of spectral component WN4 is similar to that of the eastward propagating non-migrating diurnal tide with zonal wavenumber 3(DE3)at the low latitudes,and to that of the semi-diurnal tide with zonal wavenumber 2(SE2)at the mid-latitudes:the amplitudes of the A4 are larger during boreal summer and autumn at the low-latitudes;at the mid-latitudes the amplitudes have a weak peak in March.In addition,the amplitudes of component WN4 undergo a remarkable short period variation:significant day-to-day variation of the spectral amplitudes A4 occurs primarily in July and September at the low-latitudes.In summary,we conclude that the non-migrating tides DE3 and SE2 are likely to be the origins,at the low-latitudes and the mid-latitudes in the MLT region,respectively,of the observed wavenumber spectral component WN4.
基金supported by the National Natural Science Foundation of China(41371292)
文摘The physical and chemical heterogeneities of soils make the soil spectral different and complicated, and it is valuable to increase the accuracy of prediction models for soil organic matter(SOM) based on pre-classification. This experiment was conducted under a controllable environment, and different soil samples from northeast of China were measured using ASD2500 hyperspectral instrument. The results showed that there are different reflectances in different soil types. There are statistically significant correlation between SOM and reflectence at 0.05 and 0.01 levels in 550–850 nm, and all soil types get significant at 0.01 level in 650–750 nm. The results indicated that soil types of the northeast can be divided into three categories: The first category shows relatively flat and low reflectance in the entire band; the second shows that the spectral reflectance curve raises fastest in 460–610 nm band, the sharp increase in the slope, but uneven slope changes; the third category slowly uplifts in the visible band, and its slope in the visible band is obviously higher than the first category. Except for the classification by curve shapes of reflectance, principal component analysis is one more effective method to classify soil types. The first principal component includes 62.13–97.19% of spectral information and it mainly relates to the information in 560–600, 630–690 and 690–760 nm. The second mainly represents spectral information in 1 640–1 740, 2 050–2 120 and 2 200–2 300 nm. The samples with high OM are often in the left, and the others with low OM are in the right of the scatter plot(the first principal component is the horizontal axis and the second is the longitudinal axis). Soil types in northeast of China can be classified effectively by those two principles; it is also a valuable reference to other soil in other areas.
文摘The structure of any Bangla numerical character is more complex compared to English numerical character. Two pairs of numerical character in Bangla resembles to be closed and they are: “one and nine” and “five and six”. We found that, handwritten Bangla numerical character cannot be recognized using single machine learning algorithm or discrete wavelet transform (DWT). Above phenomenon motivated us to use combination of DWT, Fuzzy Inference System (FIS) and Principal Component Analysis (PCA) to recognize numerical characters of Bangla in handwritten format. The four lowest spectral components of a preprocessed image are taken using DWT, which is considered as the feature vector to recognize the digits in first phase. The feature vector is then applied to FIS and PCA separately. The combined method provides recognition accuracy of 95.8% whereas application of individual method gives less rate of accuracy. Instead of storing the images itself in a folder, if we can store the feature vector of images achieved from DWT in tabular form. The records of table can be applied in FIS, PCA or other object detection algorithm. Although the technique used in the paper can detect objects with moderate rate of accuracy but can save huge storage against a benchmark database of images. If a tradeoff is made between storage requirements and accuracy of recognition, the model of the paper is preferable compared to other present state-of-art. Another finding of the paper is that, the spectral components of images acquired by DWT only matched with FIS and PCA for classification but do not match properly with unsupervised (K-mean clustering) and supervised (support vector machine) learning.
基金Supported by the National Natural Science Foundation of China (60974061).
文摘An integrated method for identifying the propagation of multi-loop process oscillations and their source location is proposed in this paper. Oscillatory process loop variables are automatically selected based on the component-related ratio index and a mixing matrix, both of which are obtained in data preprocessing by spectral independent component analysis. The complex causality among oscillatory process variables is then revealed by Granger causality test and is visualized in the form of causality diagram. The simplification of causal connectivity in the diagram is performed according to the understanding of process knowledge and the final simplest causality diagram, which represents the main oscillation propagation paths, is achieved by the automated cutting-off thresh-old search, with which less significant causality pathways are filtered out. The source of the oscillation disturbance can be identified intuitively through the final causality diagram. Both simulated and real plant data tests are presented to demonstrate the effectiveness and feasibility of the proposed method.
文摘On the basis of the analytical results of the period components of monthly mean sea level of 236 stationsin the Pacific, the period components plus linear trend are ed to fit the monthly mean sea level series. The statisticalresults of linear trend Coefficients of these stations indicate that, if the abnormal values of sea-level rise and fall are neglected, the average rise rate of relative sea level in the Pacific is 1. 16 mm/a. Affected by nonuniformity Of land subsidence and other factors, the regional change of relative sea level rise or fall in the Pacific is greater. In the light of thepositive or negative values of linear trend coefficients as well as the geographical position of the sea area, zoning is madeof the sea level rise or fall in the Pacific including the coastal areas of China and Southeast Asia to obtain the averagerate of rise or fall in each sea area. The rise or fall trends of relative sea level obtained for the entire Pacific Ocean,west coast of North America, the northern and central South America, the greater part of the tropical Pacific and thecoastal Islands of Japan are basically in keeping with the other relevant results. The regional average estimated result ofthe relative sea level in the coast of East Asia is on the rise while the estimated results provided by Barnett tend todrop; the main cause of this nonuniformity is the number of stations selected and the distributional density.