There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because the...There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.展开更多
Due to the conflict between huge amount of map data and limited network bandwidth, rapid trans- mission of vector map data over the Internet has become a bottleneck of spatial data delivery in web-based environment. T...Due to the conflict between huge amount of map data and limited network bandwidth, rapid trans- mission of vector map data over the Internet has become a bottleneck of spatial data delivery in web-based environment. This paper proposed an approach to organizing and transmitting multi-scale vector river network data via the Internet progressively. This approach takes account of two levels of importance, i.e. the importance of river branches and the importance of the points belonging to each river branch, and forms data packages ac- cording to these. Our experiments have shown that the proposed approach can reduce 90% of original data while preserving the river structure well.展开更多
The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper...The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper converts the vector data into 8 bit images according to their importance to mineralization each by programming. We can communicate the geological meaning with the raster images by this method. The paper also fuses geographical data and geochemical data with the programmed strata data. The result shows that image fusion can express different intensities effectively and visualize the structure characters in 2 dimensions. Furthermore, it also can produce optimized information from multi-source data and express them more directly.展开更多
Parallel vector buffer analysis approaches can be classified into 2 types:algorithm-oriented parallel strategy and the data-oriented parallel strategy.These methods do not take its applicability on the existing geogra...Parallel vector buffer analysis approaches can be classified into 2 types:algorithm-oriented parallel strategy and the data-oriented parallel strategy.These methods do not take its applicability on the existing geographic information systems(GIS)platforms into consideration.In order to address the problem,a spatial decomposition approach for accelerating buffer analysis of vector data is proposed.The relationship between the number of vertices of each feature and the buffer analysis computing time is analyzed to generate computational intensity transformation functions(CITFs).Then,computational intensity grids(CIGs)of polyline and polygon are constructed based on the relative CITFs.Using the corresponding CIGs,a spatial decomposition method for parallel buffer analysis is developed.Based on the computational intensity of the features and the sub-domains generated in the decomposition,the features are averagely assigned within the sub-domains into parallel buffer analysis tasks for load balance.Compared with typical regular domain decomposition methods,the new approach accomplishes greater balanced decomposition of computational intensity for parallel buffer analysis and achieves near-linear speedups.展开更多
Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the...Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.展开更多
Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter...Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter (EnKF) technology was used for the prediction of soil moisture in different soil layers: 0-5 cm, 30 cm, 50 cm, 100 cm, 200 cm, and 300 cm. The SVM methodology was first used to train the ground measurements of soil moisture and meteorological parameters from the Meilin study area, in East China, to construct soil moisture statistical prediction models. Subsequent observations and their statistics were used for predictions, with two approaches: the SVM predictor and the SVM-EnKF model made by coupling the SVM model with the EnKF technique using the DA method. Validation results showed that the proposed SVM-EnKF model can improve the prediction results of soil moisture in different layers, from the surface to the root zone.展开更多
Multistage Vector Quantization(MSVQ) can achieve very low encoding and storage complexity in comparison to unstructured vector quantization. However, the conventional MSVQ is suboptimal with respect to the overall per...Multistage Vector Quantization(MSVQ) can achieve very low encoding and storage complexity in comparison to unstructured vector quantization. However, the conventional MSVQ is suboptimal with respect to the overall performance measure. This paper proposes a new technology to design the decoder codebook, which is different from the encoder codebook to optimise the overall performance. The performance improvement is achieved with no effect on encoding complexity, both storage and time consuming, but a modest increase in storage complexity of decoder.展开更多
According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are comput...According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are computed to determine the time delay and the embedding dimension.Due to different features of the data,data mining algorithm is conducted to classify the data into different groups.Redundant information is eliminated by the advantage of data mining technology,and the historical loads that have highly similar features with the forecasting day are searched by the system.As a result,the training data can be decreased and the computing speed can also be improved when constructing support vector machine(SVM) model.Then,SVM algorithm is used to predict power load with parameters that get in pretreatment.In order to prove the effectiveness of the new model,the calculation with data mining SVM algorithm is compared with that of single SVM and back propagation network.It can be seen that the new DSVM algorithm effectively improves the forecast accuracy by 0.75%,1.10% and 1.73% compared with SVM for two random dimensions of 11-dimension,14-dimension and BP network,respectively.This indicates that the DSVM gains perfect improvement effect in the short-term power load forecasting.展开更多
介绍ArcGIS Data Reviewer基本功能和特性,对其应用于林业地理信息矢量数据质量检查,如图斑重复、重叠,图斑间有间隙、多部件、狭长面、急锐角化、漏绘等空间关系,以及属性字段之间的逻辑性检查等的方法和步骤,举例进行了详细叙述,可为...介绍ArcGIS Data Reviewer基本功能和特性,对其应用于林业地理信息矢量数据质量检查,如图斑重复、重叠,图斑间有间隙、多部件、狭长面、急锐角化、漏绘等空间关系,以及属性字段之间的逻辑性检查等的方法和步骤,举例进行了详细叙述,可为该软件模块的使用提供参考。展开更多
Geophysical data sets are growing at an ever-increasing rate,requiring computationally efficient data selection (thinning) methods to preserve essential information.Satellites,such as WindSat,provide large data sets...Geophysical data sets are growing at an ever-increasing rate,requiring computationally efficient data selection (thinning) methods to preserve essential information.Satellites,such as WindSat,provide large data sets for assessing the accuracy and computational efficiency of data selection techniques.A new data thinning technique,based on support vector regression (SVR),is developed and tested.To manage large on-line satellite data streams,observations from WindSat are formed into subsets by Voronoi tessellation and then each is thinned by SVR (TSVR).Three experiments are performed.The first confirms the viability of TSVR for a relatively small sample,comparing it to several commonly used data thinning methods (random selection,averaging and Barnes filtering),producing a 10% thinning rate (90% data reduction),low mean absolute errors (MAE) and large correlations with the original data.A second experiment,using a larger dataset,shows TSVR retrievals with MAE < 1 m s-1 and correlations ≥ 0.98.TSVR was an order of magnitude faster than the commonly used thinning methods.A third experiment applies a two-stage pipeline to TSVR,to accommodate online data.The pipeline subsets reconstruct the wind field with the same accuracy as the second experiment,is an order of magnitude faster than the nonpipeline TSVR.Therefore,pipeline TSVR is two orders of magnitude faster than commonly used thinning methods that ingest the entire data set.This study demonstrates that TSVR pipeline thinning is an accurate and computationally efficient alternative to commonly used data selection techniques.展开更多
Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine...Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine information from several data sources. In the centralized scheme, all information from several data sources is centralized to construct an input space. Then a multi-class Support Vector Machine classifier is trained. In the distributed schemes, the individual data sources are proc-essed separately and modelled by using the multi-class Support Vector Machine. Then new data fusion strategies are proposed to combine the information from the individual multi-class Support Vector Machine models. Our proposed fusion strategies take into account that an Support Vector Machine (SVM) classifier achieves classification by finding the optimal classification hyperplane with maximal margin. The proposed methods are applied for fault diagnosis of a diesel engine. The experimental results showed that almost all the proposed approaches can largely improve the diagnostic accuracy. The robustness of diagnosis is also improved because of the implementation of data fusion strategies. The proposed methods can also be applied in other fields.展开更多
基金Project(61374140)supported by the National Natural Science Foundation of China
文摘There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.
文摘Due to the conflict between huge amount of map data and limited network bandwidth, rapid trans- mission of vector map data over the Internet has become a bottleneck of spatial data delivery in web-based environment. This paper proposed an approach to organizing and transmitting multi-scale vector river network data via the Internet progressively. This approach takes account of two levels of importance, i.e. the importance of river branches and the importance of the points belonging to each river branch, and forms data packages ac- cording to these. Our experiments have shown that the proposed approach can reduce 90% of original data while preserving the river structure well.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2006AA040308), National Natural Science Foundation of China (60736021), and the National Creative Research Groups Science Foundation of China (60721062)
文摘The geological data are constructed in vector format in geographical information system (GIS) while other data such as remote sensing images, geographical data and geochemical data are saved in raster ones. This paper converts the vector data into 8 bit images according to their importance to mineralization each by programming. We can communicate the geological meaning with the raster images by this method. The paper also fuses geographical data and geochemical data with the programmed strata data. The result shows that image fusion can express different intensities effectively and visualize the structure characters in 2 dimensions. Furthermore, it also can produce optimized information from multi-source data and express them more directly.
基金the National Natural Science Foundation of China(No.41971356,41701446)National Key Research and Development Program of China(No.2017YFB0503600,2018YFB0505500,2017YFC0602204).
文摘Parallel vector buffer analysis approaches can be classified into 2 types:algorithm-oriented parallel strategy and the data-oriented parallel strategy.These methods do not take its applicability on the existing geographic information systems(GIS)platforms into consideration.In order to address the problem,a spatial decomposition approach for accelerating buffer analysis of vector data is proposed.The relationship between the number of vertices of each feature and the buffer analysis computing time is analyzed to generate computational intensity transformation functions(CITFs).Then,computational intensity grids(CIGs)of polyline and polygon are constructed based on the relative CITFs.Using the corresponding CIGs,a spatial decomposition method for parallel buffer analysis is developed.Based on the computational intensity of the features and the sub-domains generated in the decomposition,the features are averagely assigned within the sub-domains into parallel buffer analysis tasks for load balance.Compared with typical regular domain decomposition methods,the new approach accomplishes greater balanced decomposition of computational intensity for parallel buffer analysis and achieves near-linear speedups.
基金National Natural Science Foundation of China(No.61374140)the Youth Foundation of National Natural Science Foundation of China(No.61403072)
文摘Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.
基金supported by the National Basic Research Program of China (the 973 Program,Grant No.2010CB951101)the Program for Changjiang Scholars and Innovative Research Teams in Universities,the Ministry of Education,China (Grant No. IRT0717)
文摘Hybrid data assimilation (DA) is a method seeing more use in recent hydrology and water resources research. In this study, a DA method coupled with the support vector machines (SVMs) and the ensemble Kalman filter (EnKF) technology was used for the prediction of soil moisture in different soil layers: 0-5 cm, 30 cm, 50 cm, 100 cm, 200 cm, and 300 cm. The SVM methodology was first used to train the ground measurements of soil moisture and meteorological parameters from the Meilin study area, in East China, to construct soil moisture statistical prediction models. Subsequent observations and their statistics were used for predictions, with two approaches: the SVM predictor and the SVM-EnKF model made by coupling the SVM model with the EnKF technique using the DA method. Validation results showed that the proposed SVM-EnKF model can improve the prediction results of soil moisture in different layers, from the surface to the root zone.
文摘Multistage Vector Quantization(MSVQ) can achieve very low encoding and storage complexity in comparison to unstructured vector quantization. However, the conventional MSVQ is suboptimal with respect to the overall performance measure. This paper proposes a new technology to design the decoder codebook, which is different from the encoder codebook to optimise the overall performance. The performance improvement is achieved with no effect on encoding complexity, both storage and time consuming, but a modest increase in storage complexity of decoder.
基金Project(70671039) supported by the National Natural Science Foundation of China
文摘According to the chaotic and non-linear characters of power load data,the time series matrix is established with the theory of phase-space reconstruction,and then Lyapunov exponents with chaotic time series are computed to determine the time delay and the embedding dimension.Due to different features of the data,data mining algorithm is conducted to classify the data into different groups.Redundant information is eliminated by the advantage of data mining technology,and the historical loads that have highly similar features with the forecasting day are searched by the system.As a result,the training data can be decreased and the computing speed can also be improved when constructing support vector machine(SVM) model.Then,SVM algorithm is used to predict power load with parameters that get in pretreatment.In order to prove the effectiveness of the new model,the calculation with data mining SVM algorithm is compared with that of single SVM and back propagation network.It can be seen that the new DSVM algorithm effectively improves the forecast accuracy by 0.75%,1.10% and 1.73% compared with SVM for two random dimensions of 11-dimension,14-dimension and BP network,respectively.This indicates that the DSVM gains perfect improvement effect in the short-term power load forecasting.
基金NOAA Grant NA17RJ1227 and NSF Grant EIA-0205628 for providing financial support for this worksupported by RSF Grant 14-41-00039
文摘Geophysical data sets are growing at an ever-increasing rate,requiring computationally efficient data selection (thinning) methods to preserve essential information.Satellites,such as WindSat,provide large data sets for assessing the accuracy and computational efficiency of data selection techniques.A new data thinning technique,based on support vector regression (SVR),is developed and tested.To manage large on-line satellite data streams,observations from WindSat are formed into subsets by Voronoi tessellation and then each is thinned by SVR (TSVR).Three experiments are performed.The first confirms the viability of TSVR for a relatively small sample,comparing it to several commonly used data thinning methods (random selection,averaging and Barnes filtering),producing a 10% thinning rate (90% data reduction),low mean absolute errors (MAE) and large correlations with the original data.A second experiment,using a larger dataset,shows TSVR retrievals with MAE < 1 m s-1 and correlations ≥ 0.98.TSVR was an order of magnitude faster than the commonly used thinning methods.A third experiment applies a two-stage pipeline to TSVR,to accommodate online data.The pipeline subsets reconstruct the wind field with the same accuracy as the second experiment,is an order of magnitude faster than the nonpipeline TSVR.Therefore,pipeline TSVR is two orders of magnitude faster than commonly used thinning methods that ingest the entire data set.This study demonstrates that TSVR pipeline thinning is an accurate and computationally efficient alternative to commonly used data selection techniques.
文摘Multi-source multi-class classification methods based on multi-class Support Vector Machines and data fusion strategies are proposed in this paper. The centralized and distributed fusion schemes are applied to combine information from several data sources. In the centralized scheme, all information from several data sources is centralized to construct an input space. Then a multi-class Support Vector Machine classifier is trained. In the distributed schemes, the individual data sources are proc-essed separately and modelled by using the multi-class Support Vector Machine. Then new data fusion strategies are proposed to combine the information from the individual multi-class Support Vector Machine models. Our proposed fusion strategies take into account that an Support Vector Machine (SVM) classifier achieves classification by finding the optimal classification hyperplane with maximal margin. The proposed methods are applied for fault diagnosis of a diesel engine. The experimental results showed that almost all the proposed approaches can largely improve the diagnostic accuracy. The robustness of diagnosis is also improved because of the implementation of data fusion strategies. The proposed methods can also be applied in other fields.
文摘探究教师注意力对于评估课堂教师行为具有极其重要的研究价值。然而,现有的教师注意力识别算法存在无法应对极端头部姿态角度等问题。为此,提出一种基于6DRep Net360模型的教师注意力状态识别算法,提升极端角度中头部姿态估计算法的准确性。相较于传统的依赖条件判断来分类教师注意力状态的方法,设计一种基于支持向量机(SVM)的教师注意力分类模型,对复杂头部姿态角度进行注意力状态的精准识别。为进一步解决算法稳定性和准确性带来的误差数据,提出基于滑动窗口的数据清洗算法,有效提高整体识别结果的真实性和可靠性。通过在构建的CCNUTeacherS tat e数据集上进行一系列的算法评估,实验结果表明,所提出的教师注意力识别算法在CCNUTeacherS tate数据集上达到了90.67%的准确率。