Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently...Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels.展开更多
The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation sy...The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.展开更多
To solve the unbalanced data problems of learning models for semantic concepts, an optimized modeling method based on the posterior probability support vector machine (PPSVM) is presented. A neighborbased posterior ...To solve the unbalanced data problems of learning models for semantic concepts, an optimized modeling method based on the posterior probability support vector machine (PPSVM) is presented. A neighborbased posterior probability estimator for visual concepts is provided. The proposed method has been applied in a high-level visual semantic concept classification system and the experiment results show that it results in enhanced performance over the baseline SVM models, as well as in improved robustness with respect to high-level visual semantic concept classification.展开更多
Detailed information on the spatio-temporal changes of cropland soil organic carbon(SOC) can significantly contribute to the improvement of soil fertility and mitigate climate change. Nonetheless, information and know...Detailed information on the spatio-temporal changes of cropland soil organic carbon(SOC) can significantly contribute to the improvement of soil fertility and mitigate climate change. Nonetheless, information and knowledge on the national scale spatio-temporal changes and the corresponding uncertainties of SOC in Chinese upland soils remain limited. The CENTURY model was used to estimate the SOC storages and their changes in Chinese uplands from 1980 to 2010. With the Monte Carlo method, the uncertainties of CENTURY-modelled SOC dynamics associated with the spatial heterogeneous model inputs were quantified. Results revealed that the SOC storage in Chinese uplands increased from 3.03(1.59 to 4.78) Pg C in 1980 to 3.40(2.39 to 4.62) Pg C in 2010. Increment of SOC storage during this period was 370 Tg C, with an uncertainty interval of –440 to 1110 Tg C. The regional disparities of SOC changes reached a significant level, with considerable SOC accumulation in the Huang-Huai-Hai Plain of China and SOC loss in the northeastern China. The SOC lost from Meadow soils, Black soils and Chernozems was most severe, whilst SOC accumulation in Fluvo-aquic soils, Cinnamon soils and Purplish soils was most significant. In modelling large-scale SOC dynamics, the initial soil properties were major sources of uncertainty. Hence, more detailed information concerning the soil properties must be collected. The SOC stock of Chinese uplands in 2010 was still relatively low, manifesting that recommended agricultural management practices in conjunction with effectively economic and policy incentives to farmers for soil fertility improvement were indispensable for future carbon sequestration in these regions.展开更多
The development of spatio-temporal data model is introduced. According to the soil characteristic of reclamation land, we adopt the base state with amendments model of multi-layer raster to organize the spatio-tempora...The development of spatio-temporal data model is introduced. According to the soil characteristic of reclamation land, we adopt the base state with amendments model of multi-layer raster to organize the spatio-temporal data, using the combined data structure on linear quadtree and linear octree to code. The advantage of this model is that it can easily obtain the information of certain layer and integratedly analyze the data with other methods. Then, the methods of obtain and analyses are introduced. The method can provide a tool for the research of the soil characteristic change and spatial distribution in reclamation land.展开更多
Detecting naturally arising structures in data is central to knowledge extraction from data. In most applications, the main challenge is in the choice of the appropriate model for exploring the data features. The choi...Detecting naturally arising structures in data is central to knowledge extraction from data. In most applications, the main challenge is in the choice of the appropriate model for exploring the data features. The choice is generally poorly understood and any tentative choice may be too restrictive. Growing volumes of data, disparate data sources and modelling techniques entail the need for model optimization via adaptability rather than comparability. We propose a novel two-stage algorithm to modelling continuous data consisting of an unsupervised stage whereby the algorithm searches through the data for optimal parameter values and a supervised stage that adapts the parameters for predictive modelling. The method is implemented on the sunspots data with inherently Gaussian distributional properties and assumed bi-modality. Optimal values separating high from lows cycles are obtained via multiple simulations. Early patterns for each recorded cycle reveal that the first 3 years provide a sufficient basis for predicting the peak. Multiple Support Vector Machine runs using repeatedly improved data parameters show that the approach yields greater accuracy and reliability than conventional approaches and provides a good basis for model selection. Model reliability is established via multiple simulations of this type.展开更多
With the development of automation and informatization in the steelmaking industry,the human brain gradually fails to cope with an increasing amount of data generated during the steelmaking process.Machine learning te...With the development of automation and informatization in the steelmaking industry,the human brain gradually fails to cope with an increasing amount of data generated during the steelmaking process.Machine learning technology provides a new method other than production experience and metallurgical principles in dealing with large amounts of data.The application of machine learning in the steelmaking process has become a research hotspot in recent years.This paper provides an overview of the applications of machine learning in the steelmaking process modeling involving hot metal pretreatment,primary steelmaking,secondary refining,and some other aspects.The three most frequently used machine learning algorithms in steelmaking process modeling are the artificial neural network,support vector machine,and case-based reasoning,demonstrating proportions of 56%,14%,and 10%,respectively.Collected data in the steelmaking plants are frequently faulty.Thus,data processing,especially data cleaning,is crucially important to the performance of machine learning models.The detection of variable importance can be used to optimize the process parameters and guide production.Machine learning is used in hot metal pretreatment modeling mainly for endpoint S content prediction.The predictions of the endpoints of element compositions and the process parameters are widely investigated in primary steelmaking.Machine learning is used in secondary refining modeling mainly for ladle furnaces,Ruhrstahl–Heraeus,vacuum degassing,argon oxygen decarburization,and vacuum oxygen decarburization processes.Further development of machine learning in the steelmaking process modeling can be realized through additional efforts in the construction of the data platform,the industrial transformation of the research achievements to the practical steelmaking process,and the improvement of the universality of the machine learning models.展开更多
The current GIS can only deal with 2-D or 2.5-D information on the earth surface. A new 3-D data structure and data model need to be designed for the 3-D GIS. This paper analyzes diverse 3-D spatial phenomena from min...The current GIS can only deal with 2-D or 2.5-D information on the earth surface. A new 3-D data structure and data model need to be designed for the 3-D GIS. This paper analyzes diverse 3-D spatial phenomena from mine to geology and their complicated relations, and proposes several new kinds of spatial objects including cross-section, column body and digital surface model to represent some special spatial phenomena like tunnels and irregular surfaces of an ore body. An integrated data structure including vector, raster and object-oriented data models is used to represent various 3-D spatial objects and their relations. The integrated data structure and object-oriented data model can be used as bases to design and realize a 3-D geographic information system.展开更多
Workers’exposure to excessive noise is a big universal work-related challenges.One of the major consequences of exposure to noise is permanent or transient hearing loss.The current study sought to utilize audiometric...Workers’exposure to excessive noise is a big universal work-related challenges.One of the major consequences of exposure to noise is permanent or transient hearing loss.The current study sought to utilize audiometric data to weigh and prioritize the factors affecting workers’hearing loss based using the Support Vector Machine(SVM)algorithm.This cross sectional-descriptive study was conducted in 2017 in a mining industry in southeast Iran.The participating workers(n=150)were divided into three groups of 50 based on the sound pressure level to which they were exposed(two experimental groups and one control group).Audiometric tests were carried out for all members of each group.The study generally entailed the following steps:(1)selecting predicting variables to weigh and prioritize factors affecting hearing loss;(2)conducting audiometric tests and assessing permanent hearing loss in each ear and then evaluating total hearing loss;(3)categorizing different types of hearing loss;(4)weighing and prioritizing factors that affect hearing loss based on the SVM algorithm;and(5)assessing the error rate and accuracy of the models.The collected data were fed into SPSS 18,followed by conducting linear regression and paired samples t-test.It was revealed that,in the first model(SPL<70 dBA),the frequency of 8 KHz had the greatest impact(with a weight of 33%),while noise had the smallest influence(with a weight of 5%).The accuracy of this model was 100%.In the second model(70<SPL<80 dBA),the frequency of 4 KHz had the most profound effect(with a weight of 21%),whereas the frequency of 250 Hz had the lowest impact(with a weight of 6%).The accuracy of this model was 100%too.In the third model(SPL>85 dBA),the frequency of 4 KHz had the highest impact(with a weight of 22%),while the frequency of 250 Hz had the smallest influence(with a weight of 3%).The accuracy of this model was 100%too.In the fourth model,the frequency of 4 KHz had the greatest effect(with a weight of 24%),while the frequency of 500 Hz had the smallest effect(with a weight of 4%).The accuracy of this model was found to be 94%.According to the modeling conducted using the SVM algorithm,the frequency of 4 KHz has the most profound effect on predicting changes in hearing loss.Given the high accuracy of the obtained model,this algorithm is an appropriate and powerful tool to predict and model hearing loss.展开更多
The purpose of this paper is to study the theory of conservative estimating functions in nonlinear regression model with aggregated data. In this model, a quasi-score function with aggregated data is defined. When thi...The purpose of this paper is to study the theory of conservative estimating functions in nonlinear regression model with aggregated data. In this model, a quasi-score function with aggregated data is defined. When this function happens to be conservative, it is projection of the true score function onto a class of estimation functions. By constructing, the potential function for the projected score with aggregated data is obtained, which have some properties of log-likelihood function.展开更多
Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the...Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.展开更多
Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB d...Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB database,this paper proposes a spatio-temporal grid index model(STGI)for efficient optimized query of marine big data.A spatio-temporal secondary index is created on the spatial code and time code columns to build a composite index in the MongoDB database used for the storage of massive marine data.Multiple comparative experiments demonstrate that the retrieval efficiency adopting the STGI approach is increased by more than two to three times compared with other index models.Through theoretical analysis and experimental verification,the conclusion could be achieved that the STGI model is quite suitable for retrieving large-scale spatial data with low time frequency,such as marine big data.展开更多
The accuracy of predicting the Producer Price Index(PPI)plays an indispensable role in government economic work.However,it is difficult to forecast the PPI.In our research,we first propose an unprecedented hybrid mode...The accuracy of predicting the Producer Price Index(PPI)plays an indispensable role in government economic work.However,it is difficult to forecast the PPI.In our research,we first propose an unprecedented hybrid model based on fuzzy information granulation that integrates the GA-SVR and ARIMA(Autoregressive Integrated Moving Average Model)models.The fuzzy-information-granulation-based GA-SVR-ARIMA hybrid model is intended to deal with the problem of imprecision in PPI estimation.The proposed model adopts the fuzzy information-granulation algorithm to pre-classification-process monthly training samples of the PPI,and produced three different sequences of fuzzy information granules,whose Support Vector Regression(SVR)machine forecast models were separately established for their Genetic Algorithm(GA)optimization parameters.Finally,the residual errors of the GA-SVR model were rectified through ARIMA modeling,and the PPI estimate was reached.Research shows that the PPI value predicted by this hybrid model is more accurate than that predicted by other models,including ARIMA,GRNN,and GA-SVR,following several comparative experiments.Research also indicates the precision and validation of the PPI prediction of the hybrid model and demonstrates that the model has consistent ability to leverage the forecasting advantage of GA-SVR in non-linear space and of ARIMA in linear space.展开更多
Post-translational modification (PTM) increases the functional diversity of proteins by introducing new functional groups to the side chain of amino acid of a protein. Among all amino acid residues, the side chain of ...Post-translational modification (PTM) increases the functional diversity of proteins by introducing new functional groups to the side chain of amino acid of a protein. Among all amino acid residues, the side chain of lysine (K) can undergo many types of PTM, called K-PTM, such as “acetylation”, “crotonylation”, “methylation” and “succinylation” and also responsible for occurring multiple PTM in the same lysine of a protein which leads to the requirement of multi-label PTM site identification. However, most of the existing computational methods have been established to predict various single-label PTM sites and a very few have been developed to solve multi-label issue which needs further improvement. Here, we have developed a computational tool termed mLysPTMpred to predict multi-label lysine PTM sites by 1) incorporating the sequence-coupled information into the general pseudo amino acid composition, 2) balancing the effect of skewed training dataset by Different Error Cost method, and 3) constructing a multi-label predictor using a combination of support vector machine (SVM). This predictor achieved 83.73% accuracy in predicting the multi-label PTM site of K-PTM types. Moreover, all the experimental results along with accuracy outperformed than the existing predictor iPTM-mLys. A user-friendly web server of mLysPTMpred is available at http://research.ru.ac.bd/mLysPTMpred/.展开更多
基金supported by the National Key Basic Research and Development Program of China under contract No.2006CB701305the National Natural Science Foundation of China under coutract No.40571129the National High-Technology Program of China under contract Nos 2002AA639400,2003AA604040 and 2003AA637030.
文摘Marine information has been increasing quickly. The traditional database technologies have disadvantages in manipulating large amounts of marine information which relates to the position in 3-D with the time. Recently, greater emphasis has been placed on GIS (geographical information system)to deal with the marine information. The GIS has shown great success for terrestrial applications in the last decades, but its use in marine fields has been far more restricted. One of the main reasons is that most of the GIS systems or their data models are designed for land applications. They cannot do well with the nature of the marine environment and for the marine information. And this becomes a fundamental challenge to the traditional GIS and its data structure. This work designed a data model, the raster-based spatio-temporal hierarchical data model (RSHDM), for the marine information system, or for the knowledge discovery fi'om spatio-temporal data, which bases itself on the nature of the marine data and overcomes the shortages of the current spatio-temporal models when they are used in the field. As an experiment, the marine fishery data warehouse (FDW) for marine fishery management was set up, which was based on the RSHDM. The experiment proved that the RSHDM can do well with the data and can extract easily the aggregations that the management needs at different levels.
基金Under the auspices of National High Technology Research and Development Program of China (No.2007AA12Z242)
文摘The technique of incremental updating,which can better guarantee the real-time situation of navigational map,is the developing orientation of navigational road network updating.The data center of vehicle navigation system is in charge of storing incremental data,and the spatio-temporal data model for storing incremental data does affect the efficiency of the response of the data center to the requirements of incremental data from the vehicle terminal.According to the analysis on the shortcomings of several typical spatio-temporal data models used in the data center and based on the base map with overlay model,the reverse map with overlay model (RMOM) was put forward for the data center to make rapid response to incremental data request.RMOM supports the data center to store not only the current complete road network data,but also the overlays of incremental data from the time when each road network changed to the current moment.Moreover,the storage mechanism and index structure of the incremental data were designed,and the implementation algorithm of RMOM was developed.Taking navigational road network in Guangzhou City as an example,the simulation test was conducted to validate the efficiency of RMOM.Results show that the navigation database in the data center can response to the requirements of incremental data by only one query with RMOM,and costs less time.Compared with the base map with overlay model,the data center does not need to temporarily overlay incremental data with RMOM,so time-consuming of response is significantly reduced.RMOM greatly improves the efficiency of response and provides strong support for the real-time situation of navigational road network.
基金Sponsored by the Beijing Municipal Natural Science Foundation(4082027)
文摘To solve the unbalanced data problems of learning models for semantic concepts, an optimized modeling method based on the posterior probability support vector machine (PPSVM) is presented. A neighborbased posterior probability estimator for visual concepts is provided. The proposed method has been applied in a high-level visual semantic concept classification system and the experiment results show that it results in enhanced performance over the baseline SVM models, as well as in improved robustness with respect to high-level visual semantic concept classification.
基金Under the auspices of National Key Research and Development Program of China(No.2017YFA0603002)National Natural Science Foundation of China(No.31800358,31700369)+1 种基金Jiangsu Agricultural Science and Technology Innovation Fund(No.CX(19)3099)the Foundation of Jiangsu Vocational College of Agriculture and Forestry(No.2019kj014)。
文摘Detailed information on the spatio-temporal changes of cropland soil organic carbon(SOC) can significantly contribute to the improvement of soil fertility and mitigate climate change. Nonetheless, information and knowledge on the national scale spatio-temporal changes and the corresponding uncertainties of SOC in Chinese upland soils remain limited. The CENTURY model was used to estimate the SOC storages and their changes in Chinese uplands from 1980 to 2010. With the Monte Carlo method, the uncertainties of CENTURY-modelled SOC dynamics associated with the spatial heterogeneous model inputs were quantified. Results revealed that the SOC storage in Chinese uplands increased from 3.03(1.59 to 4.78) Pg C in 1980 to 3.40(2.39 to 4.62) Pg C in 2010. Increment of SOC storage during this period was 370 Tg C, with an uncertainty interval of –440 to 1110 Tg C. The regional disparities of SOC changes reached a significant level, with considerable SOC accumulation in the Huang-Huai-Hai Plain of China and SOC loss in the northeastern China. The SOC lost from Meadow soils, Black soils and Chernozems was most severe, whilst SOC accumulation in Fluvo-aquic soils, Cinnamon soils and Purplish soils was most significant. In modelling large-scale SOC dynamics, the initial soil properties were major sources of uncertainty. Hence, more detailed information concerning the soil properties must be collected. The SOC stock of Chinese uplands in 2010 was still relatively low, manifesting that recommended agricultural management practices in conjunction with effectively economic and policy incentives to farmers for soil fertility improvement were indispensable for future carbon sequestration in these regions.
文摘The development of spatio-temporal data model is introduced. According to the soil characteristic of reclamation land, we adopt the base state with amendments model of multi-layer raster to organize the spatio-temporal data, using the combined data structure on linear quadtree and linear octree to code. The advantage of this model is that it can easily obtain the information of certain layer and integratedly analyze the data with other methods. Then, the methods of obtain and analyses are introduced. The method can provide a tool for the research of the soil characteristic change and spatial distribution in reclamation land.
文摘Detecting naturally arising structures in data is central to knowledge extraction from data. In most applications, the main challenge is in the choice of the appropriate model for exploring the data features. The choice is generally poorly understood and any tentative choice may be too restrictive. Growing volumes of data, disparate data sources and modelling techniques entail the need for model optimization via adaptability rather than comparability. We propose a novel two-stage algorithm to modelling continuous data consisting of an unsupervised stage whereby the algorithm searches through the data for optimal parameter values and a supervised stage that adapts the parameters for predictive modelling. The method is implemented on the sunspots data with inherently Gaussian distributional properties and assumed bi-modality. Optimal values separating high from lows cycles are obtained via multiple simulations. Early patterns for each recorded cycle reveal that the first 3 years provide a sufficient basis for predicting the peak. Multiple Support Vector Machine runs using repeatedly improved data parameters show that the approach yields greater accuracy and reliability than conventional approaches and provides a good basis for model selection. Model reliability is established via multiple simulations of this type.
基金supported by the National Natural Science Foundation of China(No.U1960202)。
文摘With the development of automation and informatization in the steelmaking industry,the human brain gradually fails to cope with an increasing amount of data generated during the steelmaking process.Machine learning technology provides a new method other than production experience and metallurgical principles in dealing with large amounts of data.The application of machine learning in the steelmaking process has become a research hotspot in recent years.This paper provides an overview of the applications of machine learning in the steelmaking process modeling involving hot metal pretreatment,primary steelmaking,secondary refining,and some other aspects.The three most frequently used machine learning algorithms in steelmaking process modeling are the artificial neural network,support vector machine,and case-based reasoning,demonstrating proportions of 56%,14%,and 10%,respectively.Collected data in the steelmaking plants are frequently faulty.Thus,data processing,especially data cleaning,is crucially important to the performance of machine learning models.The detection of variable importance can be used to optimize the process parameters and guide production.Machine learning is used in hot metal pretreatment modeling mainly for endpoint S content prediction.The predictions of the endpoints of element compositions and the process parameters are widely investigated in primary steelmaking.Machine learning is used in secondary refining modeling mainly for ladle furnaces,Ruhrstahl–Heraeus,vacuum degassing,argon oxygen decarburization,and vacuum oxygen decarburization processes.Further development of machine learning in the steelmaking process modeling can be realized through additional efforts in the construction of the data platform,the industrial transformation of the research achievements to the practical steelmaking process,and the improvement of the universality of the machine learning models.
基金Project supported by the National Natural Science Foundation of China (No.49871066)
文摘The current GIS can only deal with 2-D or 2.5-D information on the earth surface. A new 3-D data structure and data model need to be designed for the 3-D GIS. This paper analyzes diverse 3-D spatial phenomena from mine to geology and their complicated relations, and proposes several new kinds of spatial objects including cross-section, column body and digital surface model to represent some special spatial phenomena like tunnels and irregular surfaces of an ore body. An integrated data structure including vector, raster and object-oriented data models is used to represent various 3-D spatial objects and their relations. The integrated data structure and object-oriented data model can be used as bases to design and realize a 3-D geographic information system.
基金This study stemmed from a research project(code number:96000838)which was sponsored by the Institute for Futures Studies in Health at Kerman University of Medical Sciences.
文摘Workers’exposure to excessive noise is a big universal work-related challenges.One of the major consequences of exposure to noise is permanent or transient hearing loss.The current study sought to utilize audiometric data to weigh and prioritize the factors affecting workers’hearing loss based using the Support Vector Machine(SVM)algorithm.This cross sectional-descriptive study was conducted in 2017 in a mining industry in southeast Iran.The participating workers(n=150)were divided into three groups of 50 based on the sound pressure level to which they were exposed(two experimental groups and one control group).Audiometric tests were carried out for all members of each group.The study generally entailed the following steps:(1)selecting predicting variables to weigh and prioritize factors affecting hearing loss;(2)conducting audiometric tests and assessing permanent hearing loss in each ear and then evaluating total hearing loss;(3)categorizing different types of hearing loss;(4)weighing and prioritizing factors that affect hearing loss based on the SVM algorithm;and(5)assessing the error rate and accuracy of the models.The collected data were fed into SPSS 18,followed by conducting linear regression and paired samples t-test.It was revealed that,in the first model(SPL<70 dBA),the frequency of 8 KHz had the greatest impact(with a weight of 33%),while noise had the smallest influence(with a weight of 5%).The accuracy of this model was 100%.In the second model(70<SPL<80 dBA),the frequency of 4 KHz had the most profound effect(with a weight of 21%),whereas the frequency of 250 Hz had the lowest impact(with a weight of 6%).The accuracy of this model was 100%too.In the third model(SPL>85 dBA),the frequency of 4 KHz had the highest impact(with a weight of 22%),while the frequency of 250 Hz had the smallest influence(with a weight of 3%).The accuracy of this model was 100%too.In the fourth model,the frequency of 4 KHz had the greatest effect(with a weight of 24%),while the frequency of 500 Hz had the smallest effect(with a weight of 4%).The accuracy of this model was found to be 94%.According to the modeling conducted using the SVM algorithm,the frequency of 4 KHz has the most profound effect on predicting changes in hearing loss.Given the high accuracy of the obtained model,this algorithm is an appropriate and powerful tool to predict and model hearing loss.
文摘The purpose of this paper is to study the theory of conservative estimating functions in nonlinear regression model with aggregated data. In this model, a quasi-score function with aggregated data is defined. When this function happens to be conservative, it is projection of the true score function onto a class of estimation functions. By constructing, the potential function for the projected score with aggregated data is obtained, which have some properties of log-likelihood function.
基金National Natural Science Foundation of China(No.61374140)the Youth Foundation of National Natural Science Foundation of China(No.61403072)
文摘Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.
基金This research was funded by the National Key Research and Development Plan(2018YFB0505300)the Guangxi Science and Technology Major Project(AA18118025)+1 种基金the Opening Foundation of Key Laboratory of Environment Change and Resources Use in Beibu Gulf,Ministry of Education(Nanning Normal University)Guangxi Key Laboratory of Earth Surface Processes and Intelligent Simulation(Nanning Normal University)(No.NNNU-KLOP-K1905).
文摘Marine big data are characterized by a large amount and complex structures,which bring great challenges to data management and retrieval.Based on the GeoSOT Grid Code and the composite index structure of the MongoDB database,this paper proposes a spatio-temporal grid index model(STGI)for efficient optimized query of marine big data.A spatio-temporal secondary index is created on the spatial code and time code columns to build a composite index in the MongoDB database used for the storage of massive marine data.Multiple comparative experiments demonstrate that the retrieval efficiency adopting the STGI approach is increased by more than two to three times compared with other index models.Through theoretical analysis and experimental verification,the conclusion could be achieved that the STGI model is quite suitable for retrieving large-scale spatial data with low time frequency,such as marine big data.
基金This work was supported by Hainan Provincial Natural Science Foundation of China[2018CXTD333,617048]The National Natural Science Foundation of China[61762033,61702539]+1 种基金Hainan University Doctor Start Fund Project[kyqd1328]Hainan University Youth Fund Project[qnjj1444].
文摘The accuracy of predicting the Producer Price Index(PPI)plays an indispensable role in government economic work.However,it is difficult to forecast the PPI.In our research,we first propose an unprecedented hybrid model based on fuzzy information granulation that integrates the GA-SVR and ARIMA(Autoregressive Integrated Moving Average Model)models.The fuzzy-information-granulation-based GA-SVR-ARIMA hybrid model is intended to deal with the problem of imprecision in PPI estimation.The proposed model adopts the fuzzy information-granulation algorithm to pre-classification-process monthly training samples of the PPI,and produced three different sequences of fuzzy information granules,whose Support Vector Regression(SVR)machine forecast models were separately established for their Genetic Algorithm(GA)optimization parameters.Finally,the residual errors of the GA-SVR model were rectified through ARIMA modeling,and the PPI estimate was reached.Research shows that the PPI value predicted by this hybrid model is more accurate than that predicted by other models,including ARIMA,GRNN,and GA-SVR,following several comparative experiments.Research also indicates the precision and validation of the PPI prediction of the hybrid model and demonstrates that the model has consistent ability to leverage the forecasting advantage of GA-SVR in non-linear space and of ARIMA in linear space.
文摘Post-translational modification (PTM) increases the functional diversity of proteins by introducing new functional groups to the side chain of amino acid of a protein. Among all amino acid residues, the side chain of lysine (K) can undergo many types of PTM, called K-PTM, such as “acetylation”, “crotonylation”, “methylation” and “succinylation” and also responsible for occurring multiple PTM in the same lysine of a protein which leads to the requirement of multi-label PTM site identification. However, most of the existing computational methods have been established to predict various single-label PTM sites and a very few have been developed to solve multi-label issue which needs further improvement. Here, we have developed a computational tool termed mLysPTMpred to predict multi-label lysine PTM sites by 1) incorporating the sequence-coupled information into the general pseudo amino acid composition, 2) balancing the effect of skewed training dataset by Different Error Cost method, and 3) constructing a multi-label predictor using a combination of support vector machine (SVM). This predictor achieved 83.73% accuracy in predicting the multi-label PTM site of K-PTM types. Moreover, all the experimental results along with accuracy outperformed than the existing predictor iPTM-mLys. A user-friendly web server of mLysPTMpred is available at http://research.ru.ac.bd/mLysPTMpred/.
文摘探究教师注意力对于评估课堂教师行为具有极其重要的研究价值。然而,现有的教师注意力识别算法存在无法应对极端头部姿态角度等问题。为此,提出一种基于6DRep Net360模型的教师注意力状态识别算法,提升极端角度中头部姿态估计算法的准确性。相较于传统的依赖条件判断来分类教师注意力状态的方法,设计一种基于支持向量机(SVM)的教师注意力分类模型,对复杂头部姿态角度进行注意力状态的精准识别。为进一步解决算法稳定性和准确性带来的误差数据,提出基于滑动窗口的数据清洗算法,有效提高整体识别结果的真实性和可靠性。通过在构建的CCNUTeacherS tat e数据集上进行一系列的算法评估,实验结果表明,所提出的教师注意力识别算法在CCNUTeacherS tate数据集上达到了90.67%的准确率。