Short-term building energy predictions serve as one of the fundamental tasks in building operation management.While large numbers of studies have explored the value of various supervised machine learning techniques in...Short-term building energy predictions serve as one of the fundamental tasks in building operation management.While large numbers of studies have explored the value of various supervised machine learning techniques in energy predictions,few studies have addressed the potential data shortage problem in developing data-driven models.One promising solution is data augmentation,which aims to enrich existing building data resources for reliable predictive modeling.This study proposes a deep generative modeling-based data augmentation strategy for improving short-term building energy predictions.Two types of conditional variational autoencoders have been designed for synthetic energy data generation using fully connected and one-dimensional convolutional layers respectively.Data experiments have been designed to evaluate the value of data augmentation using actual measurements from 52 buildings.The results indicate that conditional variational autoencoders are capable of generating high-quality synthetic data samples,which in turns helps to enhance the accuracy in short-term building energy predictions.The average performance enhancement ratios in terms of CV-RMSE range between 12%and 18%.Practical guidelines have been obtained to ensure the validity and quality of synthetic building energy data.The research outcomes are valuable for enhancing the robustness and reliability of data-driven models for smart building operation management.展开更多
Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands i...Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.展开更多
Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adver...Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adversarial learning idea.The goal of GANs is to estimate the potential distribution of real data samples and generate new samples from that distribution.Since their initiation, GANs have been widely studied due to their enormous prospect for applications, including image and vision computing, speech and language processing, etc. In this review paper, we summarize the state of the art of GANs and look into the future. Firstly, we survey GANs' proposal background,theoretic and implementation models, and application fields.Then, we discuss GANs' advantages and disadvantages, and their development trends. In particular, we investigate the relation between GANs and parallel intelligence,with the conclusion that GANs have a great potential in parallel systems research in terms of virtual-real interaction and integration. Clearly, GANs can provide substantial algorithmic support for parallel intelligence.展开更多
Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abu...Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.展开更多
In this paper,we propose a novel coverless image steganographic scheme based on a generative model.In our scheme,the secret image is first fed to the generative model database,to generate a meaning-normal and independ...In this paper,we propose a novel coverless image steganographic scheme based on a generative model.In our scheme,the secret image is first fed to the generative model database,to generate a meaning-normal and independent image different from the secret image.The generated image is then transmitted to the receiver and fed to the generative model database to generate another image visually the same as the secret image.Thus,we only need to transmit the meaning-normal image which is not related to the secret image,and we can achieve the same effect as the transmission of the secret image.This is the first time to propose the coverless image information steganographic scheme based on generative model,compared with the traditional image steganography.The transmitted image is not embedded with any information of the secret image in this method,therefore,can effectively resist steganalysis tools.Experimental results show that our scheme has high capacity,security and reliability.展开更多
This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with gener...This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.展开更多
For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for...For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.展开更多
Predictive models for assessing the risk of developing lung cancers can help identify high-risk individuals with the aim of recommending further screening and early intervention.To facilitate pre-hospital self-assessm...Predictive models for assessing the risk of developing lung cancers can help identify high-risk individuals with the aim of recommending further screening and early intervention.To facilitate pre-hospital self-assessments,some studies have exploited predictive models trained on non-clinical data(e.g.,smoking status and family history).The performance of these models is limited due to not considering clinical data(e.g.,blood test and medical imaging results).Deep learning has shown the potential in processing complex data that combine both clinical and non-clinical information.However,predicting lung cancers remains difficult due to the severe lack of positive samples among follow-ups.To tackle this problem,this paper presents a generative-discriminative framework for improving the ability of deep learning models to generalize.According to the proposed framework,two nonlinear generative models,one based on the generative adversarial network and another on the variational autoencoder,are used to synthesize auxiliary positive samples for the training set.Then,several discriminative models,including a deep neural network(DNN),are used to assess the lung cancer risk based on a comprehensive list of risk factors.The framework was evaluated on over 55000 subjects questioned between January 2014 and December 2017,with 699 subjects being clinically diagnosed with lung cancer between January 2014 and August 2019.According to the results,the best performing predictive model built using the proposed framework was based on DNN.It achieved an average sensitivity of 76.54%and an area under the curve of 69.24%in distinguishing between the cases of lung cancer and normal cases on test sets.展开更多
Trapdoor is a key component of public key cryptography design which is the essential security foundation of modern cryptography.Normally,the traditional way in designing a trapdoor is to identify a computationally har...Trapdoor is a key component of public key cryptography design which is the essential security foundation of modern cryptography.Normally,the traditional way in designing a trapdoor is to identify a computationally hard problem,such as the NPC problems.So the trapdoor in a public key encryption mechanism turns out to be a type of limited resource.In this paper,we generalize the methodology of adversarial learning model in artificial intelligence and introduce a novel way to conveniently obtain sub-optimal and computationally hard trapdoors based on the automatic information theoretic search technique.The basic routine is constructing a generative architecture to search and discover a probabilistic reversible generator which can correctly encoding and decoding any input messages.The architecture includes a trapdoor generator built on a variational autoencoder(VAE)responsible for searching the appropriate trapdoors satisfying a maximum of entropy,a random message generator yielding random noise,and a dynamic classifier taking the results of the two generator.The evaluation of our construction shows the architecture satisfying basic indistinguishability of outputs under chosen-plaintext attack model(CPA)and high efficiency in generating cheap trapdoors.展开更多
Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on ri...Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.展开更多
Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a...Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods: Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC) and more detailed graphical procedures based on randomized quantile residuals were used. Results: The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of I m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions: As such the model can be used to support forest practitioners in their decision making for regeneration and forest protection planning in the Hessian predicting future spatial patterns of the larva density is still comparatively weak. Ried. However, the application of the model for somewhat limited because the causal effects are展开更多
A mode of ontology-based information integration and management( OIIM) for testability scheme was proposed through expatiating on the connotation of the system testability scheme.Aiming at the complexity of influencin...A mode of ontology-based information integration and management( OIIM) for testability scheme was proposed through expatiating on the connotation of the system testability scheme.Aiming at the complexity of influencing factors in optimal design procedure of the testability scheme, the information of concept entities,concept attributions and concept relationships was analyzed and extracted,and then the testability scheme information ontology( TSIO) was built and coded via web ontology language( OWL).Based on the information ontology, the generalized model for testability scheme( GMTS) was founded by defining transformation rules. The primary study shows that the mode of OIIM for testability scheme can make up the deficiencies in knowledge representation and reasoning existing in traditional information models,and achieve the information share and reuse. It provides the effectual model basis for the optimal design of the testability scheme.展开更多
Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such...Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such problems usually consume time and memory,especially for large-size problems.Recently different machine learning approaches have been considered as potential promising techniques for combinatorial optimization problems,especially the generative model of the deep neural networks.In this work,we propose a resource allocation deep autoencoder network,as one of the promising generative models,for enabling spectrum sharing in underlay device-to-device(D2D)communication by solving linear sum assignment problems(LSAPs).Specifically,we investigate the performance of three different architectures for the conditional variational autoencoders(CVAE).The three proposed architecture are the convolutional neural network(CVAECNN)autoencoder,the feed-forward neural network(CVAE-FNN)autoencoder,and the hybrid(H-CVAE)autoencoder.The simulation results show that the proposed approach could be used as a replacement of the conventional RA techniques,such as the Hungarian algorithm,due to its ability to find solutions of LASPs of different sizes with high accuracy and very fast execution time.Moreover,the simulation results reveal that the accuracy of the proposed hybrid autoencoder architecture outperforms the other proposed architectures and the state-of-the-art DNN techniques.展开更多
In recent years,an increasing number of studies about quantum machine learning not only provide powerful tools for quantum chemistry and quantum physics but also improve the classical learning algorithm.The hybrid qua...In recent years,an increasing number of studies about quantum machine learning not only provide powerful tools for quantum chemistry and quantum physics but also improve the classical learning algorithm.The hybrid quantum-classical framework,which is constructed by a variational quantum circuit(VQC)and an optimizer,plays a key role in the latest quantum machine learning studies.Nevertheless,in these hybrid-framework-based quantum machine learning models,the VQC is mainly constructed with a fixed structure and this structure causes inflexibility problems.There are also few studies focused on comparing the performance of quantum generative models with different loss functions.In this study,we address the inflexibility problem by adopting the variable-depth VQC model to automatically change the structure of the quantum circuit according to the qBAS score.The basic idea behind the variable-depth VQC is to consider the depth of the quantum circuit as a parameter during the training.Meanwhile,we compared the performance of the variable-depth VQC model based on four widely used statistical distances set as the loss functions,including Kullback-Leibler divergence(KL-divergence),Jensen-Shannon divergence(JS-divergence),total variation distance,and maximum mean discrepancy.Our numerical experiment shows a promising result that the variable-depth VQC model works better than the original VQC in the generative learning tasks.展开更多
Stochastic models are derived to estimate the level of coliform count in terms of MPN index, one of the most important water quality characteristic in ground water based on a set of water source location and soil char...Stochastic models are derived to estimate the level of coliform count in terms of MPN index, one of the most important water quality characteristic in ground water based on a set of water source location and soil characteristics. The study is based on about twenty location and soil characteristics, majority of them are observed through laboratory analysis of soil and water samples collected from nearly thee hundred locations of drinking water sources, wells and bore wells selected at random from the district of Kasaragod. The water contamination in wells are found to be relatively more as compared to bore wells. The study reveals that only 7 % of the wells and 40 o~ of the bore wells of the district are within the permissible limit of WHO standard of drinking water quality. The level of contamination is very high in the hospital premises and is very low in the forest area. Two separate multiple ordinal logistic regression models are developed to predict the level of coliform count, one for well and the other for bore well. The significant feature of this study is that in addition to scientifically proving the dependence of the water quality on the distances from waste disposal area and septic tanks etc., it highlights the dependence of two other very significant soil characteristics, the soil organic carbon and soil porosity. The models enable to predict the quality of water in a location based on the set of soil and location characteristics. One of the important uses of the model is in fixing safe locations for waste dump area, septic tank, digging well etc. in town planning, designing residential layouts, industrial layouts, hospital/hostel construction etc. This is the first ever study to describe the ground water quality in terms of the location and soil characteristics.展开更多
BACKGROUND Treatment efficacy for attention-deficit/hyperactivity disorder(ADHD)is reported to be poor,possibly due to heterogeneity of ADHD symptoms.Little is known about poor treatment efficacy owing to ADHD heterog...BACKGROUND Treatment efficacy for attention-deficit/hyperactivity disorder(ADHD)is reported to be poor,possibly due to heterogeneity of ADHD symptoms.Little is known about poor treatment efficacy owing to ADHD heterogeneity.AIM To use generalized structural equation modeling(GSEM)to show how the heterogeneous nature of hyperactivity/impulsivity(H/I)symptoms in ADHD,irritable oppositional defiant disorder(ODD),and the presentation of aggression in children interferes with treatment responses in ADHD.METHODS A total of 231 children and adolescents completed ADHD inattention and H/I tests.ODD scores from the Swanson,Nolan,and Pelham,version IV scale were obtained.The child behavior checklist(CBCL)and parent’s satisfaction questionnaire were completed.The relationships were analyzed by GSEM.RESULTS GSEM revealed that the chance of ADHD remission was lower in children with a combination of H/I symptoms of ADHD,ODD symptoms,and childhood aggressive behavior.ODD directly mediated ADHD symptom severity.The chance of reaching remission based on H/I symptoms of ADHD was reduced by 13.494%[=exp(2.602)]in children with comorbid ADHD and ODD[odds ratio(OR)=2.602,95%confidence interval(CI):1.832-3.373,P=0.000]after adjusting for the effects of other factors.Childhood aggression mediated ODD symptom severity.The chance of reaching remission based on ODD symptoms was lowered by 11.000%[=1-exp(-0.117)]in children with more severe baseline symptoms of aggression based on the CBCL score at study entry[OR=-0.117,95%CI:(-0.190)-(-0.044),P=0.002].CONCLUSION Mediation through ODD symptoms and aggression may influence treatment effects in ADHD after adjusting for the effects of baseline ADHD symptom severity.More attention could be directed to the early recognition of risks leading to ineffective ADHD treatment,e.g.,symptoms of ODD and the presentation of aggressive or delinquent behaviors and thought problems in children with ADHD.展开更多
A new covariate dependent zero-truncated bivariate Poisson model is proposed in this paper employing generalized linear model. A marginal-conditional approach is used to show the bivariate model. The proposed model wi...A new covariate dependent zero-truncated bivariate Poisson model is proposed in this paper employing generalized linear model. A marginal-conditional approach is used to show the bivariate model. The proposed model with estimation procedure and tests for goodness-of-fit and under (or over) dispersion are shown and applied to road safety data. Two correlated outcome variables considered in this study are number of cars involved in an accident and number of casualties for given number of cars.展开更多
A suitable statistical model has been explored for the investors as well as the researchers to resolve the future estimation of share volume by using daily stock volume data from Dhaka Stock Exchange (DSE). The dail...A suitable statistical model has been explored for the investors as well as the researchers to resolve the future estimation of share volume by using daily stock volume data from Dhaka Stock Exchange (DSE). The daily volume data from the June 1, 2004 to April 19, 2010 were retrieved from DSE website as a secondary data source. The Maximum Likelihood---Autoregressive Conditional Heteroskedasticity (ARCH) (Marquardt) method has been applied to construct the models for the stock volume data of DSE by using statistical package software E-Views of verson-5. First of all, an "Auto Regressive Integrated Moving Average (ARIMA) model" was fitted and observed that heteroscedastic volatilities were still present there. To eliminate this dilemma, ARCH class of volatility models has been used and finally the ARIMA with EGARCH model has been explored. Findings of this study have recognized that ARIMA with EGARCH model implies low mean square error, low mean absolute error, low bias proportion, and low variance proportion for share volume data with comparing to other models. Hence, the modelling concept established in this study would be a decisive study for the investors as well as the researchers.展开更多
The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims...The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims to enhance its non-invasive white blood cell counting device,Neosonics,by creating synthetic in vitro ultrasound images to facilitate a more efficient image generation process.This study addresses the data scarcity issue by designing and evaluating a continuous scalar conditional Generative Adversarial Network(GAN)to augment in vitro peritoneal dialysis ultrasound images,increasing both the volume and variability of training samples.The developed GAN architecture incorporates novel design features:varying kernel sizes in the generator’s transposed convolutional layers and a latent intermediate space,projecting noise and condition values for enhanced image resolution and specificity.The experimental results show that the GAN successfully generated diverse images of high visual quality,closely resembling real ultrasound samples.While visual results were promising,the use of GAN-based data augmentation did not consistently improve the performance of an image regressor in distinguishing features specific to varied white blood cell concentrations.Ultimately,while this continuous scalar conditional GAN model made strides in generating realistic images,further work is needed to achieve consistent gains in regression tasks,aiming for robust model generalization.展开更多
In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
基金support of this research by the National Natural Science Foundation of China(No.51908365,No.71772125)the Philosophical and Social Science Program of Guangdong Province,China(GD18YGL07).
文摘Short-term building energy predictions serve as one of the fundamental tasks in building operation management.While large numbers of studies have explored the value of various supervised machine learning techniques in energy predictions,few studies have addressed the potential data shortage problem in developing data-driven models.One promising solution is data augmentation,which aims to enrich existing building data resources for reliable predictive modeling.This study proposes a deep generative modeling-based data augmentation strategy for improving short-term building energy predictions.Two types of conditional variational autoencoders have been designed for synthetic energy data generation using fully connected and one-dimensional convolutional layers respectively.Data experiments have been designed to evaluate the value of data augmentation using actual measurements from 52 buildings.The results indicate that conditional variational autoencoders are capable of generating high-quality synthetic data samples,which in turns helps to enhance the accuracy in short-term building energy predictions.The average performance enhancement ratios in terms of CV-RMSE range between 12%and 18%.Practical guidelines have been obtained to ensure the validity and quality of synthetic building energy data.The research outcomes are valuable for enhancing the robustness and reliability of data-driven models for smart building operation management.
基金funded by National Science Centre,Poland under the project"Assessment of the impact of weather conditions on forest health status and forest disturbances at regional and national scale based on the integration of ground and space-based remote sensing datasets"(project no.2021/41/B/ST10/)Data collection and research was also supported by the project no.EZ.271.3.19.2021"Modele ryzyka zamierania drzewostanow glownych gatunkow lasotworczych Polski"funded by the General Directorate of State Forests in Poland。
文摘Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.
基金supported by the National Natural Science Foundation of China(61533019,71232006,91520301)
文摘Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adversarial learning idea.The goal of GANs is to estimate the potential distribution of real data samples and generate new samples from that distribution.Since their initiation, GANs have been widely studied due to their enormous prospect for applications, including image and vision computing, speech and language processing, etc. In this review paper, we summarize the state of the art of GANs and look into the future. Firstly, we survey GANs' proposal background,theoretic and implementation models, and application fields.Then, we discuss GANs' advantages and disadvantages, and their development trends. In particular, we investigate the relation between GANs and parallel intelligence,with the conclusion that GANs have a great potential in parallel systems research in terms of virtual-real interaction and integration. Clearly, GANs can provide substantial algorithmic support for parallel intelligence.
基金The National Key R&D Program of China under contract No.2017YFE0104400the National Natural Science Foundation of China under contract No.31772852the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology(Qingdao)under contract No.2018SDKJ0501-2。
文摘Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.
基金This paper was supported by the National Natural Science Foundation of China(No.U1204606)the Key Programs for Science and Technology Development of Henan Province(No.172102210335)Key Scientific Research Projects in Henan Universities(No.16A520058).
文摘In this paper,we propose a novel coverless image steganographic scheme based on a generative model.In our scheme,the secret image is first fed to the generative model database,to generate a meaning-normal and independent image different from the secret image.The generated image is then transmitted to the receiver and fed to the generative model database to generate another image visually the same as the secret image.Thus,we only need to transmit the meaning-normal image which is not related to the secret image,and we can achieve the same effect as the transmission of the secret image.This is the first time to propose the coverless image information steganographic scheme based on generative model,compared with the traditional image steganography.The transmitted image is not embedded with any information of the secret image in this method,therefore,can effectively resist steganalysis tools.Experimental results show that our scheme has high capacity,security and reliability.
基金Project(51774219)supported by the National Natural Science Foundation of China
文摘This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.
基金supported by the National Natural Science Foundation of China under Grant 51722406,52074340,and 51874335the Shandong Provincial Natural Science Foundation under Grant JQ201808+5 种基金The Fundamental Research Funds for the Central Universities under Grant 18CX02097Athe Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-008the Science and Technology Support Plan for Youth Innovation of University in Shandong Province under Grant 2019KJH002the National Research Council of Science and Technology Major Project of China under Grant 2016ZX05025001-006111 Project under Grant B08028Sinopec Science and Technology Project under Grant P20050-1
文摘For reservoirs with complex non-Gaussian geological characteristics,such as carbonate reservoirs or reservoirs with sedimentary facies distribution,it is difficult to implement history matching directly,especially for the ensemble-based data assimilation methods.In this paper,we propose a multi-source information fused generative adversarial network(MSIGAN)model,which is used for parameterization of the complex geologies.In MSIGAN,various information such as facies distribution,microseismic,and inter-well connectivity,can be integrated to learn the geological features.And two major generative models in deep learning,variational autoencoder(VAE)and generative adversarial network(GAN)are combined in our model.Then the proposed MSIGAN model is integrated into the ensemble smoother with multiple data assimilation(ESMDA)method to conduct history matching.We tested the proposed method on two reservoir models with fluvial facies.The experimental results show that the proposed MSIGAN model can effectively learn the complex geological features,which can promote the accuracy of history matching.
基金supported in part by Zhejiang Provincial Natural Science Foundation of China(LQ20F030013)Research Foundation of Hwa Mei Hospital,University of Chinese Academy of Sciences(2020HMZD22)+1 种基金Ningbo Public Service Technology Foundation(202002N3181)Medical Scientific Research Foundation of Zhejiang Province(2021431314)。
文摘Predictive models for assessing the risk of developing lung cancers can help identify high-risk individuals with the aim of recommending further screening and early intervention.To facilitate pre-hospital self-assessments,some studies have exploited predictive models trained on non-clinical data(e.g.,smoking status and family history).The performance of these models is limited due to not considering clinical data(e.g.,blood test and medical imaging results).Deep learning has shown the potential in processing complex data that combine both clinical and non-clinical information.However,predicting lung cancers remains difficult due to the severe lack of positive samples among follow-ups.To tackle this problem,this paper presents a generative-discriminative framework for improving the ability of deep learning models to generalize.According to the proposed framework,two nonlinear generative models,one based on the generative adversarial network and another on the variational autoencoder,are used to synthesize auxiliary positive samples for the training set.Then,several discriminative models,including a deep neural network(DNN),are used to assess the lung cancer risk based on a comprehensive list of risk factors.The framework was evaluated on over 55000 subjects questioned between January 2014 and December 2017,with 699 subjects being clinically diagnosed with lung cancer between January 2014 and August 2019.According to the results,the best performing predictive model built using the proposed framework was based on DNN.It achieved an average sensitivity of 76.54%and an area under the curve of 69.24%in distinguishing between the cases of lung cancer and normal cases on test sets.
基金the National Natural Science Foundation of China(No.61572521,U1636114)National Key Project of Research and Development Plan(2017YFB0802000)+2 种基金Natural Science Foundation of Shaanxi Province(2021JM-252)Innovative Research Team Project of Engineering University of APF(KYTD201805)Fundamental Research Project of Engineering University of PAP(WJY201910).
文摘Trapdoor is a key component of public key cryptography design which is the essential security foundation of modern cryptography.Normally,the traditional way in designing a trapdoor is to identify a computationally hard problem,such as the NPC problems.So the trapdoor in a public key encryption mechanism turns out to be a type of limited resource.In this paper,we generalize the methodology of adversarial learning model in artificial intelligence and introduce a novel way to conveniently obtain sub-optimal and computationally hard trapdoors based on the automatic information theoretic search technique.The basic routine is constructing a generative architecture to search and discover a probabilistic reversible generator which can correctly encoding and decoding any input messages.The architecture includes a trapdoor generator built on a variational autoencoder(VAE)responsible for searching the appropriate trapdoors satisfying a maximum of entropy,a random message generator yielding random noise,and a dynamic classifier taking the results of the two generator.The evaluation of our construction shows the architecture satisfying basic indistinguishability of outputs under chosen-plaintext attack model(CPA)and high efficiency in generating cheap trapdoors.
基金Supported by the National Natural Science Foundation of China (61273131) 111 Project (B12018)+1 种基金 the Innovation Project of Graduate in Jiangsu Province (CXZZ12_0741) the Fundamental Research Funds for the Central Universities (JUDCF12034)
文摘Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.
文摘Background: In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany) is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods: Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC) and more detailed graphical procedures based on randomized quantile residuals were used. Results: The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of I m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions: As such the model can be used to support forest practitioners in their decision making for regeneration and forest protection planning in the Hessian predicting future spatial patterns of the larva density is still comparatively weak. Ried. However, the application of the model for somewhat limited because the causal effects are
文摘A mode of ontology-based information integration and management( OIIM) for testability scheme was proposed through expatiating on the connotation of the system testability scheme.Aiming at the complexity of influencing factors in optimal design procedure of the testability scheme, the information of concept entities,concept attributions and concept relationships was analyzed and extracted,and then the testability scheme information ontology( TSIO) was built and coded via web ontology language( OWL).Based on the information ontology, the generalized model for testability scheme( GMTS) was founded by defining transformation rules. The primary study shows that the mode of OIIM for testability scheme can make up the deficiencies in knowledge representation and reasoning existing in traditional information models,and achieve the information share and reuse. It provides the effectual model basis for the optimal design of the testability scheme.
基金supported in part by the China NSFC Grant 61872248Guangdong NSF 2017A030312008+1 种基金Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China (Grant No.161064)GDUPS (2015)
文摘Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such problems usually consume time and memory,especially for large-size problems.Recently different machine learning approaches have been considered as potential promising techniques for combinatorial optimization problems,especially the generative model of the deep neural networks.In this work,we propose a resource allocation deep autoencoder network,as one of the promising generative models,for enabling spectrum sharing in underlay device-to-device(D2D)communication by solving linear sum assignment problems(LSAPs).Specifically,we investigate the performance of three different architectures for the conditional variational autoencoders(CVAE).The three proposed architecture are the convolutional neural network(CVAECNN)autoencoder,the feed-forward neural network(CVAE-FNN)autoencoder,and the hybrid(H-CVAE)autoencoder.The simulation results show that the proposed approach could be used as a replacement of the conventional RA techniques,such as the Hungarian algorithm,due to its ability to find solutions of LASPs of different sizes with high accuracy and very fast execution time.Moreover,the simulation results reveal that the accuracy of the proposed hybrid autoencoder architecture outperforms the other proposed architectures and the state-of-the-art DNN techniques.
基金This work has received support from the National Key Research&Development Plan of China under Grant No.2018YFA0306703.
文摘In recent years,an increasing number of studies about quantum machine learning not only provide powerful tools for quantum chemistry and quantum physics but also improve the classical learning algorithm.The hybrid quantum-classical framework,which is constructed by a variational quantum circuit(VQC)and an optimizer,plays a key role in the latest quantum machine learning studies.Nevertheless,in these hybrid-framework-based quantum machine learning models,the VQC is mainly constructed with a fixed structure and this structure causes inflexibility problems.There are also few studies focused on comparing the performance of quantum generative models with different loss functions.In this study,we address the inflexibility problem by adopting the variable-depth VQC model to automatically change the structure of the quantum circuit according to the qBAS score.The basic idea behind the variable-depth VQC is to consider the depth of the quantum circuit as a parameter during the training.Meanwhile,we compared the performance of the variable-depth VQC model based on four widely used statistical distances set as the loss functions,including Kullback-Leibler divergence(KL-divergence),Jensen-Shannon divergence(JS-divergence),total variation distance,and maximum mean discrepancy.Our numerical experiment shows a promising result that the variable-depth VQC model works better than the original VQC in the generative learning tasks.
文摘Stochastic models are derived to estimate the level of coliform count in terms of MPN index, one of the most important water quality characteristic in ground water based on a set of water source location and soil characteristics. The study is based on about twenty location and soil characteristics, majority of them are observed through laboratory analysis of soil and water samples collected from nearly thee hundred locations of drinking water sources, wells and bore wells selected at random from the district of Kasaragod. The water contamination in wells are found to be relatively more as compared to bore wells. The study reveals that only 7 % of the wells and 40 o~ of the bore wells of the district are within the permissible limit of WHO standard of drinking water quality. The level of contamination is very high in the hospital premises and is very low in the forest area. Two separate multiple ordinal logistic regression models are developed to predict the level of coliform count, one for well and the other for bore well. The significant feature of this study is that in addition to scientifically proving the dependence of the water quality on the distances from waste disposal area and septic tanks etc., it highlights the dependence of two other very significant soil characteristics, the soil organic carbon and soil porosity. The models enable to predict the quality of water in a location based on the set of soil and location characteristics. One of the important uses of the model is in fixing safe locations for waste dump area, septic tank, digging well etc. in town planning, designing residential layouts, industrial layouts, hospital/hostel construction etc. This is the first ever study to describe the ground water quality in terms of the location and soil characteristics.
文摘BACKGROUND Treatment efficacy for attention-deficit/hyperactivity disorder(ADHD)is reported to be poor,possibly due to heterogeneity of ADHD symptoms.Little is known about poor treatment efficacy owing to ADHD heterogeneity.AIM To use generalized structural equation modeling(GSEM)to show how the heterogeneous nature of hyperactivity/impulsivity(H/I)symptoms in ADHD,irritable oppositional defiant disorder(ODD),and the presentation of aggression in children interferes with treatment responses in ADHD.METHODS A total of 231 children and adolescents completed ADHD inattention and H/I tests.ODD scores from the Swanson,Nolan,and Pelham,version IV scale were obtained.The child behavior checklist(CBCL)and parent’s satisfaction questionnaire were completed.The relationships were analyzed by GSEM.RESULTS GSEM revealed that the chance of ADHD remission was lower in children with a combination of H/I symptoms of ADHD,ODD symptoms,and childhood aggressive behavior.ODD directly mediated ADHD symptom severity.The chance of reaching remission based on H/I symptoms of ADHD was reduced by 13.494%[=exp(2.602)]in children with comorbid ADHD and ODD[odds ratio(OR)=2.602,95%confidence interval(CI):1.832-3.373,P=0.000]after adjusting for the effects of other factors.Childhood aggression mediated ODD symptom severity.The chance of reaching remission based on ODD symptoms was lowered by 11.000%[=1-exp(-0.117)]in children with more severe baseline symptoms of aggression based on the CBCL score at study entry[OR=-0.117,95%CI:(-0.190)-(-0.044),P=0.002].CONCLUSION Mediation through ODD symptoms and aggression may influence treatment effects in ADHD after adjusting for the effects of baseline ADHD symptom severity.More attention could be directed to the early recognition of risks leading to ineffective ADHD treatment,e.g.,symptoms of ODD and the presentation of aggressive or delinquent behaviors and thought problems in children with ADHD.
文摘A new covariate dependent zero-truncated bivariate Poisson model is proposed in this paper employing generalized linear model. A marginal-conditional approach is used to show the bivariate model. The proposed model with estimation procedure and tests for goodness-of-fit and under (or over) dispersion are shown and applied to road safety data. Two correlated outcome variables considered in this study are number of cars involved in an accident and number of casualties for given number of cars.
文摘A suitable statistical model has been explored for the investors as well as the researchers to resolve the future estimation of share volume by using daily stock volume data from Dhaka Stock Exchange (DSE). The daily volume data from the June 1, 2004 to April 19, 2010 were retrieved from DSE website as a secondary data source. The Maximum Likelihood---Autoregressive Conditional Heteroskedasticity (ARCH) (Marquardt) method has been applied to construct the models for the stock volume data of DSE by using statistical package software E-Views of verson-5. First of all, an "Auto Regressive Integrated Moving Average (ARIMA) model" was fitted and observed that heteroscedastic volatilities were still present there. To eliminate this dilemma, ARCH class of volatility models has been used and finally the ARIMA with EGARCH model has been explored. Findings of this study have recognized that ARIMA with EGARCH model implies low mean square error, low mean absolute error, low bias proportion, and low variance proportion for share volume data with comparing to other models. Hence, the modelling concept established in this study would be a decisive study for the investors as well as the researchers.
文摘The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims to enhance its non-invasive white blood cell counting device,Neosonics,by creating synthetic in vitro ultrasound images to facilitate a more efficient image generation process.This study addresses the data scarcity issue by designing and evaluating a continuous scalar conditional Generative Adversarial Network(GAN)to augment in vitro peritoneal dialysis ultrasound images,increasing both the volume and variability of training samples.The developed GAN architecture incorporates novel design features:varying kernel sizes in the generator’s transposed convolutional layers and a latent intermediate space,projecting noise and condition values for enhanced image resolution and specificity.The experimental results show that the GAN successfully generated diverse images of high visual quality,closely resembling real ultrasound samples.While visual results were promising,the use of GAN-based data augmentation did not consistently improve the performance of an image regressor in distinguishing features specific to varied white blood cell concentrations.Ultimately,while this continuous scalar conditional GAN model made strides in generating realistic images,further work is needed to achieve consistent gains in regression tasks,aiming for robust model generalization.
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.