Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer i...Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.展开更多
In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different f...In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different feature sets and their performances were evaluated in terms of accuracy and F-measure metrics.While the first experiments directly used the own stock features as the model inputs,the second experiments utilized reduced stock features through Variational AutoEncoders(VAE).In the last experiments,in order to grasp the effects of the other banking stocks on individual stock performance,the features belonging to other stocks were also given as inputs to our models.While combining other stock features was done for both own(named as allstock_own)and VAE-reduced(named as allstock_VAE)stock features,the expanded dimensions of the feature sets were reduced by Recursive Feature Elimination.As the highest success rate increased up to 0.685 with allstock_own and LSTM with attention model,the combination of allstock_VAE and LSTM with the attention model obtained an accuracy rate of 0.675.Although the classification results achieved with both feature types was close,allstock_VAE achieved these results using nearly 16.67%less features compared to allstock_own.When all experimental results were examined,it was found out that the models trained with allstock_own and allstock_VAE achieved higher accuracy rates than those using individual stock features.It was also concluded that the results obtained with the VAE-reduced stock features were similar to those obtained by own stock features.展开更多
In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have b...In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have been widely applied in order to decrease the wasting of the industrial database.Nevertheless,these latent variables do not usually follow the Gaussian distribution and thus perform unsuitable when applying some statistics indices,especially the T^(2) on them.Variational AutoEncoders(VAE),an unsupervised deep learning algorithm using the hierarchy study method,has the ability to make the latent variables follow the Gaussian distribution.The partial least squares(PLS)are used to obtain the information between the dependent variables and independent variables.In this paper,we will integrate these two methods and make a comparison with other methods.The superiority of this proposed method will be verified by the simulation and the Trimethylchlorosilane purification process in terms of the multivariate control charts.展开更多
Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of f...Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of face reconstruction,face reenactment and talking head et al..However,due to the sparsity of landmarks and the lack of accurate labels for the factors,it is hard to learn the disentangled representation of landmarks.To address these problem,we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations,which is based on a Variational Autoencoder framework.Besides,we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage.Moreover,we implement an identity preservation loss to further enhance the representation ability of identity factor.To the best of our knowledge,this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark.展开更多
Supervised machine learning algorithms have been widely used in seismic exploration processing,but the lack of labeled examples complicates its application.Therefore,we propose a seismic labeled data expansion method ...Supervised machine learning algorithms have been widely used in seismic exploration processing,but the lack of labeled examples complicates its application.Therefore,we propose a seismic labeled data expansion method based on deep variational Autoencoders(VAE),which are made of neural networks and contains two partsEncoder and Decoder.Lack of training samples leads to overfitting of the network.We training the VAE with whole seismic data,which is a data-driven process and greatly alleviates the risk of overfitting.The Encoder captures the ability to map the seismic waveform Y to latent deep features z,and the Decoder captures the ability to reconstruct high-dimensional waveform Yb from latent deep features z.Later,we put the labeled seismic data into Encoders and get the latent deep features.We can easily use gaussian mixture model to fit the deep feature distribution of each class labeled data.We resample a mass of expansion deep features z* according to the Gaussian mixture model,and put the expansion deep features into the decoder to generate expansion seismic data.The experiments in synthetic and real data show that our method alleviates the problem of lacking labeled seismic data for supervised seismic facies analysis.展开更多
Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the co...Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the contemporary convergence environment to connect to corporate networks and cloud-based applications only worsens this situation,as it facilitates multiple new attack vectors to emerge effortlessly.As such,existing intrusion detection systems suffer from performance degradation mainly because of insufficient considerations and poorly modeled detection systems.To address this problem,we designed a blended threat detection approach,considering the possible impact and dimensionality of new attack surfaces due to the aforementioned convergence.We collectively refer to the convergence of different technology sectors as the internet of blended environment.The proposed approach encompasses an ensemble of heterogeneous probabilistic autoencoders that leverage the corresponding advantages of a convolutional variational autoencoder and long short-term memory variational autoencoder.An extensive experimental analysis conducted on the TON_IoT dataset demonstrated 96.02%detection accuracy.Furthermore,performance of the proposed approach was compared with various single model(autoencoder)-based network intrusion detection approaches:autoencoder,variational autoencoder,convolutional variational autoencoder,and long short-term memory variational autoencoder.The proposed model outperformed all compared models,demonstrating F1-score improvements of 4.99%,2.25%,1.92%,and 3.69%,respectively.展开更多
Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effe...Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effective low-dimensional representation and visualization of the original scRNA-Seq data. At the single cell level, the transcriptional fluctuations are much larger than the average of a cell population, and the low amount of RNA transcripts will increase the rate of technical dropout events. Therefore, scRNA-seq data are much noisier than traditional bulk RNA-seq data. In this study, we proposed the deep variational autoencoder for scRNA-seq data(VASC), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. VASC can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on over 20 datasets, VASC shows superior performances in most cases and exhibits broader dataset compatibility compared to four state-of-the-art dimension reduction and visualization methods. In addition, VASC provides better representations for very rare cell populations in the 2D visualization. As a case study, VASC successfully re-establishes the cell dynamics in pre-implantation embryos and identifies several candidate marker genes associated with early embryo development. Moreover, VASC also performs well on a 10× Genomics dataset with more cells and higher dropout rate.展开更多
With the development of intelligent agents pursuing humanisation,artificial intelligence must consider emotion,the most basic spiritual need in human interaction.Traditional emotional dialogue systems usually use an e...With the development of intelligent agents pursuing humanisation,artificial intelligence must consider emotion,the most basic spiritual need in human interaction.Traditional emotional dialogue systems usually use an external emotional dictionary to select appropriate emotional words to add to the response or concatenate emotional tags and semantic features in the decoding step to generate appropriate responses.However,selecting emotional words from a fixed emotional dictionary may result in loss of the diversity and consistency of the response.We propose a semantic and emotion-based dual latent variable generation model(Dual-LVG)for dialogue systems,which is able to generate appropriate emotional responses without an emotional dictionary.Different from previous work,the conditional variational autoencoder(CVAE)adopts the standard transformer structure.Then,Dual-LVG regularises the CVAE latent space by introducing a dual latent space of semantics and emotion.The content diversity and emotional accuracy of the generated responses are improved by learning emotion and semantic features respectively.Moreover,the average attention mechanism is adopted to better extract semantic features at the sequence level,and the semi-supervised attention mechanism is used in the decoding step to strengthen the fusion of emotional features of the model.Experimental results show that Dual-LVG can successfully achieve the effect of generating different content by controlling emotional factors.展开更多
Stocks that are fundamentally connected with each other tend to move together.Considering such common trends is believed to benefit stock movement forecasting tasks.However,such signals are not trivial to model becaus...Stocks that are fundamentally connected with each other tend to move together.Considering such common trends is believed to benefit stock movement forecasting tasks.However,such signals are not trivial to model because the connections among stocks are not physically presented and need to be estimated from volatile data.Motivated by this observation,we propose a framework that incorporates the inter-connection of firms to forecast stock prices.To effectively utilize a large set of fundamental features,we further design a novel pipeline.First,we use variational autoencoder(VAE)to reduce the dimension of stock fundamental information and then cluster stocks into a graph structure(fundamentally clustering).Second,a hybrid model of graph convolutional network and long-short term memory network(GCN-LSTM)with an adjacency graph matrix(learnt from VAE)is proposed for graph-structured stock market forecasting.Experiments on minute-level U.S.stock market data demonstrate that our model effectively captures both spatial and temporal signals and achieves superior improvement over baseline methods.The proposed model is promising for other applications in which there is a possible but hidden spatial dependency to improve time-series prediction.展开更多
Weak feature extraction is of great importance for condition monitoring and intelligent diagnosis of aeroengine.Aimed at achieving intelligent diagnosis of aero-engine main shaft bearing,an enhanced sparsity-assisted ...Weak feature extraction is of great importance for condition monitoring and intelligent diagnosis of aeroengine.Aimed at achieving intelligent diagnosis of aero-engine main shaft bearing,an enhanced sparsity-assisted intelligent condition monitoring method is proposed in this paper.Through analyzing the weakness of convex sparse model,i.e.the tradeoff between noise reduction and feature reconstruction,this paper proposes an enhanced-sparsity nonconvex regularized convex model based on Moreau envelope to achieve weak feature extraction.Accordingly,a sparsity-assisted deep convolutional variational autoencoders network is proposed,which achieves the intelligent identification of fault state through training denoised normal data.Finally,the effectiveness of the proposed method is verified through aero-engine bearing run-to-failure experiment.The comparison results show that the proposed method is good at abnormal pattern recognition,showing a good potential for weak fault intelligent diagnosis of aero-engine main shaft bearings.展开更多
<div style="text-align:justify;"> Most existing image dehazing methods based learning are less able to perform well to real hazy images. An important reason is that they are trained on synthetic hazy i...<div style="text-align:justify;"> Most existing image dehazing methods based learning are less able to perform well to real hazy images. An important reason is that they are trained on synthetic hazy images whose distribution is different from real hazy images. To relieve this issue, this paper proposes a new hazy scene generation model based on domain adaptation, which uses a variational autoencoder to encode the synthetic hazy image pairs and the real hazy images into the latent space to adapt. The synthetic hazy image pairs guide the model to learn the mapping of clear images to hazy images, the real hazy images are used to adapt the synthetic hazy images’ latent space to real hazy images through generative adversarial loss, so as to make the generative hazy images’ distribution as close to the real hazy images’ distribution as possible. By comparing the results of the model with traditional physical scattering models and Adobe Lightroom CC software, the hazy images generated in this paper is more realistic. Our end-to-end domain adaptation model is also very convenient to synthesize hazy images without depth map. Using traditional method to dehaze the synthetic hazy images generated by this paper, both SSIM and PSNR have been improved, proved that the effectiveness of our method. The non-reference haze density evaluation algorithm and other quantitative evaluation also illustrate the advantages of our method in synthetic hazy images. </div>展开更多
Data-driven garment animation is a current topic of interest in the computer graphics industry.Existing approaches generally establish the mapping between a single human pose or a temporal pose sequence,and garment de...Data-driven garment animation is a current topic of interest in the computer graphics industry.Existing approaches generally establish the mapping between a single human pose or a temporal pose sequence,and garment deformation,but it is difficult to quickly generate diverse clothed human animations.We address this problem with a method to automatically synthesize dressed human animations with temporal consistency from a specified human motion label.At the heart of our method is a twostage strategy.Specifically,we first learn a latent space encoding the sequence-level distribution of human motions utilizing a transformer-based conditional variational autoencoder(Transformer-CVAE).Then a garment simulator synthesizes dynamic garment shapes using a transformer encoder-decoder architecture.Since the learned latent space comes from varied human motions,our method can generate a variety of styles of motions given a specific motion label.By means of a novel beginning of sequence(BOS)learning strategy and a self-supervised refinement procedure,our garment simulator is capable of efficiently synthesizing garment deformation sequences corresponding to the generated human motions while maintaining temporal and spatial consistency.We verify our ideas experimentally.This is the first generative model that directly dresses human animation.展开更多
As one of the essential tools for spatio‒temporal traffic data mining,vehicle trajectory clustering is widely used to mine the behavior patterns of vehicles.However,uploading original vehicle trajectory data to the se...As one of the essential tools for spatio‒temporal traffic data mining,vehicle trajectory clustering is widely used to mine the behavior patterns of vehicles.However,uploading original vehicle trajectory data to the server and clustering carry the risk of privacy leakage.Therefore,one of the current challenges is determining how to perform vehicle trajectory clustering while protecting user privacy.We propose a privacy-preserving vehicle trajectory clustering framework and construct a vehicle trajectory clustering model(IKV)based on the variational autoencoder(VAE)and an improved K-means algorithm.In the framework,the client calculates the hidden variables of the vehicle trajectory and uploads the variables to the server;the server uses the hidden variables for clustering analysis and delivers the analysis results to the client.The IKV’workflow is as follows:first,we train the VAE with historical vehicle trajectory data(when VAE’s decoder can approximate the original data,the encoder is deployed to the edge computing device);second,the edge device transmits the hidden variables to the server;finally,clustering is performed using improved K-means,which prevents the leakage of the vehicle trajectory.IKV is compared to numerous clustering methods on three datasets.In the nine performance comparison experiments,IKV achieves optimal or sub-optimal performance in six of the experiments.Furthermore,in the nine sensitivity analysis experiments,IKV not only demonstrates significant stability in seven experiments but also shows good robustness to hyperparameter variations.These results validate that the framework proposed in this paper is not only suitable for privacy-conscious production environments,such as carpooling tasks,but also adapts to clustering tasks of different magnitudes due to the low sensitivity to the number of cluster centers.展开更多
Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different...Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different information sources or various LIB cell types has not been well studied. In this paper, an unsupervised learning model called variational autoencoder (VAE) is evaluated with three datasets of charge-discharge cycles with different conditions. The model was first trained with a publicly available dataset of commercial cylindrical cells, and then evaluated with our private datasets of commercial pouch and hand-made coin cells. These cells used different chemistry and were tested with different cycle testers under different purposes, which induces various characteristics to each dataset. We report that researchers can recognise these characteristics with VAE to plan a proper data preprocessing. We also discuss about interpretability of a ML model.展开更多
The capability of Convolutional Neural Networks(CNNs)for sparse representation has significant application to complex tasks like Representation Learning(RL).However,labelled datasets of sufficient size for learning th...The capability of Convolutional Neural Networks(CNNs)for sparse representation has significant application to complex tasks like Representation Learning(RL).However,labelled datasets of sufficient size for learning this representation are not easily obtainable.The unsupervised learning capability of Variational Autoencoders(VAEs)and Generative Adversarial Networks(GANs)provide a promising solution to this issue through their capacity to learn representations for novel data samples and classification tasks.In this research,a texture-based latent space disentanglement technique is proposed to enhance learning of representations for novel data samples.A comparison is performed among different VAEs and GANs with the proposed approach for synthesis of new data samples.Two different VAE architectures are considered,a single layer dense VAE and a convolution based VAE,to compare the effectiveness of different architectures for learning of the representations.The GANs are selected based on the distance metric for disjoint distribution divergence estimation of complex representation learning tasks.The proposed texture-based disentanglement has been shown to provide a significant improvement for disentangling the process of representation learning by conditioning the random noise and synthesising texture rich images of fruit and vegetables。展开更多
An optimal configuration method of a multi-energy microgrid system based on the deep joint generation of sourceload-temperature scenarios is proposed to improve the multienergy complementation and the reliability of e...An optimal configuration method of a multi-energy microgrid system based on the deep joint generation of sourceload-temperature scenarios is proposed to improve the multienergy complementation and the reliability of energy supply in extreme scenarios.First,based on the historical meteorological data,the typical meteorological clusters and extreme temperature types are obtained.Then,to reflect the uncertainty of energy consumption and renewable energy output in different weather types,a deep joint generation model using a radiation-electric load-temperature scenario based on a denoising variational autoencoder is established for each weather module.At the same time,to cover the potential high energy consumption scenarios with extreme temperatures,the extreme scenarios with fewer data samples are expanded.Then,the scenarios are reduced by clustering analysis.The normal days of different typical scenarios and extreme temperature scenarios are determined,and the cooling and heating loads are determined by temperature.Finally,the optimal configuration of a multi-energy microgrid system is carried out.Experiments show that the optimal configuration based on the extreme scenarios and typical scenarios can improve the power supply reliability of the system.The proposed method can accurately capture the complementary potential of energy sources.And the economy of the system configuration is improved by 14.56%.展开更多
Building Integrated Photovoltaics (BIPV) is a promising technology to decarbonize urban energy systems viaharnessing solar energy available on building envelopes. While methods to assess solar irradiation, especiallyo...Building Integrated Photovoltaics (BIPV) is a promising technology to decarbonize urban energy systems viaharnessing solar energy available on building envelopes. While methods to assess solar irradiation, especiallyon rooftops, are well established, the assessment on building facades usually involves a higher effort due tomore complex urban features and obstructions. The drawback of existing physics-based simulation programsare that they require significant manual modeling effort and computing time for generating time resolveddeterministic results. Yet, solar irradiation is highly intermittent and representing its inherent uncertainty maybe required for designing robust BIPV energy systems. Targeting on these drawbacks, this paper proposes adata-driven model based on Deep Generative Networks (DGN) to efficiently generate stochastic ensembles ofannual hourly solar irradiance time series on building facades with uncompromised spatiotemporal resolutionat the urban scale. The only input required are easily obtainable fisheye images as categorical shading maskscaptured from 3D models. In principle, even actual photographs of urban contexts can be utilized, given they are semantically segmented. The potential of our approach is that it may be applied as a surrogate for timeconsuming simulations, when facing lacking information (e.g., no 3D model exists), and to use the generatedstochastic time-series ensembles in robust energy systems planning. Our validations exemplify a good fidelityof the generated time series when compared to the physics-based simulator. Due to the nature of the usedDGNs, it remains an open challenge to precisely reconstruct the ground truth one-to-one for each hour of theyear. However, we consider the benefits of the approach to outweigh the shortcomings. To demonstrate themodel’s relevance for urban energy planning, we showcase its potential for generative design by parametricallyaltering characteristic features of the urban environment and producing corresponding time series on buildingfacades under different climatic contexts in real-time.展开更多
Significant progress has been made in image inpainting methods in recent years.However,they are incapable of producing inpainting results with reasonable structures,rich detail,and sharpness at the same time.In this p...Significant progress has been made in image inpainting methods in recent years.However,they are incapable of producing inpainting results with reasonable structures,rich detail,and sharpness at the same time.In this paper,we propose the Pyramid-VAE-GAN network for image inpainting to address this limitation.Our network is built on a variational autoencoder(VAE)backbone that encodes high-level latent variables to represent complicated high-dimensional prior distributions of images.The prior assists in reconstructing reasonable structures when inpainting.We also adopt a pyramid structure in our model to maintain rich detail in low-level latent variables.To avoid the usual incompatibility of requiring both reasonable structures and rich detail,we propose a novel cross-layer latent variable transfer module.This transfers information about long-range structures contained in high-level latent variables to low-level latent variables representing more detailed information.We further use adversarial training to select the most reasonable results and to improve the sharpness of the images.Extensive experimental results on multiple datasets demonstrate the superiority of our method.Our code is available at https://github.com/thy960112/Pyramid-VAE-GAN.展开更多
Ensuring the safe and efficient operation of self-driving vehicles relies heavily on accurately predicting their future trajectories.Existing approaches commonly employ an encoder-decoder neural network structure to e...Ensuring the safe and efficient operation of self-driving vehicles relies heavily on accurately predicting their future trajectories.Existing approaches commonly employ an encoder-decoder neural network structure to enhance information extraction during the encoding phase.However,these methods often neglect the inclusion of road rule constraints during trajectory formulation in the decoding phase.This paper proposes a novel method that combines neural networks and rule-based constraints in the decoder stage to improve trajectory prediction accuracy while ensuring compliance with vehicle kinematics and road rules.The approach separates vehicle trajectories into lateral and longitudinal routes and utilizes conditional variational autoencoder(CVAE)to capture trajectory uncertainty.The evaluation results demonstrate a reduction of 32.4%and 27.6%in the average displacement error(ADE)for predicting the top five and top ten trajectories,respectively,compared to the baseline method.展开更多
基金National Natural Science Foundation of China(61976209,62020106015,U21A20388)in part by the CAS International Collaboration Key Project(173211KYSB20190024)in part by the Strategic Priority Research Program of CAS(XDB32040000)。
文摘Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.
文摘In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different feature sets and their performances were evaluated in terms of accuracy and F-measure metrics.While the first experiments directly used the own stock features as the model inputs,the second experiments utilized reduced stock features through Variational AutoEncoders(VAE).In the last experiments,in order to grasp the effects of the other banking stocks on individual stock performance,the features belonging to other stocks were also given as inputs to our models.While combining other stock features was done for both own(named as allstock_own)and VAE-reduced(named as allstock_VAE)stock features,the expanded dimensions of the feature sets were reduced by Recursive Feature Elimination.As the highest success rate increased up to 0.685 with allstock_own and LSTM with attention model,the combination of allstock_VAE and LSTM with the attention model obtained an accuracy rate of 0.675.Although the classification results achieved with both feature types was close,allstock_VAE achieved these results using nearly 16.67%less features compared to allstock_own.When all experimental results were examined,it was found out that the models trained with allstock_own and allstock_VAE achieved higher accuracy rates than those using individual stock features.It was also concluded that the results obtained with the VAE-reduced stock features were similar to those obtained by own stock features.
文摘In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have been widely applied in order to decrease the wasting of the industrial database.Nevertheless,these latent variables do not usually follow the Gaussian distribution and thus perform unsuitable when applying some statistics indices,especially the T^(2) on them.Variational AutoEncoders(VAE),an unsupervised deep learning algorithm using the hierarchy study method,has the ability to make the latent variables follow the Gaussian distribution.The partial least squares(PLS)are used to obtain the information between the dependent variables and independent variables.In this paper,we will integrate these two methods and make a comparison with other methods.The superiority of this proposed method will be verified by the simulation and the Trimethylchlorosilane purification process in terms of the multivariate control charts.
基金Supported by the National Natural Science Foundation of China(61210007).
文摘Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of face reconstruction,face reenactment and talking head et al..However,due to the sparsity of landmarks and the lack of accurate labels for the factors,it is hard to learn the disentangled representation of landmarks.To address these problem,we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations,which is based on a Variational Autoencoder framework.Besides,we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage.Moreover,we implement an identity preservation loss to further enhance the representation ability of identity factor.To the best of our knowledge,this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark.
基金Supported by National Natural Science Foundation of China(41804126,41604107).
文摘Supervised machine learning algorithms have been widely used in seismic exploration processing,but the lack of labeled examples complicates its application.Therefore,we propose a seismic labeled data expansion method based on deep variational Autoencoders(VAE),which are made of neural networks and contains two partsEncoder and Decoder.Lack of training samples leads to overfitting of the network.We training the VAE with whole seismic data,which is a data-driven process and greatly alleviates the risk of overfitting.The Encoder captures the ability to map the seismic waveform Y to latent deep features z,and the Decoder captures the ability to reconstruct high-dimensional waveform Yb from latent deep features z.Later,we put the labeled seismic data into Encoders and get the latent deep features.We can easily use gaussian mixture model to fit the deep feature distribution of each class labeled data.We resample a mass of expansion deep features z* according to the Gaussian mixture model,and put the expansion deep features into the decoder to generate expansion seismic data.The experiments in synthetic and real data show that our method alleviates the problem of lacking labeled seismic data for supervised seismic facies analysis.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(No.2021R1A2C2011391)was supported by the Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2021-0-01806Development of security by design and security management technology in smart factory).
文摘Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the contemporary convergence environment to connect to corporate networks and cloud-based applications only worsens this situation,as it facilitates multiple new attack vectors to emerge effortlessly.As such,existing intrusion detection systems suffer from performance degradation mainly because of insufficient considerations and poorly modeled detection systems.To address this problem,we designed a blended threat detection approach,considering the possible impact and dimensionality of new attack surfaces due to the aforementioned convergence.We collectively refer to the convergence of different technology sectors as the internet of blended environment.The proposed approach encompasses an ensemble of heterogeneous probabilistic autoencoders that leverage the corresponding advantages of a convolutional variational autoencoder and long short-term memory variational autoencoder.An extensive experimental analysis conducted on the TON_IoT dataset demonstrated 96.02%detection accuracy.Furthermore,performance of the proposed approach was compared with various single model(autoencoder)-based network intrusion detection approaches:autoencoder,variational autoencoder,convolutional variational autoencoder,and long short-term memory variational autoencoder.The proposed model outperformed all compared models,demonstrating F1-score improvements of 4.99%,2.25%,1.92%,and 3.69%,respectively.
基金supported by the National Natural Science Foundation of China (Grant Nos.61370035 and 31361163004)Tsinghua University Initiative Scientific Research Program
文摘Single-cell RNA sequencing(scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell subpopulations and lineages, with an effective low-dimensional representation and visualization of the original scRNA-Seq data. At the single cell level, the transcriptional fluctuations are much larger than the average of a cell population, and the low amount of RNA transcripts will increase the rate of technical dropout events. Therefore, scRNA-seq data are much noisier than traditional bulk RNA-seq data. In this study, we proposed the deep variational autoencoder for scRNA-seq data(VASC), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. VASC can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on over 20 datasets, VASC shows superior performances in most cases and exhibits broader dataset compatibility compared to four state-of-the-art dimension reduction and visualization methods. In addition, VASC provides better representations for very rare cell populations in the 2D visualization. As a case study, VASC successfully re-establishes the cell dynamics in pre-implantation embryos and identifies several candidate marker genes associated with early embryo development. Moreover, VASC also performs well on a 10× Genomics dataset with more cells and higher dropout rate.
基金Fundamental Research Funds for the Central Universities of China,Grant/Award Number:CUC220B009National Natural Science Foundation of China,Grant/Award Numbers:62207029,62271454,72274182。
文摘With the development of intelligent agents pursuing humanisation,artificial intelligence must consider emotion,the most basic spiritual need in human interaction.Traditional emotional dialogue systems usually use an external emotional dictionary to select appropriate emotional words to add to the response or concatenate emotional tags and semantic features in the decoding step to generate appropriate responses.However,selecting emotional words from a fixed emotional dictionary may result in loss of the diversity and consistency of the response.We propose a semantic and emotion-based dual latent variable generation model(Dual-LVG)for dialogue systems,which is able to generate appropriate emotional responses without an emotional dictionary.Different from previous work,the conditional variational autoencoder(CVAE)adopts the standard transformer structure.Then,Dual-LVG regularises the CVAE latent space by introducing a dual latent space of semantics and emotion.The content diversity and emotional accuracy of the generated responses are improved by learning emotion and semantic features respectively.Moreover,the average attention mechanism is adopted to better extract semantic features at the sequence level,and the semi-supervised attention mechanism is used in the decoding step to strengthen the fusion of emotional features of the model.Experimental results show that Dual-LVG can successfully achieve the effect of generating different content by controlling emotional factors.
文摘Stocks that are fundamentally connected with each other tend to move together.Considering such common trends is believed to benefit stock movement forecasting tasks.However,such signals are not trivial to model because the connections among stocks are not physically presented and need to be estimated from volatile data.Motivated by this observation,we propose a framework that incorporates the inter-connection of firms to forecast stock prices.To effectively utilize a large set of fundamental features,we further design a novel pipeline.First,we use variational autoencoder(VAE)to reduce the dimension of stock fundamental information and then cluster stocks into a graph structure(fundamentally clustering).Second,a hybrid model of graph convolutional network and long-short term memory network(GCN-LSTM)with an adjacency graph matrix(learnt from VAE)is proposed for graph-structured stock market forecasting.Experiments on minute-level U.S.stock market data demonstrate that our model effectively captures both spatial and temporal signals and achieves superior improvement over baseline methods.The proposed model is promising for other applications in which there is a possible but hidden spatial dependency to improve time-series prediction.
基金the National Natural Science Foundations of China(Nos.91860125,51705398)the National Key Basic Research Program of China(No.2015CB057400)the Shaanxi Province 2020 Natural Science Basic Research Plan(No.2020JQ-042).
文摘Weak feature extraction is of great importance for condition monitoring and intelligent diagnosis of aeroengine.Aimed at achieving intelligent diagnosis of aero-engine main shaft bearing,an enhanced sparsity-assisted intelligent condition monitoring method is proposed in this paper.Through analyzing the weakness of convex sparse model,i.e.the tradeoff between noise reduction and feature reconstruction,this paper proposes an enhanced-sparsity nonconvex regularized convex model based on Moreau envelope to achieve weak feature extraction.Accordingly,a sparsity-assisted deep convolutional variational autoencoders network is proposed,which achieves the intelligent identification of fault state through training denoised normal data.Finally,the effectiveness of the proposed method is verified through aero-engine bearing run-to-failure experiment.The comparison results show that the proposed method is good at abnormal pattern recognition,showing a good potential for weak fault intelligent diagnosis of aero-engine main shaft bearings.
文摘<div style="text-align:justify;"> Most existing image dehazing methods based learning are less able to perform well to real hazy images. An important reason is that they are trained on synthetic hazy images whose distribution is different from real hazy images. To relieve this issue, this paper proposes a new hazy scene generation model based on domain adaptation, which uses a variational autoencoder to encode the synthetic hazy image pairs and the real hazy images into the latent space to adapt. The synthetic hazy image pairs guide the model to learn the mapping of clear images to hazy images, the real hazy images are used to adapt the synthetic hazy images’ latent space to real hazy images through generative adversarial loss, so as to make the generative hazy images’ distribution as close to the real hazy images’ distribution as possible. By comparing the results of the model with traditional physical scattering models and Adobe Lightroom CC software, the hazy images generated in this paper is more realistic. Our end-to-end domain adaptation model is also very convenient to synthesize hazy images without depth map. Using traditional method to dehaze the synthetic hazy images generated by this paper, both SSIM and PSNR have been improved, proved that the effectiveness of our method. The non-reference haze density evaluation algorithm and other quantitative evaluation also illustrate the advantages of our method in synthetic hazy images. </div>
基金supported by the National Natural Science Foundation of China(Grant No.61972379).
文摘Data-driven garment animation is a current topic of interest in the computer graphics industry.Existing approaches generally establish the mapping between a single human pose or a temporal pose sequence,and garment deformation,but it is difficult to quickly generate diverse clothed human animations.We address this problem with a method to automatically synthesize dressed human animations with temporal consistency from a specified human motion label.At the heart of our method is a twostage strategy.Specifically,we first learn a latent space encoding the sequence-level distribution of human motions utilizing a transformer-based conditional variational autoencoder(Transformer-CVAE).Then a garment simulator synthesizes dynamic garment shapes using a transformer encoder-decoder architecture.Since the learned latent space comes from varied human motions,our method can generate a variety of styles of motions given a specific motion label.By means of a novel beginning of sequence(BOS)learning strategy and a self-supervised refinement procedure,our garment simulator is capable of efficiently synthesizing garment deformation sequences corresponding to the generated human motions while maintaining temporal and spatial consistency.We verify our ideas experimentally.This is the first generative model that directly dresses human animation.
基金the National Natural Science Foundation of China(No.71961028)the Key Research and Development Program of Gansu Province,China(No.22YF7GA171)+3 种基金the University Industry Support Program of Gansu Province,China(No.2023QB-115)the Innovation Fund for Science and Technology-Based Small and Medium Enterprises of Gansu Province,China(No.23CXGA0136)the Traditional Chinese Medicine Industry Innovation Consortium Project of Gansu Province,China(No.22ZD6FA021-5)the Scientific Research Project of the Lanzhou Science and Technology Program,China(No.2018-01-58)。
文摘As one of the essential tools for spatio‒temporal traffic data mining,vehicle trajectory clustering is widely used to mine the behavior patterns of vehicles.However,uploading original vehicle trajectory data to the server and clustering carry the risk of privacy leakage.Therefore,one of the current challenges is determining how to perform vehicle trajectory clustering while protecting user privacy.We propose a privacy-preserving vehicle trajectory clustering framework and construct a vehicle trajectory clustering model(IKV)based on the variational autoencoder(VAE)and an improved K-means algorithm.In the framework,the client calculates the hidden variables of the vehicle trajectory and uploads the variables to the server;the server uses the hidden variables for clustering analysis and delivers the analysis results to the client.The IKV’workflow is as follows:first,we train the VAE with historical vehicle trajectory data(when VAE’s decoder can approximate the original data,the encoder is deployed to the edge computing device);second,the edge device transmits the hidden variables to the server;finally,clustering is performed using improved K-means,which prevents the leakage of the vehicle trajectory.IKV is compared to numerous clustering methods on three datasets.In the nine performance comparison experiments,IKV achieves optimal or sub-optimal performance in six of the experiments.Furthermore,in the nine sensitivity analysis experiments,IKV not only demonstrates significant stability in seven experiments but also shows good robustness to hyperparameter variations.These results validate that the framework proposed in this paper is not only suitable for privacy-conscious production environments,such as carpooling tasks,but also adapts to clustering tasks of different magnitudes due to the low sensitivity to the number of cluster centers.
基金supported by the project“ZeDaBase-Batteriezelldatenbank”of the Initiative and Networking Fund of the Helmholtz Association(KW-BASF-6).
文摘Machine learning (ML) is a rapidly growing tool even in the lithium-ion battery (LIB) research field. To utilize this tool, more and more datasets have been published. However, applicability of a ML model to different information sources or various LIB cell types has not been well studied. In this paper, an unsupervised learning model called variational autoencoder (VAE) is evaluated with three datasets of charge-discharge cycles with different conditions. The model was first trained with a publicly available dataset of commercial cylindrical cells, and then evaluated with our private datasets of commercial pouch and hand-made coin cells. These cells used different chemistry and were tested with different cycle testers under different purposes, which induces various characteristics to each dataset. We report that researchers can recognise these characteristics with VAE to plan a proper data preprocessing. We also discuss about interpretability of a ML model.
基金Edith Cowan University(ECU),Australia and Higher Education Commission(HEC)Pakistan,The Islamia University of Bahawalpur(IUB)Pakistan(5-1/HRD/UE STPI(Batch-V)/1182/2017/HEC).
文摘The capability of Convolutional Neural Networks(CNNs)for sparse representation has significant application to complex tasks like Representation Learning(RL).However,labelled datasets of sufficient size for learning this representation are not easily obtainable.The unsupervised learning capability of Variational Autoencoders(VAEs)and Generative Adversarial Networks(GANs)provide a promising solution to this issue through their capacity to learn representations for novel data samples and classification tasks.In this research,a texture-based latent space disentanglement technique is proposed to enhance learning of representations for novel data samples.A comparison is performed among different VAEs and GANs with the proposed approach for synthesis of new data samples.Two different VAE architectures are considered,a single layer dense VAE and a convolution based VAE,to compare the effectiveness of different architectures for learning of the representations.The GANs are selected based on the distance metric for disjoint distribution divergence estimation of complex representation learning tasks.The proposed texture-based disentanglement has been shown to provide a significant improvement for disentangling the process of representation learning by conditioning the random noise and synthesising texture rich images of fruit and vegetables。
基金supported by National Key Research and Development Program of China(2019YFB1505400)Jilin Science and Technology Development Program(20160411003XH)Jilin Industrial Technology Research and Development Program(2019C058-8).
文摘An optimal configuration method of a multi-energy microgrid system based on the deep joint generation of sourceload-temperature scenarios is proposed to improve the multienergy complementation and the reliability of energy supply in extreme scenarios.First,based on the historical meteorological data,the typical meteorological clusters and extreme temperature types are obtained.Then,to reflect the uncertainty of energy consumption and renewable energy output in different weather types,a deep joint generation model using a radiation-electric load-temperature scenario based on a denoising variational autoencoder is established for each weather module.At the same time,to cover the potential high energy consumption scenarios with extreme temperatures,the extreme scenarios with fewer data samples are expanded.Then,the scenarios are reduced by clustering analysis.The normal days of different typical scenarios and extreme temperature scenarios are determined,and the cooling and heating loads are determined by temperature.Finally,the optimal configuration of a multi-energy microgrid system is carried out.Experiments show that the optimal configuration based on the extreme scenarios and typical scenarios can improve the power supply reliability of the system.The proposed method can accurately capture the complementary potential of energy sources.And the economy of the system configuration is improved by 14.56%.
文摘Building Integrated Photovoltaics (BIPV) is a promising technology to decarbonize urban energy systems viaharnessing solar energy available on building envelopes. While methods to assess solar irradiation, especiallyon rooftops, are well established, the assessment on building facades usually involves a higher effort due tomore complex urban features and obstructions. The drawback of existing physics-based simulation programsare that they require significant manual modeling effort and computing time for generating time resolveddeterministic results. Yet, solar irradiation is highly intermittent and representing its inherent uncertainty maybe required for designing robust BIPV energy systems. Targeting on these drawbacks, this paper proposes adata-driven model based on Deep Generative Networks (DGN) to efficiently generate stochastic ensembles ofannual hourly solar irradiance time series on building facades with uncompromised spatiotemporal resolutionat the urban scale. The only input required are easily obtainable fisheye images as categorical shading maskscaptured from 3D models. In principle, even actual photographs of urban contexts can be utilized, given they are semantically segmented. The potential of our approach is that it may be applied as a surrogate for timeconsuming simulations, when facing lacking information (e.g., no 3D model exists), and to use the generatedstochastic time-series ensembles in robust energy systems planning. Our validations exemplify a good fidelityof the generated time series when compared to the physics-based simulator. Due to the nature of the usedDGNs, it remains an open challenge to precisely reconstruct the ground truth one-to-one for each hour of theyear. However, we consider the benefits of the approach to outweigh the shortcomings. To demonstrate themodel’s relevance for urban energy planning, we showcase its potential for generative design by parametricallyaltering characteristic features of the urban environment and producing corresponding time series on buildingfacades under different climatic contexts in real-time.
基金The authors gratefully acknowledge the financial support of the National Natural Science Foundation of China(Grant No.61925603).
文摘Significant progress has been made in image inpainting methods in recent years.However,they are incapable of producing inpainting results with reasonable structures,rich detail,and sharpness at the same time.In this paper,we propose the Pyramid-VAE-GAN network for image inpainting to address this limitation.Our network is built on a variational autoencoder(VAE)backbone that encodes high-level latent variables to represent complicated high-dimensional prior distributions of images.The prior assists in reconstructing reasonable structures when inpainting.We also adopt a pyramid structure in our model to maintain rich detail in low-level latent variables.To avoid the usual incompatibility of requiring both reasonable structures and rich detail,we propose a novel cross-layer latent variable transfer module.This transfers information about long-range structures contained in high-level latent variables to low-level latent variables representing more detailed information.We further use adversarial training to select the most reasonable results and to improve the sharpness of the images.Extensive experimental results on multiple datasets demonstrate the superiority of our method.Our code is available at https://github.com/thy960112/Pyramid-VAE-GAN.
基金supported in part by the National Natural Science Foundation of China under Grant 52372393,62003238in part by the DongfengTechnology Center(Research and Application of Next-Generation Low-Carbonntelligent Architecture Technology).
文摘Ensuring the safe and efficient operation of self-driving vehicles relies heavily on accurately predicting their future trajectories.Existing approaches commonly employ an encoder-decoder neural network structure to enhance information extraction during the encoding phase.However,these methods often neglect the inclusion of road rule constraints during trajectory formulation in the decoding phase.This paper proposes a novel method that combines neural networks and rule-based constraints in the decoder stage to improve trajectory prediction accuracy while ensuring compliance with vehicle kinematics and road rules.The approach separates vehicle trajectories into lateral and longitudinal routes and utilizes conditional variational autoencoder(CVAE)to capture trajectory uncertainty.The evaluation results demonstrate a reduction of 32.4%and 27.6%in the average displacement error(ADE)for predicting the top five and top ten trajectories,respectively,compared to the baseline method.