Landslides are destructive natural disasters that cause catastrophic damage and loss of life worldwide.Accurately predicting landslide displacement enables effective early warning and risk management.However,the limit...Landslides are destructive natural disasters that cause catastrophic damage and loss of life worldwide.Accurately predicting landslide displacement enables effective early warning and risk management.However,the limited availability of on-site measurement data has been a substantial obstacle in developing data-driven models,such as state-of-the-art machine learning(ML)models.To address these challenges,this study proposes a data augmentation framework that uses generative adversarial networks(GANs),a recent advance in generative artificial intelligence(AI),to improve the accuracy of landslide displacement prediction.The framework provides effective data augmentation to enhance limited datasets.A recurrent GAN model,RGAN-LS,is proposed,specifically designed to generate realistic synthetic multivariate time series that mimics the characteristics of real landslide on-site measurement data.A customized moment-matching loss is incorporated in addition to the adversarial loss in GAN during the training of RGAN-LS to capture the temporal dynamics and correlations in real time series data.Then,the synthetic data generated by RGAN-LS is used to enhance the training of long short-term memory(LSTM)networks and particle swarm optimization-support vector machine(PSO-SVM)models for landslide displacement prediction tasks.Results on two landslides in the Three Gorges Reservoir(TGR)region show a significant improvement in LSTM model prediction performance when trained on augmented data.For instance,in the case of the Baishuihe landslide,the average root mean square error(RMSE)increases by 16.11%,and the mean absolute error(MAE)by 17.59%.More importantly,the model’s responsiveness during mutational stages is enhanced for early warning purposes.However,the results have shown that the static PSO-SVM model only sees marginal gains compared to recurrent models such as LSTM.Further analysis indicates that an optimal synthetic-to-real data ratio(50%on the illustration cases)maximizes the improvements.This also demonstrates the robustness and effectiveness of supplementing training data for dynamic models to obtain better results.By using the powerful generative AI approach,RGAN-LS can generate high-fidelity synthetic landslide data.This is critical for improving the performance of advanced ML models in predicting landslide displacement,particularly when there are limited training data.Additionally,this approach has the potential to expand the use of generative AI in geohazard risk management and other research areas.展开更多
Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved,including digit deformation,noise interference between frames,blurred output,and the need for tem...Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved,including digit deformation,noise interference between frames,blurred output,and the need for temporal coherence across frames.In this paper,we propose a novel approach for generating coherent videos of moving digits from textual input using a Deep Deconvolutional Generative Adversarial Network(DD-GAN).The DDGAN comprises a Deep Deconvolutional Neural Network(DDNN)as a Generator(G)and a modified Deep Convolutional Neural Network(DCNN)as a Discriminator(D)to ensure temporal coherence between adjacent frames.The proposed research involves several steps.First,the input text is fed into a Long Short Term Memory(LSTM)based text encoder and then smoothed using Conditioning Augmentation(CA)techniques to enhance the effectiveness of the Generator(G).Next,using a DDNN to generate video frames by incorporating enhanced text and random noise and modifying a DCNN to act as a Discriminator(D),effectively distinguishing between generated and real videos.This research evaluates the quality of the generated videos using standard metrics like Inception Score(IS),Fréchet Inception Distance(FID),Fréchet Inception Distance for video(FID2vid),and Generative Adversarial Metric(GAM),along with a human study based on realism,coherence,and relevance.By conducting experiments on Single-Digit Bouncing MNIST GIFs(SBMG),Two-Digit Bouncing MNIST GIFs(TBMG),and a custom dataset of essential mathematics videos with related text,this research demonstrates significant improvements in both metrics and human study results,confirming the effectiveness of DD-GAN.This research also took the exciting challenge of generating preschool math videos from text,handling complex structures,digits,and symbols,and achieving successful results.The proposed research demonstrates promising results for generating coherent videos from textual input.展开更多
Sarcasm detection in text data is an increasingly vital area of research due to the prevalence of sarcastic content in online communication.This study addresses challenges associated with small datasets and class imba...Sarcasm detection in text data is an increasingly vital area of research due to the prevalence of sarcastic content in online communication.This study addresses challenges associated with small datasets and class imbalances in sarcasm detection by employing comprehensive data pre-processing and Generative Adversial Network(GAN)based augmentation on diverse datasets,including iSarcasm,SemEval-18,and Ghosh.This research offers a novel pipeline for augmenting sarcasm data with Reverse Generative Adversarial Network(RGAN).The proposed RGAN method works by inverting labels between original and synthetic data during the training process.This inversion of labels provides feedback to the generator for generating high-quality data closely resembling the original distribution.Notably,the proposed RGAN model exhibits performance on par with standard GAN,showcasing its robust efficacy in augmenting text data.The exploration of various datasets highlights the nuanced impact of augmentation on model performance,with cautionary insights into maintaining a delicate balance between synthetic and original data.The methodological framework encompasses comprehensive data pre-processing and GAN-based augmentation,with a meticulous comparison against Natural Language Processing Augmentation(NLPAug)as an alternative augmentation technique.Overall,the F1-score of our proposed technique outperforms that of the synonym replacement augmentation technique using NLPAug.The increase in F1-score in experiments using RGAN ranged from 0.066%to 1.054%,and the use of standard GAN resulted in a 2.88%increase in F1-score.The proposed RGAN model outperformed the NLPAug method and demonstrated comparable performance to standard GAN,emphasizing its efficacy in text data augmentation.展开更多
Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adver...Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adversarial learning idea.The goal of GANs is to estimate the potential distribution of real data samples and generate new samples from that distribution.Since their initiation, GANs have been widely studied due to their enormous prospect for applications, including image and vision computing, speech and language processing, etc. In this review paper, we summarize the state of the art of GANs and look into the future. Firstly, we survey GANs' proposal background,theoretic and implementation models, and application fields.Then, we discuss GANs' advantages and disadvantages, and their development trends. In particular, we investigate the relation between GANs and parallel intelligence,with the conclusion that GANs have a great potential in parallel systems research in terms of virtual-real interaction and integration. Clearly, GANs can provide substantial algorithmic support for parallel intelligence.展开更多
Sampling-based path planning is a popular methodology for robot path planning.With a uniform sampling strategy to explore the state space,a feasible path can be found without the complex geometric modeling of the conf...Sampling-based path planning is a popular methodology for robot path planning.With a uniform sampling strategy to explore the state space,a feasible path can be found without the complex geometric modeling of the configuration space.However,the quality of the initial solution is not guaranteed,and the convergence speed to the optimal solution is slow.In this paper,we present a novel image-based path planning algorithm to overcome these limitations.Specifically,a generative adversarial network(GAN)is designed to take the environment map(denoted as RGB image)as the input without other preprocessing works.The output is also an RGB image where the promising region(where a feasible path probably exists)is segmented.This promising region is utilized as a heuristic to achieve non-uniform sampling for the path planner.We conduct a number of simulation experiments to validate the effectiveness of the proposed method,and the results demonstrate that our method performs much better in terms of the quality of the initial solution and the convergence speed to the optimal solution.Furthermore,apart from the environments similar to the training set,our method also works well on the environments which are very different from the training set.展开更多
Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis...Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis. In this paper, a semi-supervised learning scheme is incorporated with generative adversarial network on image classification tasks to improve the image classification accuracy. Two applications of GANs are mainly focused on: semi-supervised learning and generation of images which can be as real as possible. The whole process is divided into two sections. First, only a small part of the dataset is utilized as labeled training data. And then a huge amount of samples generated from the generator is added into the training samples to improve the generalization of the discriminator. Through the semi-supervised learning scheme, full use of the unlabeled data is made which may contain potential information. Thus, the classification accuracy of the discriminator can be improved. Experimental results demonstrate the improvement of the classification accuracy of discriminator among different datasets, such as MNIST, CIFAR-10.展开更多
With aperture synthesis(AS)technique,a number of small antennas can be assembled to form a large telescope whose spatial resolution is determined by the distance of two farthest antennas instead of the diameter of a s...With aperture synthesis(AS)technique,a number of small antennas can be assembled to form a large telescope whose spatial resolution is determined by the distance of two farthest antennas instead of the diameter of a single-dish antenna.In contrast from a direct imaging system,an AS telescope captures the Fourier coefficients of a spatial object,and then implement inverse Fourier transform to reconstruct the spatial image.Due to the limited number of antennas,the Fourier coefficients are extremely sparse in practice,resulting in a very blurry image.To remove/reduce blur,“CLEAN”deconvolution has been widely used in the literature.However,it was initially designed for a point source.For an extended source,like the Sun,its efficiency is unsatisfactory.In this study,a deep neural network,referring to Generative Adversarial Network(GAN),is proposed for solar image deconvolution.The experimental results demonstrate that the proposed model is markedly better than traditional CLEAN on solar images.The main purpose of this work is visual inspection instead of quantitative scientific computation.We believe that this will also help scientists to better understand solar phenomena with high quality images.展开更多
Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of ...Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of samples in minority classes based on generative adversarial networks(GANs)has been demonstrated as an effective approach.This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network(CMAGAN).In the CMAGAN framework,an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers.Subsequently,a newly designed boundary-strengthening learning GAN(BSLGAN)is employed to generate additional samples for minority classes.By incorporating a supplementary classifier and innovative training mechanisms,the BSLGAN focuses on learning the distribution of samples near classification boundaries.Consequently,it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries.Finally,the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution.To evaluate the effectiveness of the proposed approach,CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications.The performance of CMAGAN was compared with that of seven other algorithms,including state-of-the-art GAN-based methods,and the results indicated that CMAGAN could provide higher-quality augmented results.展开更多
Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss...Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss functions are introduced to measure the degree of similarity between the samples generated by the generator and the real data samples,and the effectiveness of the loss functions in improving the generating ability of GANs.In this paper,we present a detailed survey for the loss functions used in GANs,and provide a critical analysis on the pros and cons of these loss functions.First,the basic theory of GANs along with the training mechanism are introduced.Then,the most commonly used loss functions in GANs are introduced and analyzed.Third,the experimental analyses and comparison of these loss functions are presented in different GAN architectures.Finally,several suggestions on choosing suitable loss functions for image synthesis tasks are given.展开更多
In this paper,we propose a hybrid model aiming to map the input noise vector to the label of the generated image by the generative adversarial network(GAN).This model mainly consists of a pre-trained deep convolution ...In this paper,we propose a hybrid model aiming to map the input noise vector to the label of the generated image by the generative adversarial network(GAN).This model mainly consists of a pre-trained deep convolution generative adversarial network(DCGAN)and a classifier.By using the model,we visualize the distribution of two-dimensional input noise,leading to a specific type of the generated image after each training epoch of GAN.The visualization reveals the distribution feature of the input noise vector and the performance of the generator.With this feature,we try to build a guided generator(GG)with the ability to produce a fake image we need.Two methods are proposed to build GG.One is the most significant noise(MSN)method,and the other utilizes labeled noise.The MSN method can generate images precisely but with less variations.In contrast,the labeled noise method has more variations but is slightly less stable.Finally,we propose a criterion to measure the performance of the generator,which can be used as a loss function to effectively train the network.展开更多
Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the reco...Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the recorded data have certain missing values due to factors,such as weather and equipment anomalies.These missing values seriously affect the analysis of QAR data by aeronautical engineers,such as airline flight scenario reproduction and airline flight safety status assessment.Therefore,imputing missing values in the QAR data,which can further guarantee the flight safety of airlines,is crucial.QAR data also have multivariate,multiprocess,and temporal features.Therefore,we innovatively propose the imputation models A-AEGAN("A"denotes attention mechanism,"AE"denotes autoencoder,and"GAN"denotes generative adversarial network)and SA-AEGAN("SA"denotes self-attentive mechanism)for missing values of QAR data,which can be effectively applied to QAR data.Specifically,we apply an innovative generative adversarial network to impute missing values from QAR data.The improved gated recurrent unit is then introduced as the neural unit of GAN,which can successfully capture the temporal relationships in QAR data.In addition,we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator.The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator.We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data.Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data.Furthermore,we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data.Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.展开更多
The objective of style transfer is to maintain the content of an image while transferring the style of another image.However,conventional methods face challenges in preserving facial features,especially in Korean port...The objective of style transfer is to maintain the content of an image while transferring the style of another image.However,conventional methods face challenges in preserving facial features,especially in Korean portraits where elements like the“Gat”(a traditional Korean hat)are prevalent.This paper proposes a deep learning network designed to perform style transfer that includes the“Gat”while preserving the identity of the face.Unlike traditional style transfer techniques,the proposed method aims to preserve the texture,attire,and the“Gat”in the style image by employing image sharpening and face landmark,with the GAN.The color,texture,and intensity were extracted differently based on the characteristics of each block and layer of the pre-trained VGG-16,and only the necessary elements during training were preserved using a facial landmark mask.The head area was presented using the eyebrow area to transfer the“Gat”.Furthermore,the identity of the face was retained,and style correlation was considered based on the Gram matrix.To evaluate performance,we introduced a metric using PSNR and SSIM,with an emphasis on median values through new weightings for style transfer in Korean portraits.Additionally,we have conducted a survey that evaluated the content,style,and naturalness of the transferred results,and based on the assessment,we can confidently conclude that our method to maintain the integrity of content surpasses the previous research.Our approach,enriched by landmarks preservation and diverse loss functions,including those related to“Gat”,outperformed previous researches in facial identity preservation.展开更多
Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).Howev...Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).However,image super-resolution reconstruction remains a difficult task because of the complexity and high textual requirements for diagnosis purpose.In this paper,we offer a deep learning based strategy for reconstructing medical images from low resolutions utilizing Transformer and generative adversarial networks(T-GANs).The integrated system can extract more precise texture information and focus more on important locations through global image matching after successfully inserting Transformer into the generative adversarial network for picture reconstruction.Furthermore,we weighted the combination of content loss,adversarial loss,and adversarial feature loss as the final multi-task loss function during the training of our proposed model T-GAN.In comparison to established measures like peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM),our suggested T-GAN achieves optimal performance and recovers more texture features in super-resolution reconstruction of MRI scanned images of the knees and belly.展开更多
Concrete subjected to fire loads is susceptible to explosive spalling, which can lead to the exposure of reinforcingsteel bars to the fire, substantially jeopardizing the structural safety and stability. The spalling ...Concrete subjected to fire loads is susceptible to explosive spalling, which can lead to the exposure of reinforcingsteel bars to the fire, substantially jeopardizing the structural safety and stability. The spalling of fire-loaded concreteis closely related to the evolution of pore pressure and temperature. Conventional analytical methods involve theresolution of complex, strongly coupled multifield equations, necessitating significant computational efforts. Torapidly and accurately obtain the distributions of pore-pressure and temperature, the Pix2Pix model is adoptedin this work, which is celebrated for its capabilities in image generation. The open-source dataset used hereinfeatures RGB images we generated using a sophisticated coupled model, while the grayscale images encapsulate the15 principal variables influencing spalling. After conducting a series of tests with different layers configurations,activation functions and loss functions, the Pix2Pix model suitable for assessing the spalling risk of fire-loadedconcrete has been meticulously designed and trained. The applicability and reliability of the Pix2Pix model inconcrete parameter prediction are verified by comparing its outcomes with those derived fromthe strong couplingTHC model. Notably, for the practical engineering applications, our findings indicate that utilizing monochromeimages as the initial target for analysis yields more dependable results. This work not only offers valuable insightsfor civil engineers specializing in concrete structures but also establishes a robust methodological approach forresearchers seeking to create similar predictive models.展开更多
Accurate displacement prediction is critical for the early warning of landslides.The complexity of the coupling relationship between multiple influencing factors and displacement makes the accurate prediction of displ...Accurate displacement prediction is critical for the early warning of landslides.The complexity of the coupling relationship between multiple influencing factors and displacement makes the accurate prediction of displacement difficult.Moreover,in engineering practice,insufficient monitoring data limit the performance of prediction models.To alleviate this problem,a displacement prediction method based on multisource domain transfer learning,which helps accurately predict data in the target domain through the knowledge of one or more source domains,is proposed.First,an optimized variational mode decomposition model based on the minimum sample entropy is used to decompose the cumulative displacement into the trend,periodic,and stochastic components.The trend component is predicted by an autoregressive model,and the periodic component is predicted by the long short-term memory.For the stochastic component,because it is affected by uncertainties,it is predicted by a combination of a Wasserstein generative adversarial network and multisource domain transfer learning for improved prediction accuracy.Considering a real mine slope as a case study,the proposed prediction method was validated.Therefore,this study provides new insights that can be applied to scenarios lacking sample data.展开更多
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera...The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.展开更多
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor...The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.展开更多
Nowadays,the fifth-generation(5G)mobile communication system has obtained prosperous development and deployment,reshaping our daily lives.However,anomalies of cell outages and congestion in 5G critically influence the...Nowadays,the fifth-generation(5G)mobile communication system has obtained prosperous development and deployment,reshaping our daily lives.However,anomalies of cell outages and congestion in 5G critically influence the quality of experience and significantly increase operational expenditures.Although several big data and artificial intelligencebased anomaly detection methods have been proposed for wireless cellular systems,they change distributions of the data and ignore the relevance among user activities,causing anomaly detection ineffective for some cells.In this paper,we propose a highly effective and accurate anomaly detection framework by utilizing generative adversarial networks(GAN)and long short-term memory(LSTM)neural networks.The framework expands the original dataset while simultaneously keeping the distribution of data unchanged,and explores the relevance among user activities to further improve the system performance.The results demonstrate that our framework can achieve 97.16%accuracy and 2.30%false positive rate by utilizing the correlation of user activities and data expansion.展开更多
The marine biological sonar system evolved in the struggle of nature is far superior to the current artificial sonar. Therefore, the development of bionic underwater concealed detection is of great strategic significa...The marine biological sonar system evolved in the struggle of nature is far superior to the current artificial sonar. Therefore, the development of bionic underwater concealed detection is of great strategic significance to the military and economy. In this paper, a generative adversarial network(GAN) is trained based on the dolphin vocal sound dataset we constructed, which can achieve unsupervised generation of dolphin vocal sounds with global consistency. Through the analysis of the generated audio samples and the real audio samples in the time domain and the frequency domain, it can be proven that the generated audio samples are close to the real audio samples,which meets the requirements of bionic underwater concealed detection.展开更多
基金supported by the Natural Science Foundation of Jiangsu Province(Grant No.BK20220421)the State Key Program of the National Natural Science Foundation of China(Grant No.42230702)the National Natural Science Foundation of China(Grant No.82302352).
文摘Landslides are destructive natural disasters that cause catastrophic damage and loss of life worldwide.Accurately predicting landslide displacement enables effective early warning and risk management.However,the limited availability of on-site measurement data has been a substantial obstacle in developing data-driven models,such as state-of-the-art machine learning(ML)models.To address these challenges,this study proposes a data augmentation framework that uses generative adversarial networks(GANs),a recent advance in generative artificial intelligence(AI),to improve the accuracy of landslide displacement prediction.The framework provides effective data augmentation to enhance limited datasets.A recurrent GAN model,RGAN-LS,is proposed,specifically designed to generate realistic synthetic multivariate time series that mimics the characteristics of real landslide on-site measurement data.A customized moment-matching loss is incorporated in addition to the adversarial loss in GAN during the training of RGAN-LS to capture the temporal dynamics and correlations in real time series data.Then,the synthetic data generated by RGAN-LS is used to enhance the training of long short-term memory(LSTM)networks and particle swarm optimization-support vector machine(PSO-SVM)models for landslide displacement prediction tasks.Results on two landslides in the Three Gorges Reservoir(TGR)region show a significant improvement in LSTM model prediction performance when trained on augmented data.For instance,in the case of the Baishuihe landslide,the average root mean square error(RMSE)increases by 16.11%,and the mean absolute error(MAE)by 17.59%.More importantly,the model’s responsiveness during mutational stages is enhanced for early warning purposes.However,the results have shown that the static PSO-SVM model only sees marginal gains compared to recurrent models such as LSTM.Further analysis indicates that an optimal synthetic-to-real data ratio(50%on the illustration cases)maximizes the improvements.This also demonstrates the robustness and effectiveness of supplementing training data for dynamic models to obtain better results.By using the powerful generative AI approach,RGAN-LS can generate high-fidelity synthetic landslide data.This is critical for improving the performance of advanced ML models in predicting landslide displacement,particularly when there are limited training data.Additionally,this approach has the potential to expand the use of generative AI in geohazard risk management and other research areas.
基金supported by the General Program of the National Natural Science Foundation of China(Grant No.61977029).
文摘Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved,including digit deformation,noise interference between frames,blurred output,and the need for temporal coherence across frames.In this paper,we propose a novel approach for generating coherent videos of moving digits from textual input using a Deep Deconvolutional Generative Adversarial Network(DD-GAN).The DDGAN comprises a Deep Deconvolutional Neural Network(DDNN)as a Generator(G)and a modified Deep Convolutional Neural Network(DCNN)as a Discriminator(D)to ensure temporal coherence between adjacent frames.The proposed research involves several steps.First,the input text is fed into a Long Short Term Memory(LSTM)based text encoder and then smoothed using Conditioning Augmentation(CA)techniques to enhance the effectiveness of the Generator(G).Next,using a DDNN to generate video frames by incorporating enhanced text and random noise and modifying a DCNN to act as a Discriminator(D),effectively distinguishing between generated and real videos.This research evaluates the quality of the generated videos using standard metrics like Inception Score(IS),Fréchet Inception Distance(FID),Fréchet Inception Distance for video(FID2vid),and Generative Adversarial Metric(GAM),along with a human study based on realism,coherence,and relevance.By conducting experiments on Single-Digit Bouncing MNIST GIFs(SBMG),Two-Digit Bouncing MNIST GIFs(TBMG),and a custom dataset of essential mathematics videos with related text,this research demonstrates significant improvements in both metrics and human study results,confirming the effectiveness of DD-GAN.This research also took the exciting challenge of generating preschool math videos from text,handling complex structures,digits,and symbols,and achieving successful results.The proposed research demonstrates promising results for generating coherent videos from textual input.
文摘Sarcasm detection in text data is an increasingly vital area of research due to the prevalence of sarcastic content in online communication.This study addresses challenges associated with small datasets and class imbalances in sarcasm detection by employing comprehensive data pre-processing and Generative Adversial Network(GAN)based augmentation on diverse datasets,including iSarcasm,SemEval-18,and Ghosh.This research offers a novel pipeline for augmenting sarcasm data with Reverse Generative Adversarial Network(RGAN).The proposed RGAN method works by inverting labels between original and synthetic data during the training process.This inversion of labels provides feedback to the generator for generating high-quality data closely resembling the original distribution.Notably,the proposed RGAN model exhibits performance on par with standard GAN,showcasing its robust efficacy in augmenting text data.The exploration of various datasets highlights the nuanced impact of augmentation on model performance,with cautionary insights into maintaining a delicate balance between synthetic and original data.The methodological framework encompasses comprehensive data pre-processing and GAN-based augmentation,with a meticulous comparison against Natural Language Processing Augmentation(NLPAug)as an alternative augmentation technique.Overall,the F1-score of our proposed technique outperforms that of the synonym replacement augmentation technique using NLPAug.The increase in F1-score in experiments using RGAN ranged from 0.066%to 1.054%,and the use of standard GAN resulted in a 2.88%increase in F1-score.The proposed RGAN model outperformed the NLPAug method and demonstrated comparable performance to standard GAN,emphasizing its efficacy in text data augmentation.
基金supported by the National Natural Science Foundation of China(61533019,71232006,91520301)
文摘Recently, generative adversarial networks(GANs)have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adversarial learning idea.The goal of GANs is to estimate the potential distribution of real data samples and generate new samples from that distribution.Since their initiation, GANs have been widely studied due to their enormous prospect for applications, including image and vision computing, speech and language processing, etc. In this review paper, we summarize the state of the art of GANs and look into the future. Firstly, we survey GANs' proposal background,theoretic and implementation models, and application fields.Then, we discuss GANs' advantages and disadvantages, and their development trends. In particular, we investigate the relation between GANs and parallel intelligence,with the conclusion that GANs have a great potential in parallel systems research in terms of virtual-real interaction and integration. Clearly, GANs can provide substantial algorithmic support for parallel intelligence.
基金This work was partially supported by National Key R&D Program of China(2019YFB1312400)Shenzhen Key Laboratory of Robotics Perception and Intelligence(ZDSYS20200810171800001)+1 种基金Hong Kong RGC GRF(14200618)Hong Kong RGC CRF(C4063-18G).
文摘Sampling-based path planning is a popular methodology for robot path planning.With a uniform sampling strategy to explore the state space,a feasible path can be found without the complex geometric modeling of the configuration space.However,the quality of the initial solution is not guaranteed,and the convergence speed to the optimal solution is slow.In this paper,we present a novel image-based path planning algorithm to overcome these limitations.Specifically,a generative adversarial network(GAN)is designed to take the environment map(denoted as RGB image)as the input without other preprocessing works.The output is also an RGB image where the promising region(where a feasible path probably exists)is segmented.This promising region is utilized as a heuristic to achieve non-uniform sampling for the path planner.We conduct a number of simulation experiments to validate the effectiveness of the proposed method,and the results demonstrate that our method performs much better in terms of the quality of the initial solution and the convergence speed to the optimal solution.Furthermore,apart from the environments similar to the training set,our method also works well on the environments which are very different from the training set.
基金Supported by the National Natural Science Foundation of China(No.61501457)National Key Technology R&D Program(No.2015BAK21B00)
文摘Generative adversarial networks(GANs) have become a competitive method among computer vision tasks. There have been many studies devoted to utilizing generative network to do generative tasks, such as images synthesis. In this paper, a semi-supervised learning scheme is incorporated with generative adversarial network on image classification tasks to improve the image classification accuracy. Two applications of GANs are mainly focused on: semi-supervised learning and generation of images which can be as real as possible. The whole process is divided into two sections. First, only a small part of the dataset is utilized as labeled training data. And then a huge amount of samples generated from the generator is added into the training samples to improve the generalization of the discriminator. Through the semi-supervised learning scheme, full use of the unlabeled data is made which may contain potential information. Thus, the classification accuracy of the discriminator can be improved. Experimental results demonstrate the improvement of the classification accuracy of discriminator among different datasets, such as MNIST, CIFAR-10.
基金the National Natural Science Foundation of China(NSFC)(Grant Nos.61572461,61811530282,61872429,11790301 and 11790305).
文摘With aperture synthesis(AS)technique,a number of small antennas can be assembled to form a large telescope whose spatial resolution is determined by the distance of two farthest antennas instead of the diameter of a single-dish antenna.In contrast from a direct imaging system,an AS telescope captures the Fourier coefficients of a spatial object,and then implement inverse Fourier transform to reconstruct the spatial image.Due to the limited number of antennas,the Fourier coefficients are extremely sparse in practice,resulting in a very blurry image.To remove/reduce blur,“CLEAN”deconvolution has been widely used in the literature.However,it was initially designed for a point source.For an extended source,like the Sun,its efficiency is unsatisfactory.In this study,a deep neural network,referring to Generative Adversarial Network(GAN),is proposed for solar image deconvolution.The experimental results demonstrate that the proposed model is markedly better than traditional CLEAN on solar images.The main purpose of this work is visual inspection instead of quantitative scientific computation.We believe that this will also help scientists to better understand solar phenomena with high quality images.
基金supported by the National Natural Science Foundation of China(Grant No.52375256)the Natural Science Foundation of Shanghai Municipality(Grant Nos.21ZR1431500 and 23ZR1431600).
文摘Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of samples in minority classes based on generative adversarial networks(GANs)has been demonstrated as an effective approach.This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network(CMAGAN).In the CMAGAN framework,an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers.Subsequently,a newly designed boundary-strengthening learning GAN(BSLGAN)is employed to generate additional samples for minority classes.By incorporating a supplementary classifier and innovative training mechanisms,the BSLGAN focuses on learning the distribution of samples near classification boundaries.Consequently,it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries.Finally,the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution.To evaluate the effectiveness of the proposed approach,CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications.The performance of CMAGAN was compared with that of seven other algorithms,including state-of-the-art GAN-based methods,and the results indicated that CMAGAN could provide higher-quality augmented results.
文摘Recently,the evolution of Generative Adversarial Networks(GANs)has embarked on a journey of revolutionizing the field of artificial and computational intelligence.To improve the generating ability of GANs,various loss functions are introduced to measure the degree of similarity between the samples generated by the generator and the real data samples,and the effectiveness of the loss functions in improving the generating ability of GANs.In this paper,we present a detailed survey for the loss functions used in GANs,and provide a critical analysis on the pros and cons of these loss functions.First,the basic theory of GANs along with the training mechanism are introduced.Then,the most commonly used loss functions in GANs are introduced and analyzed.Third,the experimental analyses and comparison of these loss functions are presented in different GAN architectures.Finally,several suggestions on choosing suitable loss functions for image synthesis tasks are given.
基金supported by Shenzhen Science and Technology Innovation Committee under Grants No. JCYJ20170306170559215 and No. JCYJ20180302153918689。
文摘In this paper,we propose a hybrid model aiming to map the input noise vector to the label of the generated image by the generative adversarial network(GAN).This model mainly consists of a pre-trained deep convolution generative adversarial network(DCGAN)and a classifier.By using the model,we visualize the distribution of two-dimensional input noise,leading to a specific type of the generated image after each training epoch of GAN.The visualization reveals the distribution feature of the input noise vector and the performance of the generator.With this feature,we try to build a guided generator(GG)with the ability to produce a fake image we need.Two methods are proposed to build GG.One is the most significant noise(MSN)method,and the other utilizes labeled noise.The MSN method can generate images precisely but with less variations.In contrast,the labeled noise method has more variations but is slightly less stable.Finally,we propose a criterion to measure the performance of the generator,which can be used as a loss function to effectively train the network.
基金This work was supported by the National Natural Science Foundation of China(Nos.61972456,61402329)the Natural Science Foundation of Tianjin(Nos.19JCYBJC15400,21YDTPJC00440)。
文摘Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the recorded data have certain missing values due to factors,such as weather and equipment anomalies.These missing values seriously affect the analysis of QAR data by aeronautical engineers,such as airline flight scenario reproduction and airline flight safety status assessment.Therefore,imputing missing values in the QAR data,which can further guarantee the flight safety of airlines,is crucial.QAR data also have multivariate,multiprocess,and temporal features.Therefore,we innovatively propose the imputation models A-AEGAN("A"denotes attention mechanism,"AE"denotes autoencoder,and"GAN"denotes generative adversarial network)and SA-AEGAN("SA"denotes self-attentive mechanism)for missing values of QAR data,which can be effectively applied to QAR data.Specifically,we apply an innovative generative adversarial network to impute missing values from QAR data.The improved gated recurrent unit is then introduced as the neural unit of GAN,which can successfully capture the temporal relationships in QAR data.In addition,we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator.The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator.We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data.Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data.Furthermore,we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data.Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.
基金supported by Metaverse Lab Program funded by the Ministry of Science and ICT(MSIT),and the Korea Radio Promotion Association(RAPA).
文摘The objective of style transfer is to maintain the content of an image while transferring the style of another image.However,conventional methods face challenges in preserving facial features,especially in Korean portraits where elements like the“Gat”(a traditional Korean hat)are prevalent.This paper proposes a deep learning network designed to perform style transfer that includes the“Gat”while preserving the identity of the face.Unlike traditional style transfer techniques,the proposed method aims to preserve the texture,attire,and the“Gat”in the style image by employing image sharpening and face landmark,with the GAN.The color,texture,and intensity were extracted differently based on the characteristics of each block and layer of the pre-trained VGG-16,and only the necessary elements during training were preserved using a facial landmark mask.The head area was presented using the eyebrow area to transfer the“Gat”.Furthermore,the identity of the face was retained,and style correlation was considered based on the Gram matrix.To evaluate performance,we introduced a metric using PSNR and SSIM,with an emphasis on median values through new weightings for style transfer in Korean portraits.Additionally,we have conducted a survey that evaluated the content,style,and naturalness of the transferred results,and based on the assessment,we can confidently conclude that our method to maintain the integrity of content surpasses the previous research.Our approach,enriched by landmarks preservation and diverse loss functions,including those related to“Gat”,outperformed previous researches in facial identity preservation.
文摘Super-resolution reconstruction in medical imaging has become more demanding due to the necessity of obtaining high-quality images with minimal radiation dose,such as in low-field magnetic resonance imaging(MRI).However,image super-resolution reconstruction remains a difficult task because of the complexity and high textual requirements for diagnosis purpose.In this paper,we offer a deep learning based strategy for reconstructing medical images from low resolutions utilizing Transformer and generative adversarial networks(T-GANs).The integrated system can extract more precise texture information and focus more on important locations through global image matching after successfully inserting Transformer into the generative adversarial network for picture reconstruction.Furthermore,we weighted the combination of content loss,adversarial loss,and adversarial feature loss as the final multi-task loss function during the training of our proposed model T-GAN.In comparison to established measures like peak signal-to-noise ratio(PSNR)and structural similarity index measure(SSIM),our suggested T-GAN achieves optimal performance and recovers more texture features in super-resolution reconstruction of MRI scanned images of the knees and belly.
基金the National Natural Science Foundation of China(NSFC)(52178324).
文摘Concrete subjected to fire loads is susceptible to explosive spalling, which can lead to the exposure of reinforcingsteel bars to the fire, substantially jeopardizing the structural safety and stability. The spalling of fire-loaded concreteis closely related to the evolution of pore pressure and temperature. Conventional analytical methods involve theresolution of complex, strongly coupled multifield equations, necessitating significant computational efforts. Torapidly and accurately obtain the distributions of pore-pressure and temperature, the Pix2Pix model is adoptedin this work, which is celebrated for its capabilities in image generation. The open-source dataset used hereinfeatures RGB images we generated using a sophisticated coupled model, while the grayscale images encapsulate the15 principal variables influencing spalling. After conducting a series of tests with different layers configurations,activation functions and loss functions, the Pix2Pix model suitable for assessing the spalling risk of fire-loadedconcrete has been meticulously designed and trained. The applicability and reliability of the Pix2Pix model inconcrete parameter prediction are verified by comparing its outcomes with those derived fromthe strong couplingTHC model. Notably, for the practical engineering applications, our findings indicate that utilizing monochromeimages as the initial target for analysis yields more dependable results. This work not only offers valuable insightsfor civil engineers specializing in concrete structures but also establishes a robust methodological approach forresearchers seeking to create similar predictive models.
基金supported by the National Natural Science Foundation of China(Grant No.51674169)Department of Education of Hebei Province of China(Grant No.ZD2019140)+1 种基金Natural Science Foundation of Hebei Province of China(Grant No.F2019210243)S&T Program of Hebei(Grant No.22375413D)School of Electrical and Electronics Engineering。
文摘Accurate displacement prediction is critical for the early warning of landslides.The complexity of the coupling relationship between multiple influencing factors and displacement makes the accurate prediction of displacement difficult.Moreover,in engineering practice,insufficient monitoring data limit the performance of prediction models.To alleviate this problem,a displacement prediction method based on multisource domain transfer learning,which helps accurately predict data in the target domain through the knowledge of one or more source domains,is proposed.First,an optimized variational mode decomposition model based on the minimum sample entropy is used to decompose the cumulative displacement into the trend,periodic,and stochastic components.The trend component is predicted by an autoregressive model,and the periodic component is predicted by the long short-term memory.For the stochastic component,because it is affected by uncertainties,it is predicted by a combination of a Wasserstein generative adversarial network and multisource domain transfer learning for improved prediction accuracy.Considering a real mine slope as a case study,the proposed prediction method was validated.Therefore,this study provides new insights that can be applied to scenarios lacking sample data.
基金the National Natural Science Foundation of China(No.61976080)the Academic Degrees&Graduate Education Reform Project of Henan Province(No.2021SJGLX195Y)+1 种基金the Teaching Reform Research and Practice Project of Henan Undergraduate Universities(No.2022SYJXLX008)the Key Project on Research and Practice of Henan University Graduate Education and Teaching Reform(No.YJSJG2023XJ006)。
文摘The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.
文摘The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.
基金supported by National Natural Science Foundation of China under Grant 61772406 and Grant 61941105in part by the projects of the Fundamental Research Funds for the Central Universitiesthe Innovation Fund of Xidian University under Grant 500120109215456。
文摘Nowadays,the fifth-generation(5G)mobile communication system has obtained prosperous development and deployment,reshaping our daily lives.However,anomalies of cell outages and congestion in 5G critically influence the quality of experience and significantly increase operational expenditures.Although several big data and artificial intelligencebased anomaly detection methods have been proposed for wireless cellular systems,they change distributions of the data and ignore the relevance among user activities,causing anomaly detection ineffective for some cells.In this paper,we propose a highly effective and accurate anomaly detection framework by utilizing generative adversarial networks(GAN)and long short-term memory(LSTM)neural networks.The framework expands the original dataset while simultaneously keeping the distribution of data unchanged,and explores the relevance among user activities to further improve the system performance.The results demonstrate that our framework can achieve 97.16%accuracy and 2.30%false positive rate by utilizing the correlation of user activities and data expansion.
基金supported by the National Natural Science Foundation of China under Grants No. 62027803, No. 61701095,No. 61601096, No. 61801089, and No. 61971111the Science and Technology Program of Sichuan under Grants No. 2020YFG0044, No. 2020YFG0046, and No. 2021YFG0200+1 种基金the Science and Technology Program under Grant No.2021-JCJQ-JJ-0949the Defense Industrial Technology Development Program under Grant No. JCKY2020110C041。
文摘The marine biological sonar system evolved in the struggle of nature is far superior to the current artificial sonar. Therefore, the development of bionic underwater concealed detection is of great strategic significance to the military and economy. In this paper, a generative adversarial network(GAN) is trained based on the dolphin vocal sound dataset we constructed, which can achieve unsupervised generation of dolphin vocal sounds with global consistency. Through the analysis of the generated audio samples and the real audio samples in the time domain and the frequency domain, it can be proven that the generated audio samples are close to the real audio samples,which meets the requirements of bionic underwater concealed detection.