期刊文献+
共找到24篇文章
< 1 2 >
每页显示 20 50 100
Generating Synthetic Data to Reduce Prediction Error of Energy Consumption
1
作者 Debapriya Hazra Wafa Shafqat Yung-Cheol Byun 《Computers, Materials & Continua》 SCIE EI 2022年第2期3151-3167,共17页
Renewable and nonrenewable energy sources are widely incorporated for solar and wind energy that produces electricity without increasing carbon dioxide emissions.Energy industries worldwide are trying hard to predict ... Renewable and nonrenewable energy sources are widely incorporated for solar and wind energy that produces electricity without increasing carbon dioxide emissions.Energy industries worldwide are trying hard to predict future energy consumption that could eliminate over or under contracting energy resources and unnecessary financing.Machine learning techniques for predicting energy are the trending solution to overcome the challenges faced by energy companies.The basic need for machine learning algorithms to be trained for accurate prediction requires a considerable amount of data.Another critical factor is balancing the data for enhanced prediction.Data Augmentation is a technique used for increasing the data available for training.Synthetic data are the generation of new data which can be trained to improve the accuracy of prediction models.In this paper,we propose a model that takes time series energy consumption data as input,pre-processes the data,and then uses multiple augmentation techniques and generative adversarial networks to generate synthetic data which when combined with the original data,reduces energy consumption prediction error.We propose TGAN-skip-Improved-WGAN-GP to generate synthetic energy consumption time series tabular data.We modify TGANwith skip connections,then improveWGANGPby defining a consistency term,and finally use the architecture of improved WGAN-GP for training TGAN-skip.We used various evaluation metrics and visual representation to compare the performance of our proposed model.We also measured prediction accuracy along with mean and maximum error generated while predicting with different variations of augmented and synthetic data with original data.The mode collapse problemcould be handled by TGAN-skip-Improved-WGAN-GP model and it also converged faster than existing GAN models for synthetic data generation.The experiment result shows that our proposed technique of combining synthetic data with original data could significantly reduce the prediction error rate and increase the prediction accuracy of energy consumption. 展开更多
关键词 Energy consumption generative adversarial networks synthetic data time series data TGAN WGAN-GP TGAN-skip prediction error augmentation
下载PDF
A Survey of Synthetic Data Augmentation Methods in Machine Vision
2
作者 Alhassan Mumuni Fuseini Mumuni Nana Kobina Gerrar 《Machine Intelligence Research》 EI CSCD 2024年第5期831-869,共39页
The standard approach to tackling computer vision problems is to train deep convolutional neural network(CNN)models using large-scale image datasets that are representative of the target task.However,in many scenarios... The standard approach to tackling computer vision problems is to train deep convolutional neural network(CNN)models using large-scale image datasets that are representative of the target task.However,in many scenarios,it is often challenging to obtain sufficient image data for the target task.Data augmentation is a way to mitigate this challenge.A common practice is to explicitly transform existing images in desired ways to create the required volume and variability of training data necessary to achieve good generalization performance.In situations where data for the target domain are not accessible,a viable workaround is to synthesize training data from scratch,i.e.,synthetic data augmentation.This paper presents an extensive review of synthetic data augmentation techniques.It covers data synthesis approaches based on realistic 3D graphics modelling,neural style transfer(NST),differential neural rendering,and generative modelling using generative adversarial networks(GANs)and variational autoencoders(VAEs).For each of these classes of methods,we focus on the important data generation and augmentation techniques,general scope of application and specific use-cases,as well as existing limitations and possible workarounds.Additionally,we provide a summary of common synthetic datasets for training computer vision models,highlighting the main features,application domains and supported tasks.Finally,we discuss the effectiveness of synthetic data augmentation methods.Since this is the first paper to explore synthetic data augmentation methods in great detail,we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues. 展开更多
关键词 data augmentation generative modelling neural rendering data synthesis synthetic data neural style transfer(NsT)
原文传递
Transfer learning from synthetic data for open-circuit voltage curve reconstruction and state of health estimation of lithium-ion batteries from partial charging segments
3
作者 Tobias Hofmann Jacob Hamar +2 位作者 Bastian Mager Simon Erhard Jan Philipp Schmidt 《Energy and AI》 EI 2024年第3期80-97,共18页
Data-driven models for battery state estimation require extensive experimental training data,which may not be available or suitable for specific tasks like open-circuit voltage(OCV)reconstruction and subsequent state ... Data-driven models for battery state estimation require extensive experimental training data,which may not be available or suitable for specific tasks like open-circuit voltage(OCV)reconstruction and subsequent state of health(SOH)estimation.This study addresses this issue by developing a transfer-learning-based OCV reconstruction model using a temporal convolutional long short-term memory(TCN-LSTM)network trained on synthetic data from an automotive nickel cobalt aluminium oxide(NCA)cell generated through a mechanistic model approach.The data consists of voltage curves at constant temperature,C-rates between C/30 to 1C,and a SOH-range from 70%to 100%.The model is refined via Bayesian optimization and then applied to four use cases with reduced experimental nickel manganese cobalt oxide(NMC)cell training data for higher use cases.The TL models’performances are compared with models trained solely on experimental data,focusing on different C-rates and voltage windows.The results demonstrate that the OCV reconstruction mean absolute error(MAE)within the average battery electric vehicle(BEV)home charging window(30%to 85%state of charge(SOC))is less than 22 mV for the first three use cases across all C-rates.The SOH estimated from the reconstructed OCV exhibits an mean absolute percentage error(MAPE)below 2.2%for these cases.The study further investigates the impact of the source domain on TL by incorporating two additional synthetic datasets,a lithium iron phosphate(LFP)cell and an entirely artificial,non-existing,cell,showing that solely the shifting and scaling of gradient changes in the charging curve suffice to transfer knowledge,even between different cell chemistries.A key limitation with respect to extrapolation capability is identified and evidenced in our fourth use case,where the absence of such comprehensive data hindered the TL process. 展开更多
关键词 Lithium-ion battery State of health estimation Transfer learning OCV curve Partial charging synthetic data
原文传递
A Study of EM Algorithm as an Imputation Method: A Model-Based Simulation Study with Application to a Synthetic Compositional Data
4
作者 Yisa Adeniyi Abolade Yichuan Zhao 《Open Journal of Modelling and Simulation》 2024年第2期33-42,共10页
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode... Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance. 展开更多
关键词 Compositional data Linear Regression Model Least Square Method Robust Least Square Method synthetic data Aitchison Distance Maximum Likelihood Estimation Expectation-Maximization Algorithm k-Nearest Neighbor and Mean imputation
下载PDF
Synthetic data as an investigative tool in hypertension and renal diseases research
5
作者 Aleena Jamal Som Singh Fawad Qureshi 《World Journal of Methodology》 2025年第1期9-13,共5页
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful... There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research. 展开更多
关键词 synthetic data Artificial intelligence Nephrology Blood pressure Research Editorial
下载PDF
A Study of Using Synthetic Data for Effective Association Knowledge Learning
6
作者 Yuchi Liu Zhongdao Wang +1 位作者 Xiangxin Zhou Liang Zheng 《Machine Intelligence Research》 EI CSCD 2023年第2期194-206,共13页
Association,aiming to link bounding boxes of the same identity in a video sequence,is a central component in multi-object tracking(MOT).To train association modules,e.g.,parametric networks,real video data are usually... Association,aiming to link bounding boxes of the same identity in a video sequence,is a central component in multi-object tracking(MOT).To train association modules,e.g.,parametric networks,real video data are usually used.However,annotating person tracks in consecutive video frames is expensive,and such real data,due to its inflexibility,offer us limited opportunities to evaluate the system performance w.r.t.changing tracking scenarios.In this paper,we study whether 3D synthetic data can replace real-world videos for association training.Specifically,we introduce a large-scale synthetic data engine named MOTX,where the motion characteristics of cameras and objects are manually configured to be similar to those of real-world datasets.We show that,compared with real data,association knowledge obtained from synthetic data can achieve very similar performance on real-world test sets without domain adaption techniques.Our intriguing observation is credited to two factors.First and foremost,3D engines can well simulate motion factors such as camera movement,camera view,and object movement so that the simulated videos can provide association modules with effective motion features.Second,the experimental results show that the appearance domain gap hardly harms the learning of association knowledge.In addition,the strong customization ability of MOTX allows us to quantitatively assess the impact of motion factors on MOT,which brings new insights to the community. 展开更多
关键词 Multi-object tracking(MOT) data association synthetic data motion simulation association knowledge learning
原文传递
“stppSim”: A Novel Analytical Tool for Creating Synthetic Spatio-Temporal Point Data
7
作者 Monsuru Adepeju 《Open Journal of Modelling and Simulation》 2023年第4期99-116,共18页
In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotempor... In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotemporal crime records from law enforcement faces significant challenges due to confidentiality concerns. In response to these challenges, this paper introduces an innovative analytical tool named “stppSim,” designed to synthesize fine-grained spatiotemporal point records while safeguarding the privacy of individual locations. By utilizing the open-source R platform, this tool ensures easy accessibility for researchers, facilitating download, re-use, and potential advancements in various research domains beyond crime science. 展开更多
关键词 OPEN-SOURCE synthetic data CRIME Spatio-Temporal Patterns data Privacy
下载PDF
Synthetic Data Generation and Shuffled Multi-Round Training Based Offline Handwritten Mathematical Expression Recognition
8
作者 Lan-Fang Dong Han-Chao Liu Xin-Ming Zhang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2022年第6期1427-1443,共17页
Offline handwritten mathematical expression recognition is a challenging optical character recognition(OCR)task due to various ambiguities of handwritten symbols and complicated two-dimensional structures.Recent work ... Offline handwritten mathematical expression recognition is a challenging optical character recognition(OCR)task due to various ambiguities of handwritten symbols and complicated two-dimensional structures.Recent work in this area usually constructs deeper and deeper neural networks trained with end-to-end approaches to improve the performance.However,the higher the complexity of the network,the more the computing resources and time required.To improve the performance without more computing requirements,we concentrate on the training data and the training strategy in this paper.We propose a data augmentation method which can generate synthetic samples with new LaTeX notations by only using the official training data of CROHME.Moreover,we propose a novel training strategy called Shuffled Multi-Round Training(SMRT)to regularize the model.With the generated data and the shuffled multi-round training strategy,we achieve the state-of-the-art result in expression accuracy,i.e.,59.74%and 61.57%on CROHME 2014 and 2016,respectively,by using attention-based encoder-decoder models for offline handwritten mathematical expression recognition. 展开更多
关键词 handwritten mathematical expression recognition OFFLINE synthetic data generation training strategy
原文传递
Sarve:synthetic data and local differential privacy for private frequency estimation
9
作者 Gatha Varmal Ritu Chauhan Dhananjay Singh 《Cybersecurity》 EI CSCD 2022年第4期97-116,共20页
The collection of user attributes by service providers is a double-edged sword.They are instrumental in driving statistical analysis to train more accurate predictive models like recommenders.The analysis of the colle... The collection of user attributes by service providers is a double-edged sword.They are instrumental in driving statistical analysis to train more accurate predictive models like recommenders.The analysis of the collected user data includes frequency estimation for categorical attributes.Nonetheless,the users deserve privacy guarantees against inadvertent identity disclosures.Therefore algorithms called frequency oracles were developed to randomize or perturb user attributes and estimate the frequencies of their values.We propose Sarve,a frequency oracle that used Randomized Aggregatable Privacy-Preserving Ordinal Response(RAPPOR)and Hadamard Response(HR)for randomization in combination with fake data.The design of a service-oriented architecture must consider two types of complexities,namely computational and communication.The functions of such systems aim to minimize the two complexities and therefore,the choice of privacy-enhancing methods must be a calculated decision.The variant of RAPPOR we had used was realized through bloom flters.A bloom filter is a memory-efficient data structure that offers time complexity of O(1).On the other hand,HR has been proven to give the best communication costs of the order of log(b)for b-bits communication.Therefore,Sarve is a step towards frequency oracles that exhibit how privacy provisions of existing methods can be combined with those of fake data to achieve statistical results comparable to the original data.Sarve also implemented an adaptive solution enhanced from the work of Arcolezi et al.The use of RAPPOR was found to provide better privacy-utility tradeoffs for specific privacy budgets in both high and general privacyregimes. 展开更多
关键词 synthetic data Differential privacy Frequency estimation Frequency oracle PRIVACY
原文传递
Resolving co- and early post-seismic slip variations of the 2021 MW 7.4 Madoi earthquake in east Bayan Har block with a block-wide distributed deformation mode from satellite synthetic aperture radar data 被引量:14
10
作者 Shuai Wang Chuang Song +1 位作者 ShanShan Li Xing Li 《Earth and Planetary Physics》 CSCD 2022年第1期108-122,共15页
On 21 May 2021(UTC),an MW 7.4 earthquake jolted the east Bayan Har block in the Tibetan Plateau.The earthquake received widespread attention as it is the largest event in the Tibetan Plateau and its surroundings since... On 21 May 2021(UTC),an MW 7.4 earthquake jolted the east Bayan Har block in the Tibetan Plateau.The earthquake received widespread attention as it is the largest event in the Tibetan Plateau and its surroundings since the 2008 Wenchuan earthquake,and especially in proximity to the seismic gaps on the east Kunlun fault.Here we use satellite interferometric synthetic aperture radar data and subpixel offset observations along the range directions to characterize the coseismic deformation of the earthquake.Range offset displacements depict clear surface ruptures with a total length of~170 km involving two possible activated fault segments in the earthquake.Coseismic modeling results indicate that the earthquake was dominated by left-lateral strike-slip motions of up to 7 m within the top 12 km of the crust.The well-resolved slip variations are characterized by five major slip patches along strike and 64%of shallow slip deficit,suggesting a young seismogenic structure.Spatial-temporal changes of the postseismic deformation are mapped from early 6-day and 24-day InSAR observations,and are well explained by time-dependent afterslip models.Analysis of Global Navigation Satellite System(GNSS)velocity profiles and strain rates suggests that the eastward extrusion of plateau is diffusely distributed across the east Bayan Har block,but exhibits significant lateral heterogeneities,as evidenced by magnetotelluric observations.The block-wide distributed deformation of the east Bayan Har block along with the significant co-and post-seismic stress loadings from the Madoi earthquake imply high seismic risks along regional faults,especially the Tuosuo Lake and Maqên-Maqu segments of the Kunlun fault that are known as seismic gaps. 展开更多
关键词 Madoi earthquake Bayan Har block synthetic aperture radar data co-and post-seismic slip block-wide distributed deformation seismic risk
下载PDF
An envelope-based machine learning workflow for locating earthquakes in the southern Sichuan Basin
11
作者 Kang Wang Jie Zhang +2 位作者 Ji Zhang Zhangyu Wang Ziyu Li 《Earthquake Research Advances》 CSCD 2024年第2期45-54,共10页
The development of machine learning technology enables more robust real-time earthquake monitoring through automated implementations. However, the application of machine learning to earthquake location problems faces ... The development of machine learning technology enables more robust real-time earthquake monitoring through automated implementations. However, the application of machine learning to earthquake location problems faces challenges in regions with limited available training data. To address the issues of sparse event distribution and inaccurate ground truth in historical seismic datasets, we expand the training dataset by using a large number of synthetic envelopes that closely resemble real data and build an earthquake location model named ENVloc. We propose an envelope-based machine learning workflow for simultaneously determining earthquake location and origin time. The method eliminates the need for phase picking and avoids the accumulation of location errors resulting from inaccurate picking results. In practical application, ENVloc is applied to several data intercepted at different starting points. We take the starting point of the time window corresponding to the highest prediction probability value as the origin time and save the predicted result as the earthquake location. We apply ENVloc to observed data acquired in the southern Sichuan Basin, China, between September 2018 and March 2019. The results show that the average difference with the catalog in latitude, longitude, depth, and origin time is 0.02°,0.02°, 2 km, and 1.25 s, respectively. These suggest that our envelope-based method provides an efficient and robust way to locate earthquakes without phase picking, and can be used in earthquake monitoring in near-real time. 展开更多
关键词 Waveform envelope Earthquake location Local seismicity synthetic data Sparse stations
下载PDF
Storm surge model based on variational data assimilation method 被引量:1
12
作者 Shi-li HUANG Jian XU +1 位作者 De-guan WANG Dong-yan LU 《Water Science and Engineering》 EI CAS 2010年第2期166-173,共8页
By combining computation and observation information, the variational data assimilation method has the ability to eliminate errors caused by the uncertainty of parameters in practical forecasting. It was applied to a ... By combining computation and observation information, the variational data assimilation method has the ability to eliminate errors caused by the uncertainty of parameters in practical forecasting. It was applied to a storm surge model based on unstructured grids with high spatial resolution meant for improving the forecasting accuracy of the storm surge. By controlling the wind stress drag coefficient, the variation-based model was developed and validated through data assimilation tests in an actual storm surge induced by a typhoon. In the data assimilation tests, the model accurately identified the wind stress drag coefficient and obtained results close to the true state. Then, the actual storm surge induced by Typhoon 0515 was forecast by the developed model, and the results demonstrate its efficiency in practical application. 展开更多
关键词 storm surge variational data assimilation synthetic data unstructured grid
下载PDF
Shadow Detection and Removal From Photo-Realistic Synthetic Urban Image Using Deep Learning 被引量:1
13
作者 Hee-Jin Yoon Kang-Jik Kim Jun-Chul Chun 《Computers, Materials & Continua》 SCIE EI 2020年第1期459-472,共14页
Recently,virtual reality technology that can interact with various data is used for urban design and analysis.Reality,one of the most important elements in virtual reality technology,means visual expression so that a ... Recently,virtual reality technology that can interact with various data is used for urban design and analysis.Reality,one of the most important elements in virtual reality technology,means visual expression so that a person can experience three-dimensional space like reality.To obtain this realism,real-world data are used in the various fields.For example,in order to increase the realism of 3D modeled building textures real aerial images are utilized in 3D modelling.However,the aerial image captured during the day can be shadowed by the sun and it can cause the distortion or deterioration of image.To resolve this problem,researches on detecting and removing shadows have been conducted,but the detecting and removing shadow is still considered as a challenging problem.In this paper,we propose a novel method for detecting and removing shadows using deep learning.For this work,we first a build a new dataset of photo-realistic synthetic urban data based on the virtual environment using 3D spatial information provided by VWORLD.For detecting and removing shadow from the dataset,firstly,the 1-channel shadow mask image is inferred from the 3-channel shadow image through the CNN.Then,to generate a shadow-free image,a 3-channel shadow image and a detected 1-channel shadow mask into the GAN is executed.From the experiments,we can prove that the proposed method outperforms the existing methods in detecting and removing shadow. 展开更多
关键词 Deep-learning shadow detection shadow removal synthetic data
下载PDF
Generation of meaningful synthetic sensor data—Evaluated with a reliable transferability methodology
14
作者 Michael Meiser Benjamin Duppe Ingo Zinnikus 《Energy and AI》 EI 2024年第1期248-264,共17页
As households are equipped with smart meters,supervised Machine Learning(ML)models and especially Non-Intrusive Load Monitoring(NILM)disaggregation algorithms are becoming increasingly important.To be robust,these mod... As households are equipped with smart meters,supervised Machine Learning(ML)models and especially Non-Intrusive Load Monitoring(NILM)disaggregation algorithms are becoming increasingly important.To be robust,these models require a large amount of data,which is difficult to collect.Consequently,the generation of meaningful synthetic data is becoming more relevant.We use a simulation framework to generate multiple datasets using different techniques and evaluate their quality statistically by measuring the performance of NILM models for transferability.We demonstrate that the method of data generation is crucial to train ML models in a meaningful way.The experiments conducted reveal that adding noise to the synthetic smart meter data is essential to train robust NILM models for transferability.The best results are obtained when this noise is derived from unknown appliances for which no ground truth data is available.Since we observed that NILM models can provide unstable results,we develop a reliable evaluation methodology,based on Cochran’s sample size.Finally,we compare the quality of the generated synthetic data with real data and observe that multiple NILM models trained on synthetic data perform significantly better than those trained on real data. 展开更多
关键词 Smart home synthetic sensor data Energy data Transfer learning Evaluation methodology Machine learning Neural networks NILM Seq2point WindowGRU DAE Seq2seq RNN
原文传递
An online fast multi-track locating algorithm for high-resolution single-event effect test platform 被引量:1
15
作者 Yu-Xiao Hu Hai-Bo Yang +3 位作者 Hong-Lin Zhang Jian-Wei Liao Fa-Tai Mai Cheng-Xin Zhao 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2023年第5期86-100,共15页
To improve the efficiency and accuracy of single-event effect(SEE)research at the Heavy Ion Research Facility at Lanzhou,Hi’Beam-SEE must precisely localize the position at which each heavy ion hitting the integrated... To improve the efficiency and accuracy of single-event effect(SEE)research at the Heavy Ion Research Facility at Lanzhou,Hi’Beam-SEE must precisely localize the position at which each heavy ion hitting the integrated circuit(IC)causes SEE.In this study,we propose a fast multi-track location(FML)method based on deep learning to locate the position of each particle track with high speed and accuracy.FML can process a vast amount of data supplied by Hi’Beam-SEE online,revealing sensitive areas in real time.FML is a slot-based object-centric encoder-decoder structure in which each slot can learn the location information of each track in the image.To make the method more accurate for real data,we designed an algorithm to generate a simulated dataset with a distribution similar to that of the real data,which was then used to train the model.Extensive comparison experiments demonstrated that the FML method,which has the best performance on simulated datasets,has high accuracy on real datasets as well.In particular,FML can reach 238 fps and a standard error of 1.6237μm.This study discusses the design and performance of FML. 展开更多
关键词 Beam tracks Multi-track location Rapid location High accuracy synthetic data Deep neural network Single-event effects Silicon pixel sensors HIRFL
下载PDF
Big Data Interprets US Opioid Crisis
16
作者 Zidong Wang Poning Fan 《Proceedings of Business and Economic Studies》 2020年第6期68-74,共7页
Since 2010,there has been a new round of drug crises in the United States.The abuse of opioids has led to a sharp increase in the number of people involved in drug crimes in the United States.There is an urgent need t... Since 2010,there has been a new round of drug crises in the United States.The abuse of opioids has led to a sharp increase in the number of people involved in drug crimes in the United States.There is an urgent need to explore solutions to the drug crisis in the United States.In this paper,the model of in-depth analysis is established under the condition of obtaining the opioid data and the influence factor data of the large sample of five state[1].In the first part,we use the Highway Safety Research Institute model based on the differential equation model to predict the initial value,find the initial position of the drug transfer,and obtain the curve of the number of different groups over time by fitting the data,so that the curves can be predicted the changing trends of the groups in the future.It was found that in Kentucky State,the county's most likely to start using opioids were Pike and Bale.In Ohio,the county's most likely to start using opioids are Jackson and Scioto.In Pennsylvania State,Mercer and Lackawanna are the counties most likely to start using opioids.Martinsville and Galax are the counties where Virginia State is most likely to start using opioids.Logan and Mingo are the counties where West Virginia State is most likely to start using opioids.In the second part,the gray prediction model is used to further analyze the time series of each factor,the maximum likelihood estimation method is used to obtain the weight of each factor,and the weight coefficient matrix is used to simulate the multivariate regression equation,and the factors that have the greatest influence on opioid abuse are educational background and family composition.In the third part,the hypothesis test model of two groups(the data type is proportional)is used to verify the difference between the influence factors(including the predicted values)in the first two parts of the states,thus verifying the feasibility between them.At the same time,we put forward a few suggestions to combine the current situation in the United States with the CDC data.We believe that in order to address the opium crisis,the U.S.government needs to strengthen not only oversight of doctors'prescriptions,but also make joint efforts of all sectors of society to fundamentally reduce the barriers to the use of opioids. 展开更多
关键词 Highway Safety Research Institute model synthetic drug data itting gray prediction hypothesis test antidrug advice
下载PDF
Generation of Synthetic Transcriptome Data with Defined Statistical Properties for the Development and Testing of New Analysis Methods 被引量:1
17
作者 Guillaume Brysbaert Sebastian Noth Arndt Benecke 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2007年第1期45-52,共8页
We have previously developed a combined signal/variance distribution model that accounts for the particular statistical properties of datasets generated on the Applied Biosystems AB1700 transcriptome system. Here we s... We have previously developed a combined signal/variance distribution model that accounts for the particular statistical properties of datasets generated on the Applied Biosystems AB1700 transcriptome system. Here we show that this model can be efficiently used to generate synthetic datasets with statistical properties virtually identical to those of the actual data by aid of the JAVA application ace.map creator 1.0 that we have developed. The fundamentally different structure of AB1700 transcriptome profiles requires re-evaluation, adaptation, or even redevelopment of many of the standard microarray analysis methods in order to avoid misinterpretation of the data on the one hand, and to draw full benefit from their increased specificity and sensitivity on the other hand. Our composite data model and the ace.map creator 1.0 application thereby not only present proof of the correctness of our parameter estimation, but also provide a tool for the generation of synthetic test data that will be useful for further development and testing of analysis methods. 展开更多
关键词 TRANSCRIPTOME microarray analysis signal/variance distribution distribution modeling parameter approximation synthetic data generation
原文传递
Synthetic demand data generation for individual electricity consumers :Generative Adversarial Networks (GANs) 被引量:2
18
作者 Bilgi Yilmaz Ralf Korn 《Energy and AI》 2022年第3期37-50,共14页
Load modeling is one of the crucial tasks for improving smart grids’ energy efficiency. Among manyalternatives, machine learning-based load models have become popular in applications and have shownoutstanding perform... Load modeling is one of the crucial tasks for improving smart grids’ energy efficiency. Among manyalternatives, machine learning-based load models have become popular in applications and have shownoutstanding performance in recent years. The performance of these models highly relies on data quality andquantity available for training. However, gathering a sufficient amount of high-quality data is time-consumingand extremely expensive. In the last decade, Generative Adversarial Networks (GANs) have demonstrated theirpotential to solve the data shortage problem by generating synthetic data by learning from recorded/empiricaldata. Educated synthetic datasets can reduce prediction error of electricity consumption when combined withempirical data. Further, they can be used to enhance risk management calculations. Therefore, we proposeRCGAN, TimeGAN, CWGAN, and RCWGAN which take individual electricity consumption data as input toprovide synthetic data in this study. Our work focuses on one dimensional times series, and numericalexperiments on an empirical dataset show that GANs are indeed able to generate synthetic data with realisticappearance. 展开更多
关键词 Electricity consumption Generative adversarial networks synthetic data generation Unsupervised learning RCGAN TimeGAN CWGAN RCWGAN
原文传递
A new oil spill detection algorithm based on Dempster-Shafer evidence theory 被引量:1
19
作者 Tianlong ZHANG Jie GUO +3 位作者 Chenqi XU Xi ZHANG Chuanyuan WANG Baoquan LI 《Journal of Oceanology and Limnology》 SCIE CAS CSCD 2022年第2期456-469,共14页
Features of oil spills and look-alikes in polarimetric synthetic aperture radar(SAR)images always play an important role in oil spill detection.Many oil spill detection algorithms have been implemented based on these ... Features of oil spills and look-alikes in polarimetric synthetic aperture radar(SAR)images always play an important role in oil spill detection.Many oil spill detection algorithms have been implemented based on these features.Although environmental factors such as wind speed are important to distinguish oil spills and look-alikes,some oil spill detection algorithms do not consider the environmental factors.To distinguish oil spills and look-alikes more accurately based on environmental factors and image features,a new oil spill detection algorithm based on Dempster-Shafer evidence theory was proposed.The process of oil spill detection taking account of environmental factors was modeled using the subjective Bayesian model.The Faster-region convolutional neural networks(RCNN)model was used for oil spill detection based on the convolution features.The detection results of the two models were fused at decision level using Dempster-Shafer evidence theory.The establishment and test of the proposed algorithm were completed based on our oil spill and look-alike sample database that contains 1798 image samples and environmental information records related to the image samples.The analysis and evaluation of the proposed algorithm shows a good ability to detect oil spills at a higher detection rate,with an identifi cation rate greater than 75%and a false alarm rate lower than 19%from experiments.A total of 12 oil spill SAR images were collected for the validation and evaluation of the proposed algorithm.The evaluation result shows that the proposed algorithm has a good performance on detecting oil spills with an overall detection rate greater than 70%. 展开更多
关键词 synthetic aperture radar(SAR)data oil spill detection subjective Bayesian Faster-region convolutional neural networks(RCNN) Dempster-Shafer evidence theory
下载PDF
Unsupervised change detection of man-made objects using coherent and incoherent features of multi-temporal SAR images
20
作者 FENG Hao WU Jianzhong +1 位作者 ZHANG Lu LIAO Mingsheng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第4期896-906,共11页
Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar(SAR)images has been a difficult research topic,especially in urban areas.Although existing st... Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar(SAR)images has been a difficult research topic,especially in urban areas.Although existing studies have extended from bi-temporal data pair to multi-temporal datasets to derive more plentiful information,there are still two problems to be solved in practical applications.First,change indicators constructed from incoherent feature only cannot characterize the change objects accurately.Second,the results of pixel-level methods are usually presented in the form of the noisy binary map,making the spatial change not intuitive and the temporal change of a single pixel meaningless.In this study,we propose an unsupervised man-made objects change detection framework using both coherent and incoherent features derived from multi-temporal SAR images.The coefficients of variation in timeseries incoherent features and the man-made object index(MOI)defined with coherent features are first combined to identify the initial change pixels.Afterwards,an improved spatiotemporal clustering algorithm is developed based on density-based spatial clustering of applications with noise(DBSCAN)and dynamic time warping(DTW),which can transform the initial results into noiseless object-level patches,and take the cluster center as a representative of the man-made object to determine the change pattern of each patch.An experiment with a stack of 10 TerraSAR-X images in Stripmap mode demonstrated that this method is effective in urban scenes and has the potential applicability to wide area change detection. 展开更多
关键词 change detection multi-temporal synthetic aperture radar(SAR)data coherent and incoherent features CLUSTERING
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部