期刊文献+
共找到320篇文章
< 1 2 16 >
每页显示 20 50 100
Missing Data Imputations for Upper Air Temperature at 24 Standard Pressure Levels over Pakistan Collected from Aqua Satellite 被引量:4
1
作者 Muhammad Usman Saleem Sajid Rashid Ahmed 《Journal of Data Analysis and Information Processing》 2016年第3期132-146,共16页
This research was an effort to select best imputation method for missing upper air temperature data over 24 standard pressure levels. We have implemented four imputation techniques like inverse distance weighting, Bil... This research was an effort to select best imputation method for missing upper air temperature data over 24 standard pressure levels. We have implemented four imputation techniques like inverse distance weighting, Bilinear, Natural and Nearest interpolation for missing data imputations. Performance indicators for these techniques were the root mean square error (RMSE), absolute mean error (AME), correlation coefficient and coefficient of determination ( R<sup>2</sup> ) adopted in this research. We randomly make 30% of total samples (total samples was 324) predictable from 70% remaining data. Although four interpolation methods seem good (producing <1 RMSE, AME) for imputations of air temperature data, but bilinear method was the most accurate with least errors for missing data imputations. RMSE for bilinear method remains <0.01 on all pressure levels except 1000 hPa where this value was 0.6. The low value of AME (<0.1) came at all pressure levels through bilinear imputations. Very strong correlation (>0.99) found between actual and predicted air temperature data through this method. The high value of the coefficient of determination (0.99) through bilinear interpolation method, tells us best fit to the surface. We have also found similar results for imputation with natural interpolation method in this research, but after investigating scatter plots over each month, imputations with this method seem to little obtuse in certain months than bilinear method. 展开更多
关键词 Missing data imputations Spatial Interpolation AQUA Satellite Upper Level Air Temperature AIRX3STML
下载PDF
AQUA Satellite Data and Imputation of Geopotential Height: A Case Study for Pakistan
2
作者 Usman Saleem Mian Sohail Akram +2 位作者 Muhammad Fahad Ullah Faisal Rehman Muhammad Riaz Khan 《Open Journal of Geology》 2018年第10期1002-1018,共17页
In current study an attempt is carried out by filling missing data of geopotiential height over Pakistan and identifying the optimum method for interpolation. In last thirteen years geopotential height values over wer... In current study an attempt is carried out by filling missing data of geopotiential height over Pakistan and identifying the optimum method for interpolation. In last thirteen years geopotential height values over were missing over Pakistan. These gaps are tried to be filled by interpolation Techniques. The techniques for interpolations included Bilinear interpolations [BI], Nearest Neighbor [NN], Natural [NI] and Inverse distance weighting [IDW]. These imputations were judged on the basis of performance parameters which include Root Mean Square Error [RMSE], Mean Absolute Error [MAE], Correlation Coefficient [Corr] and Coefficient of Determination [R2]. The NN and IDW interpolation Imputations were not precise and accurate. The Natural Neighbors and Bilinear interpolations immaculately fitted to the data set. A good correlation was found for Natural Neighbor interpolation imputations and perfectly fit to the surface of geopotential height. The root mean square error [maximum and minimum] values were ranges from ±5.10 to ±2.28 m respectively. However mean absolute error was near to 1. The validation of imputation revealed that NN interpolation produced more accurate results than BI. It can be concluded that Natural Interpolation was the best suited interpolation technique for filling missing data sets from AQUA satellite for geopotential height. 展开更多
关键词 AIRX3STML MISSING data imputations MISSING CLIMATIC data UPPER Air Temperature
下载PDF
Accuracy Comparison of Data Imputation Estimation Using Structural Equation Modeling Between Constrained and Unconstrained Approaches
3
作者 Narong Phothi Somchai Prakancharoen 《通讯和计算机(中英文版)》 2012年第3期297-302,共6页
关键词 结构方程模型 测量精度 M估计 矿产资源 归责 美国加州大学 测试数据 网上数据库
下载PDF
Superiority of Bayesian Imputation to Mice in Logit Panel Data Models
4
作者 Peter Otieno Opeyo Weihu Cheng Zhao Xu 《Open Journal of Statistics》 2023年第3期316-358,共43页
Non-responses leading to missing data are common in most studies and causes inefficient and biased statistical inferences if ignored. When faced with missing data, many studies choose to employ complete case analysis ... Non-responses leading to missing data are common in most studies and causes inefficient and biased statistical inferences if ignored. When faced with missing data, many studies choose to employ complete case analysis approach to estimate the parameters of the model. This however compromises on the susceptibility of the estimates to reduced bias and minimum variance as expected. Several classical and model based techniques of imputing the missing values have been mentioned in literature. Bayesian approach to missingness is deemed superior amongst the other techniques through its natural self-lending to missing data settings where the missing values are treated as unobserved random variables that have a distribution which depends on the observed data. This paper digs up the superiority of Bayesian imputation to Multiple Imputation with Chained Equations (MICE) when estimating logistic panel data models with single fixed effects. The study validates the superiority of conditional maximum likelihood estimates for nonlinear binary choice logit panel model in the presence of missing observations. A Monte Carlo simulation was designed to determine the magnitude of bias and root mean square errors (RMSE) arising from MICE and Full Bayesian imputation. The simulation results show that the conditional maximum likelihood (ML) logit estimator presented in this paper is less biased and more efficient when Bayesian imputation is performed to curb non-responses. 展开更多
关键词 Panel data imputATION Monte Carlo BIAS Conditional Maximum Likelihood
下载PDF
Study on the Missing Data Mechanisms and Imputation Methods
5
作者 Abdullah Z. Alruhaymi Charles J. Kim 《Open Journal of Statistics》 2021年第4期477-492,共16页
The absence of some data values in any observed dataset has been a real hindrance to achieving valid results in statistical research. This paper</span></span><span><span><span style="fo... The absence of some data values in any observed dataset has been a real hindrance to achieving valid results in statistical research. This paper</span></span><span><span><span style="font-family:""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">aim</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">ed</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;"> at the missing data widespread problem faced by analysts and statisticians in academia and professional environments. Some data-driven methods were studied to obtain accurate data. Projects that highly rely on data face this missing data problem. And since machine learning models are only as good as the data used to train them, the missing data problem has a real impact on the solutions developed for real-world problems. Therefore, in this dissertation, there is an attempt to solve this problem using different mechanisms. This is done by testing the effectiveness of both traditional and modern data imputation techniques by determining the loss of statistical power when these different approaches are used to tackle the missing data problem. At the end of this research dissertation, it should be easy to establish which methods are the best when handling the research problem. It is recommended that using Multivariate Imputation by Chained Equations (MICE) for MAR missingness is the best approach </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">to</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;"> dealing with missing data. 展开更多
关键词 Missing data MECHANISMS imputation Techniques MODELS
下载PDF
Sequence-To-Sequence Learning for Online Imputation of Sensory Data
6
作者 Kaitai TONG Teng LI 《Instrumentation》 2019年第2期63-70,共8页
Online sensing can provide useful information in monitoring applications,for example,machine health monitoring,structural condition monitoring,environmental monitoring,and many more.Missing data is generally a signifi... Online sensing can provide useful information in monitoring applications,for example,machine health monitoring,structural condition monitoring,environmental monitoring,and many more.Missing data is generally a significant issue in the sensory data that is collected online by sensing systems,which may affect the goals of monitoring programs.In this paper,a sequence-to-sequence learning model based on a recurrent neural network(RNN)architecture is presented.In the proposed method,multivariate time series of the monitored parameters is embedded into the neural network through layer-by-layer encoders where the hidden features of the inputs are adaptively extracted.Afterwards,predictions of the missing data are generated by network decoders,which are one-step-ahead predictive data sequences of the monitored parameters.The prediction performance of the proposed model is validated based on a real-world sensory dataset.The experimental results demonstrate the performance of the proposed RNN-encoder-decoder model with its capability in sequence-to-sequence learning for online imputation of sensory data. 展开更多
关键词 data imputATION RECURRENT NEURAL Network Sequence-To-Sequence Learning SEQUENCE Prediction
下载PDF
Fault detection and diagnosis for data incomplete industrial systems with new Bayesian network approach 被引量:15
7
作者 Zhengdao Zhang Jinlin Zhu Feng Pan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2013年第3期500-511,共12页
For the fault detection and diagnosis problem in largescale industrial systems, there are two important issues: the missing data samples and the non-Gaussian property of the data. However, most of the existing data-d... For the fault detection and diagnosis problem in largescale industrial systems, there are two important issues: the missing data samples and the non-Gaussian property of the data. However, most of the existing data-driven methods cannot be able to handle both of them. Thus, a new Bayesian network classifier based fault detection and diagnosis method is proposed. At first, a non-imputation method is presented to handle the data incomplete samples, with the property of the proposed Bayesian network classifier, and the missing values can be marginalized in an elegant manner. Furthermore, the Gaussian mixture model is used to approximate the non-Gaussian data with a linear combination of finite Gaussian mixtures, so that the Bayesian network can process the non-Gaussian data in an effective way. Therefore, the entire fault detection and diagnosis method can deal with the high-dimensional incomplete process samples in an efficient and robust way. The diagnosis results are expressed in the manner of probability with the reliability scores. The proposed approach is evaluated with a benchmark problem called the Tennessee Eastman process. The simulation results show the effectiveness and robustness of the proposed method in fault detection and diagnosis for large-scale systems with missing measurements. 展开更多
关键词 fault detection and diagnosis Bayesian network Gaussian mixture model data incomplete non-imputation.
下载PDF
Comparative Study of Four Methods in Missing Value Imputations under Missing Completely at Random Mechanism 被引量:3
8
作者 Michikazu Nakai Ding-Geng Chen +1 位作者 Kunihiro Nishimura Yoshihiro Miyamoto 《Open Journal of Statistics》 2014年第1期27-37,共11页
In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. ... In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study. 展开更多
关键词 MISSING data imputATION MCAR COMPLETE Case LOCF
下载PDF
Missing Values Imputation Based on Iterative Learning 被引量:1
9
作者 Huaxiong Li 《International Journal of Intelligence Science》 2013年第1期50-55,共6页
Databases for machine learning and data mining often have missing values. How to develop effective method for missing values imputation is a crucial important problem in the field of machine learning and data mining. ... Databases for machine learning and data mining often have missing values. How to develop effective method for missing values imputation is a crucial important problem in the field of machine learning and data mining. In this paper, several methods for dealing with missing values in incomplete data are reviewed, and a new method for missing values imputation based on iterative learning is proposed. The proposed method is based on a basic assumption: There exist cause-effect connections among condition attribute values, and the missing values can be induced from known values. In the process of missing values imputation, a part of missing values are filled in at first and converted to known values, which are used for the next step of missing values imputation. The iterative learning process will go on until an incomplete data is entirely converted to a complete data. The paper also presents an example to illustrate the framework of iterative learning for missing values imputation. 展开更多
关键词 INCOMPLETE data MISSING VALUES imputATION ITERATIVE Learning INTENSION Extension
下载PDF
Revealing GE Interactions from Trial Data without Replications
10
作者 Jixiang Wu Johnie Jenkins Jack C. McCarty 《Open Journal of Statistics》 2019年第3期407-419,共13页
Detecting genotype-by-environment (GE) interaction effects or yield stability is one of the most important components for crop trial data analysis, especially in historical crop trial data. However, it is statisticall... Detecting genotype-by-environment (GE) interaction effects or yield stability is one of the most important components for crop trial data analysis, especially in historical crop trial data. However, it is statistically challenging to discover the GE interaction effects because many published data were just entry means under each environment rather than repeated field plot data. In this study, we propose a new methodology, which can be used to impute replicated trial data sets to reveal GE interactions from the original data. As a demonstration, we used a data set, which includes 28 potato genotypes and six environments with three replications to numerically evaluate the properties of this new imputation method. We compared the phenotypic means and predicted random effects from the imputed data with the results from the original data. The results from the imputed data were highly consistent with those from the original data set, indicating that imputed data from the method we proposed in this study can be used to reveal information including GE interaction effects harbored in the original data. Therefore, this study could pave a way to detect the GE interactions and other related information from historical crop trial reports when replications were not available. 展开更多
关键词 GE Interaction HISTORICAL CROP TRIAL data imputATION
下载PDF
Copy Mean: A New Method to Impute Intermittent Missing Values in Longitudinal Studies
11
作者 Christophe Genolini René écochard Hélène Jacqmin-Gadda 《Open Journal of Statistics》 2013年第4期26-40,共15页
Longitudinal studies are those in which the same variable is repeatedly measured at different times. These studies are more likely than others to suffer from missing values. Since the presence of missing values may ha... Longitudinal studies are those in which the same variable is repeatedly measured at different times. These studies are more likely than others to suffer from missing values. Since the presence of missing values may have an important impact on statistical analyses, it is important that they should be dealt with properly. In this paper, we present “Copy Mean”, a new method to impute intermittent missing values. We compared its efficiency in eleven imputation methods dedicated to the treatment of missing values in longitudinal data. All these methods were tested on three markedly different real datasets (stationary, increasing, and sinusoidal pattern) with complete data. For each of them, we generated nine types of incomplete datasets that include 10%, 30%, or 50% of missing data using either a Missing Completely at Random, a Missing at Random, or a Missing Not at Random missingness mechanism. Our results show that Copy Mean has a great effectiveness, exceeding or equaling the performance of other methods in almost all configurations. The effectiveness of linear interpolation is highly data-dependent. The Last Occurrence Carried Forward method is strongly discouraged. 展开更多
关键词 imputATION Longitudinal data INTERMITTENT MISSING VALUES
下载PDF
Using Statistical Learning to Treat Missing Data: A Case of HIV/TB Co-Infection in Kenya
12
作者 Joshua O. Mwaro Linda Chaba Collins Odhiambo 《Journal of Data Analysis and Information Processing》 2020年第3期110-133,共24页
In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objec... In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objective is to identify the best method for correcting missing data in TB/HIV Co-infection setting. We employ both empirical data analysis and extensive simulation study to examine the effects of missing data, the accuracy, sensitivity, specificity and train and test error for different approaches. The novelty of this work hinges on the use of modern statistical learning algorithm when treating missingness. In the empirical analysis, both HIV data and TB-HIV co-infection data imputations were performed, and the missing values were imputed using different approaches. In the simulation study, sets of 0% (Complete case), 10%, 30%, 50% and 80% of the data were drawn randomly and replaced with missing values. Results show complete cases only had a co-infection rate (95% Confidence Interval band) of 29% (25%, 33%), weighted method 27% (23%, 31%), likelihood-based approach 26% (24%, 28%) and multiple imputation approach 21% (20%, 22%). In conclusion, MI remains the best approach for dealing with missing data and failure to apply it, results to overestimation of HIV/TB co-infection rate by 8%. 展开更多
关键词 Missing data HIV/TB Co-Infection imputATION Missing at Random Count data
下载PDF
Fraction of Missing Information (γ) at Different Missing Data Fractions in the 2012 NAMCS Physician Workflow Mail Survey
13
作者 Qiyuan Pan Rong Wei 《Applied Mathematics》 2016年第10期1057-1067,共11页
In his 1987 classic book on multiple imputation (MI), Rubin used the fraction of missing information, γ, to define the relative efficiency (RE) of MI as RE = (1 + γ/m)?1/2, where m is the number of imputations, lead... In his 1987 classic book on multiple imputation (MI), Rubin used the fraction of missing information, γ, to define the relative efficiency (RE) of MI as RE = (1 + γ/m)?1/2, where m is the number of imputations, leading to the conclusion that a small m (≤5) would be sufficient for MI. However, evidence has been accumulating that many more imputations are needed. Why would the apparently sufficient m deduced from the RE be actually too small? The answer may lie with γ. In this research, γ was determined at the fractions of missing data (δ) of 4%, 10%, 20%, and 29% using the 2012 Physician Workflow Mail Survey of the National Ambulatory Medical Care Survey (NAMCS). The γ values were strikingly small, ranging in the order of 10?6 to 0.01. As δ increased, γ usually increased but sometimes decreased. How the data were analysed had the dominating effects on γ, overshadowing the effect of δ. The results suggest that it is impossible to predict γ using δ and that it may not be appropriate to use the γ-based RE to determine sufficient m. 展开更多
关键词 Multiple imputation Fraction of Missing Information (γ) Sufficient Number of imputations Missing data NAMCS
下载PDF
基于生成对抗网络的追尾事故数据填补方法研究
14
作者 周备 张莹 +2 位作者 张生瑞 周千喜 汪琴 《交通运输系统工程与信息》 EI CSCD 北大核心 2024年第1期132-137,198,共7页
深入分析交通事故数据可以为规避事故发生、降低事故严重程度提供重要理论依据,然而,在事故数据采集、传输、存储过程中往往会产生数据缺失,导致统计分析结果的准确性下降、模型的误判风险上升。本文以芝加哥2016—2021年的101452条追... 深入分析交通事故数据可以为规避事故发生、降低事故严重程度提供重要理论依据,然而,在事故数据采集、传输、存储过程中往往会产生数据缺失,导致统计分析结果的准确性下降、模型的误判风险上升。本文以芝加哥2016—2021年的101452条追尾事故数据为研究对象,将原始数据按照7∶3随机分为训练集和测试集。在训练集数据上,利用生成式插补网络(Generative Adversarial Imputation Network,GAIN)实现对缺失数据的填补。为对比不同数据填补方法的效果,同时选择多重插补(Multiple Imputation by Chained Equations,MICE)算法、期望最大化(Expectation Maximization,EM)填充算法、缺失森林(MissForest)算法和K最近邻(K-Nearest Neighbor,KNN)算法对同一数据集进行数据填补,并基于填补前后变量方差变化比较不同填补算法对数据变异性的影响。在完成数据填补的基础上,构建LightGBM三分类事故严重程度影响因素分析模型。使用原始训练集数据,以及填补后的训练集数据分别训练模型,并使用未经填补的测试集数据检验模型预测效果。结果表明,经缺失值填补后,模型性能得到一定改善,使用GAIN填补数据集训练的模型,相较于原始数据训练的模型,准确率提高了6.84%,F1提高了4.61%,AUC(Area Under the Curve)提高了10.09%,且改善效果优于其他4种填补方法。 展开更多
关键词 城市交通 数据填补 生成对抗网络 追尾事故 LightGBM模型
下载PDF
A Comparative Analysis of Generalized Estimating Equations Methods for Incomplete Longitudinal Ordinal Data with Ignorable Dropouts
15
作者 Kago Edwin Ditlhong Oscar Owino Ngesa Abdalla Yusuf Kombo 《Open Journal of Statistics》 2018年第5期770-792,共23页
In longitudinal studies, measurements are taken repeatedly over time on the same experimental unit. These measurements are thus correlated. Missing data are very common in longitudinal studies. A lot of research has b... In longitudinal studies, measurements are taken repeatedly over time on the same experimental unit. These measurements are thus correlated. Missing data are very common in longitudinal studies. A lot of research has been going on ways to appropriately analyze such data set. Generalized Estimating Equations (GEE) is a popular method for the analysis of non-Gaussian longitudinal data. In the presence of missing data, GEE requires the strong assumption of missing completely at random (MCAR). Multiple Imputation Generalized Estimating Equations (MIGEE), Inverse Probability Weighted Generalized Estimating Equations (IPWGEE) and Double Robust Generalized Estimating Equations (DRGEE) have been proposed as elegant ways to ensure validity of the inference under missing at random (MAR). In this study, the three extensions of GEE are compared under various dropout rates and sample sizes through simulation studies. Under MAR and MCAR mechanism, the simulation results revealed better performance of DRGEE compared to IPWGEE and MIGEE. The optimum method was applied to real data set. 展开更多
关键词 Longitudinal ORDINAL data MAR MCAR Multiple imputATION GEE Inverse Probability Weighted GEE Double Robust GEE
下载PDF
空间自回归模型下不完整大数据缺失值插补算法
16
作者 刘晓燕 翟建国 《吉林大学学报(信息科学版)》 CAS 2024年第2期312-317,共6页
针对不完整大数据因其自身结构具有不规则性,导致在进行缺失值插补时计算量大、插补精度低的问题,提出空间自回归模型下不完整大数据缺失值插补算法。利用迁移学习算法在动态权重下过滤出原始数据中冗余数据,区分异常和正常数据,提取残... 针对不完整大数据因其自身结构具有不规则性,导致在进行缺失值插补时计算量大、插补精度低的问题,提出空间自回归模型下不完整大数据缺失值插补算法。利用迁移学习算法在动态权重下过滤出原始数据中冗余数据,区分异常和正常数据,提取残缺数据,采用最小二乘回归对残缺数据实施修补。将缺失值插补分为3种类型,分别为一阶空间自回归模型插补、空间自回归模型插补和多重插补法。根据实际情况将修补后数据插补到合适的位置,实现不完整大数据缺失值插补。实验结果表明,所提方法具有良好的缺失值插补能力。 展开更多
关键词 迁移学习 不完整大数据 缺失值插补 空间回归模型 数据修正
下载PDF
Missing Data Imputation: A Comprehensive Review
17
作者 Majed Alwateer El-Sayed Atlam +2 位作者 Mahmoud Mohammed Abd El-Raouf Osama A. Ghoneim Ibrahim Gad 《Journal of Computer and Communications》 2024年第11期53-75,共23页
Missing data presents a significant challenge in statistical analysis and machine learning, often resulting in biased outcomes and diminished efficiency. This comprehensive review investigates various imputation techn... Missing data presents a significant challenge in statistical analysis and machine learning, often resulting in biased outcomes and diminished efficiency. This comprehensive review investigates various imputation techniques, categorizing them into three primary approaches: deterministic methods, probabilistic models, and machine learning algorithms. Traditional techniques, including mean or mode imputation, regression imputation, and last observation carried forward, are evaluated alongside more contemporary methods such as multiple imputation, expectation-maximization, and deep learning strategies. The strengths and limitations of each approach are outlined. Key considerations for selecting appropriate methods, based on data characteristics and research objectives, are discussed. The importance of evaluating imputation’s impact on subsequent analyses is emphasized. This synthesis of recent advancements and best practices provides researchers with a robust framework for effectively handling missing data, thereby improving the reliability of empirical findings across diverse disciplines. 展开更多
关键词 Missing data Machine Learning Prediction Deep Learning imputation
下载PDF
基于面板数据模型的拱坝缺失数据填补方法 被引量:2
18
作者 崔欣然 石立 +3 位作者 陆希 顾昊 吴艳 朱明远 《水力发电学报》 CSCD 北大核心 2024年第3期94-107,共14页
混凝土拱坝作为重要的水工建筑物,由于监测设备故障、人为因素等影响,导致其监测数据频繁出现缺失的现象,降低了大坝安全评估与预测的有效性与准确性。传统方法多仅依赖单测点测值进行插补,忽略了测点之间的相关性与异质性。本文提出了... 混凝土拱坝作为重要的水工建筑物,由于监测设备故障、人为因素等影响,导致其监测数据频繁出现缺失的现象,降低了大坝安全评估与预测的有效性与准确性。传统方法多仅依赖单测点测值进行插补,忽略了测点之间的相关性与异质性。本文提出了一种基于面板数据模型的变形缺失数据插补方法。首先,改进传统变形相似性增量速度指标,解决了其分母可能等于零的问题。其次,提出了一种组合加权方法以计算变形相似性综合指标,并采用改进的基于密度聚类方法对变形监测点进行分类。随后,建立了面板模型,以填补不同区域内的缺失数据。本文提出的方法可以更准确地填补混凝土拱坝变形数据的缺失,从而能够有效地解决变形监测数据缺失的问题。 展开更多
关键词 缺失数据填补 变形相似性指标 聚类方法 面板数据模型 混凝土拱坝
下载PDF
基于多变量时空融合网络的风机数据缺失值插补研究 被引量:1
19
作者 詹兆康 胡旭光 +3 位作者 赵浩然 张思琪 张峻凯 马大中 《自动化学报》 EI CAS CSCD 北大核心 2024年第6期1171-1184,共14页
风电场数据的完整性会因恶劣天气、输入信号丢失、传感器故障等原因遭到破坏,而大面积的数据缺失将给风机设备的运行和维护带来严峻考验.因此,提出一个多变量时空融合网络(Multivariate spatiotemporal integration network,MSIN)来解... 风电场数据的完整性会因恶劣天气、输入信号丢失、传感器故障等原因遭到破坏,而大面积的数据缺失将给风机设备的运行和维护带来严峻考验.因此,提出一个多变量时空融合网络(Multivariate spatiotemporal integration network,MSIN)来解决缺失数据问题.首先,提出包含缺失值定位−指引机制的MSIN结构,揭示缺失部分数据的潜在信息,确保插补数据符合真实分布.其次,在网络中设计多视角时空卷积模块,捕捉同一风机多个变量与多个风机同一变量之间的局部空间和全局时间相关性,用于提高插补数据的真实性.接着,提出网络实时自更新机制,根据风电场实时变化情况实现在线调整,能够提升网络泛化能力,由此弥补重新训练模型的时间和空间成本高的缺陷.最后,通过真实的风机数据验证所提网络的有效性和优越性.相关分析结果表明,相较于MissForest等传统数据插补方法的插补性能,平均绝对误差(Mean absolute error,MAE)、平均绝对百分比误差(Mean absolute percentage error,MAPE)和均方根误差(Root mean square error,RMSE)分别下降18.54%、41.00%和3.15%以上. 展开更多
关键词 风机数据 数据插补 时空特征 生成对抗网络
下载PDF
基于双向循环插补网络的分布式光伏集群时序数据耦合增强方法
20
作者 廖若愚 刘友波 +3 位作者 沈晓东 高红均 唐冬来 刘俊勇 《电网技术》 EI CSCD 北大核心 2024年第7期2784-2794,I0042-I0048,共18页
分布式光伏点多面广、局部渗透率高、安装环境复杂多变,真实可靠的量测数据是其性能分析、出力预测、运维调控的基础。然而,传感器故障和通信堵塞等因素会造成量测值缺失,恶化原始数据质量,进而影响配电网运行决策的准确性。传统数据修... 分布式光伏点多面广、局部渗透率高、安装环境复杂多变,真实可靠的量测数据是其性能分析、出力预测、运维调控的基础。然而,传感器故障和通信堵塞等因素会造成量测值缺失,恶化原始数据质量,进而影响配电网运行决策的准确性。传统数据修复方法只考虑单一量测值的分布特征,忽略了多维时序数据的潜在耦合关系,修复精度有限。为此,该文提出一种基于双向多阶段循环插补网络和Seq2SeqAttention的时序数据耦合增强方法,改进了循环插补网络的结构,并引入衰减机制,能利用少量未缺失数据,潜在地挖掘原始数据的整体分布规律,一次性对多个光伏场站完成高质量数据修复。实验结果表明,所提方法在高比例缺失情况下仍有良好的修复性能,可明显增强分布式光伏集群的基础数据质量,提升电网运营商对光伏集群的细粒度感知能力。 展开更多
关键词 缺失数据修复 双向循环插补网络 耦合时序数据 分布式光伏集群
下载PDF
上一页 1 2 16 下一页 到第
使用帮助 返回顶部