期刊文献+
共找到21篇文章
< 1 2 >
每页显示 20 50 100
Physics-Informed AI Surrogates for Day-Ahead Wind Power Probabilistic Forecasting with Incomplete Data for Smart Grid in Smart Cities 被引量:1
1
作者 Zeyu Wu Bo Sun +2 位作者 Qiang Feng Zili Wang Junlin Pan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期527-554,共28页
Due to the high inherent uncertainty of renewable energy,probabilistic day-ahead wind power forecasting is crucial for modeling and controlling the uncertainty of renewable energy smart grids in smart cities.However,t... Due to the high inherent uncertainty of renewable energy,probabilistic day-ahead wind power forecasting is crucial for modeling and controlling the uncertainty of renewable energy smart grids in smart cities.However,the accuracy and reliability of high-resolution day-ahead wind power forecasting are constrained by unreliable local weather prediction and incomplete power generation data.This article proposes a physics-informed artificial intelligence(AI)surrogates method to augment the incomplete dataset and quantify its uncertainty to improve wind power forecasting performance.The incomplete dataset,built with numerical weather prediction data,historical wind power generation,and weather factors data,is augmented based on generative adversarial networks.After augmentation,the enriched data is then fed into a multiple AI surrogates model constructed by two extreme learning machine networks to train the forecasting model for wind power.Therefore,the forecasting models’accuracy and generalization ability are improved by mining the implicit physics information from the incomplete dataset.An incomplete dataset gathered from a wind farm in North China,containing only 15 days of weather and wind power generation data withmissing points caused by occasional shutdowns,is utilized to verify the proposed method’s performance.Compared with other probabilistic forecastingmethods,the proposed method shows better accuracy and probabilistic performance on the same incomplete dataset,which highlights its potential for more flexible and sensitive maintenance of smart grids in smart cities. 展开更多
关键词 Physics-informed method probabilistic forecasting wind power generative adversarial network extreme learning machine day-ahead forecasting incomplete data smart grids
下载PDF
Power Incomplete Data Clustering Based on Fuzzy Fusion Algorithm
2
作者 Yutian Hong Yuping Yan 《Energy Engineering》 EI 2023年第1期245-261,共17页
With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow e... With the rapid development of the economy,the scale of the power grid is expanding.The number of power equipment that constitutes the power grid has been very large,which makes the state data of power equipment grow explosively.These multi-source heterogeneous data have data differences,which lead to data variation in the process of transmission and preservation,thus forming the bad information of incomplete data.Therefore,the research on data integrity has become an urgent task.This paper is based on the characteristics of random chance and the Spatio-temporal difference of the system.According to the characteristics and data sources of the massive data generated by power equipment,the fuzzy mining model of power equipment data is established,and the data is divided into numerical and non-numerical data based on numerical data.Take the text data of power equipment defects as the mining material.Then,the Apriori algorithm based on an array is used to mine deeply.The strong association rules in incomplete data of power equipment are obtained and analyzed.From the change trend of NRMSE metrics and classification accuracy,most of the filling methods combined with the two frameworks in this method usually show a relatively stable filling trend,and will not fluctuate greatly with the growth of the missing rate.The experimental results show that the proposed algorithm model can effectively improve the filling effect of the existing filling methods on most data sets,and the filling effect fluctuates greatly with the increase of the missing rate,that is,with the increase of the missing rate,the improvement effect of the model for the existing filling methods is higher than 4.3%.Through the incomplete data clustering technology studied in this paper,a more innovative state assessment of smart grid reliability operation is carried out,which has good research value and reference significance. 展开更多
关键词 Power system equipment parameter incomplete data fuzzy analysis data clustering
下载PDF
Analysis of Incomplete Data of Accelerated Life Testing with Competing Failure Modes 被引量:10
3
作者 TAN Yuanyuan ZHANG Chunhua CHEN Xun 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2009年第6期883-889,共7页
Data obtained from accelerated life testing (ALT) when there are two or more failure modes, which is commonly referred to as competing failure modes, are often incomplete. The incompleteness is mainly due to censori... Data obtained from accelerated life testing (ALT) when there are two or more failure modes, which is commonly referred to as competing failure modes, are often incomplete. The incompleteness is mainly due to censoring, as well as masking which might be the case that the failure time is observed, but its corresponding failure mode is not identified. Because the identification of the failure mode may be expensive, or very difficult to investigate due to lack of appropriate diagnostics. A method is proposed for analyzing incomplete data of constant stress ALT with competing failure modes. It is assumed that failure modes have s-independent latent lifetimes and the log lifetime of each failure mode can be written as a linear function of stress. The parameters of the model are estimated by using the expectation maximum (EM) algorithm with incomplete data. Simulation studies are performed to check'model validity and investigate the properties of estimates. For further validation, the method is also illustrated by an example, which shows the process of analyze incomplete data from ALT of some insulation system. Because of considering the incompleteness of data in modeling and making use of the EM algorithm in estimating, the method becomes more flexible in ALT analysis. 展开更多
关键词 accelerated life testing competing failure modes expectation maximum algorithm incomplete data Monte Carlo simulation
下载PDF
Energy Consumption Prediction of a CNC Machining Process With Incomplete Data 被引量:6
4
作者 Jian Pan Congbo Li +2 位作者 Ying Tang Wei Li Xiaoou Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第5期987-1000,共14页
Energy consumption prediction of a CNC machining process is important for energy efficiency optimization strategies.To improve the generalization abilities,more and more parameters are acquired for energy prediction m... Energy consumption prediction of a CNC machining process is important for energy efficiency optimization strategies.To improve the generalization abilities,more and more parameters are acquired for energy prediction modeling.While the data collected from workshops may be incomplete because of misoperation,unstable network connections,and frequent transfers,etc.This work proposes a framework for energy modeling based on incomplete data to address this issue.First,some necessary preliminary operations are used for incomplete data sets.Then,missing values are estimated to generate a new complete data set based on generative adversarial imputation nets(GAIN).Next,the gene expression programming(GEP)algorithm is utilized to train the energy model based on the generated data sets.Finally,we test the predictive accuracy of the obtained model.Computational experiments are designed to investigate the performance of the proposed framework with different rates of missing data.Experimental results demonstrate that even when the missing data rate increases to 30%,the proposed framework can still make efficient predictions,with the corresponding RMSE and MAE 0.903 k J and 0.739 k J,respectively. 展开更多
关键词 Energy consumption prediction incomplete data generative adversarial imputation nets(GAIN) gene expression programming(GEP)
下载PDF
Deep learning technique for process fault detection and diagnosis in the presence of incomplete data 被引量:3
5
作者 Cen Guo Wenkai Hu +1 位作者 Fan Yang Dexian Huang 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2020年第9期2358-2367,共10页
In modern industrial processes, timely detection and diagnosis of process abnormalities are critical for monitoring process operations. Various fault detection and diagnosis(FDD) methods have been proposed and impleme... In modern industrial processes, timely detection and diagnosis of process abnormalities are critical for monitoring process operations. Various fault detection and diagnosis(FDD) methods have been proposed and implemented, the performance of which, however, could be drastically influenced by the common presence of incomplete or missing data in real industrial scenarios. This paper presents a new FDD approach based on an incomplete data imputation technique for process fault recognition. It employs the modified stacked autoencoder,a deep learning structure, in the phase of incomplete data treatment, and classifies data representations rather than the imputed complete data in the phase of fault identification. A benchmark process, the Tennessee Eastman process, is employed to illustrate the effectiveness and applicability of the proposed method. 展开更多
关键词 Alarm configuration Deep learning Fault detection and diagnosis incomplete data Stacked autoencoder
下载PDF
Bayesian estimation of a power law process with incomplete data 被引量:2
6
作者 HU Junming HUANG Hongzhong LI Yanfeng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第1期243-251,共9页
Due to the simplicity and flexibility of the power law process,it is widely used to model the failures of repairable systems.Although statistical inference on the parameters of the power law process has been well deve... Due to the simplicity and flexibility of the power law process,it is widely used to model the failures of repairable systems.Although statistical inference on the parameters of the power law process has been well developed,numerous studies largely depend on complete failure data.A few methods on incomplete data are reported to process such data,but they are limited to their specific cases,especially to that where missing data occur at the early stage of the failures.No framework to handle generic scenarios is available.To overcome this problem,from the point of view of order statistics,the statistical inference of the power law process with incomplete data is established in this paper.The theoretical derivation is carried out and the case studies demonstrate and verify the proposed method.Order statistics offer an alternative to the statistical inference of the power law process with incomplete data as they can reformulate current studies on the left censored failure data and interval censored data in a unified framework.The results show that the proposed method has more flexibility and more applicability. 展开更多
关键词 incomplete data power law process Bayesian inference order statistics repairable system
下载PDF
A Fast and Effective Multiple Kernel Clustering Method on Incomplete Data 被引量:1
7
作者 Lingyun Xiang Guohan Zhao +3 位作者 Qian Li Gwang-Jun Kim Osama Alfarraj Amr Tolba 《Computers, Materials & Continua》 SCIE EI 2021年第4期267-284,共18页
Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete da... Multiple kernel clustering is an unsupervised data analysis method that has been used in various scenarios where data is easy to be collected but hard to be labeled.However,multiple kernel clustering for incomplete data is a critical yet challenging task.Although the existing absent multiple kernel clustering methods have achieved remarkable performance on this task,they may fail when data has a high value-missing rate,and they may easily fall into a local optimum.To address these problems,in this paper,we propose an absent multiple kernel clustering(AMKC)method on incomplete data.The AMKC method rst clusters the initialized incomplete data.Then,it constructs a new multiple-kernel-based data space,referred to as K-space,from multiple sources to learn kernel combination coefcients.Finally,it seamlessly integrates an incomplete-kernel-imputation objective,a multiple-kernel-learning objective,and a kernel-clustering objective in order to achieve absent multiple kernel clustering.The three stages in this process are carried out simultaneously until the convergence condition is met.Experiments on six datasets with various characteristics demonstrate that the kernel imputation and clustering performance of the proposed method is signicantly better than state-of-the-art competitors.Meanwhile,the proposed method gains fast convergence speed. 展开更多
关键词 Multiple kernel clustering absent-kernel imputation incomplete data kernel k-means clustering
下载PDF
Belief Combination of Classifiers for Incomplete Data
8
作者 Zuowei Zhang Songtao Ye +2 位作者 Yiru Zhang Weiping Ding Hao Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第4期652-667,共16页
Data with missing values,or incomplete information,brings some challenges to the development of classification,as the incompleteness may significantly affect the performance of classifiers.In this paper,we handle miss... Data with missing values,or incomplete information,brings some challenges to the development of classification,as the incompleteness may significantly affect the performance of classifiers.In this paper,we handle missing values in both training and test sets with uncertainty and imprecision reasoning by proposing a new belief combination of classifier(BCC)method based on the evidence theory.The proposed BCC method aims to improve the classification performance of incomplete data by characterizing the uncertainty and imprecision brought by incompleteness.In BCC,different attributes are regarded as independent sources,and the collection of each attribute is considered as a subset.Then,multiple classifiers are trained with each subset independently and allow each observed attribute to provide a sub-classification result for the query pattern.Finally,these sub-classification results with different weights(discounting factors)are used to provide supplementary information to jointly determine the final classes of query patterns.The weights consist of two aspects:global and local.The global weight calculated by an optimization function is employed to represent the reliability of each classifier,and the local weight obtained by mining attribute distribution characteristics is used to quantify the importance of observed attributes to the pattern classification.Abundant comparative experiments including seven methods on twelve datasets are executed,demonstrating the out-performance of BCC over all baseline methods in terms of accuracy,precision,recall,F1 measure,with pertinent computational costs. 展开更多
关键词 Classifier fusion CLASSIFICATION evidence theory incomplete data missing values
下载PDF
Damage Identification under Incomplete Mode Shape Data Using Optimization Technique Based on Generalized Flexibility Matrix
9
作者 Qianhui Gao Zhu Li +1 位作者 Yongping Yu Shaopeng Zheng 《Journal of Applied Mathematics and Physics》 2023年第12期3887-3901,共15页
A generalized flexibility–based objective function utilized for structure damage identification is constructed for solving the constrained nonlinear least squares optimized problem. To begin with, the generalized fle... A generalized flexibility–based objective function utilized for structure damage identification is constructed for solving the constrained nonlinear least squares optimized problem. To begin with, the generalized flexibility matrix (GFM) proposed to solve the damage identification problem is recalled and a modal expansion method is introduced. Next, the objective function for iterative optimization process based on the GFM is formulated, and the Trust-Region algorithm is utilized to obtain the solution of the optimization problem for multiple damage cases. And then for computing the objective function gradient, the sensitivity analysis regarding design variables is derived. In addition, due to the spatial incompleteness, the influence of stiffness reduction and incomplete modal measurement data is discussed by means of two numerical examples with several damage cases. Finally, based on the computational results, it is evident that the presented approach provides good validity and reliability for the large and complicated engineering structures. 展开更多
关键词 Generalized Flexibility Matrix Damage Identification Constrained Nonlinear Least Squares Trust-Region Algorithm Sensitivity Analysis incomplete Modal data
下载PDF
IDEA:A Utility-Enhanced Approach to Incomplete Data Stream Anonymization 被引量:1
10
作者 Lu Yang Xingshu Chen +2 位作者 Yonggang Luo Xiao Lan Wei Wang 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第1期127-140,共14页
The prevalence of missing values in the data streams collected in real environments makes them impossible to ignore in the privacy preservation of data streams.However,the development of most privacy preservation meth... The prevalence of missing values in the data streams collected in real environments makes them impossible to ignore in the privacy preservation of data streams.However,the development of most privacy preservation methods does not consider missing values.A few researches allow them to participate in data anonymization but introduce extra considerable information loss.To balance the utility and privacy preservation of incomplete data streams,we present a utility-enhanced approach for Incomplete Data strEam Anonymization(IDEA).In this approach,a slide-window-based processing framework is introduced to anonymize data streams continuously,in which each tuple can be output with clustering or anonymized clusters.We consider the dimensions of attribute and tuple as the similarity measurement,which enables the clustering between incomplete records and complete records and generates the cluster with minimal information loss.To avoid the missing value pollution,we propose a generalization method that is based on maybe match for generalizing incomplete data.The experiments conducted on real datasets show that the proposed approach can efficiently anonymize incomplete data streams while effectively preserving utility. 展开更多
关键词 ANONYMIZATION GENERALIZATION incomplete data streams privacy preservation UTILITY
原文传递
Incomplete data management: a survey 被引量:1
11
作者 Xiaoye MIAO Yunjun GAO +1 位作者 Su GUO Wanqi LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2018年第1期4-25,共22页
Incomplete data accompanies our life processes and covers almost all fields of scientific studies, as a result of delivery failure, no power of battery, accidental loss, etc. However, how to model, index, and query in... Incomplete data accompanies our life processes and covers almost all fields of scientific studies, as a result of delivery failure, no power of battery, accidental loss, etc. However, how to model, index, and query incomplete data in- curs big challenges. For example, the queries struggling with incomplete data usually have dissatisfying query results due to the improper incompleteness handling methods. In this pa- per, we systematically review the management of incomplete data, including modelling, indexing, querying, and handling methods in terms of incomplete data. We also overview sev- eral application scenarios of incomplete data, and summa- rize the existing systems related to incomplete data. It is our hope that this survey could provide insights to the database community on how incomplete data is managed, and inspire database researchers to develop more advanced processing techniques and tools to cope with the issues resulting from incomplete data in the real world. 展开更多
关键词 incomplete data query processing indexing application SYSTEM
原文传递
Effective Density-Based Clustering Algorithms for Incomplete Data 被引量:2
12
作者 Zhonghao Xue Hongzhi Wang 《Big Data Mining and Analytics》 EI 2021年第3期183-194,共12页
Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missi... Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering result quality. To avoid these problems,we develop a novel density-based clustering approach for incomplete data based on Bayesian theory, which conducts imputation and clustering concurrently and makes use of intermediate clustering results. To avoid the impact of low-density areas inside non-convex clusters, we introduce a local imputation clustering algorithm, which aims to impute points to high-density local areas. The performances of the proposed algorithms are evaluated using ten synthetic datasets and five real-world datasets with induced missing values. The experimental results show the effectiveness of the proposed algorithms. 展开更多
关键词 density-based clustering incomplete data clustering algorihtm
原文传递
Efficient k-dominant skyline query over incomplete data using MapReduce
13
作者 Linlin DING Shu WANG Baoyan SONG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期151-164,共14页
Skyline queries are extensively incorporated in various real-life applications by filtering uninteresting data objects.Sometimes,a skyline query may return so many results because it cannot control the retrieval condi... Skyline queries are extensively incorporated in various real-life applications by filtering uninteresting data objects.Sometimes,a skyline query may return so many results because it cannot control the retrieval conditions especially for highdimensional datasets.As an extension of skyline query,the kdominant skyline query reduces the control of the dimension by controlling the value of the parameter k to achieve the purpose of reducing the retrieval objects.In addition,with the continuous promotion of Bigdata applications,the data we acquired may not have the entire content that people wanted for some practically reasons of delivery failure,no power of battery,accidental loss,so that the data might be incomplete with missing values in some attributes.Obviously,the k-dominant skyline query algorithms of incomplete data depend on the user definition in some degree and the results cannot be shared.Meanwhile,the existing algorithms are unsuitable for directly used to the incomplete big data.Based on the above situations,this paper mainly studies k-dominant skyline query problem over incomplete dataset and combines this problem with the distributed structure like MapReduce environment.First,we propose an index structure over incomplete data,named incomplete data index based on dominate hierarchical tree(ID-DHT).Applying the bucket strategy,the incomplete data is divided into different buckets according to the dimensions of missing attributes.Second,we also put forward query algorithm for incomplete data in MapReduce environment,named MapReduce incomplete data based on dominant hierarchical tree algorithm(MR-ID-DHTA).The data in the bucket is allocated to the subspace according to the dominant condition by Map function.Reduce function controls the data according to the key value and returns the k-dominant skyline query result.The effective experiments demonstrate the validity and usability of our index structure and the algorithm. 展开更多
关键词 k-dominant skyline query incomplete data MAPREDUCE index structure big data
原文传递
Fault detection and diagnosis for data incomplete industrial systems with new Bayesian network approach 被引量:15
14
作者 Zhengdao Zhang Jinlin Zhu Feng Pan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2013年第3期500-511,共12页
For the fault detection and diagnosis problem in largescale industrial systems, there are two important issues: the missing data samples and the non-Gaussian property of the data. However, most of the existing data-d... For the fault detection and diagnosis problem in largescale industrial systems, there are two important issues: the missing data samples and the non-Gaussian property of the data. However, most of the existing data-driven methods cannot be able to handle both of them. Thus, a new Bayesian network classifier based fault detection and diagnosis method is proposed. At first, a non-imputation method is presented to handle the data incomplete samples, with the property of the proposed Bayesian network classifier, and the missing values can be marginalized in an elegant manner. Furthermore, the Gaussian mixture model is used to approximate the non-Gaussian data with a linear combination of finite Gaussian mixtures, so that the Bayesian network can process the non-Gaussian data in an effective way. Therefore, the entire fault detection and diagnosis method can deal with the high-dimensional incomplete process samples in an efficient and robust way. The diagnosis results are expressed in the manner of probability with the reliability scores. The proposed approach is evaluated with a benchmark problem called the Tennessee Eastman process. The simulation results show the effectiveness and robustness of the proposed method in fault detection and diagnosis for large-scale systems with missing measurements. 展开更多
关键词 fault detection and diagnosis Bayesian network Gaussian mixture model data incomplete non-imputation.
下载PDF
A MODEL IDENTIFICATION METHOD OF VIBRATING STRUCTURES FROM INCOMPLETE MODAL INFORMATION
15
作者 郑小平 姚振汉 蘧时胜 《Applied Mathematics and Mechanics(English Edition)》 SCIE EI 1995年第10期971-976,共6页
The accurate mathematical models for complicated structures are very difficult to construct.The work presented here provides an identification method for estimating the mass.damping,and stiffness matrices of linear dy... The accurate mathematical models for complicated structures are very difficult to construct.The work presented here provides an identification method for estimating the mass.damping,and stiffness matrices of linear dynamical systems from incomplete experimental data.The mass,stiffness and damping matrices are assumed to be real,symmetric,and positive definite The partial set of experimental complex eigenvalues and corresponding eigenvectors are given.In the proposed method the least squares algorithm is combined with the iteration technique to determine systems identified matrices and corresponding design parameters.Seeveral illustative examples,are presented to demonstrate the reliability of the proposed method .It is emphasized that the mass,damping and stiffness matrices can be identified simultaneously. 展开更多
关键词 vibrating structures model identification incomplete experiemntal modal data the least squares method iteration technique
下载PDF
A deep neural network based surrogate model for damage identification in full-scale structures with incomplete noisy measurements
16
作者 Tram BUI-NGOC Duy-Khuong LY +2 位作者 Tam T TRUONG Chanachai THONGCHOM T.NGUYEN-THOI 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2024年第3期393-410,共18页
The paper introduces a novel approach for detecting structural damage in full-scale structures using surrogate models generated from incomplete modal data and deep neural networks(DNNs).A significant challenge in this... The paper introduces a novel approach for detecting structural damage in full-scale structures using surrogate models generated from incomplete modal data and deep neural networks(DNNs).A significant challenge in this field is the limited availability of measurement data for full-scale structures,which is addressed in this paper by generating data sets using a reduced finite element(FE)model constructed by SAP2000 software and the MATLAB programming loop.The surrogate models are trained using response data obtained from the monitored structure through a limited number of measurement devices.The proposed approach involves training a single surrogate model that can quickly predict the location and severity of damage for all potential scenarios.To achieve the most generalized surrogate model,the study explores different types of layers and hyperparameters of the training algorithm and employs state-of-the-art techniques to avoid overfitting and to accelerate the training process.The approach’s effectiveness,efficiency,and applicability are demonstrated by two numerical examples.The study also verifies the robustness of the proposed approach on data sets with sparse and noisy measured data.Overall,the proposed approach is a promising alternative to traditional approaches that rely on FE model updating and optimization algorithms,which can be computationally intensive.This approach also shows potential for broader applications in structural damage detection. 展开更多
关键词 vibration-based damage detection deep neural network full-scale structures finite element model updating noisy incomplete modal data
原文传递
A visual analysis approach for data imputation via multi-party tabular data correlation strategies
17
作者 Haiyang ZHU Dongming HAN +8 位作者 Jiacheng PAN Yating WEI Yingchaojie FENG Luoxuan WENG Ketian MAO Yuankai XING Jianshu LV Qiucheng WAN Wei CHEN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第3期398-414,共17页
Data imputation is an essential pre-processing task for data governance,aimed at filling in incomplete data.However,conventional data imputation methods can only partly alleviate data incompleteness using isolated tab... Data imputation is an essential pre-processing task for data governance,aimed at filling in incomplete data.However,conventional data imputation methods can only partly alleviate data incompleteness using isolated tabular data,and they fail to achieve the best balance between accuracy and eficiency.In this paper,we present a novel visual analysis approach for data imputation.We develop a multi-party tabular data association strategy that uses intelligent algorithms to identify similar columns and establish column correlations across multiple tables.Then,we perform the initial imputation of incomplete data using correlated data entries from other tables.Additionally,we develop a visual analysis system to refine data imputation candidates.Our interactive system combines the multi-party data imputation approach with expert knowledge,allowing for a better understanding of the relational structure of the data.This significantly enhances the accuracy and eficiency of data imputation,thereby enhancing the quality of data governance and the intrinsic value of data assets.Experimental validation and user surveys demonstrate that this method supports users in verifying and judging the associated columns and similar rows using theirdomain knowledge. 展开更多
关键词 data governance data incompleteness data imputation data visualization Interactive visual analysis
原文传递
Flexible Factor Model for Handling Missing Data in Supervised Learning
18
作者 Andriette Bekker Farzane Hashemi Mohammad Arashi 《Communications in Mathematics and Statistics》 SCIE CSCD 2023年第2期477-501,共25页
This paper presents an extension of the factor analysis model based on the normal mean-variance mixture of the Birnbaum-Saunders in the presence of nonresponses and missing data.This model can be used as a powerful to... This paper presents an extension of the factor analysis model based on the normal mean-variance mixture of the Birnbaum-Saunders in the presence of nonresponses and missing data.This model can be used as a powerful tool to model non-normal features observed from data such as strongly skewed and heavy-tailed noises.Missing data may occur due to operator error or incomplete data capturing therefore cannot be ignored in factor analysis modeling.We implement an EM-type algorithm for maximum likelihood estimation and propose single imputation of possible missing values under a missing at random mechanism.The potential and applicability of our proposed method are illustrated through analyzing both simulated and real datasets. 展开更多
关键词 Automobile dataset Asymmetry ECME algorithm Factor analysis model Heavy tails incomplete data Liver disorders dataset
原文传递
Regression Analysis of Right-censored Failure Time Data with Missing Censoring Indicators
19
作者 Ping Chen Ren He +1 位作者 Jun-shan Shen Jian-guo Sun 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2009年第3期415-426,共12页
This paper discusses regression analysis of right-censored failure time data when censoring indicators are missing for some subjects. Several methods have been developed for the analysis under different situations and... This paper discusses regression analysis of right-censored failure time data when censoring indicators are missing for some subjects. Several methods have been developed for the analysis under different situations and especially, Goetghebeur and Ryan considered the situation where both the failure time and the censoring time follow the proportional hazards models marginally and developed an estimating equation approach. One limitation of their approach is that the two baseline hazard functions were assumed to be proportional to each other. We consider the same problem and present an efficient estimation procedure for regression parameters that does not require the proportionality assumption. An EM algorithm is developed and the method is evaluated by a simulation study, which indicates that the proposed methodology performs well for practical situations. An illustrative example is provided. 展开更多
关键词 Efficient estimation em algorithm incomplete data missing at random Proportional hazards model
原文传递
Tolerance Limits Under Gamma Mixtures:Application in Hydrology
20
作者 JIAO Junjun CHENG Weihu 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2023年第3期1285-1301,共17页
In this study,the authors proposed upper tolerance limits for the gamma mixture distribution based on generalized fiducial inference,and an MCMC simulation is performed to sample from the generalized fiducial distribu... In this study,the authors proposed upper tolerance limits for the gamma mixture distribution based on generalized fiducial inference,and an MCMC simulation is performed to sample from the generalized fiducial distributions.The simulation results and a real hydrological data example show that the proposed tolerance limits are more efficient. 展开更多
关键词 Gamma mixture distribution generalized fiducial inference incomplete data latent variable Markov chain Monte Carlo
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部