期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism
1
作者 Jingqi Zhao chuitian rong +1 位作者 Xin Dang Huabo Sun 《Big Data Mining and Analytics》 EI CSCD 2024年第1期12-28,共17页
Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the reco... Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the recorded data have certain missing values due to factors,such as weather and equipment anomalies.These missing values seriously affect the analysis of QAR data by aeronautical engineers,such as airline flight scenario reproduction and airline flight safety status assessment.Therefore,imputing missing values in the QAR data,which can further guarantee the flight safety of airlines,is crucial.QAR data also have multivariate,multiprocess,and temporal features.Therefore,we innovatively propose the imputation models A-AEGAN("A"denotes attention mechanism,"AE"denotes autoencoder,and"GAN"denotes generative adversarial network)and SA-AEGAN("SA"denotes self-attentive mechanism)for missing values of QAR data,which can be effectively applied to QAR data.Specifically,we apply an innovative generative adversarial network to impute missing values from QAR data.The improved gated recurrent unit is then introduced as the neural unit of GAN,which can successfully capture the temporal relationships in QAR data.In addition,we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator.The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator.We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data.Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data.Furthermore,we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data.Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results. 展开更多
关键词 multivariate time series data imputation self-attention Generative Adversarial Network(GAN)
原文传递
面向多源关系数据的融合 被引量:10
2
作者 丁玥 王涓 +2 位作者 卢卫 荣垂田 杜小勇 《中国科学:信息科学》 CSCD 北大核心 2020年第5期649-661,共13页
针对"信息孤岛"中的关系数据融合问题,本文提出并实现了多源关系数据融合的基本框架(multi-source relational data fusion,MSF).框架包含3个主要模块:模式匹配、实体对齐、实体融合.模式匹配面向多源关系数据的属性对齐问题... 针对"信息孤岛"中的关系数据融合问题,本文提出并实现了多源关系数据融合的基本框架(multi-source relational data fusion,MSF).框架包含3个主要模块:模式匹配、实体对齐、实体融合.模式匹配面向多源关系数据的属性对齐问题,结合属性值的多维特征,提出基于匈牙利(Hungarian)算法的属性间对齐发现机制,实现了多源关系数据的快速模式匹配.实体对齐连接多源关系中的元组对,通过引入多样性取样策略和实体特征抽取方法,提升了实体对齐的效果.最后将对齐实体进行融合,为数据分析提供统一的数据视图.为了验证MSF的效果和效率,实现了数据融合系统DataPuzzle,并在该系统上,结合真实公开的多领域数据,对提出的方法进行了验证.结果表明,所提出的方法可以高效地实现数据融合,具有较高的查全率、查准率. 展开更多
关键词 多源异构数据 关系数据 信息孤岛 模式匹配 实体对齐 数据融合
原文传递
String similarity join with different similarity thresholds based on novel indexing techniques 被引量:2
3
作者 chuitian rong Yasin N. SILVA Chunqing LI 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第2期307-319,共13页
String similarity join is an essential operation of many applications that need to find all similar string pairs from two given collections. A quantitative way to determine whether two strings are similar is to comput... String similarity join is an essential operation of many applications that need to find all similar string pairs from two given collections. A quantitative way to determine whether two strings are similar is to compute their similarity based on a certain similarity function. The string pairs with similarity above a certain threshold are regarded as results. The current approach to solving the similarity join problem is to use a unique threshold value. There are, however, several scenarios that require the support of multiple thresholds, for instance, when the dataset includes strings of various lengths. In this scenario, longer string pairs typically tolerate much more typos than shorter ones. Therefore, we proposed a so- lution for string similarity joins that supports different simi- larity thresholds in a single operator. In order to support dif- ferent thresholds, we devised two novel indexing techniques: partition based indexing and similarity aware indexing. To utilize the new indices and improve the join performance, we proposed new filtering methods and index probing tech- niques. To the best of our knowledge, this is the first work that addresses this problem. Experimental results on real-world datasets show that our solution performs efficiently while pro- viding a more flexible threshold specification. 展开更多
关键词 similarity join similarity aware index similarity thresholds
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部