期刊文献+

基于真值发现的加油站车辆号牌缺损数据填充方法

GAS STATION VEHICLE PLATE DEFECT DATA FILLING BASED ON TRUTH DISCOVERY
下载PDF
导出
摘要 由于数据采集设备的不完善以及数据在传输过程中的不可靠性等原因,致使加油站车辆加油数据中经常会产生数据的丢失和错误,降低了车辆加油数据的完整性,严重影响了后续的数据分析工作。目前虽然已有许多算法可以解决连续型数值数据的缺损问题,但是它们并不适用于车辆号牌这种离散型分类数据。提出一种基于改进TruthFinder算法的缺损值填充框架。基于真值发现算法,考虑到离散数据相似度的计算方式,改进原算法对数据值支持度的计算模型。通过在真实加油站车辆数据集上的实验,相较于原算法及更加通用的Voting算法,正确率分别提升了7%和23%。该方法能部分解决类似加油站车辆加油数据这种多源离散型数据的缺损值填充问题,大大提高了此数据的可用性。 Due to the imperfection of the data acquisition equipment and the unreliability of the data in the transmission process,data loss and errors often occur in the fueling data of the gas station vehicles.These problems reduce the integrity of the vehicle fueling data and seriously affect the subsequent data analysis work.Although there are many algorithms that can solve the problem of continuous numerical data defects,but they are not suitable for discrete classification data such as vehicle plates.Therefore,this paper proposed a defect-filling framework based on the improved TruthFinder algorithm.Its framework improved the calculation model of the data support by considering the calculation method of discrete data similarity.Through experiments on real gas station vehicle datasets,compared with the original algorithm and the Voting algorithm,the correct rates have increased by 7% and 23% respectively.The method can partially solve the problem of filling the defect value of the multi-source discrete data such as gas station vehicle fueling data,and greatly improves it s availability.
作者 彭新亮 程力 王轶 马博 赵凡 周喜 Peng Xinliang;Cheng Li;Wang Yi;Ma Bo;Zhao Fan;Zhou Xi(The Xinjiang Technical Institute of Physics and Chemistry,Chinese Academy of Sciences,Urumqi 830011,Xinjiang,China;University of the Chinese Academy of Sciences,Beijing 100049,China;Xinjiang Laboratory of Minority Speech and Language Information Processing,Urumqi 830011,Xinjiang,China)
出处 《计算机应用与软件》 北大核心 2019年第8期41-46,74,共7页 Computer Applications and Software
基金 2017“天山雪松计划”项目(2017XS05) 新疆维吾尔自治区十三五重大专项(2016A03007-2)
关键词 数据清洗 车辆加油数据 缺失数据填充 真值发现 Data cleaning Gas station data Defect data filling Truth discovery
  • 相关文献

参考文献8

二级参考文献111

  • 1毕方明,张永平.数据挖掘技术研究[J].计算机工程与设计,2004,25(12):2242-2244. 被引量:28
  • 2邹志文,朱金伟.数据挖掘算法研究与综述[J].计算机工程与设计,2005,26(9):2304-2307. 被引量:52
  • 3杨明花,古志民.基于兴趣特征的WUM数据预处理方法[J].计算机应用,2006,26(10):2393-2394. 被引量:3
  • 4杨军,邹国华.比例Bootstrap及其方差估计的相合性[J].中国科学院研究生院学报,2007,24(3):273-279. 被引量:2
  • 5Lakshminarayan K,Harp S A,Samad T.Imputation of missing data in industrial databases[J].Applied Intelligence,1999,11:259-275.
  • 6Li K H.Imputation using Markov chains[J].Journal of Statisticalt Comput Simul,1988,30:57-79.
  • 7Little R J,Rubin D B.Statistical analysis with missing data[M].[S.l] :John Wiley and Sons,1987.
  • 8Gustavo E A,Batista P A,Monard M C.An analysis of four missing data treatment methods for supervised learning[J].Applied Artificial Intelligence,2003,17(5/6):519-533.
  • 9Huang C,Lee H.A grey-based nearegt neighbor approach for missing attribute value prediction[J].Applied Artificial Intelligence,2004,20(3):239-252.
  • 10Hruschka E R,Jr,Ebecken N F F.Missing values prediction with K2[J].Intelligent Data Allalysis,2002,6(6):557-566.

共引文献185

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部