摘要
为了提高对海量数据的检测过滤能力,提出基于概率数学模型的数据过滤方法,采用描述性统计分析方法构建大数据过滤的统计特征分析模型,采用高阶累积量进行数据过滤的概率密度特征统计量设计,结合模糊数学推理进行数据过滤的检测统计量分析,在海量数据环境下根据检测统计量分布的概率密度进行回归分析,采用阈值检验和门限判决方法,实现数据过滤。数据测试结果表明,采用该方法进行大数据过滤的准确性较好,数学模型的可靠性和收敛性较好。
In order to improve the ability of detecting and filtering mass data,a method of data filtering based on probabilistic mathematical model is proposed,and the statistical characteristic analysis model of big data filtering is constructed by descriptive statistical analysis method.High order cumulant is used to design the probability density characteristic statistics of data filtering,and fuzzy mathematics reasoning is used to analyze the detection statistics of data filtering.In the environment of massive data,regression analysis is carried out according to the probability density of the distribution of detection statistics,and the method of threshold test and threshold decision is adopted to realize data filtering.The results of data test show that the accuracy of the method for big data filtering is better,and the reliability and convergence of the mathematical model are better.
作者
汪苗苗
焦学磊
Wang Miaomiao;Jiao Xuelei(Information Engineering,Sichuan Technology&Business College,Dujiangyan Sichuan 611830,China;State Owned Assets Management Office,Sichuan Vocational College of Finance And Economics,Chengdu Sichuan 610101,China)
出处
《科技通报》
2019年第6期20-23,57,共5页
Bulletin of Science and Technology
关键词
概率数学模型
数据过滤
回归分析
检验统计
probabilistic mathematical model
filtering data
regression analysis
inspection statistics