期刊文献+

一种大数据估价算法 被引量:2

Big Data Valuation Algorithm
下载PDF
导出
摘要 “大数据”已经成为计算机领域使用频率最高的专业词汇之一,而且已经逐渐变成了一个商品名称。无论是从学术研究角度,还是从数据交易需求角度,对大数据集的可用性进行评价都是一个新的问题。文中提出了一个大数据可用性评价模型,为学术和流通领域提供参考。结合大数据的4V(Volume,Variety,Velocity,Value)特性,分段统计样本数据的4V特性分布,从而给出基于分段分布的大数据特性概率模型,以及大数据可用性加权评价模型。文中还提出了实现大数据分块抽样的算法,以及大数据评价模型的各个特性加权系数的估计算法。结合视频大数据的可用性评价需求,展示所提模型和算法的具体应用。大数据可用性评价模型可以用于数据科学实验的数据评价,也可以用于大数据交易市场的数据集定价。给出了实际评价工作中,标准化(商品化)数据集以及确定数据评价基准等具体操作方面的解决方案。应用案例对所提模型有支持作用,进一步检验了模型的可行性。 With the rapid development of information technology,the generation of data has shown an exponential growth trend.Big data has become one of the most frequently used words due to the rapid emergence of big data and its great value.It is not only an academic vocabulary,but has gradually become a commodity name.Whether from academic research or data trading needs,how to evaluate the availability of big data sets is a new issue.A big data usability evaluation model is proposed to provide refe-rence for academic and circulation fields in this paper.Combined with the 4V(Volume,Variety,Velocity,Value)characteristics of big data,the 4V characteristic distribution of the statistical data is segmented,which gives the probability model of big data based on the piecewise distribution and the availability of large data sets and weighted evaluation model.An algorithm for realizing big data block sampling and an estimation algorithm for weighting coefficients of each characteristic in the big data set evaluation model are proposed.Combined with the data availability evaluation requirements in video big data analysis,the specific applications of the proposed models and algorithms are demonstrated.The big data usability evaluation model can be used for data evalua-tion of data science experiments,and can also be used for data set pricing in big data transaction markets.In the actual evaluation work,how to standardize(commercialized)data sets,and how to determine the specific operational aspects of the video field eva-luation benchmarks are given.The application case supports the proposed model and further tests the feasibility of the model.
作者 赵会群 吴凯锋 ZHAO Hui-qun;WU Kai-feng(College of Computer Science and Technology,North China University of Technology,Beijing 100144,China;Beijing Key Laboratory of Large-scale Stream Data Integration and Analysis Technology,North China University of Technology,Beijing 100144,China)
出处 《计算机科学》 CSCD 北大核心 2020年第9期110-116,共7页 Computer Science
基金 国家自然科学基金项目(61672041)。
关键词 大数据可用性评价 概率模型 大数据分块算法 视频大数据 Big data availability evaluation Probability model Big data blocking algorithm Video big data
  • 相关文献

参考文献9

二级参考文献228

  • 1李孟来.我国个人信用评分模型的应用探讨[J].金融管理与研究(杭州金融研修学院学报),2009(2):51-53. 被引量:3
  • 2沈翠华,邓乃扬,肖瑞彦.基于支持向量机的个人信用评估[J].计算机工程与应用,2004,40(23):198-199. 被引量:19
  • 3潘晓,肖珍,孟小峰.位置隐私研究综述[J].计算机科学与探索,2007,1(3):268-281. 被引量:65
  • 4姜明辉,谢行恒,王树林,温潇.个人信用评估的Logistic-RBF组合模型[J].哈尔滨工业大学学报,2007,39(7):1128-1130. 被引量:16
  • 5Gruteser M, Grunwald D. Anonymous usage of locationbased services through spatial and temporal cloaking//Proceedings of the 1st International Conference on Mobile Sys tems, Applications, and Services (MobiSys 2003). San Fransisco, 2003: 31 -42.
  • 6Mokbel M F, Chow C Y, Aref W G. The newcasper: Query processing for location services withoutcompromising privacy//Proceedings of the 32nd Conference of Very Large Databases (VLDB 2006). Seoul, 2006: 763-774.
  • 7Bamba B, Liu L. Supporting anonymous location queries in mobile environments with privacy grid//Proceeding of the 17th International Conference on World Wide Web (WWW 2008). Beijing, 2008:237-246.
  • 8Pan X, Meng X, Xu J. Distortion-based anonymity for continuous queries in location-based mobile services//Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS 2009). Washington, 2009:256-265.
  • 9Krumm J. A survey of computational location privacy. Personal and Ubiquitous Computing, 2009, 13(6): 391-399.
  • 10Bettini C, Wang S X, Jajodia S. Protecting privacy against location-based personal identification//Proceedings of the 2nd VLDB workshop on Secure Data Management (SDM2005). Trondheim, 2005:185-199.

共引文献519

同被引文献44

引证文献2

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部