期刊文献+

多核的并行相似连接

Parallel Similarity Join of Multi-core
下载PDF
导出
摘要 相似连接(similarity join)是指在给定的数据集中,根据给定的相似度度量函数来衡量数据之间的相似度,并找出所有相似度不小于给定阈值的数据对的操作。随着网络和移动应用等信息技术的不断发展,数据呈现爆炸式增长,海量数据的分析需要强大的计算能力,相似连接成为大数据处理领域的热点方式之一。传统的单核计算机平台的处理能力已经很难满足海量数据处理的计算要求。为了提高计算效率和性能,利用基于多核平台的多线程并行编程发挥多核体系结构的优势,已经成为实现个人低成本并行计算和多核技术发展的趋势。因此,为了提高相似连接的效率,充分利用现代体系结构的多核特性和多线程技术,提出了相似连接并行化的改进方法。实验结果表明,使用该方法极大地提升了效率。 Similar join is an operation which is using a given similarity function to measure the similarity between data and find out all similarity less than a given threshold in a given data set. With the continuous development of Internet and mobile applications, the amount of data is increasing explosively, and along with the analyzing of huge amount of data,it requires a strong ability of calculation, so similar joins become one of the leading way of hotspots in the field of data processing. The processing capacity of traditional single-core comput- er platform has been difficult to meet the calculation of mass data processing requirements. Programming based on multi-core platform and using the multi-thread parallel can make full use of the advantage of multi-core architecture and improve the computational efficien- cy and computational performance, which has become the trend to realize personal low cost calculation and the development of multi-core technology. Therefore, based on the characteristics of multi-core and multi-thread technology, the improved method of similar connected parallelization is proposed. The experimental results show that the efficiency has been obviously improved.
作者 冯林静
机构地区 天津工业大学
出处 《计算机技术与发展》 2017年第7期43-46,50,共5页 Computer Technology and Development
基金 国家自然科学基金资助项目(61402329)
关键词 多核 多线程 并行 相似连接 multi-core multi-thread parallel similar join
  • 相关文献

参考文献5

二级参考文献34

  • 1薛巍.多核课程建设[J].计算机教育,2007(06S):40-43. 被引量:9
  • 2GRAMAA.并行计算导论[M].张武,译.北京:机械工业出版社,2005.
  • 3Intel软件学院教材编写组.多核多线程技术[M].上海:上海交通大学出版社,2011.
  • 4AKHTEH S, ROBERTS J.多核程序设计技术-通过软件多线程提升性能[M].李宝峰,富弘毅,李韬,译.北京:电子工业出版社’2007.
  • 5周伟明.多核编程中的负载平衡难题[EB/OL].[2013-04-01].http: //blog. csdn. net/drzliouweimin^archive/2007/04/17/1568364. aspx.
  • 6Alfred J. Park, Kalyan S. Perumalla. Efficient Heterogeneous Execution on Large Multicore and Accelerator Platforms: Case Study Us- ing a Block Tridiagonal Solver[C]. Journal of Parallel and Distributed Computing, 2013, 73( 12): 1578-1591.
  • 7Meng-Ju WU, Min-shu ZHAO, Donald Yeung. 2013. Studying Muhicore Processor Scaling Via Reuse Distance Analysis[C]. SIGARCH Computer Architecture News, 2013, 41 (3): 499-510.
  • 8(美)戈夫(Darryl Gove).多核应用编程实战[M].郭晴霞译.北京:人民邮电出版社,2013(6).
  • 9Bertossi L, Kolahi S, Lakshmanan L. Data cleaning and query answering with matching dependencies and matching functions. In: Abiteboul S, B6hm K, Koch C, Tan KL, eds. Proc. of the 27th Int'l Conf. on Data Engineering. Hannover: IEEE Computer Society, 2011. 268-279. [doi: 10.1145/1938551,1938585].
  • 10Dong X, Halevy AY, Yu C. Data integration with uncertainty. In: Koch C, Gehrke J, Garofalakis MN, Srivastava D Aberer K, Deshpande A, Florescu D, Chart CY, Ganti V, Kanne CC, Klas WJ, Neuhold E, eds. Proc. of the 33rd Int'l Conf. on Very Large Data Bases. Vienna: ACM Press, 2007. 687-698.

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部