5Apache H. What is apache hadoop?[EB/OL]. 2013-08-26[2016-04-13]. http://hadoop.apache.org.
6Dean J, Ghemawat S. MapReduce: Simplified data processing on large cluster[J]. Communications of the ACM, 2008, 51(1): 107-113.
7Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX Conference on Networked Systems Design and hnplementation. Berkeley, CA: USENIX Association, 2012: 141-146.
8Lublinsky B, Smith K T, Yakubovich A. Professional hadoop solutions[M]. Birmingham: Wrox Press, 2013.
9Gartner Research Report. Magic quadrant for data quality tools [EB/OL]. [2016-04-12]. http://useready.com/wp-contenffuploads/2013/07/Gartner-Data- Quality-2012.pdf.
10Gonzalez J E, Low Y, Gu H, et al. Powergraph: Distributed graph-parallel computation on natural graphs[C]//Pmceedings of the 10th USENIX Sympo- sium on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2012: 17-30.