摘要
任意一个分布式系统都必须满足CAP理论,在数据分析分析系统中,最为重要的是效率以及可靠性,而数据采集时整个分析系统的基石,构建基于改进的Flume的实时数据采集系统,通过Flume采集数据,采用复合型Channel与Flume相结合,在保证数据源的丰富性和可靠性的前提下,提高采集的效率。实验结果表明,该系统的各项功能符合预期结果,Flume使用复合型Channel可以提高采集效率。
Any distributed system must meet the CAP theory.In a data analysis system,the most important thing is efficiency and reliability.The cornerstone of the entire analysis system for data collection is to build a real-time data collection system based on improved Flume.Collect data through flume,and use the combination of composite channel and flume to improve the efficiency of collection while ensuring the richness and reliability of the data source.The experimental results show that the functions of the system meet the expected results.Flume uses the composite channel can improve the collection efficiency.
作者
朱涛
孙知信
宫婧
ZHU Tao;SUN Zhixin;GONG Jing(Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu Province,210023 China)
出处
《科技资讯》
2021年第11期73-75,79,共4页
Science & Technology Information