In the era of Big Data, typical architecture of distributed real-time stream processing systems is the combination of Flume, Kafka, and Storm. As a kind of distributed message system, Kafka has the characteristics of ...In the era of Big Data, typical architecture of distributed real-time stream processing systems is the combination of Flume, Kafka, and Storm. As a kind of distributed message system, Kafka has the characteristics of horizontal scalability and high throughput, which is manly deployed in many areas in order to address the problem of speed mismatch between message producers and consumers. When using Kafka, we need to quickly receive data sent by producers. In addition, we need to send data to consumers quickly. Therefore, the performance of Kafka is of critical importance to the performance of the whole stream processing system. In this paper, we propose the improved design of real-time stream processing systems, and focus on improving the Kafka's data loading process.We use Kafka cat to transfer data from the source to Kafka topic directly, which can reduce the network transmission. We also utilize the memory file system to accelerate the process of data loading, which can address the bottleneck and performance problems caused by disk I/O. Extensive experiments are conducted to evaluate the performance, which show the superiority of our improved design.展开更多
In this paper we describe how progressive download and adaptive streaming can be combined into a simple and efficient streaming framework. Based on the MPEG-4 file format (MP4) we use HTTP for transport and argue that...In this paper we describe how progressive download and adaptive streaming can be combined into a simple and efficient streaming framework. Based on the MPEG-4 file format (MP4) we use HTTP for transport and argue that these two components are sufficient for specifying an open streaming architecture. The client selects appropriate chunks from the MP4 file to be transferred based on (1) the header information (i.e. the 'moov' box) in the first part of the file and (2) his observation of network throughput. The framework is completely client driven which allows for better server scalability and reduces signalling overhead. We discuss architecture and implementation issues such as complexity, interoperability and scalability and compare to 3GPP PSS Re1-6 adaptive streaming when appropriate. Measurements from a simple MP4/HTTP streaming client are presented showing that appropriate chunks are selected such that increased reliability is achieved.展开更多
对基于HTTP live streaming(简称HLS)协议的流媒体直播系统的工作原理以及结构进行了系统的介绍,利用网络数据分析仪来深入研究客户端与服务器的交互传输过程,剖析了这种技术应用于网络电视直播时所存在的视频流索引文件重复下载的冗余...对基于HTTP live streaming(简称HLS)协议的流媒体直播系统的工作原理以及结构进行了系统的介绍,利用网络数据分析仪来深入研究客户端与服务器的交互传输过程,剖析了这种技术应用于网络电视直播时所存在的视频流索引文件重复下载的冗余问题,在一定程度上造成了网络流量的浪费,降低了传输效率.针对这类问题提出了一种改善的方法——标志法,即在.m3u8文件中添加一个新标签,并通过计算推导和实验验证把改善前后的效果进行了定量分析和对比,实验结果表明该方法可有效降低流量浪费率,提高传输性能,具有较大的可行性.展开更多
基金supported by the Research Fund of National Key Laboratory of Computer Architecture under Grant No.CARCH201501the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing under Grant No.2016A09
文摘In the era of Big Data, typical architecture of distributed real-time stream processing systems is the combination of Flume, Kafka, and Storm. As a kind of distributed message system, Kafka has the characteristics of horizontal scalability and high throughput, which is manly deployed in many areas in order to address the problem of speed mismatch between message producers and consumers. When using Kafka, we need to quickly receive data sent by producers. In addition, we need to send data to consumers quickly. Therefore, the performance of Kafka is of critical importance to the performance of the whole stream processing system. In this paper, we propose the improved design of real-time stream processing systems, and focus on improving the Kafka's data loading process.We use Kafka cat to transfer data from the source to Kafka topic directly, which can reduce the network transmission. We also utilize the memory file system to accelerate the process of data loading, which can address the bottleneck and performance problems caused by disk I/O. Extensive experiments are conducted to evaluate the performance, which show the superiority of our improved design.
文摘In this paper we describe how progressive download and adaptive streaming can be combined into a simple and efficient streaming framework. Based on the MPEG-4 file format (MP4) we use HTTP for transport and argue that these two components are sufficient for specifying an open streaming architecture. The client selects appropriate chunks from the MP4 file to be transferred based on (1) the header information (i.e. the 'moov' box) in the first part of the file and (2) his observation of network throughput. The framework is completely client driven which allows for better server scalability and reduces signalling overhead. We discuss architecture and implementation issues such as complexity, interoperability and scalability and compare to 3GPP PSS Re1-6 adaptive streaming when appropriate. Measurements from a simple MP4/HTTP streaming client are presented showing that appropriate chunks are selected such that increased reliability is achieved.
文摘对基于HTTP live streaming(简称HLS)协议的流媒体直播系统的工作原理以及结构进行了系统的介绍,利用网络数据分析仪来深入研究客户端与服务器的交互传输过程,剖析了这种技术应用于网络电视直播时所存在的视频流索引文件重复下载的冗余问题,在一定程度上造成了网络流量的浪费,降低了传输效率.针对这类问题提出了一种改善的方法——标志法,即在.m3u8文件中添加一个新标签,并通过计算推导和实验验证把改善前后的效果进行了定量分析和对比,实验结果表明该方法可有效降低流量浪费率,提高传输性能,具有较大的可行性.