With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capac...With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.展开更多
In multimedia conference, the capability of audio processing is basic and requires more for real-time criteria. In this article, we categorize and analyze the schemes, and provide several multipoint speech audio mixin...In multimedia conference, the capability of audio processing is basic and requires more for real-time criteria. In this article, we categorize and analyze the schemes, and provide several multipoint speech audio mixing schemes using weighted algorithm, which meet the demand of practical needs for real-time multipoint speech mixing, for which the ASW and AEW schemes are especially recommended. Applying the adaptive algorithms, the high-performance schemes we provide do not use the saturation operation widely used in multimedia processing. Therefore, no additional noise will be added to the output. The above adaptive algorithms have relatively low computational complexity and good hearing perceptibility. The schemes are designed for parallel processing, and can be easily implemented with hardware, such as DSPs, and widely applied in multimedia conference systems.展开更多
With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation method...With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation methods very attractive. In this paper, we propose a two-phase rate adaptation strategy to improve users' real-time video Qo E. First, to measure and assess video Qo E, we provide a continuous Qo E prediction engine modeled by RNN recurrent neural network. Different from traditional Qo E models which consider the Qo E-aware factors separately or incompletely, our RNN-Qo E model accounts for three descriptive factors(video quality, rebuffering, and rate change) and reflects the impact of cognitive memory and recency. Besides, the video playing is separated into the initial startup phase and the steady playback phase, and we takes different optimization goals for each phase: the former aims at shortening the startup delay while the latter ameliorates the video quality and the rebufferings. Simulation results have shown that RNN-Qo E can follow the subjective Qo E quite well, and the proposed strategy can effectively reduce the occurrence of rebufferings caused by the mismatch between the requested video rates and the fluctuated throughput and attains standout performance on real-time Qo E compared with classical rate adaption methods.展开更多
Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to th...Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to the visually disadvantaged.In this study,a corpus of AD of COVID-19 educational videos is developed,named“Audio Description Corpus of COVID-19 Educational Videos”(ADCCEV).Drawing on the model of Textual and Linguistic Audio Description Matrix(TLADM),this paper aims to identify the linguistic and textual idiosyncrasies of AD themed on COVID-19 response released by the New Zealand Government.This study finds that linguistically,the AD script uses a mix of complete sentences and phrases,the majority being in Present Simple tense.Present participles and the“with”structure are used for brevity.Vocabulary is diverse,with simpler words for animated explainers.Third-person pronouns are common in educational videos.Color words are a salient feature of AD,where“yellow”denotes urgency,and“red”indicates importance,negativity,and hostility.On textual idiosyncrasies,coherence is achieved through intermodal components that align with the video’s mood and style.AD style varies depending on the video’s purpose,from informative to narrative or expressive.展开更多
BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场...BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场定位等话题展开。曹徐洋:在今年的BIRTV展会上,森海塞尔和诺音曼的展台都展出了大量优秀的产品,这些产品里有哪些是重点推出的?请介绍一下它们的主要亮点。展开更多
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at ...A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.展开更多
An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The a...An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The all-in-one machine of 3G audio and video network highly integrates all front-end devices used for audio and video collection, communication, power supply and information storage, and has advantages of wireless video transmission, clear two-way voice intercom with the command center, waterproof and dustproof function, simple operation, good portability, and long working hours. Compression code of the system is transmitted by dynamic bandwidth, and compression rate varies from 32 kbps to 4 Mbps under different network conditions. This system has forwarding mode, that is, monitoring information from each front-end monitoring point is trans- mitted to the server of the command center by 3G/ADSL, and the server codes'and decodes again, then beck-end users call images from the serv- er, which can address 3G network stoppage caused by many users calling front-end video at the same time. In addition, the system has been ap- plied in surface weather modification operation of Tai'an City, and has made a great contribution to transmitting operation orders in real time, monitoring, standardizing and recording operating process, and improving operating safety.展开更多
With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. ...With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. A large number of cyber words, phrases even sentences, which are different from conventional English, are formed and become popular in the cyber world. This paper discusses different markers of net news by taking Internet video news and Internet audio news as examples so that the readers can fully understand the properties of net news.展开更多
In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned...In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.展开更多
文摘With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.
文摘In multimedia conference, the capability of audio processing is basic and requires more for real-time criteria. In this article, we categorize and analyze the schemes, and provide several multipoint speech audio mixing schemes using weighted algorithm, which meet the demand of practical needs for real-time multipoint speech mixing, for which the ASW and AEW schemes are especially recommended. Applying the adaptive algorithms, the high-performance schemes we provide do not use the saturation operation widely used in multimedia processing. Therefore, no additional noise will be added to the output. The above adaptive algorithms have relatively low computational complexity and good hearing perceptibility. The schemes are designed for parallel processing, and can be easily implemented with hardware, such as DSPs, and widely applied in multimedia conference systems.
基金supported by the National Nature Science Foundation of China(NSFC 60622110,61471220,91538107,91638205)National Basic Research Project of China(973,2013CB329006),GY22016058
文摘With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation methods very attractive. In this paper, we propose a two-phase rate adaptation strategy to improve users' real-time video Qo E. First, to measure and assess video Qo E, we provide a continuous Qo E prediction engine modeled by RNN recurrent neural network. Different from traditional Qo E models which consider the Qo E-aware factors separately or incompletely, our RNN-Qo E model accounts for three descriptive factors(video quality, rebuffering, and rate change) and reflects the impact of cognitive memory and recency. Besides, the video playing is separated into the initial startup phase and the steady playback phase, and we takes different optimization goals for each phase: the former aims at shortening the startup delay while the latter ameliorates the video quality and the rebufferings. Simulation results have shown that RNN-Qo E can follow the subjective Qo E quite well, and the proposed strategy can effectively reduce the occurrence of rebufferings caused by the mismatch between the requested video rates and the fluctuated throughput and attains standout performance on real-time Qo E compared with classical rate adaption methods.
文摘Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to the visually disadvantaged.In this study,a corpus of AD of COVID-19 educational videos is developed,named“Audio Description Corpus of COVID-19 Educational Videos”(ADCCEV).Drawing on the model of Textual and Linguistic Audio Description Matrix(TLADM),this paper aims to identify the linguistic and textual idiosyncrasies of AD themed on COVID-19 response released by the New Zealand Government.This study finds that linguistically,the AD script uses a mix of complete sentences and phrases,the majority being in Present Simple tense.Present participles and the“with”structure are used for brevity.Vocabulary is diverse,with simpler words for animated explainers.Third-person pronouns are common in educational videos.Color words are a salient feature of AD,where“yellow”denotes urgency,and“red”indicates importance,negativity,and hostility.On textual idiosyncrasies,coherence is achieved through intermodal components that align with the video’s mood and style.AD style varies depending on the video’s purpose,from informative to narrative or expressive.
文摘BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场定位等话题展开。曹徐洋:在今年的BIRTV展会上,森海塞尔和诺音曼的展台都展出了大量优秀的产品,这些产品里有哪些是重点推出的?请介绍一下它们的主要亮点。
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
基金Supported by the Science Item of National Power Company( No.SPKJ0 16 -0 71)
文摘A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.
基金Supported by the Integration and Application Project of Meteorological Key Technology of China Meteorological Administration(CMAGJ2012M30) Technology Development Projects of Tai'an Science and Technology Bureau in 2010 (201002045) and 2011
文摘An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The all-in-one machine of 3G audio and video network highly integrates all front-end devices used for audio and video collection, communication, power supply and information storage, and has advantages of wireless video transmission, clear two-way voice intercom with the command center, waterproof and dustproof function, simple operation, good portability, and long working hours. Compression code of the system is transmitted by dynamic bandwidth, and compression rate varies from 32 kbps to 4 Mbps under different network conditions. This system has forwarding mode, that is, monitoring information from each front-end monitoring point is trans- mitted to the server of the command center by 3G/ADSL, and the server codes'and decodes again, then beck-end users call images from the serv- er, which can address 3G network stoppage caused by many users calling front-end video at the same time. In addition, the system has been ap- plied in surface weather modification operation of Tai'an City, and has made a great contribution to transmitting operation orders in real time, monitoring, standardizing and recording operating process, and improving operating safety.
文摘With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. A large number of cyber words, phrases even sentences, which are different from conventional English, are formed and become popular in the cyber world. This paper discusses different markers of net news by taking Internet video news and Internet audio news as examples so that the readers can fully understand the properties of net news.
文摘In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.