Depression is a common mental health disorder.With current depression detection methods,specialized physicians often engage in conversations and physiological examinations based on standardized scales as auxiliary mea...Depression is a common mental health disorder.With current depression detection methods,specialized physicians often engage in conversations and physiological examinations based on standardized scales as auxiliary measures for depression assessment.Non-biological markers-typically classified as verbal or non-verbal and deemed crucial evaluation criteria for depression-have not been effectively utilized.Specialized physicians usually require extensive training and experience to capture changes in these features.Advancements in deep learning technology have provided technical support for capturing non-biological markers.Several researchers have proposed automatic depression estimation(ADE)systems based on sounds and videos to assist physicians in capturing these features and conducting depression screening.This article summarizes commonly used public datasets and recent research on audio-and video-based ADE based on three perspectives:Datasets,deficiencies in existing research,and future development directions.展开更多
Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to th...Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to the visually disadvantaged.In this study,a corpus of AD of COVID-19 educational videos is developed,named“Audio Description Corpus of COVID-19 Educational Videos”(ADCCEV).Drawing on the model of Textual and Linguistic Audio Description Matrix(TLADM),this paper aims to identify the linguistic and textual idiosyncrasies of AD themed on COVID-19 response released by the New Zealand Government.This study finds that linguistically,the AD script uses a mix of complete sentences and phrases,the majority being in Present Simple tense.Present participles and the“with”structure are used for brevity.Vocabulary is diverse,with simpler words for animated explainers.Third-person pronouns are common in educational videos.Color words are a salient feature of AD,where“yellow”denotes urgency,and“red”indicates importance,negativity,and hostility.On textual idiosyncrasies,coherence is achieved through intermodal components that align with the video’s mood and style.AD style varies depending on the video’s purpose,from informative to narrative or expressive.展开更多
BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场...BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场定位等话题展开。曹徐洋:在今年的BIRTV展会上,森海塞尔和诺音曼的展台都展出了大量优秀的产品,这些产品里有哪些是重点推出的?请介绍一下它们的主要亮点。展开更多
Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, d...Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.展开更多
With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capac...With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.展开更多
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at ...A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.展开更多
An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The a...An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The all-in-one machine of 3G audio and video network highly integrates all front-end devices used for audio and video collection, communication, power supply and information storage, and has advantages of wireless video transmission, clear two-way voice intercom with the command center, waterproof and dustproof function, simple operation, good portability, and long working hours. Compression code of the system is transmitted by dynamic bandwidth, and compression rate varies from 32 kbps to 4 Mbps under different network conditions. This system has forwarding mode, that is, monitoring information from each front-end monitoring point is trans- mitted to the server of the command center by 3G/ADSL, and the server codes'and decodes again, then beck-end users call images from the serv- er, which can address 3G network stoppage caused by many users calling front-end video at the same time. In addition, the system has been ap- plied in surface weather modification operation of Tai'an City, and has made a great contribution to transmitting operation orders in real time, monitoring, standardizing and recording operating process, and improving operating safety.展开更多
The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated ...The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.展开更多
Object detection plays a vital role in the video surveillance systems.To enhance security,surveillance cameras are now installed in public areas such as traffic signals,roadways,retail malls,train stations,and banks.Ho...Object detection plays a vital role in the video surveillance systems.To enhance security,surveillance cameras are now installed in public areas such as traffic signals,roadways,retail malls,train stations,and banks.However,monitor-ing the video continually at a quicker pace is a challenging job.As a consequence,security cameras are useless and need human monitoring.The primary difficulty with video surveillance is identifying abnormalities such as thefts,accidents,crimes,or other unlawful actions.The anomalous action does not occur at a high-er rate than usual occurrences.To detect the object in a video,first we analyze the images pixel by pixel.In digital image processing,segmentation is the process of segregating the individual image parts into pixels.The performance of segmenta-tion is affected by irregular illumination and/or low illumination.These factors highly affect the real-time object detection process in the video surveillance sys-tem.In this paper,a modified ResNet model(M-Resnet)is proposed to enhance the image which is affected by insufficient light.Experimental results provide the comparison of existing method output and modification architecture of the ResNet model shows the considerable amount improvement in detection objects in the video stream.The proposed model shows better results in the metrics like preci-sion,recall,pixel accuracy,etc.,andfinds a reasonable improvement in the object detection.展开更多
With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. ...With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. A large number of cyber words, phrases even sentences, which are different from conventional English, are formed and become popular in the cyber world. This paper discusses different markers of net news by taking Internet video news and Internet audio news as examples so that the readers can fully understand the properties of net news.展开更多
Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,s...Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,sev-eral cameras are installed underseas to collect videos.However,on the other hand,these large size videos require a lot of time and memory for their processing to extract relevant information.Hence,to automate this manual procedure of video assessment,an accurate and efficient automated system is a greater necessity.From this perspective,we intend to present a complete framework solution for the task of video summarization and object detection in underwater videos.We employed a perceived motion energy(PME)method tofirst extract the keyframes followed by an object detection model approach namely YoloV3 to perform object detection in underwater videos.The issues of blurriness and low contrast in underwater images are also taken into account in the presented approach by applying the image enhancement method.Furthermore,the suggested framework of underwater video summarization and object detection has been evaluated on a publicly available brackish dataset.It is observed that the proposed framework shows good performance and hence ultimately assists several marine researchers or scientists related to thefield of underwater archaeology,stock assessment,and surveillance.展开更多
Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requir...Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.展开更多
基金Supported by Shandong Province Key R and D Program,No.2021SFGC0504Shandong Provincial Natural Science Foundation,No.ZR2021MF079Science and Technology Development Plan of Jinan(Clinical Medicine Science and Technology Innovation Plan),No.202225054.
文摘Depression is a common mental health disorder.With current depression detection methods,specialized physicians often engage in conversations and physiological examinations based on standardized scales as auxiliary measures for depression assessment.Non-biological markers-typically classified as verbal or non-verbal and deemed crucial evaluation criteria for depression-have not been effectively utilized.Specialized physicians usually require extensive training and experience to capture changes in these features.Advancements in deep learning technology have provided technical support for capturing non-biological markers.Several researchers have proposed automatic depression estimation(ADE)systems based on sounds and videos to assist physicians in capturing these features and conducting depression screening.This article summarizes commonly used public datasets and recent research on audio-and video-based ADE based on three perspectives:Datasets,deficiencies in existing research,and future development directions.
文摘Audio description(AD),unlike interlingual translation and interpretation,is subject to unique constraints as a spoken text.Facilitated by AD,educational videos on COVID-19 anti-virus measures are made accessible to the visually disadvantaged.In this study,a corpus of AD of COVID-19 educational videos is developed,named“Audio Description Corpus of COVID-19 Educational Videos”(ADCCEV).Drawing on the model of Textual and Linguistic Audio Description Matrix(TLADM),this paper aims to identify the linguistic and textual idiosyncrasies of AD themed on COVID-19 response released by the New Zealand Government.This study finds that linguistically,the AD script uses a mix of complete sentences and phrases,the majority being in Present Simple tense.Present participles and the“with”structure are used for brevity.Vocabulary is diverse,with simpler words for animated explainers.Third-person pronouns are common in educational videos.Color words are a salient feature of AD,where“yellow”denotes urgency,and“red”indicates importance,negativity,and hostility.On textual idiosyncrasies,coherence is achieved through intermodal components that align with the video’s mood and style.AD style varies depending on the video’s purpose,from informative to narrative or expressive.
文摘BIRTV2023期间,在中央广播电视总台展台《现代电视技术》现场访谈间,本刊对森海塞尔中国内地地区专业音频Audio for Video销售负责人贾毅阳以及诺音曼中国内地地区销售负责人储海涛进行了采访,采访围绕两个品牌的产品亮点、优势及市场定位等话题展开。曹徐洋:在今年的BIRTV展会上,森海塞尔和诺音曼的展台都展出了大量优秀的产品,这些产品里有哪些是重点推出的?请介绍一下它们的主要亮点。
基金financial support from the Brazilian Federal Agency for Support and Evaluation of Graduate Education(Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior—CAPES,scholarship process no BEX 0506/15-0)the Brazilian National Agency of Petroleum,Natural Gas and Biofuels(Agencia Nacional do Petroleo,Gas Natural e Biocombustiveis—ANP),in cooperation with the Brazilian Financier of Studies and Projects(Financiadora de Estudos e Projetos—FINEP)the Brazilian Ministry of Science,Technology and Innovation(Ministério da Ciencia,Tecnologia e Inovacao—MCTI)through the ANP’s Human Resources Program of the State University of Sao Paulo(Universidade Estadual Paulista—UNESP)for the Oil and Gas Sector PRH-ANP/MCTI no 48(PRH48).
文摘Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.
文摘With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
基金Supported by the Science Item of National Power Company( No.SPKJ0 16 -0 71)
文摘A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.
基金Supported by the Integration and Application Project of Meteorological Key Technology of China Meteorological Administration(CMAGJ2012M30) Technology Development Projects of Tai'an Science and Technology Bureau in 2010 (201002045) and 2011
文摘An audio and video network monitoring system for weather modification operation transmitting information by 3G, ADSL and Internet has been developed and applied in weather modification operation of Tai'an City. The all-in-one machine of 3G audio and video network highly integrates all front-end devices used for audio and video collection, communication, power supply and information storage, and has advantages of wireless video transmission, clear two-way voice intercom with the command center, waterproof and dustproof function, simple operation, good portability, and long working hours. Compression code of the system is transmitted by dynamic bandwidth, and compression rate varies from 32 kbps to 4 Mbps under different network conditions. This system has forwarding mode, that is, monitoring information from each front-end monitoring point is trans- mitted to the server of the command center by 3G/ADSL, and the server codes'and decodes again, then beck-end users call images from the serv- er, which can address 3G network stoppage caused by many users calling front-end video at the same time. In addition, the system has been ap- plied in surface weather modification operation of Tai'an City, and has made a great contribution to transmitting operation orders in real time, monitoring, standardizing and recording operating process, and improving operating safety.
文摘The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.
文摘Object detection plays a vital role in the video surveillance systems.To enhance security,surveillance cameras are now installed in public areas such as traffic signals,roadways,retail malls,train stations,and banks.However,monitor-ing the video continually at a quicker pace is a challenging job.As a consequence,security cameras are useless and need human monitoring.The primary difficulty with video surveillance is identifying abnormalities such as thefts,accidents,crimes,or other unlawful actions.The anomalous action does not occur at a high-er rate than usual occurrences.To detect the object in a video,first we analyze the images pixel by pixel.In digital image processing,segmentation is the process of segregating the individual image parts into pixels.The performance of segmenta-tion is affected by irregular illumination and/or low illumination.These factors highly affect the real-time object detection process in the video surveillance sys-tem.In this paper,a modified ResNet model(M-Resnet)is proposed to enhance the image which is affected by insufficient light.Experimental results provide the comparison of existing method output and modification architecture of the ResNet model shows the considerable amount improvement in detection objects in the video stream.The proposed model shows better results in the metrics like preci-sion,recall,pixel accuracy,etc.,andfinds a reasonable improvement in the object detection.
文摘With the rapid development of Internet around the world, network is transmitting all kinds of information to human beings nowadays. Net news, also called cyber news is affecting people’s expression of daily English. A large number of cyber words, phrases even sentences, which are different from conventional English, are formed and become popular in the cyber world. This paper discusses different markers of net news by taking Internet video news and Internet audio news as examples so that the readers can fully understand the properties of net news.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2020R1G1A1099559).
文摘Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,sev-eral cameras are installed underseas to collect videos.However,on the other hand,these large size videos require a lot of time and memory for their processing to extract relevant information.Hence,to automate this manual procedure of video assessment,an accurate and efficient automated system is a greater necessity.From this perspective,we intend to present a complete framework solution for the task of video summarization and object detection in underwater videos.We employed a perceived motion energy(PME)method tofirst extract the keyframes followed by an object detection model approach namely YoloV3 to perform object detection in underwater videos.The issues of blurriness and low contrast in underwater images are also taken into account in the presented approach by applying the image enhancement method.Furthermore,the suggested framework of underwater video summarization and object detection has been evaluated on a publicly available brackish dataset.It is observed that the proposed framework shows good performance and hence ultimately assists several marine researchers or scientists related to thefield of underwater archaeology,stock assessment,and surveillance.
基金supported in part by NSFC(62002320,U19B2043,61672456)the Key R&D Program of Zhejiang Province,China(2021C01119).
文摘Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.