Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some ligh...This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.展开更多
Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A mod...Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.展开更多
February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese ...February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.展开更多
Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense...Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.展开更多
《民族音乐学》:Ethnomusicology,是美国民族音乐学会The Society for Ethnomusicology的官方期刊,是世界上创刊最早的专门刊载民族音乐学研究成果的学术期刊,代表当代民族音乐学及相关领域的研究水准和最新观点,极具学术性、权威性和...《民族音乐学》:Ethnomusicology,是美国民族音乐学会The Society for Ethnomusicology的官方期刊,是世界上创刊最早的专门刊载民族音乐学研究成果的学术期刊,代表当代民族音乐学及相关领域的研究水准和最新观点,极具学术性、权威性和前瞻性。其创刊主旨在于希望通过协会期刊的创办更好的促进民族音乐学在不同时期和不同文化背景下的研究、学习、交流以及表演,为拓展美国及世界其他地区的民族音乐学研究起到了推动作用。笔者透过对美国民族音乐学协会创办历史、Ethnomusicology期刊中专栏设置情况及Ethnomusicology网络期刊的创建情况的介绍及分析,为国内民族音乐学研究者提供一些新的研究视野。展开更多
The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are e...The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.展开更多
In recent years,computing art has developed rapidly with the in-depth cross study of artificial intelligence generated con-tent(AIGC)and the main features of artworks.Audio-visual content generation has gradually been...In recent years,computing art has developed rapidly with the in-depth cross study of artificial intelligence generated con-tent(AIGC)and the main features of artworks.Audio-visual content generation has gradually been applied to various practical tasks,including video or game score,assisting artists in creation,art education and other aspects,which demonstrates a broad application pro-spect.In this paper,we introduce innovative achievements in audio-visual content generation from the perspective of visual art genera-tion and auditory art generation based on artificial intelligence(Al).We outline the development tendency of image and music datasets,visual and auditory content modelling,and related automatic generation systems.The objective and subjective evaluation of generated samples plays an important role in the measurement of algorithm performance.We provide a cogeneration mechanism of audio-visual content in multimodal tasks from image to music and display the construction of specific stylized datasets.There are still many new op-portunities and challenges in the field of audio-visual synesthesia generation,and we provide a comprehensive discussion on them.展开更多
著名的美国民族音乐学家蒂莫西·赖斯(Timothy Rice)为“牛津通识读本”(VERY SHORT INTRODUCTIONS)系列撰写的《民族音乐学》,是一部以学科观念、方法论深入学者实践和学术发展史的言简意赅的导论。该著分别从学科定义、学科简史...著名的美国民族音乐学家蒂莫西·赖斯(Timothy Rice)为“牛津通识读本”(VERY SHORT INTRODUCTIONS)系列撰写的《民族音乐学》,是一部以学科观念、方法论深入学者实践和学术发展史的言简意赅的导论。该著分别从学科定义、学科简史、研究方法论、音乐的性质、音乐与文化的关系、音乐家的个人研究、撰写音乐史、现代的民族音乐学,以及民族音乐学家的工作九个方面分章勾勒学科轮廓。作者以简洁的语言说明学术观点,使读者清晰并轻松地随其表述,走进丰富的民族音乐学世界。展开更多
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
文摘This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.
基金Supported by the National Natural Science Foundation of China(60905006)the NSFC-Guangdong Joint Fund(U1035004)
文摘Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.
文摘February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.
基金This paper is the periodic research result of the research project:Basic Research Project of Beijing Institute of Graphic Communication:Research on the Artistic,Modern Communication and Publishing of Dian-shi Zhai Pictorial(1884-1898)(Serial Number Eb202008).
文摘Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.
文摘《民族音乐学》:Ethnomusicology,是美国民族音乐学会The Society for Ethnomusicology的官方期刊,是世界上创刊最早的专门刊载民族音乐学研究成果的学术期刊,代表当代民族音乐学及相关领域的研究水准和最新观点,极具学术性、权威性和前瞻性。其创刊主旨在于希望通过协会期刊的创办更好的促进民族音乐学在不同时期和不同文化背景下的研究、学习、交流以及表演,为拓展美国及世界其他地区的民族音乐学研究起到了推动作用。笔者透过对美国民族音乐学协会创办历史、Ethnomusicology期刊中专栏设置情况及Ethnomusicology网络期刊的创建情况的介绍及分析,为国内民族音乐学研究者提供一些新的研究视野。
文摘The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.
基金This work was supported by National Natural Science Foundation of China(No.62176006)the National Key Research and Development Program of China(No.2022YFF0902302).
文摘In recent years,computing art has developed rapidly with the in-depth cross study of artificial intelligence generated con-tent(AIGC)and the main features of artworks.Audio-visual content generation has gradually been applied to various practical tasks,including video or game score,assisting artists in creation,art education and other aspects,which demonstrates a broad application pro-spect.In this paper,we introduce innovative achievements in audio-visual content generation from the perspective of visual art genera-tion and auditory art generation based on artificial intelligence(Al).We outline the development tendency of image and music datasets,visual and auditory content modelling,and related automatic generation systems.The objective and subjective evaluation of generated samples plays an important role in the measurement of algorithm performance.We provide a cogeneration mechanism of audio-visual content in multimodal tasks from image to music and display the construction of specific stylized datasets.There are still many new op-portunities and challenges in the field of audio-visual synesthesia generation,and we provide a comprehensive discussion on them.
文摘著名的美国民族音乐学家蒂莫西·赖斯(Timothy Rice)为“牛津通识读本”(VERY SHORT INTRODUCTIONS)系列撰写的《民族音乐学》,是一部以学科观念、方法论深入学者实践和学术发展史的言简意赅的导论。该著分别从学科定义、学科简史、研究方法论、音乐的性质、音乐与文化的关系、音乐家的个人研究、撰写音乐史、现代的民族音乐学,以及民族音乐学家的工作九个方面分章勾勒学科轮廓。作者以简洁的语言说明学术观点,使读者清晰并轻松地随其表述,走进丰富的民族音乐学世界。