Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance.However,most audio‐vi...Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance.However,most audio‐visual wake word spotting models are only suitable for simple single‐speaker scenarios and require high computational complexity.Further development is hindered by complex multi‐person scenarios and computational limitations in mobile environments.In this paper,a novel audio‐visual model is proposed for on‐device multi‐person wake word spotting.Firstly,an attention‐based audio‐visual voice activity detection module is presented,which generates an attention score matrix of audio and visual representations to derive active speaker representation.Secondly,the knowledge distillation method is introduced to transfer knowledge from the large model to the on‐device model to control the size of our model.Moreover,a new audio‐visual dataset,PKU‐KWS,is collected for sentence‐level multi‐person wake word spotting.Experimental results on the PKU‐KWS dataset show that this approach outperforms the previous state‐of‐the‐art methods.展开更多
Experimental single case studies on automatic processing of emotion were carried on a sample of people with an anxiety disorder. Participants were required to take three Audio Visual Entrainment (AVE) sessions to test...Experimental single case studies on automatic processing of emotion were carried on a sample of people with an anxiety disorder. Participants were required to take three Audio Visual Entrainment (AVE) sessions to test for anxiety reduction as proclaimed by some academic research. Explicit reports were measured as well as pre-attentive bias to stressing information by using affective priming studies before and after AVE intervention. Group analysis shows that indeed AVEs program applications do reduce anxiety producing significant changes over explicit reports on anxiety levels and automatic processing bias of emotion. However, case by case analysis of six anxious participants shows that even when all of the participants report emotional improvement after intervention, not all of them reduce or eliminate dysfunctional bias to stressing information. Rather, they show a variety of processing styles due to intervention and some of them show no change at all. Implications of this differential effect to clinical sets are discussed.展开更多
Changsha urban landscape had been divided into four categories which were background landscape,contour landscape,architectural and humanity landscape,and garden and green landscape in this paper;and components and cha...Changsha urban landscape had been divided into four categories which were background landscape,contour landscape,architectural and humanity landscape,and garden and green landscape in this paper;and components and characteristics of each landscape had been analyzed in detail,which were used in establishment of city identity system of Changsha.Five subsystems had been taken as contexts and their components were analyzed specifically,which were mind indentify system,visual identity system,behavior identity system,audio identity system and environment identity system.Mind identity system would be disintegrated into characteristic landscape system with urban center as the core,surrounded by natural landscape belt and covering historical deposit.Visual identity system would be constituted through application of city flower,city tree,mascot and standard color.Behavior identity system would be disintegrated into capital of entertainment and behavioral custom of old city.Audio identity system would be decomposed into audio identity of urban streets,of natural landscape and of commercial landscape.Environment identity system would be established from the perspective of ecological environment.It was considered that establishment of city identity system of Changsha could directly promote economic development,and it needed further study on city identity system of Changsha.展开更多
Multimedia Percussion Theatre "The Call from Sigangli--A Dialogue of Natural Character and Avant-garde" tried comprehensive practice and searching from visual and audio design. Visual and audio, this two kinds of vo...Multimedia Percussion Theatre "The Call from Sigangli--A Dialogue of Natural Character and Avant-garde" tried comprehensive practice and searching from visual and audio design. Visual and audio, this two kinds of vocabulary brought out the best in each other with the support of multimedia and digital audio technology, and also formed a new audio-visual language. The original ecological of percussion, multimedia image and interactive technologies impacted the natural and avant-garde. It is the possibility that this practice provides new form in the spread of Chinese culture.展开更多
基金supported by the National Key R&D Program of China(No.2020AAA0108904)the Science and Technology Plan of Shenzhen(No.JCYJ20200109140410340).
文摘Audio‐visual wake word spotting is a challenging multi‐modal task that exploits visual information of lip motion patterns to supplement acoustic speech to improve overall detection performance.However,most audio‐visual wake word spotting models are only suitable for simple single‐speaker scenarios and require high computational complexity.Further development is hindered by complex multi‐person scenarios and computational limitations in mobile environments.In this paper,a novel audio‐visual model is proposed for on‐device multi‐person wake word spotting.Firstly,an attention‐based audio‐visual voice activity detection module is presented,which generates an attention score matrix of audio and visual representations to derive active speaker representation.Secondly,the knowledge distillation method is introduced to transfer knowledge from the large model to the on‐device model to control the size of our model.Moreover,a new audio‐visual dataset,PKU‐KWS,is collected for sentence‐level multi‐person wake word spotting.Experimental results on the PKU‐KWS dataset show that this approach outperforms the previous state‐of‐the‐art methods.
文摘Experimental single case studies on automatic processing of emotion were carried on a sample of people with an anxiety disorder. Participants were required to take three Audio Visual Entrainment (AVE) sessions to test for anxiety reduction as proclaimed by some academic research. Explicit reports were measured as well as pre-attentive bias to stressing information by using affective priming studies before and after AVE intervention. Group analysis shows that indeed AVEs program applications do reduce anxiety producing significant changes over explicit reports on anxiety levels and automatic processing bias of emotion. However, case by case analysis of six anxious participants shows that even when all of the participants report emotional improvement after intervention, not all of them reduce or eliminate dysfunctional bias to stressing information. Rather, they show a variety of processing styles due to intervention and some of them show no change at all. Implications of this differential effect to clinical sets are discussed.
文摘Changsha urban landscape had been divided into four categories which were background landscape,contour landscape,architectural and humanity landscape,and garden and green landscape in this paper;and components and characteristics of each landscape had been analyzed in detail,which were used in establishment of city identity system of Changsha.Five subsystems had been taken as contexts and their components were analyzed specifically,which were mind indentify system,visual identity system,behavior identity system,audio identity system and environment identity system.Mind identity system would be disintegrated into characteristic landscape system with urban center as the core,surrounded by natural landscape belt and covering historical deposit.Visual identity system would be constituted through application of city flower,city tree,mascot and standard color.Behavior identity system would be disintegrated into capital of entertainment and behavioral custom of old city.Audio identity system would be decomposed into audio identity of urban streets,of natural landscape and of commercial landscape.Environment identity system would be established from the perspective of ecological environment.It was considered that establishment of city identity system of Changsha could directly promote economic development,and it needed further study on city identity system of Changsha.
文摘Multimedia Percussion Theatre "The Call from Sigangli--A Dialogue of Natural Character and Avant-garde" tried comprehensive practice and searching from visual and audio design. Visual and audio, this two kinds of vocabulary brought out the best in each other with the support of multimedia and digital audio technology, and also formed a new audio-visual language. The original ecological of percussion, multimedia image and interactive technologies impacted the natural and avant-garde. It is the possibility that this practice provides new form in the spread of Chinese culture.