This paper presents an algorithm for coding video signal based on 3-D wavelet transformation. When the frame order t of a video signal is replaced by order 2, the video signal can be looked as a block in 3-D space. Af...This paper presents an algorithm for coding video signal based on 3-D wavelet transformation. When the frame order t of a video signal is replaced by order 2, the video signal can be looked as a block in 3-D space. After splitting the block into smaller sub-blocks, imitate the method of 2-D wavelet transformation for images, we can transform the sub-blocks with 3-D wavelet. Most of video signal energy is in the decomposed low-frequency sub-bands. These sub-bands affect the visual quality of the video signal most. Quantizing different sub-bands with different precision and then entropy encoding each sub-band, we can eliminate inter- and intra-frame redundancy of the video signal and compress data. Our simulation experiments show that this algorithm can achieve very good result.展开更多
With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses ...With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses the comments of the learning platform or the learner’s expression for emotion recognition.The research data on other modalities are scarce.Most of the studies also ignore the impact of instructional videos on learners and the guidance of knowledge on data.Because of the need for other modal research data,we construct a synchronous multimodal data set for analyzing learners’emotional states in online learning scenarios.The data set recorded the eye movement data and photoplethysmography(PPG)signals of 68 subjects and the instructional video they watched.For the problem of ignoring the instructional videos on learners and ignoring the knowledge,a multimodal emotion recognition method in video learning based on knowledge enhancement is proposed.This method uses the knowledge-based features extracted from instructional videos,such as brightness,hue,saturation,the videos’clickthrough rate,and emotion generation time,to guide the emotion recognition process of physiological signals.This method uses Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM)networks to extract deeper emotional representation and spatiotemporal information from shallow features.The model uses multi-head attention(MHA)mechanism to obtain critical information in the extracted deep features.Then,Temporal Convolutional Network(TCN)is used to learn the information in the deep features and knowledge-based features.Knowledge-based features are used to supplement and enhance the deep features of physiological signals.Finally,the fully connected layer is used for emotion recognition,and the recognition accuracy reaches 97.51%.Compared with two recent researches,the accuracy improved by 8.57%and 2.11%,respectively.On the four public data sets,our proposed method also achieves better results compared with the two recent researches.The experiment results show that the proposed multimodal emotion recognition method based on knowledge enhancement has good performance and robustness.展开更多
This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-fr...This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising.展开更多
Aiming at applications as a projectile-borne video reconnaissance system, the overall design and prototype in principle of a mortar video reconnaissance system bomb were developed. Mortar launched test results show th...Aiming at applications as a projectile-borne video reconnaissance system, the overall design and prototype in principle of a mortar video reconnaissance system bomb were developed. Mortar launched test results show that the initial integrated system was capable of transmitting images through tens of kilometers with the image resolution identifying effectively tactical targets such as roads, hills, caverns, trees and rivers. The projectile-borne video reconnaissance system is able to meet the needs of tactical target identification and battle damage assessment for tactical operations. The study will provide significant technological support for further independent development.展开更多
Wireless Local Area Networks (WLANs) such as IEEE 802.11a/g and Hiperlan/2 utilise numerous transmission modes, each providing different throughputs and reliability levels. Many link adaptation algorithms proposed in ...Wireless Local Area Networks (WLANs) such as IEEE 802.11a/g and Hiperlan/2 utilise numerous transmission modes, each providing different throughputs and reliability levels. Many link adaptation algorithms proposed in the literature either maximise the error-free data throughput based on channel conditions or are based on the number of failed transmissions. However, these algo- rithms do not take into account the content of the data stream and strongly rely on the use of Automatic Repeat Requests (ARQs). Low latency video applications such as real-time video transmission may require no retransmission, or only a limited number of retrans- missions. Moreover, completely error-free communication is not essential, especially if robust video compression techniques are applied. In such scenarios, improved decoded video quality can be obtained with a video stream transmitted at a higher bit rate using a higher link speed but with some degree of transmission error, rather than an error-free video stream at a lower bit rate using a lower link speed. In this work, we investigate a link adaptation scheme that improves the Quality of Service (QoS) for video transmission, based on the overall received video quality (Peak Signal to Noise Ratio, PSNR), rather than by maximising the error-free throughput. We also study a practical link adaptation approach that uses PER thresholds at the PHY layer. An empirical study showed that thresholds for switching from one mode to another are much lower (almost error free) than those currently used by throughput based schemes. We show that traditional link adaptation strategies are not appropriate for real-time video transmission with no retransmis- sion. Simulation results using the H.264 video compression standard over IEEE 802.11a are presented.展开更多
In this paper, architecture of softswitch-based Next Generation Network (NGN) system and Session Initiation Protocol (SIP) are studied briefly, and the problems on the openness and extensibility of normal remote video...In this paper, architecture of softswitch-based Next Generation Network (NGN) system and Session Initiation Protocol (SIP) are studied briefly, and the problems on the openness and extensibility of normal remote video-monitoring system (RVMS) are analyzed. Then a RVMS framework model based on softswitch is given. Furthermore, designation and realization of the system based on T 6000 Softswitch Platform is provided. The innovation is considering the RVMS as a part of softswitch system. It is a feasible scheme for implementing next generation video-monitoring system based on broadband IP technique.展开更多
The quantizaion factor through buffer pure occupy algorithm is provided. Through the simulation, firstly the relationship between quantization factor and compression ratio is analyzed, secondly the PSNR of the image w...The quantizaion factor through buffer pure occupy algorithm is provided. Through the simulation, firstly the relationship between quantization factor and compression ratio is analyzed, secondly the PSNR of the image with the quantization factor is discussed, and finally the control to the output rate of the coder by adjusting the value of quantization factor is studied.展开更多
Transmission and switching of video services will be the important services provided bythe information superhighway,in which ATM(Asynchronous Transfer Mode)will be one ofthe key techniques.This paper discusses the ada...Transmission and switching of video services will be the important services provided bythe information superhighway,in which ATM(Asynchronous Transfer Mode)will be one ofthe key techniques.This paper discusses the adaptation of video services in ATM networksand presents a scheme of implementation.As a component of an ATM network supported bythe High Technology Research and Development Programme of China,the circuit designedwith the principle works successfully.展开更多
The transmission characteristics of video transmission mediums,coaxial ca ble and optical fiber,are discussed in the paper. The formulas for frequency bandwidth are given to evaluate the video transmission distance. F...The transmission characteristics of video transmission mediums,coaxial ca ble and optical fiber,are discussed in the paper. The formulas for frequency bandwidth are given to evaluate the video transmission distance. For typical video transmission systems with BB/IM and PFM/IM tising optical fiber as the channel, expressions and calculating results for both S/N and sensitivity are given. Finally, the principle for se lecting different type of transmission systems according to transmission distances of the industrial TV is presented.展开更多
For rate control (RC) of hierarchical structure coding, an independent rate-quantization (R-Q) model was proposed based on mean absolute differences (MADs) in different temporal levels (TLs). In the proposed R-Q model...For rate control (RC) of hierarchical structure coding, an independent rate-quantization (R-Q) model was proposed based on mean absolute differences (MADs) in different temporal levels (TLs). In the proposed R-Q model, a novel MAD model was developed according to the hierarchical structure. The experimental results demonstrate that the proposed algorithm provides better performance, in terms of average peak signal-to-noise ratio (PSNR) and quality smoothness, than the H.264 reference model, JM14.2, under various sequences.展开更多
文摘This paper presents an algorithm for coding video signal based on 3-D wavelet transformation. When the frame order t of a video signal is replaced by order 2, the video signal can be looked as a block in 3-D space. After splitting the block into smaller sub-blocks, imitate the method of 2-D wavelet transformation for images, we can transform the sub-blocks with 3-D wavelet. Most of video signal energy is in the decomposed low-frequency sub-bands. These sub-bands affect the visual quality of the video signal most. Quantizing different sub-bands with different precision and then entropy encoding each sub-band, we can eliminate inter- and intra-frame redundancy of the video signal and compress data. Our simulation experiments show that this algorithm can achieve very good result.
基金supported by the National Science Foundation of China (Grant Nos.62267001,61906051)。
文摘With the popularity of online learning and due to the significant influence of emotion on the learning effect,more and more researches focus on emotion recognition in online learning.Most of the current research uses the comments of the learning platform or the learner’s expression for emotion recognition.The research data on other modalities are scarce.Most of the studies also ignore the impact of instructional videos on learners and the guidance of knowledge on data.Because of the need for other modal research data,we construct a synchronous multimodal data set for analyzing learners’emotional states in online learning scenarios.The data set recorded the eye movement data and photoplethysmography(PPG)signals of 68 subjects and the instructional video they watched.For the problem of ignoring the instructional videos on learners and ignoring the knowledge,a multimodal emotion recognition method in video learning based on knowledge enhancement is proposed.This method uses the knowledge-based features extracted from instructional videos,such as brightness,hue,saturation,the videos’clickthrough rate,and emotion generation time,to guide the emotion recognition process of physiological signals.This method uses Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM)networks to extract deeper emotional representation and spatiotemporal information from shallow features.The model uses multi-head attention(MHA)mechanism to obtain critical information in the extracted deep features.Then,Temporal Convolutional Network(TCN)is used to learn the information in the deep features and knowledge-based features.Knowledge-based features are used to supplement and enhance the deep features of physiological signals.Finally,the fully connected layer is used for emotion recognition,and the recognition accuracy reaches 97.51%.Compared with two recent researches,the accuracy improved by 8.57%and 2.11%,respectively.On the four public data sets,our proposed method also achieves better results compared with the two recent researches.The experiment results show that the proposed multimodal emotion recognition method based on knowledge enhancement has good performance and robustness.
基金Supported by National Natural Science Foundation of P. R. China (60121302)the National High Technology Research and Development Program of P. R. China (2002AA142100)
文摘This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising.
文摘Aiming at applications as a projectile-borne video reconnaissance system, the overall design and prototype in principle of a mortar video reconnaissance system bomb were developed. Mortar launched test results show that the initial integrated system was capable of transmitting images through tens of kilometers with the image resolution identifying effectively tactical targets such as roads, hills, caverns, trees and rivers. The projectile-borne video reconnaissance system is able to meet the needs of tactical target identification and battle damage assessment for tactical operations. The study will provide significant technological support for further independent development.
文摘Wireless Local Area Networks (WLANs) such as IEEE 802.11a/g and Hiperlan/2 utilise numerous transmission modes, each providing different throughputs and reliability levels. Many link adaptation algorithms proposed in the literature either maximise the error-free data throughput based on channel conditions or are based on the number of failed transmissions. However, these algo- rithms do not take into account the content of the data stream and strongly rely on the use of Automatic Repeat Requests (ARQs). Low latency video applications such as real-time video transmission may require no retransmission, or only a limited number of retrans- missions. Moreover, completely error-free communication is not essential, especially if robust video compression techniques are applied. In such scenarios, improved decoded video quality can be obtained with a video stream transmitted at a higher bit rate using a higher link speed but with some degree of transmission error, rather than an error-free video stream at a lower bit rate using a lower link speed. In this work, we investigate a link adaptation scheme that improves the Quality of Service (QoS) for video transmission, based on the overall received video quality (Peak Signal to Noise Ratio, PSNR), rather than by maximising the error-free throughput. We also study a practical link adaptation approach that uses PER thresholds at the PHY layer. An empirical study showed that thresholds for switching from one mode to another are much lower (almost error free) than those currently used by throughput based schemes. We show that traditional link adaptation strategies are not appropriate for real-time video transmission with no retransmis- sion. Simulation results using the H.264 video compression standard over IEEE 802.11a are presented.
文摘In this paper, architecture of softswitch-based Next Generation Network (NGN) system and Session Initiation Protocol (SIP) are studied briefly, and the problems on the openness and extensibility of normal remote video-monitoring system (RVMS) are analyzed. Then a RVMS framework model based on softswitch is given. Furthermore, designation and realization of the system based on T 6000 Softswitch Platform is provided. The innovation is considering the RVMS as a part of softswitch system. It is a feasible scheme for implementing next generation video-monitoring system based on broadband IP technique.
文摘The quantizaion factor through buffer pure occupy algorithm is provided. Through the simulation, firstly the relationship between quantization factor and compression ratio is analyzed, secondly the PSNR of the image with the quantization factor is discussed, and finally the control to the output rate of the coder by adjusting the value of quantization factor is studied.
基金the High Technology Research and Development Programme of China.
文摘Transmission and switching of video services will be the important services provided bythe information superhighway,in which ATM(Asynchronous Transfer Mode)will be one ofthe key techniques.This paper discusses the adaptation of video services in ATM networksand presents a scheme of implementation.As a component of an ATM network supported bythe High Technology Research and Development Programme of China,the circuit designedwith the principle works successfully.
文摘The transmission characteristics of video transmission mediums,coaxial ca ble and optical fiber,are discussed in the paper. The formulas for frequency bandwidth are given to evaluate the video transmission distance. For typical video transmission systems with BB/IM and PFM/IM tising optical fiber as the channel, expressions and calculating results for both S/N and sensitivity are given. Finally, the principle for se lecting different type of transmission systems according to transmission distances of the industrial TV is presented.
基金National Natural Science Foundations of China (No. 60972035,No. 61074009)Natural Science Foundation Program of Shanghai,China ( No. 10ZR1432800)
文摘For rate control (RC) of hierarchical structure coding, an independent rate-quantization (R-Q) model was proposed based on mean absolute differences (MADs) in different temporal levels (TLs). In the proposed R-Q model, a novel MAD model was developed according to the hierarchical structure. The experimental results demonstrate that the proposed algorithm provides better performance, in terms of average peak signal-to-noise ratio (PSNR) and quality smoothness, than the H.264 reference model, JM14.2, under various sequences.