In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on...In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.展开更多
We present a method of 3D image mosaicing for real 3D representation of roadside buildings, and implement a Web-based interactive visualization environment for the 3D video mosaics created by 3D image mosaicing. The 3...We present a method of 3D image mosaicing for real 3D representation of roadside buildings, and implement a Web-based interactive visualization environment for the 3D video mosaics created by 3D image mosaicing. The 3D image mo- saicing technique developed in our previous work is a very powerful method for creating textured 3D-GIS data without excessive data processing like the laser or stereo system. For the Web-based open access to the 3D video mosaics, we build an interactive visualization environment using X3D, the emerging standard of Web 3D. We conduct the data preprocessing for 3D video mosaics and the X3D modeling for textured 3D data. The data preprocessing includes the conversion of each frame of 3D video mosaics into concatenated image files that can be hyperlinked on the Web. The X3D modeling handles the representation of concatenated images using necessary X3D nodes. By employing X3D as the data format for 3D image mosaics, the real 3D representation of roadside buildings is extended to the Web and mobile service systems.展开更多
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the...Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the frames that stand out visually are extracted as key frames based on humanattention modeling theories. The schemes for modeling visual attention haveproven to be effective for video summaries. Nevertheless, the high cost ofcomputing in such techniques restricts their usability in everyday situations.In this context, we propose a method based on KFE (key frame extraction)technique, which is recommended based on an efficient and accurate visualattention model. The calculation effort is minimized by utilizing dynamicvisual highlighting based on the temporal gradient instead of the traditionaloptical flow techniques. In addition, an efficient technique using a discretecosine transformation is utilized for the static visual salience. The dynamic andstatic visual attention metrics are merged by means of a non-linear weightedfusion technique. Results of the system are compared with some existing stateof-the-art techniques for the betterment of accuracy. The experimental resultsof our proposed model indicate the efficiency and high standard in terms ofthe key frames extraction as output.展开更多
Related to the growth of data sharing on the Internet and the wide-spread use of digital media,multimedia security and copyright protection have become of broad interest.Visual cryptography(VC)is a method of sharing a...Related to the growth of data sharing on the Internet and the wide-spread use of digital media,multimedia security and copyright protection have become of broad interest.Visual cryptography(VC)is a method of sharing a secret image between a group of participants,where certain groups of participants are defined as qualified and may combine their share of the image to obtain the original,and certain other groups are defined as prohibited,and even if they combine knowledge of their parts,they can’t obtain any information on the secret image.The visual cryptography is one of the techniques which used to transmit the secrete image under the cover picture.Human vision systems are connected to visual cryptography.The black and white image was originally used as a hidden image.In order to achieve the owner’s copy right security based on visual cryptography,a watermarking algorithm is presented.We suggest an approach in this paper to hide multiple images in video by meaningful shares using one binary share.With a common share,which we refer to as a smart key,we can decrypt several images simultaneously.Depending on a given share,the smart key decrypts several hidden images.The smart key is printed on transparency and the shares are involved in video and decryption is performed by physically superimposing the transparency on the video.Using binary,grayscale,and color images,we test the proposed method.展开更多
From WeChat,QQ to Weibo,Tik Tok this series of online social networking applications,especially with the increasingly maturity of online shopping in recent years,short video and mobile intelligence have been developin...From WeChat,QQ to Weibo,Tik Tok this series of online social networking applications,especially with the increasingly maturity of online shopping in recent years,short video and mobile intelligence have been developing rapidly,which have strongly influenced many people’s living habits and shopping habits.Their development not only promotes the social and economic development,but also produces a new form of visual communication.This paper discusses the characteristics of short video,analyzes the shift of public visual consumption behind the popularity of short video,and hopes to further build a healthy development path of short video.展开更多
The Learning management system(LMS)is now being used for uploading educational content in both distance and blended setups.LMS platform has two types of users:the educators who upload the content,and the students who ...The Learning management system(LMS)is now being used for uploading educational content in both distance and blended setups.LMS platform has two types of users:the educators who upload the content,and the students who have to access the content.The students,usually rely on text notes or books and video tutorials while their exams are conducted with formal methods.Formal assessments and examination criteria are ineffective with restricted learning space which makes the student tend only to read the educational contents and videos instead of interactive mode.The aim is to design an interactive LMS and examination video-based interface to cater the issues of educators and students.It is designed according to Human-computer interaction(HCI)principles to make the interactive User interface(UI)through User experience(UX).The interactive lectures in the form of annotated videos increase user engagement and improve the self-study context of users involved in LMS.The interface design defines how the design will interact with users and how the interface exchanges information.The findings show that interactive videos for LMS allow the users to have a more personalized learning experience by engaging in the educational content.The result shows a highly personalized learning experience due to the interactive video and quiz within the video.展开更多
基金supported by National Natural Science Foundation of China under Grant No.610700800973 Sub-Program Projects under Grant No.2009CB320906+3 种基金National Science and Technology of Major Special Projects under Grant No.2010ZX03004-003S&T Planning Project of Hubei Provincial Department of Education under Grant No. Q20112805H&SPlanning Project of Hubei Provincial Department of Education under Grant No.2011jyte142Science Foundation of HubeiProvincial under Grant No.2010CDB05103
文摘In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.
文摘We present a method of 3D image mosaicing for real 3D representation of roadside buildings, and implement a Web-based interactive visualization environment for the 3D video mosaics created by 3D image mosaicing. The 3D image mo- saicing technique developed in our previous work is a very powerful method for creating textured 3D-GIS data without excessive data processing like the laser or stereo system. For the Web-based open access to the 3D video mosaics, we build an interactive visualization environment using X3D, the emerging standard of Web 3D. We conduct the data preprocessing for 3D video mosaics and the X3D modeling for textured 3D data. The data preprocessing includes the conversion of each frame of 3D video mosaics into concatenated image files that can be hyperlinked on the Web. The X3D modeling handles the representation of concatenated images using necessary X3D nodes. By employing X3D as the data format for 3D image mosaics, the real 3D representation of roadside buildings is extended to the Web and mobile service systems.
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
基金This work was supported in part by Qatar National Library,Doha,Qatar,and in part by the Qatar University Internal under Grant IRCC-2021-010。
文摘Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the frames that stand out visually are extracted as key frames based on humanattention modeling theories. The schemes for modeling visual attention haveproven to be effective for video summaries. Nevertheless, the high cost ofcomputing in such techniques restricts their usability in everyday situations.In this context, we propose a method based on KFE (key frame extraction)technique, which is recommended based on an efficient and accurate visualattention model. The calculation effort is minimized by utilizing dynamicvisual highlighting based on the temporal gradient instead of the traditionaloptical flow techniques. In addition, an efficient technique using a discretecosine transformation is utilized for the static visual salience. The dynamic andstatic visual attention metrics are merged by means of a non-linear weightedfusion technique. Results of the system are compared with some existing stateof-the-art techniques for the betterment of accuracy. The experimental resultsof our proposed model indicate the efficiency and high standard in terms ofthe key frames extraction as output.
文摘Related to the growth of data sharing on the Internet and the wide-spread use of digital media,multimedia security and copyright protection have become of broad interest.Visual cryptography(VC)is a method of sharing a secret image between a group of participants,where certain groups of participants are defined as qualified and may combine their share of the image to obtain the original,and certain other groups are defined as prohibited,and even if they combine knowledge of their parts,they can’t obtain any information on the secret image.The visual cryptography is one of the techniques which used to transmit the secrete image under the cover picture.Human vision systems are connected to visual cryptography.The black and white image was originally used as a hidden image.In order to achieve the owner’s copy right security based on visual cryptography,a watermarking algorithm is presented.We suggest an approach in this paper to hide multiple images in video by meaningful shares using one binary share.With a common share,which we refer to as a smart key,we can decrypt several images simultaneously.Depending on a given share,the smart key decrypts several hidden images.The smart key is printed on transparency and the shares are involved in video and decryption is performed by physically superimposing the transparency on the video.Using binary,grayscale,and color images,we test the proposed method.
基金Academic research project of Huashang College of Guangdong University of Finance and Economics(No.2018HSXS15).
文摘From WeChat,QQ to Weibo,Tik Tok this series of online social networking applications,especially with the increasingly maturity of online shopping in recent years,short video and mobile intelligence have been developing rapidly,which have strongly influenced many people’s living habits and shopping habits.Their development not only promotes the social and economic development,but also produces a new form of visual communication.This paper discusses the characteristics of short video,analyzes the shift of public visual consumption behind the popularity of short video,and hopes to further build a healthy development path of short video.
文摘The Learning management system(LMS)is now being used for uploading educational content in both distance and blended setups.LMS platform has two types of users:the educators who upload the content,and the students who have to access the content.The students,usually rely on text notes or books and video tutorials while their exams are conducted with formal methods.Formal assessments and examination criteria are ineffective with restricted learning space which makes the student tend only to read the educational contents and videos instead of interactive mode.The aim is to design an interactive LMS and examination video-based interface to cater the issues of educators and students.It is designed according to Human-computer interaction(HCI)principles to make the interactive User interface(UI)through User experience(UX).The interactive lectures in the form of annotated videos increase user engagement and improve the self-study context of users involved in LMS.The interface design defines how the design will interact with users and how the interface exchanges information.The findings show that interactive videos for LMS allow the users to have a more personalized learning experience by engaging in the educational content.The result shows a highly personalized learning experience due to the interactive video and quiz within the video.