The increasing amount of videos on the Internet and digital libraries highlights the necessity and importance of interactive video services such as automatically associating additional materials(e.g.,advertising logos...The increasing amount of videos on the Internet and digital libraries highlights the necessity and importance of interactive video services such as automatically associating additional materials(e.g.,advertising logos and relevant selling information) with the video content so as to enrich the viewing experience.Toward this end,this paper presents a novel approach for user-targeted video content association(VCA) .In this approach,the salient objects are extracted automatically from the video stream using complementary saliency maps.According to these salient objects,the VCA system can push the related logo images to the users.Since the salient objects often correspond to important video content,the associated images can be considered as content-related.Our VCA system also allows users to associate images to the preferred video content through simple interactions by the mouse and an infrared pen.Moreover,by learning the preference of each user through collecting feedbacks on the pulled or pushed images,the VCA system can provide user-targeted services.Experimental results show that our approach can effectively and efficiently extract the salient objects.Moreover,subjective evaluations show that our system can provide content-related and user-targeted VCA services in a less intrusive way.展开更多
Understanding the characteristics and predicting the popularity of the newly published online videos can provide direct implications in various contexts such as service design, advertisement planning, network manageme...Understanding the characteristics and predicting the popularity of the newly published online videos can provide direct implications in various contexts such as service design, advertisement planning, network management and etc. In this paper, we collect a real-world large-scale dataset from a leading online video service provider in China, namely Youku. We first analyze the dynamics of content publication and content popularity for the online video service. Then, we propose a rich set of features and exploit various effective classification methods to estimate the future popularity level of an individual video in various scenarios. We show that the future popularity level of a video can be predicted even before the video's release, and by introducing the historical popularity information the prediction performance can be improved dramatically. In addition, we investigate the importance of each feature group and each feature in the popularity prediction, and further reveal the factors that may impact the video popularity. We also discuss how the early monitoring period influences the popularity level prediction. Our work provides an insight into the popularity of the newly published online videos, and demonstrates promising practical applications for content publishers,service providers, online advisers and network operators.展开更多
Large-scale dynamic relational data visualization has attracted considerable research attention recently. We introduce dynamic data visualization into the multimedia domain, and present an interactive and scalable sys...Large-scale dynamic relational data visualization has attracted considerable research attention recently. We introduce dynamic data visualization into the multimedia domain, and present an interactive and scalable system, Video Map, for exploring large-scale video content. A long video or movie has much content; the associations between the content are complicated. Video Map uses new visual representations to extract meaningful information from video content. Map-based visualization naturally and easily summarizes and reveals important features and events in video. Multi-scale descriptions are used to describe the layout and distribution of temporal information, spatial information, and associations between video content. Firstly, semantic associations are used in which map elements correspond to video contents. Secondly, video contents are visualized hierarchically from a large scale to a fine-detailed scale. Video Map uses a small set of sketch gestures to invoke analysis, and automatically completes charts by synthesizing visual representations from the map and binding them to the underlying data. Furthermore,Video Map allows users to use gestures to move and resize the view, as when using a map, facilitating interactive exploration. Our experimental evaluation of Video Map demonstrates how the system can assist in exploring video content as well as significantly reducing browsing time when trying to understand and find events of interest.展开更多
Content-based copy detection (CBCD) is widely used in copyright control for protecting unauthorized use of digital video and its key issue is to extract robust fingerprint against different attacked versions of the sa...Content-based copy detection (CBCD) is widely used in copyright control for protecting unauthorized use of digital video and its key issue is to extract robust fingerprint against different attacked versions of the same video. In this paper, the “natural parts” (coarse scales) of the Shearlet coefficients are used to generate robust video fingerprints for content-based video copy detection applications. The proposed Shearlet-based video fingerprint (SBVF) is constructed by the Shearlet coefficients in Scale 1 (lowest coarse scale) for revealing the spatial features and Scale 2 (second lowest coarse scale) for revealing the directional features. To achieve spatiotemporal natural, the proposed SBVF is applied to Temporal Informative Representative Image (TIRI) of the video sequences for final fingerprints generation. A TIRI-SBVF based CBCD system is constructed with use of Invert Index File (IIF) hash searching approach for performance evaluation and comparison using TRECVID 2010 dataset. Common attacks are imposed in the queries such as luminance attacks (luminance change, salt and pepper noise, Gaussian noise, text insertion);geometry attacks (letter box and rotation);and temporal attacks (dropping frame, time shifting). The experimental results demonstrate that the proposed TIRI-SBVF fingerprinting algorithm is robust on CBCD applications on most of the attacks. It can achieve an average F1 score of about 0.99, less than 0.01% of false positive rate (FPR) and 97% accuracy of localization.展开更多
To cope with the rapid growth of mobile video, video providers have leveraged cloud technologies to deploy their mobile video service system for more cost-effective and scalable performance. The emergence of Software-...To cope with the rapid growth of mobile video, video providers have leveraged cloud technologies to deploy their mobile video service system for more cost-effective and scalable performance. The emergence of Software-Defined Networking(SDN) provides a promising solution to manage the underlying network. In this paper, we introduce an SDN-enabled cloud mobile video distribution architecture and propose a joint video placement, request dispatching and traffic management mechanism to improve user experience and reduce the system operational cost. We use a utility function to capture the two aspects of user experience: the level of satisfaction and average latency, and formulate the joint optimization problem as a mixed integer programming problem. We develop an optimal algorithm based on dual decomposition and prove its optimality. We conduct simulations to evaluate the performance of our algorithm and the results show that our strategy can effectively cut down the total cost and guarantee user experience.展开更多
A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at ...A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.展开更多
Video quality assessment(VQA) plays a vital role in the field of video processing, including areas of video acquisition, video filtering in retrieval, video compression, video restoration, and video enhancement. Since...Video quality assessment(VQA) plays a vital role in the field of video processing, including areas of video acquisition, video filtering in retrieval, video compression, video restoration, and video enhancement. Since VQA has gained much attention in recent years, this paper gives an up-to-date review of VQA research and highlights current challenges in this filed. The subjective study and common VQA databases are first reviewed.Then, a survey on the objective VQA methods, including full-reference, reduced-reference,and no-reference VQA, is reported. Last but most importantly, the key limitations of current research and several challenges in the field of VQA are discussed, which include the impact of video content, memory effects, computational efficiency, personalized video quality prediction, and quality assessment of newly emerged videos.展开更多
Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while ...Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.展开更多
Video training platforms are now being implemented on a large scale in organizations. In this paper, I look at a video training platform including open educational resources available for many employees with varying p...Video training platforms are now being implemented on a large scale in organizations. In this paper, I look at a video training platform including open educational resources available for many employees with varying patterns and motivations for use. This has provided me with a research challenge to find methods that help other practitioners in the field understand and explain such initiatives. I describe ways to model the research and identify where pressures and contradictions can be found, drawing on a reflective view of my own practice in performing the research. Open educational resources are defined as technology-enabled educational resources that are openly available for consultation, use and adaptation by users for non-commercial purposes [1]. The bank subject to this case study has been the first organisation in Turkey that provided open educational resources for all its employees. The video platform (called “For @ Tube”) provides users with over 100 video lectures drawn from reputable universities around the world including Yale and Harvard. Other learning tools such as discussion forums, blogs and traditional e-learning courses have been made available to the users on the e-learning platform called “For @” since 2006. In this paper, I aim to introduce the new video training platform (“For @ Tube”) and outline some of the main research issues surrounding such an initiative. I seek to explore theoretical and practical approaches that can provide suitable tools for analysis. Activity theory is seen as a suitable approach for macroanalysis and its use is illustrated in terms of the complexity of large scale research. Activity theory, besides informing research perspectives, can be turned in upon the research process itself, allowing us to consider the challenges and context of the research. By using activity theory in this way and illustrating from a range of practical approaches, I demonstrate and illustrate a useful research approach.展开更多
A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reco...A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reconstructed picture buffer which provides reference string pixels for the current pixels being coded. SRB stores a few of recently and frequently referenced pixels for repetitive reference by the current pixels being coded. In the encoder, searching of optimal reference string is performed in both PRB and SRB, and either a PRB or SRB string is selected as an optimal reference string on a string-by-string basis. Compared with HM-16.4+SCM-40 reference software, the proposed SCC algorithm can improve coding performance measured by bit-distortion rate reduction of average 4.19% in all-intra configuration for text and graphics with motion category' of test sequences defined by JCT-VC common test condition.展开更多
基金Project supported by the CADAL Project and the National Natural Science Foundation of China(Nos.60973055 and 90820003)
文摘The increasing amount of videos on the Internet and digital libraries highlights the necessity and importance of interactive video services such as automatically associating additional materials(e.g.,advertising logos and relevant selling information) with the video content so as to enrich the viewing experience.Toward this end,this paper presents a novel approach for user-targeted video content association(VCA) .In this approach,the salient objects are extracted automatically from the video stream using complementary saliency maps.According to these salient objects,the VCA system can push the related logo images to the users.Since the salient objects often correspond to important video content,the associated images can be considered as content-related.Our VCA system also allows users to associate images to the preferred video content through simple interactions by the mouse and an infrared pen.Moreover,by learning the preference of each user through collecting feedbacks on the pulled or pushed images,the VCA system can provide user-targeted services.Experimental results show that our approach can effectively and efficiently extract the salient objects.Moreover,subjective evaluations show that our system can provide content-related and user-targeted VCA services in a less intrusive way.
文摘Understanding the characteristics and predicting the popularity of the newly published online videos can provide direct implications in various contexts such as service design, advertisement planning, network management and etc. In this paper, we collect a real-world large-scale dataset from a leading online video service provider in China, namely Youku. We first analyze the dynamics of content publication and content popularity for the online video service. Then, we propose a rich set of features and exploit various effective classification methods to estimate the future popularity level of an individual video in various scenarios. We show that the future popularity level of a video can be predicted even before the video's release, and by introducing the historical popularity information the prediction performance can be improved dramatically. In addition, we investigate the importance of each feature group and each feature in the popularity prediction, and further reveal the factors that may impact the video popularity. We also discuss how the early monitoring period influences the popularity level prediction. Our work provides an insight into the popularity of the newly published online videos, and demonstrates promising practical applications for content publishers,service providers, online advisers and network operators.
基金supported by the National Natural Science Foundation of China (Project Nos. U1435220, 61232013)
文摘Large-scale dynamic relational data visualization has attracted considerable research attention recently. We introduce dynamic data visualization into the multimedia domain, and present an interactive and scalable system, Video Map, for exploring large-scale video content. A long video or movie has much content; the associations between the content are complicated. Video Map uses new visual representations to extract meaningful information from video content. Map-based visualization naturally and easily summarizes and reveals important features and events in video. Multi-scale descriptions are used to describe the layout and distribution of temporal information, spatial information, and associations between video content. Firstly, semantic associations are used in which map elements correspond to video contents. Secondly, video contents are visualized hierarchically from a large scale to a fine-detailed scale. Video Map uses a small set of sketch gestures to invoke analysis, and automatically completes charts by synthesizing visual representations from the map and binding them to the underlying data. Furthermore,Video Map allows users to use gestures to move and resize the view, as when using a map, facilitating interactive exploration. Our experimental evaluation of Video Map demonstrates how the system can assist in exploring video content as well as significantly reducing browsing time when trying to understand and find events of interest.
文摘Content-based copy detection (CBCD) is widely used in copyright control for protecting unauthorized use of digital video and its key issue is to extract robust fingerprint against different attacked versions of the same video. In this paper, the “natural parts” (coarse scales) of the Shearlet coefficients are used to generate robust video fingerprints for content-based video copy detection applications. The proposed Shearlet-based video fingerprint (SBVF) is constructed by the Shearlet coefficients in Scale 1 (lowest coarse scale) for revealing the spatial features and Scale 2 (second lowest coarse scale) for revealing the directional features. To achieve spatiotemporal natural, the proposed SBVF is applied to Temporal Informative Representative Image (TIRI) of the video sequences for final fingerprints generation. A TIRI-SBVF based CBCD system is constructed with use of Invert Index File (IIF) hash searching approach for performance evaluation and comparison using TRECVID 2010 dataset. Common attacks are imposed in the queries such as luminance attacks (luminance change, salt and pepper noise, Gaussian noise, text insertion);geometry attacks (letter box and rotation);and temporal attacks (dropping frame, time shifting). The experimental results demonstrate that the proposed TIRI-SBVF fingerprinting algorithm is robust on CBCD applications on most of the attacks. It can achieve an average F1 score of about 0.99, less than 0.01% of false positive rate (FPR) and 97% accuracy of localization.
基金supported by the State Key Program of National Natural Science Foundation of China(Grant No.61233003)National Natural Science Foundation of China(Grant No.61503358)
文摘To cope with the rapid growth of mobile video, video providers have leveraged cloud technologies to deploy their mobile video service system for more cost-effective and scalable performance. The emergence of Software-Defined Networking(SDN) provides a promising solution to manage the underlying network. In this paper, we introduce an SDN-enabled cloud mobile video distribution architecture and propose a joint video placement, request dispatching and traffic management mechanism to improve user experience and reduce the system operational cost. We use a utility function to capture the two aspects of user experience: the level of satisfaction and average latency, and formulate the joint optimization problem as a mixed integer programming problem. We develop an optimal algorithm based on dual decomposition and prove its optimality. We conduct simulations to evaluate the performance of our algorithm and the results show that our strategy can effectively cut down the total cost and guarantee user experience.
基金Supported by the Science Item of National Power Company( No.SPKJ0 16 -0 71)
文摘A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.
基金partially supported by National Basic Research Program of China ("973"Program)(2015CB351803)the National Natural Science Foundation of China(61390514,61527804,61572042,61520106004)Sino-German Center(GZ 1025)
文摘Video quality assessment(VQA) plays a vital role in the field of video processing, including areas of video acquisition, video filtering in retrieval, video compression, video restoration, and video enhancement. Since VQA has gained much attention in recent years, this paper gives an up-to-date review of VQA research and highlights current challenges in this filed. The subjective study and common VQA databases are first reviewed.Then, a survey on the objective VQA methods, including full-reference, reduced-reference,and no-reference VQA, is reported. Last but most importantly, the key limitations of current research and several challenges in the field of VQA are discussed, which include the impact of video content, memory effects, computational efficiency, personalized video quality prediction, and quality assessment of newly emerged videos.
文摘Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.
文摘Video training platforms are now being implemented on a large scale in organizations. In this paper, I look at a video training platform including open educational resources available for many employees with varying patterns and motivations for use. This has provided me with a research challenge to find methods that help other practitioners in the field understand and explain such initiatives. I describe ways to model the research and identify where pressures and contradictions can be found, drawing on a reflective view of my own practice in performing the research. Open educational resources are defined as technology-enabled educational resources that are openly available for consultation, use and adaptation by users for non-commercial purposes [1]. The bank subject to this case study has been the first organisation in Turkey that provided open educational resources for all its employees. The video platform (called “For @ Tube”) provides users with over 100 video lectures drawn from reputable universities around the world including Yale and Harvard. Other learning tools such as discussion forums, blogs and traditional e-learning courses have been made available to the users on the e-learning platform called “For @” since 2006. In this paper, I aim to introduce the new video training platform (“For @ Tube”) and outline some of the main research issues surrounding such an initiative. I seek to explore theoretical and practical approaches that can provide suitable tools for analysis. Activity theory is seen as a suitable approach for macroanalysis and its use is illustrated in terms of the complexity of large scale research. Activity theory, besides informing research perspectives, can be turned in upon the research process itself, allowing us to consider the challenges and context of the research. By using activity theory in this way and illustrating from a range of practical approaches, I demonstrate and illustrate a useful research approach.
基金supported in part by National Natural Science Foundation of China under Grant No.61201226 and 61271096Natural Science Foundation of Shanghai under Grant No.12ZR1433800Specialized Research Fund for the Doctoral Program under Grant No.20130072110054
文摘A screen content coding (SCC) algorithm that uses a primary reference buffer (PRB) and a secondary reference buffer (SRB) for string matching and string copying is proposed. PRB is typically the traditional reconstructed picture buffer which provides reference string pixels for the current pixels being coded. SRB stores a few of recently and frequently referenced pixels for repetitive reference by the current pixels being coded. In the encoder, searching of optimal reference string is performed in both PRB and SRB, and either a PRB or SRB string is selected as an optimal reference string on a string-by-string basis. Compared with HM-16.4+SCM-40 reference software, the proposed SCC algorithm can improve coding performance measured by bit-distortion rate reduction of average 4.19% in all-intra configuration for text and graphics with motion category' of test sequences defined by JCT-VC common test condition.