Associating faces appearing in Web videos with names presented in the surrounding context is an important task in many applications. However, the problem is not well investigated particularly under large-scale realist...Associating faces appearing in Web videos with names presented in the surrounding context is an important task in many applications. However, the problem is not well investigated particularly under large-scale realistic scenario,mainly due to the scarcity of dataset constructed in such circumstance. In this paper, we introduce a Web video dataset of celebrities, named WebV-Cele, for name-face association. The dataset consists of 75 073 Internet videos of over 4 000 hours,covering 2 427 celebrities and 649 001 faces. This is, to our knowledge, the most comprehensive dataset for this problem.We describe the details of dataset construction, discuss several interesting findings by analyzing this dataset like celebrity community discovery, and provide experimental results of name-face association using five existing techniques. We also outline important and challenging research problems that could be investigated in the future.展开更多
The massive web videos prompt an imperative demand on efficiently grasping the major events. However, the distinct characteristics of web videos, such as the limited number of features, the noisy text information, and...The massive web videos prompt an imperative demand on efficiently grasping the major events. However, the distinct characteristics of web videos, such as the limited number of features, the noisy text information, and the unavoidable error in near-duplicate keyframes (NDKs) detection, make web video event mining a challenging task. In this paper, we propose a novel four-stage framework to improve the performance of web video event mining. Data preprocessing is the first stage. Multiple Correspondence Analysis (MCA) is then applied to explore the correlation between terms and classes, targeting for bridging the gap between NDKs and high-level semantic concepts. Next, co-occurrence information is used to detect the similarity between NDKs and classes using the NDK-within-video information. Finally, both of them are integrated for web video event mining through negative NDK pruning and positive NDK enhancement. Moreover, both NDKs and terms with relatively low frequencies are treated as useful information in our experiments. Experimental results on large-scale web videos from YouTube demonstrate that the proposed framework outperforms several existing mining methods and obtains good results for web video event mining.展开更多
基金supported by a research grant from City University of Hong Kong under Grant No.7008178the National Natural Science Foundation of China under Grant Nos.61228205,61303175 and 61172153
文摘Associating faces appearing in Web videos with names presented in the surrounding context is an important task in many applications. However, the problem is not well investigated particularly under large-scale realistic scenario,mainly due to the scarcity of dataset constructed in such circumstance. In this paper, we introduce a Web video dataset of celebrities, named WebV-Cele, for name-face association. The dataset consists of 75 073 Internet videos of over 4 000 hours,covering 2 427 celebrities and 649 001 faces. This is, to our knowledge, the most comprehensive dataset for this problem.We describe the details of dataset construction, discuss several interesting findings by analyzing this dataset like celebrity community discovery, and provide experimental results of name-face association using five existing techniques. We also outline important and challenging research problems that could be investigated in the future.
基金supported by the National Natural Science Foundation of China under Grant Nos. 61373121, 61071184, 60972111,61036008the Research Funds for the Doctoral Program of Higher Education of China under Grant No. 20100184120009+2 种基金the Program for Sichuan Provincial Science Fund for Distinguished Young Scholars under Grant Nos. 2012JQ0029, 13QNJJ0149the Fundamental Research Funds for the Central Universities of China under Grant Nos. SWJTU09CX032, SWJTU10CX08the Program of China Scholarships Council under Grant No. 201207000050
文摘The massive web videos prompt an imperative demand on efficiently grasping the major events. However, the distinct characteristics of web videos, such as the limited number of features, the noisy text information, and the unavoidable error in near-duplicate keyframes (NDKs) detection, make web video event mining a challenging task. In this paper, we propose a novel four-stage framework to improve the performance of web video event mining. Data preprocessing is the first stage. Multiple Correspondence Analysis (MCA) is then applied to explore the correlation between terms and classes, targeting for bridging the gap between NDKs and high-level semantic concepts. Next, co-occurrence information is used to detect the similarity between NDKs and classes using the NDK-within-video information. Finally, both of them are integrated for web video event mining through negative NDK pruning and positive NDK enhancement. Moreover, both NDKs and terms with relatively low frequencies are treated as useful information in our experiments. Experimental results on large-scale web videos from YouTube demonstrate that the proposed framework outperforms several existing mining methods and obtains good results for web video event mining.