摘要
Short videos on the Internet have a huge amount, but most of them are unlabeled. In this paper, a rough short video labelling method based on the image classification neural network is proposed. Convolutional auto-encoder is applied to train and learn unlabeled video frames, in order to obtain feature in the specific level. With these features, the video key-frames are extracted by the feature clustering method. These key-frames which represent the video content are put into an image classification network, so that the labels of every video clip can be got. In addition, the different architectures of convolutional auto-encoder are estimated, and a better performance architecture through the experiment result is selected. In the final experiment, the video frame features from the convolutional auto-encoder are compared with those from other extraction methods, where it illustrates remarkable results by the proposed method.
Short videos on the Internet have a huge amount, but most of them are unlabeled. In this paper, a rough short video labelling method based on the image classification neural network is proposed. Convolutional auto-encoder is applied to train and learn unlabeled video frames, in order to obtain feature in the specific level. With these features, the video key-frames are extracted by the feature clustering method. These key-frames which represent the video content are put into an image classification network, so that the labels of every video clip can be got. In addition, the different architectures of convolutional auto-encoder are estimated, and a better performance architecture through the experiment result is selected. In the final experiment, the video frame features from the convolutional auto-encoder are compared with those from other extraction methods, where it illustrates remarkable results by the proposed method.
基金
supported by the National Key R&D Program of China (2018YFB1404100)
the Fundamental Research Funds for the Central Universities (CUC18A002-2).