摘要
鉴于商业视觉搜索引擎的日益成熟,网络数据可能是下一个扩大视觉识别的重要数据源。通过观察发现,动作名称查询到的网络图像具有歧视性的动作场景。网络图像的歧视性信息和视频的时间信息之间有相互补充的优势。在此基础上提出一种利用大量的网络图像来增强行为识别的方法。具体框架是:提取行为视频的密集轨迹特征,并与网络图像特征相结合后放入支持向量机中训练分类。该方法是一个跨域学习问题,为了有效地利用网络图像特征,引入了跨域字典学习算法来处理网络图像,以解决网络图像域和视频域之间存在的域差异问题。由于网络图像可以轻松地在网络上获取,所以该方法几乎零成本地增强行为识别。在KTH和YouTube数据集上的实验结果表明,该方法有效提高了人体行为识别的准确率。
In view of the growing maturity of commercial visual search engines,Web data may be the next important data source to expand visual recognition.It is observed that the Web images queried by the action name is discriminatory to the action scene.Clearly,there are complementary benefits between the temporal information available in videos and the discriminatory scenes portrayed in images.On the basis,we propose an algorithm which can enhance action recognition by using a large number of Web images.We extract the dense trajectory feature of behavior video and put it into support vector machine for training classification in combination with Web image feature.This algorithm is a cross-domain learning problem.In order to effectively use Web image features,we introduce a cross-domain dictionary learning algorithm to deal with Web images for solving the domain differences between Web image domain and video domain.Because the Web images can be easily obtained on the network,it can enhance action recognition with at almost zero cost.Experiment shows that the proposed algorithm can improve the accuracy of human action recognition effectively on KTH and YouTube datasets.
作者
闻号
WEN Hao(School of Electronics and Information Engineering,Anhui University,Hefei 230601,China)
出处
《计算机技术与发展》
2019年第1期31-34,共4页
Computer Technology and Development
基金
安徽省自然科学基金(1508085MF120)
关键词
网络学习
迁移学习
行为识别
密集轨迹
字典学习
Web learning
transfer learning
action recognition
dense trajectory
dictionary learning