结合视觉显著性与眼跳概率模型的视频注视点序列预测

Video Scanpath Prediction Based on Visual Saliency and Saccadic Probabilistic Model

下载PDF

导出

摘要视觉注意力相关研究中,基于图像的视觉显著图预测研究较多,而针对视频的注视点序列预测研究相对较少.在充分考虑视频场景的动态特征与人眼视觉特性基础上,提出了一种同时利用视频底层与高层特征的注视点序列预测模型.使用隐马尔可夫模型(HMM)对注视点的转移序列进行建模,其中注视点的位置作为隐藏状态.首先,采用卷积神经网络(CNN)获得视频的视觉显著图,并将视频帧的显著值作为HMM的观测概率,表征注视点所在区域所能成功引起人视觉注意的程度;然后,使用视觉心理学中基于莱维飞行的眼跳概率模型对HMM状态的转移概率进行建模;最后,通过维特比算法推断整个视频最有可能产生的注视点序列.在HOLLYWOOD2数据集上进行视频的注视点序列预测实验,并和相关算法进行比较.实验结果表明:本文提出的模型的预测结果在Hausdorff距离与平均欧氏距离两项指标上都更优. In the studies of visual attention,there are plenty of visual saliency map prediction models for images,while there are relatively few studies on the prediction of gaze shifting for video.In order to fully consider the dynamic characteristics of the video scene and the properties of human eye-movement,a scanpath prediction model for videos that utilizes both high-level and low-level features is proposed.A hidden Markov model(HMM)is used to model the shift of gaze point and the hidden states represent the location of human gazes.First,a convolutional neural network(CNN)for videos is introduced to obtain accurate visual saliency map and we model the values from saliency map of video frames as observation probabilities of the HMM indicating the extent of which a region of the scene can successfully attracts human visual attention.Then,based on the existing visual psychology study,the transition probability of HMM is modeled using saccadic probabilistic model based on Levy flight.Finally,we infer the most likely scanpath via Viterbi algorithm.Experiments were operated on the public video dataset HOLLYWOOD2 with actual eye-movement records of observers.The result shows that the performance of predicting the scanpath is better than the existing method.The experiment results show that the model can has a better prediction performance under the measure of Hausdorff distance and mean Euclidean distance.

作者罗灵兵冯辉胡波王祺尧 LUO Lingbing;FENG Hui;HU Bo;WANG Qiyao(Department of Electronic Engineering,School of Information Science and Technology,Fudan University,Shanghai 200433,China;Research Center of Smart Networks and Systems,Fudan University,Shanghai 200433,China)

机构地区复旦大学信息科学与工程学院电子工程系复旦大学智慧网络与系统研究中心

出处《复旦学报（自然科学版）》 CAS CSCD 北大核心 2019年第4期407-415,共9页 Journal of Fudan University：Natural Science

基金国家重点基础研究发展计划(2017YFC0821300)

关键词注视点序列隐马尔可夫模型视觉显著性卷积神经网络 scanpath hidden Markov model visual saliency convolutional neural network

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]