摘要
为了实现新闻图像中重要人物的自动标志,针对由于不同的表情、光照、姿势等因素带来的视觉空间上的差异性问题,提出融合文本和视觉多模态信息的新闻人物自动标志方法.首先针对每个人名找到与该人名相关的人脸图像子集,建立人名与人脸的映射关系;其次在文本空间计算相似度,并在视觉空间对人脸子集图像进行聚类和计算相似度;最后采用加权的Borda方法对文本和视觉空间的相似度排序进行序融合.在大约50万幅的雅虎新闻图像数据集上进行实验的结果表明,该方法可显著地提高基于聚类方法的性能.
In order to automatically identify the important persons in news images and solve the diversity problem of visual distribution due to different factors such as expression, illumination and pose, we propose the method of automatic person identification based on multi-modal information fusion. First, for each target name, we find its corresponding face image subset and establish the mapping between the name and the faces. Second, we calculate the similarity in the text space as well as in the visual space, and cluster each face image subset. Finally, we exploit the weighted Borda method to fuse the similarity order of text space and visual space. The experiments are performed on the data set including approximately half a million news images from Yahoo! news, and the results show that the proposed method achieves significant improvement over the clustering-only methods.
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2013年第12期1842-1847,共6页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(61075014
61103062)
教育部博士学科点基金(20116102110027
20116102120031
20106102110028)
教育部博士研究生学术新人奖
西北工业大学基础研究基金(JC201249)
西北工业大学博士论文创新基金(cx201114)
关键词
局部小波二值模式直方图序列
吸引传播聚类
多模
文本值
视觉值
local Gabor binary pattern histogram sequence
affinity propagation clustering
multi-modal
textual score
visual score