摘要
针对现有的场景识别方法大多关注场景的自身特征,却忽略了目标之间的上下文关系和场景的外观特征等细节信息,通过单一的整体特征很难获得满意的分类效果,提出一种融合对象语义描述和纹理特征学习的场景识别方法,利用长短期记忆网络对场景中出现物体的语义信息构建的局部聚合描述向量进行上下文信息的学习,并通过场景的纹理特征对图像的分布进行细节描述,最后与多模型提取的特征进行融合。该方法的识别准确率在广泛应用的场景数据集Scene15、MIT67、SUN397上分别达到96.06%、89.35%和78.88%,表明融合的特征之间彼此具有互补性,证明了该方法的有效性。
In view of the fact that most of the existing scene recognition methods focus on the features of the scene itself,but ignore the details such as the context between the objects and the appearance features of the scene,it is difficult to obtain satisfactory classification results through a single overall feature.This paper proposes a scene recognition method which combines object semantic description and texture feature learning.It used LSTM network to recognize the objects in the scene.The local aggregation description vector based on the semantic information of the scene was used to learn the context information.The distribution of the image was described in detail by the texture features of the scene,and fused with the features extracted from multiple models.The recognition accuracy of the proposed method is 96.06%,89.35%and 78.88%respectively on the widely-used scene datasets Scene15,MIT67 and SUN397,which shows that the fusion features in this paper are complementary to each other and prove the effectiveness of the method.
作者
韦金言
刘大明
Wei Jinyan;Liu Daming(College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 200090,China)
出处
《计算机应用与软件》
北大核心
2024年第10期202-211,253,共11页
Computer Applications and Software
基金
甘肃省自然科学基金项目(SKLLDJ032016021)。
关键词
场景识别
局部聚合描述符
纹理特征学习
模型集成
Scene recognition
Local aggregation descriptor
Texture feature learning
Model integration