摘要
[目的/意义]针对图像组织和检索过程中存在的语义缺失和不完整性问题,提出一个面向社会化媒体中的图像语义描述框架,旨在丰富现有的图像描述理论体系,提高图像的检索效率和利用率,为实现自动化的图像语义标注提供参考。[方法/过程]首先,调研分析国内外有关图像描述的研究进展,总结现有的图像描述和标注理论、元数据规范和相关技术方法;其次,在此理论基础上,针对社会化媒体图像领域,构建社会化媒体图像语义描述框架,并详细阐述语义层次及其相互关系。最后,通过人物图像和风景图像实例描述验证图像语义描述框架的可行性。[结果/结论]人物图像和风景图像描述实例结果表明,图像语义描述框架可通过各层之间的语义关联消除图像描述中的“语义鸿沟”,实现对图像外部特征和内容特征的多侧面、多维度、多层次的结构化和语义化描述,具有较强的可移植性和灵活性。
[Purpose/Significance]Aiming at the semantic missing and incomplete problems in the process of image organization and retrieval,a framework for semantic description of images in social media is proposed to enrich the existing theoretical system of image description,improve the efficiency and utilization of image retrieval,and provide a reference for the realization of the automatic semantic annotation of images.[Method/Process]First,we conducted a survey and analysis of research progress related to image description both domestically and internationally,summarizing the existing theories of image description and annotation,metadata specifications,and related technical methods.Second,based on the image metadata standards and the theory of hierarchical and categorical description of image features,we constructed a semantic description framework for social media images,focusing on seven layers:external feature layer,content layer,object layer,relationship layer,scene layer,event layer,and emotional layer.We also elaborated in detail the various semantic layers and their interrelationships.Finally,we verified the feasibility of the image semantic description framework by describing the examples of character images and landscape images.[Results/Conclusions]The results of the descriptive examples of character images and landscape images indicate that the image semantic description framework can eliminate the"semantic gap"in image description through semantic associations between different layers,and achieve a multi-faceted,multi-dimensional,and multi-level structured and semantic description of the external and content features of images.It has strong portability and flexibility.However,there are also certain limitations and areas for improvement in this paper:(1)Based on the image semantic description framework proposed in this paper,a prototype system based on image annotation needs to be developed;(2)The images posted by users on social media are closely related to the situation,and they are more likely to express emotions.In the future,more research on the semantic layer of images can be conducted based on the text information posted by users;(3)Future research can further explore the application of deep learning in image and text fusion to achieve more accurate event and emotion recognition.By constructing a more complex neural network structure,the event and emotion information in the image can be deeply mined and fused;(4)When describing images,the study should pay attention not only to static visual features,but also to consider the dynamic course of events.Future frameworks could attempt to combine static and dynamic information to provide richer,more vivid descriptions of images.
作者
胡守敏
董焕晴
HU Shoumin;DONG Huanqing(Central China Normal University Library,Wuhan 430079;School of Information Management,Central China Normal University,Wuhan 430079)
出处
《农业图书情报学报》
2024年第2期51-60,共10页
Journal of Library and Information Science in Agriculture
基金
国家社会科学基金项目“基于事理图谱的社会化问答知识组织与服务研究”(19BTQ075)。
关键词
语义描述框架
图像特征
语义标注
SORA
semantic description framework
image feature
semantic annotation
Sora