摘要
目的艺术品数字化为从计算机视觉角度对艺术品研究提供了巨大机会。为更好地为数字艺术品博物馆提供艺术作品分类和艺术检索功能,使人们深入理解艺术品内涵,弘扬传统文化,促进文化遗产保护,本文将多任务学习引入自动艺术分析任务,基于贝叶斯理论提出一种原创性的自适应多任务学习方法。方法基于层次贝叶斯理论利用各任务之间的相关性引入任务簇约束损失函数模型。依据贝叶斯建模方法,通过最大化不确定性的高斯似然构造多任务损失函数,最终构建了一种自适应多任务学习模型。这种自适应多任务学习模型能够很便利地扩展至任意同类学习任务,相比其他最新模型能够更好地提升学习的性能,取得更佳的分析效果。结果本文方法解决了多任务学习中每个任务损失之间相对权重难以决策这一难题,能够自动决策损失函数的权重。为了评估本文方法的性能,在多模态艺术语义理解Sem Art数据库上进行艺术作品分类以及跨模态艺术检索实验。艺术作品分类实验结果表明,本文方法相比于固定权重的多任务学习方法,在“时间范围”属性上提升了4.43%,同时本文方法的效果也优于自动确定损失权重的现有方法。跨模态艺术检索实验结果也表明,与使用“作者”属性的最新的基于知识图谱模型相比较,本文方法的改进幅度为9.91%,性能与分类的结果一致。结论本文方法可以在多任务学习框架内自适应地学习每个任务的权重,与目前流行的方法相比能显著提高自动艺术分析任务的性能。
Objective To improve learning efficiency and prediction accuracy,multi-task learning aims to tackle multiple tasks based on the generic features assumption those are prior to task-related features.Multi-task learning technique has been applied in a variety of computer vision applications on the aspects of object detection and tracking,object recognition,human-based identification and human facial attribute classification.The worldwide digitization of artwork has called to art research from the aspect of computer vision and further facilitated cultural heritage preservation.Automatic artwork analysis has been developing the art style,the content of the painting,or the oriented attributes analysis for art research.Our multitask learning for automatic art analysis application is based on the historical,social and artistic information.The existing multi-task joint learning methods learn multiple tasks based on a labor cost and time consuming weighted sum of losses.Our method illustrates art classification and art retrieval tools for the application of Digital Art Museum,which is convenient for researchers to deeply understand the connotation of art and further harness traditional cultural heritage research.Method A multiple objectives learning method is based on Bayesian theory.In terms of Bayesian analyzed results,we use the correlation between each task and introduce task cluster(clustering)to constrain the model.Then,we formulate a multi-task loss function via maximizing the Gaussian possibility derived of homoscedastic uncertainty via task-dependent uncertainty in Bayesian modeling.Result In order to slice into art classification and art retrieval missions,we identify the Sem Art dataset,a recent multi-modal benchmark for understanding the semantic essence of the art,which is designed to retrieve the art paginating cross different modal,and could be readily modified for the classification of art paginating.This dataset contains21384 art painting images,which is randomly split into training,validation and test sets based on 19244,1069 and1069 samples,respectively.First,we conduct art classification experiments on the Sem Art dataset,and then evaluate the performance through classification accuracy,i.e.,the proportion of properly predicted paintings to the total amount of paintings in test procedure.The art classification results demonstrate that our model is qualified based on proposed adaptive multi-task learning technique while in the previous multi-task learning model,the weight of each task in fixed.For example,in“Timeframe”classification task,the improvement is about 4.43%with respect to the previous model.In order to calculate the task-specific weighting,the previous model barriers are limited to twice back forward tracing.The art classification results also validate the importance of introducing weighting constraints in our model.Next,we also evaluate our model on cross-modal art retrieval tasks.Experiments are conducted through Text2Art Challenge Evaluation where painting samples are sorted out based on their similarity to an oriented text,and vice versa.The calculated ranking results are evaluated by median rank and recall rate atK,withKbeing 1,5 and 10 on the test dataset and performances.Median rank denotes the value separating the higher half of the relevant ranking position amount all samples,whereas recall at rateK represents the rate of samples for which its relevant image is in the topKpositions of the ranking.Compared with the most recent knowledge-graph-based model in the context of author attribute,the improvement is about 9.91%in average which is consistent of classification results.Finally,we compare our model with manual evaluators.Following an artistic text,which contains comment,title,author,type,school and time schedule,participants are required to pick the most proper painting image out from a collection of 10 images.There are two distinct levels in this task as mentioned below:the collection of painting images are easy to random selected from the test set,and the difficulty is where the 10 collected images have the identical attribute category(i.e.,portraits,landscapes).All participants are required to conduct the task for 100 artistic texts in each level.The performance is reported as the proportion of clear feedbacks over all responses.Our demonstrated results also illustrate that our modeling accuracy is quite closer to human evaluators.Conclusion We harness an adaptive multi-task learning method to weight multiple loss functions based on Bayesian theory for automatic art analysis tasks.Furthermore,we conduct several experiments on the public available art dataset.The synthesized results on this dataset include both art classification and art retrieval challenges.
作者
杨冰
向学勤
孔万增
施妍
姚金良
Yang Bing;Xiang Xueqin;Kong Wanzeng;Shi Yan;Yao Jinliang(College of Computer Science and Technology,Hangzhou Dianzi University,Hangzhou 310018,China;Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province,Hangzhou 310018,China;uSens Incorporated Company,Hangzhou 310051,China;School of Media and Design,Hangzhou Dianzi University,Hangzhou 310018,China)
出处
《中国图象图形学报》
CSCD
北大核心
2022年第4期1226-1237,共12页
Journal of Image and Graphics
基金
国家自然科学基金项目(61633010,U1909202)
浙江省基础公益研究计划(LGG22F020027)
浙江省重点研发计划(2020C04009)
浙江省脑机协同智能重点实验室项目(2020E10010)。
关键词
自动艺术分析
自适应多任务学习
贝叶斯理论
艺术分类
跨模态艺术检索
automatic art analysis
adaptive multi-task learning
Bayesian theory
art classification
cross-modal art retrieval