摘要
Cross-depiction is the recognition—and synthesis—of objects whether they are photographed,painted, drawn, etc. It is a significant yet underresearched problem. Emulating the remarkable human ability to recognise and depict objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of computer vision. In this paper we motivate the cross-depiction problem, explain why it is difficult, and discuss some current approaches. Our main conclusions are(i) appearance-based recognition systems tend to be over-fitted to one depiction,(ii) models that explicitly encode spatial relations between parts are more robust,and(iii) recognition and non-photorealistic synthesis are related tasks.
Cross-depiction is the recognition—and synthesis—of objects whether they are photographed,painted, drawn, etc. It is a significant yet underresearched problem. Emulating the remarkable human ability to recognise and depict objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of computer vision. In this paper we motivate the cross-depiction problem, explain why it is difficult, and discuss some current approaches. Our main conclusions are(i) appearance-based recognition systems tend to be over-fitted to one depiction,(ii) models that explicitly encode spatial relations between parts are more robust,and(iii) recognition and non-photorealistic synthesis are related tasks.