基于扩散模型的食品图像生成研究

Research on food image generation based on diffusion models

下载PDF

导出

摘要食物图像生成主要研究从一组特定的配料中生成膳食图像,该任务属于文本到图像任务的范畴。但由于与膳食图像相关的因素较复杂,生成逼真食品图像的类似工作迄今未能完全实现。现有的方法基于配料和烹饪信息利用生成对抗网络逐步产生高质量的样本,但不能覆盖整个分布,因此很难达到条件生成高质量图像的目的。扩散模型是一类基于似然性的模型,最近已被证明可以产生高质量的图像,同时提供理想的特性,如分布覆盖、固定训练目标和易于扩展。通过跨模态信息关联并引导扩散模型根据类别信息生成高质量食物图片。在Recipe1M数据集上的结果表明,模型性能比基线方法有显著的提升。 Research on food image generation primarily focuses on generating meal images from a specific set of ingredients,falling under the category of text-to-image tasks.However,due to the complexity associated with dietary images,similar efforts to generate realistic food images have yet to achieve complete success.Existing methods utilize generative adversarial networks(GANs)based on ingredient and cooking information to progressively generate high-quality samples.However,these methods may fail to cover the entire distribution,making it challenging to achieve the goal of conditionally generating high-quality images.Diffusion models,a class of likelihood-based models,have recently been demonstrated to generate high-quality images while offering desirable properties such as distribution coverage,fixed training objectives,and ease of scalability.This paper explores the utilization of cross-modal information association and guidance of diffusion models to generate high-quality food images based on category information.Results on the Recipe1M dataset demonstrate a significant improvement in model performance compared to baseline methods.

作者徐桓程 Xu Huancheng(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China)

机构地区西南交通大学计算机与人工智能学院

出处《现代计算机》 2024年第16期69-73,共5页 Modern Computer

关键词扩散模型食谱图像生成 diffusion models recipe image generation

分类号 TP3 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

1陈昱成,韩涛.生成式人工智能视角下研究问题与研究方法句生成研究——以高能物理领域为例[J].情报杂志,2024,43(10):144-149.
2郑素珍.旋转物体的单光子三维重建技术[J].应用光学,2024,45(5):879-884.
3毛泽勇,陈欣易,丁俊峰,陈蕾.基于不确定性感知旋转目标检测的二次接线质检[J].计算机技术与发展,2024,34(10):178-185.
4本期重点推荐论文[J].应用光学,2024,45(5).
5胡晓雪,占一可.TextLeak:基于决策的单词级黑盒文本对抗攻击方法[J].武汉大学学报（理学版）,2024,70(4):431-440.
6徐晋,杨彬鑫,陈雪锦.面向多类物体草图的三维形状生成统一模型[J].计算机辅助设计与图形学学报,2024,36(8):1171-1180.

现代计算机

2024年第16期

浏览历史

内容加载中请稍等...

基于扩散模型的食品图像生成研究

相关作者

相关机构

相关主题

浏览历史