Most existing image inpainting methods aim to fill in the missing content in the inside-hole region of the target image. However, the areas to be restored in realistically degraded images are unspecified. Previous stu...Most existing image inpainting methods aim to fill in the missing content in the inside-hole region of the target image. However, the areas to be restored in realistically degraded images are unspecified. Previous studies have failed to recover the degradations due to the absence of the explicit mask indication. Meanwhile, inconsistent patterns are blended complexly with the image content. Therefore, estimating whether certain pixels are out of distribution and considering whether the object is consistent with the context is necessary. Motivated by these observations, a two-stage blind image inpainting network, which utilizes global semantic features of the image to locate semantically inconsistent regions and then generates reasonable content in the areas, is proposed. Specifically, the representation differences between inconsistent and available content are first amplified, iteratively predicting the region to be restored from coarse to fine. A confidence-driven inpainting network based on prediction masks is then used to estimate the information regarding missing regions. Furthermore, a multiscale contextual aggregation module is introduced for spatial feature transfer to refine the generated contents. Extensive experiments over multiple datasets demonstrate that the proposed method can generate visually plausible and structurally complete results that are particularly effective in recovering diverse degraded images.展开更多
基金supported by the Natural Science Foundation of Shandong Province of China(No.ZR2020MF140)the Major Scientific and Technological Projects of CNPC(No.ZD2019-183-004)the Fundamental Research Funds for the Central Universities(No.20CX05019A).
文摘Most existing image inpainting methods aim to fill in the missing content in the inside-hole region of the target image. However, the areas to be restored in realistically degraded images are unspecified. Previous studies have failed to recover the degradations due to the absence of the explicit mask indication. Meanwhile, inconsistent patterns are blended complexly with the image content. Therefore, estimating whether certain pixels are out of distribution and considering whether the object is consistent with the context is necessary. Motivated by these observations, a two-stage blind image inpainting network, which utilizes global semantic features of the image to locate semantically inconsistent regions and then generates reasonable content in the areas, is proposed. Specifically, the representation differences between inconsistent and available content are first amplified, iteratively predicting the region to be restored from coarse to fine. A confidence-driven inpainting network based on prediction masks is then used to estimate the information regarding missing regions. Furthermore, a multiscale contextual aggregation module is introduced for spatial feature transfer to refine the generated contents. Extensive experiments over multiple datasets demonstrate that the proposed method can generate visually plausible and structurally complete results that are particularly effective in recovering diverse degraded images.