Teaser image for IntrinsicEdit

Abstract

Generative diffusion models have advanced image editing with high-quality results and intuitive interfaces such as prompts and semantic drawing. However, these interfaces lack precise control, and the associated methods typically specialize on a single editing task. We introduce a versatile, generative workflow that operates in an intrinsic-image latent space, enabling semantic, local manipulation with pixel precision for a range of editing operations. Building atop the RGB-X diffusion framework, we address key challenges of identity preservation and intrinsic-channel entanglement. By incorporating exact diffusion inversion and disentangled channel manipulation, we enable precise, efficient editing with automatic resolution of global illumination effects -- all without additional data collection or model fine-tuning. We demonstrate state-of-the-art performance across a variety of tasks on complex images, including color and texture adjustments, object insertion and removal, global relighting, and their combinations.

Results

Object Removal
Object Removal: Our method performs on par with or better than specialized approaches (RGB→X→RGB, Photoshop Generative Fill, and Stable Diffusion Inpainting) while handling complex cases like reflection removal. Notably, it preserves surrounding textures and completely removes objects including their shadows, where other methods struggle.

Object Insertion
Object insertion: Our method outperforms specialized insertion techniques (RGB→X→RGB, IntrinsicComp, ZeroComp, etc.) by better harmonizing objects with scene lighting and geometry. Key advantages include realistic handling of reflections (top row) and strong directional lighting (bottom row), despite not being task-specific.

Material Editing
Material editing: Our method enables precise material property control (color, texture, roughness) where prompt-based approaches fail, while outperforming intrinsic-space methods in edit quality and scene harmony. Key examples include color-accurate wall reflections (top row), lighting-preserved texture edits (middle rows), and automatic material-consistent floor extensions (third row).

Relighting
Relighting: Our method produces more natural relighting than RGB→X→RGB, handling both prompted irradiance changes (top rows) and volumetric shading (bottom row) while preserving scene identity. It robustly adapts to drastic lighting changes (second row) while maintaining material properties and object appearances.
<

BibTeX

@article{lyu2025intrinsic,
title={IntrinsicEdit: Precise generative image manipulation in intrinsic space},
author={Lyu, Linjie and Deschaintre, Valentin and Hold-Geoffroy, Yannick and Ha\v{s}an, Milo\v{s} and Yoon, Jae Shin and Leimk{\"u}ehler, Thomas and Theobalt, Christian and Georgiev, Iliyan},
journal={ACM Transactions on Graphics},
volume={44},
number={4},
year={2025}
}