TurboEdit: Instant text-based image editing

A new research from Adobe. TurboEdit enables real-time, high-quality image editing using few-step diffusion models and detailed text prompts.

Nov 10, 2024

A new research from Adobe. TurboEdit enables real-time, high-quality image editing using few-step diffusion models and detailed text prompts.

Original Problem 🔍:

Existing image editing techniques for few-step diffusion models struggle with precise image inversion and disentangled editing, limiting their real-time application potential.

Key Insights 💡:

• Detailed text prompts enable better disentangled control in few-step diffusion models

• Freezing noise maps while modifying text prompts allows targeted attribute changes

• Iterative inversion with encoder-based techniques improves reconstruction quality

Solution in this Paper 🛠️:

• Encoder-based iterative inversion network conditioned on input and previous reconstructions

• Automatic generation of detailed text prompts for disentangled control

• Freeze noise maps and modify single attributes in text prompts for targeted editing

• Linear interpolation of text embeddings for editing strength control

• Integration with LLMs for instruction-based editing

Results 📊:

• Inversion: 8 NFEs (one-time cost)

• Editing: 4 NFEs per edit

• Speed: <0.5 seconds per edit (vs >3 seconds for multi-step methods)

• Outperforms state-of-the-art in background preservation and CLIP similarity

Rohan's Bytes

Discussion about this post