TurboEdit: Instant text-based image editing
A new research from Adobe. TurboEdit enables real-time, high-quality image editing using few-step diffusion models and detailed text prompts.
A new research from Adobe. TurboEdit enables real-time, high-quality image editing using few-step diffusion models and detailed text prompts.
Original Problem 🔍:
Existing image editing techniques for few-step diffusion models struggle with precise image inversion and disentangled editing, limiting their real-time application potential.
Key Insights 💡:
• Detailed text prompts enable better disentangled control in few-step diffusion models
• Freezing noise maps while modifying text prompts allows targeted attribute changes
• Iterative inversion with encoder-based techniques improves reconstruction quality
Solution in this Paper 🛠️:
• Encoder-based iterative inversion network conditioned on input and previous reconstructions
• Automatic generation of detailed text prompts for disentangled control
• Freeze noise maps and modify single attributes in text prompts for targeted editing
• Linear interpolation of text embeddings for editing strength control
• Integration with LLMs for instruction-based editing
Results 📊:
• Inversion: 8 NFEs (one-time cost)
• Editing: 4 NFEs per edit
• Speed: <0.5 seconds per edit (vs >3 seconds for multi-step methods)
• Outperforms state-of-the-art in background preservation and CLIP similarity