"AutoPresent: Designing Structured Visuals from Scratch"

Playback speed

Share post at current time

0:00

Transcript

"AutoPresent: Designing Structured Visuals from Scratch"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 12, 2025

Turn words into slides: AutoPresent bridges the gap between thoughts and visuals

AutoPresent enables automated slide creation from natural language instructions by generating executable code that produces high-quality presentation slides, comparable to GPT-4.

-----

https://arxiv.org/abs/2501.00912

🎯 Original Problem:

Creating presentation slides requires both content creation and visual design skills, making it time-consuming even for experts. Current AI solutions excel at general image generation but struggle with structured visual content like slides.

-----

🔧 Solution in this Paper:

→ Introduces SlidesBench, a benchmark with 7k training and 585 test examples across 10 domains for evaluating slide generation

→ Proposes program generation over direct image generation, where models produce Python code that creates slides

→ Develops AutoPresent, an 8B parameter LLM trained on instruction-code pairs

→ Creates SlidesLib, a toolkit with high-level functions to simplify slide program generation

→ Implements iterative refinement where models self-improve their output

-----

🔍 Key Insights:

→ Program generation produces better slides than end-to-end image generation

→ Small models struggle with direct code generation but improve with SlidesLib

→ Iterative refinement enhances slide quality across all scenarios