"Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis"

Playback speed

Share post at current time

Share from 0:00

0:00

Transcript

"Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 12, 2025

Transcript

Bits beat pixels: A new way to teach AI about images

Infinity turns every image into bits, making AI art generation faster and better than ever

Infinity introduces a bitwise visual autoregressive model that transforms high-resolution image generation by using infinite vocabulary tokenization and self-correction mechanisms.

-----

https://arxiv.org/abs/2412.04431

🤖 Original Problem:

Current autoregressive models struggle with high-resolution image generation due to limited vocabulary size, poor reconstruction quality, and train-test discrepancy issues.

-----

🔧 Solution in this Paper:

→ Introduces bitwise modeling framework with three key components: bitwise multi-scale residual quantizer, infinite-vocabulary classifier, and bitwise self-correction

→ Scales tokenizer vocabulary to 2^64 while reducing memory by 99.95% through dimension-independent bitwise quantization

→ Implements parallel binary classifiers instead of conventional classifier to handle extremely large vocabulary

→ Uses random bit flipping and re-quantization for self-correction during training to address prediction errors

-----

💡 Key Insights:

→ Bitwise tokenization enables nearly infinite vocabulary while maintaining low memory usage

→ Parallel binary classification is more efficient than conventional methods for large vocabularies

→ Self-correction mechanism significantly reduces train-test discrepancy

→ Progressive training strategy improves generation quality across resolutions

-----

📊 Results:

→ Generates 1024×1024 images 2.6× faster than SD3-Medium (0.8s vs 2.1s)

→ Improves GenEval score from 0.62 to 0.73 compared to SD3-Medium

→ Achieves 66% win rate in human evaluation

→ Reduces memory usage by 99.95% compared to conventional classifiers

------

Are you into AI and LLMs❓ Join me on X/Twitter with 52K+ others, to remain on the bleeding-edge of AI every day.

𝕏/🐦 https://x.com/rohanpaul_ai

Rohan's Bytes

"Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis"

Discussion about this video