Turns out, it's all in how you ask the question. Prompting is super-powerful.
A single finite Transformer can compute anything computable, just by changing its prompts
Mathematical proof that prompting makes Transformers universal computers
https://arxiv.org/abs/2411.01992
Original Problem 🤔:
The field lacks theoretical understanding of how a single LLM can perform multiple tasks through prompting. While empirical success exists, we don't know the fundamental capabilities and limitations of the prompting paradigm.
-----
Solution in this Paper 🔧:
→ The paper introduces a novel computational model called Two-tape Post-Turing Machine (2-PTM) that can efficiently simulate Turing machines
→ They construct a finite-size decoder-only Transformer that can execute 2-PTMs through chain-of-thought steps
→ The solution uses ReLU activation, layer normalization, and causal attention to implement boolean operations and equality checks
→ The paper develops special encoding schemes to represent arbitrary computations in finite-sized prompts
-----
Key Insights 💡:
→ Prompting is Turing-complete - a single finite Transformer can compute any computable function with the right prompt
→ The model doesn't need to memorize all functions or have answers embedded in prompts
→ The solution achieves nearly same complexity bounds as unbounded Transformers
→ Chain-of-thought steps are necessary for universal computation
-----
Results 📊:
→ Can compute any TIME(t(n)) function within O(t(n) log t(n)) chain-of-thought steps
→ Achieves O(log(n + t(n))) bits of precision for any length-n input
→ Can decide any P language within log-precision
Share this post