Tweet by Jay Hack @mathemagic1an

Speculative Sampling: Accelerating Text Generation

🧵 The TL;DR

DeepMind has developed a way to use a smaller/faster model to generate K (potentailly obvious) tokens that can be checked by a slower/smarter model, resulting in a 2x+ speedup on natural language generation.

🔑 Key Points

DeepMind has developed a technique called Speculative Sampling
This technique uses a small/fast model to quickly generate K (potentially obvious) tokens
The slower/smarter model checks the work of the small model, resulting in a 2x+ speedup on natural language generation

👥 Key People

DeepMind

UnrollAI

Speculative Sampling: Accelerating Text Generation

🧵 The TL;DR

🔑 Key Points

👥 Key People

View Tweet Thread