Introducing Mercury 2 - Diffusion for real-time reasoning

www.inceptionlabs.ai

Introducing Mercury 2 - Diffusion for real-time reasoning

www.inceptionlabs.ai

eifachposteMB to AI (Reddit RSS)English · 2 days ago

Introducing Mercury 2 – Inception

www.inceptionlabs.ai

Today, we're introducing Mercury 2 — the world's fastest reasoning language model, built to make production AI feel instant.

Original Reddit post

What stands out: Uses diffusion-based generation instead of sequential token-by-token decoding Generates tokens in parallel and refines them over a few steps Claims 1,009 tokens/sec on NVIDIA Blackwell GPUs Pricing: $0.25 / 1M input tokens , $0.75 / 1M output tokens 128K context Tunable reasoning Native tool use + schema-aligned JSON output OpenAI API compatible They’re positioning it heavily for: Coding assistants Agentic loops (multi-step inference chains) Real-time voice systems RAG/search pipelines with multi-hop retrieval submitted by /u/TyedalWaves

Originally posted by u/TyedalWaves on r/ArtificialInteligence

You must log in or # to comment.

Chat