MiniMax M1

MiniMax’s first open-source large-scale hybrid-attention reasoning model. Two variants: 80K and 40K reasoning-output. 1M token context window, 80K reasoning output (8× DeepSeek R1 at the time). Open weights on Hugging Face.

Why it matters

The “lightning attention” hybrid architecture is the technical moat — efficient long-context processing that lets the model use its full 1M window without the throughput collapse that hits standard attention at long sequences. Anchors a model family that continued into M2.7.

Status in the wiki

Sources

See Also