Mistral AI released Mistral Small 4 on March 16, a 119-billion-parameter Mixture-of-Experts model that consolidates multiple specialized capabilities into a single open-source package.
Model Architecture
Mistral Small 4 uses a 128-expert Mixture-of-Experts design with only 6 billion active parameters per inference pass. This sparse activation keeps inference fast while maintaining the representational power of a much larger dense model.
The model supports two inference modes:
reasoning_effort="none"for low-latency responsesreasoning_effort="high"for extended reasoning, with “equivalent verbosity to previous Magistral models,” according to Mistral’s release notes
Unified Capabilities
The release consolidates three prior model families into one:
- Reasoning — depth equivalent to Magistral
- Multimodal — text + image, previously Pixtral
- Agentic coding — code generation for autonomous systems, previously Devstral
The model supports a 256K context window, enabling longer documents and complex multi-step agent tasks without context truncation.
Licensing and Availability
Mistral Small 4 ships under the Apache 2.0 license, available on Hugging Face, the Mistral API, and NVIDIA Build. Apache 2.0 means any organization can download, deploy, and modify the model without licensing fees or usage restrictions.
Performance Gains
The model is 40% faster and handles 3x more concurrent queries per second than its predecessor, according to Mistral’s internal testing.
NVIDIA Partnership Announcement
Embedded in the Small 4 release is a new NVIDIA-Mistral partnership to co-develop frontier open models, announced simultaneously at GTC 2026. The partnership positions Mistral as NVIDIA’s preferred open-model provider for the NemoClaw agentic framework and broader developer ecosystem.
This dual-track strategy — closed enterprise models (NemoClaw) alongside open community models (Mistral) — reflects NVIDIA’s shift toward a full-stack agentic AI platform play, where NVIDIA controls infrastructure and partners provide the model layer.
Why It Matters
Mistral Small 4’s Apache 2.0 license on a 119B model makes the open-source stack viable for production agentic AI. Prior open models at this scale required custom training or faced licensing restrictions. Small 4 unifies reasoning, multimodal, and code capabilities that previously required separate model deployments.
For teams building autonomous agents, this means deploying a single open model across multiple agent types — reasoning agents, code-generation agents, and multimodal agents — without vendor lock-in or per-token fees.
The NVIDIA partnership signals that open-source infrastructure is no longer competing against proprietary closed models, but complementing them. NVIDIA runs enterprise proprietary models on Vera Rubin infrastructure. Developers run Mistral open models on commodity hardware and dev clouds. Both strategies coexist within a single platform narrative.