Mistral AI released Medium 3.5 on April 29, a 128B dense model with a 256k context window and open weights under a modified MIT license. Alongside the model, the company launched remote coding agents in its Vibe CLI that run in cloud isolation, work through long tasks without developer supervision, and open pull requests when finished.
Medium 3.5 scores 77.6% on SWE-Bench Verified, ahead of Mistral’s own Devstral 2 and Qwen3.5 397B A17B, according to Mistral. The model also scores 91.4 on the τ³-Telecom agentic benchmark. Reasoning effort is configurable per request, meaning the same model handles quick chat replies and extended agentic runs. Mistral trained the vision encoder from scratch for variable image sizes and aspect ratios.
Cloud Agents, Not Local Loops
The bigger product shift is architectural. Coding agents in Vibe now run remotely in isolated sandboxes rather than on a developer’s laptop. Sessions can be spawned from the CLI or directly from Le Chat, Mistral’s web interface. Multiple sessions run in parallel. A local CLI session can be “teleported” to the cloud mid-task, carrying session history, task state, and approvals across environments.
Each session gets broad permissions within its sandbox for edits and installs. When work completes, the agent can open a pull request on GitHub and send a notification. Developers review the output rather than monitoring every step.
“Coding agents have mostly lived on your laptop. Today we’re moving them to the cloud, where they run on their own, in parallel, and you stop being the bottleneck on every step the agent takes,” Mistral wrote in the announcement.
Vibe integrates with GitHub for code and pull requests, Linear and Jira for issues, Sentry for incidents, and Slack or Teams for notifications. The target workload is high-volume, well-defined work: module refactors, test generation, dependency upgrades, CI investigations, and bug fixes.
The Coders Blog characterized the shift as a move “from a synchronous request-response model to an ‘offload and notify’ paradigm,” arguing that local LLM integration patterns are hitting scalability limits as agentic tasks grow more complex and longer-running.
Work Mode in Le Chat
Mistral also launched Work mode in Le Chat (preview), a new agentic mode powered by Medium 3.5 that handles multi-step tasks across connected tools. Work mode keeps connectors on by default, letting the agent reach into email, calendar, documents, and messaging platforms without manual configuration.
Use cases include: catching up across email, messages, and calendar in a single run; preparing meeting briefs with attendee context and talking points; triaging inboxes and drafting replies; creating Jira issues from discussions; and sending Slack summaries.
Every tool call and reasoning step is visible during execution. Le Chat requires explicit approval before sensitive actions like sending messages or modifying data.
Pricing and Availability
Medium 3.5 is available through the Mistral API at $1.50 per million input tokens and $7.50 per million output tokens. Open weights are on Hugging Face. The model is also available on NVIDIA build.nvidia.com endpoints and as an NVIDIA NIM containerized inference microservice. Self-hosting requires as few as four GPUs.
Remote agents and Work mode are available on Le Chat Pro, Team, and Enterprise plans. Medium 3.5 replaces Devstral 2 as the default model in both Vibe CLI and Le Chat.