OpenClaw released 2026.5.26 beta 2 on May 27, shipping transcript capture as core platform infrastructure alongside Gateway performance improvements, mobile approval reactions, shared voice runtime primitives, and security hardening across browser, channel, and plugin boundaries.

The release is wide, but the pattern is consistent: make agents easier to inspect while running, cheaper to recover when something fails, and harder to manipulate through untrusted input.

Transcripts as Accountability Infrastructure

The headline change treats transcript capture and source-provider support for transcript-backed summaries as first-class infrastructure. The release improves source-provider chunks, cleaned user-turn persistence, media provenance, Codex mirrors, WebChat replies, CLI and TUI replay, and follow-up routing to admitted session targets.

The practical effect is accountability. Long-running agents don’t fail only when a model gives a bad answer. They fail when the system can’t prove what was said, what was routed, what was replayed, and which session owns the next action. For teams running OpenClaw in support, operations, coding, or internal automation, transcript-backed systems make summaries less opaque, follow-ups less ambiguous, and replays safer after runtime restarts, according to the OpenClaw Playbook analysis.

Gateway Gets Lighter

The performance work targets the operational surfaces operators feel most. Startup now avoids repeated plugin, channel, session, usage-cost, warning, scheduled-service, and filesystem scans. OpenClaw caches plugin metadata snapshots, package realpaths, stable Gateway metadata, model cost indexes, channel resolution, auth facts, and session details that previously required rediscovery on every check.

A production agent setup checks status constantly: model availability, cron health, browser readiness, channel delivery, usage costs, failed sessions, blocked tools, and active runs. When those checks each pay startup or metadata costs repeatedly, operators feel it as lag. Faster Gateway paths reduce that friction.

Reply paths also improve. The release separates user-facing sends from slower follow-up work, preserves Telegram typing and progress context, avoids hot-path model hydration, and tracks delivery timing. The human sees the useful response quickly while cleanup, compaction, diagnostics, and delivery bookkeeping run asynchronously.

Mobile Approval Reactions

Channel improvements span Telegram (typing/progress context, forum topic names, reply context, durable retry targets), iMessage (attachment handling, source dedupe, group media, catchup cursors, thumb approval reactions), WhatsApp (restored group and media behavior), and Signal (reaction approvals).

The approval reactions matter for operational use. Mobile approval flows become far more usable when a trusted person can approve or deny an agent action with a reaction instead of typing a command. Fewer stalled cron runs, fewer half-approved tool actions, less friction when the operator is away from a keyboard.

Voice Runtime Consolidation

Shared realtime turn-context tracking, output activity tracking, consult question matching, speakable-result extraction, forced-consult coordination, activation-name matching, and transcript screening now sit in a shared SDK path reused across Gateway Talk, Voice Call, Discord voice, browser voice, meeting surfaces, Google Meet commands, and node audio bridges.

The consolidation matters because voice agents are easy to make impressive and hard to make reliable. Wake names need tolerance without letting ambient speech trigger actions. Barge-in needs to understand whether the agent is speaking. Follow-up questions need enough transcript context for safe answers. Shared primitives reduce behavioral drift between surfaces.

Security Boundaries

The safety work covers multiple vectors. Browser snapshot reads now honor SSRF policy before ChromeMCP or direct CDP reads. System-event text is sanitized so plugin or channel labels cannot spoof nested prompt markers. Fetched file text and metadata are wrapped as external content. ClickClack sender allowlists run before dispatch. Invalidated device-token clients are rejected during rotation. Serialized tool-call text is scrubbed from replies.

This is infrastructure hygiene for a platform where agents increasingly act on messy external inputs: files, browser tabs, device events, plugin labels, channel messages, webhooks, and generated media. The platform draws a harder line between “external content I should inspect” and “instruction I should obey.”

The Infrastructure Maturation Signal

This beta ships infrastructure: transcripts, caching, approval flows, voice consolidation, and input sanitization — the kind of work that separates agent experimentation from agent operations. The trajectory is clear: OpenClaw is building for teams that attach agents to revenue work, not just personal productivity experiments.