Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152981 stories
·
33 followers

Teaching an AI Agent to Debug Flaky Tests

1 Share

If you’ve been connected to the internet for a while, you’ve surely heard of AI Agent Skills. They teach your agent to do this and that. You might have even used or written a couple of them yourself.

If you aren’t yet familiar with them, the idea is simple: Instead of prompting instructions for a specific task each time, you define them once and reuse them later. A Skill is an AI equivalent of a knowledge base article: a plain text document that lives in a discoverable location and describes steps, a set of conventions, or domain-specific knowledge.

Most Skills you see in the wild are for simple things like enforcing code style or commit message conventions. But they can be much more powerful than that. In this article, we’ll combine AI Skills, good old developer tools, and a bit of creative thinking to address a notoriously challenging task: making AI deterministically find the root cause of flaky tests.

The problem

Quoting the TeamCity CI/CD guide:

Flaky tests are defined as tests that return both passes and failures despite no changes to the code or the test itself.

Flakiness undermines the whole point of tests: When a test fails, you can’t tell whether something is actually broken. You can’t fully rely on the test results, and at the same time, you can’t ignore them. This wastes both human and infrastructure resources.

And as if the underlying bugs weren’t difficult enough on their own, flaky tests often have this property of failing once in several thousand runs, making them extremely hard to reproduce and debug.

Example project

For the example project, let’s take the webshop demo from this article: Your Programs Are Not Single-Threaded. It is a Spring Boot project, in which one of the services has a TOCTOU (time-of-check to time-of-use) problem: It checks a condition and then acts on it, but another thread can change the state in between. In this particular case, it may sometimes cause duplicate invoice numbers and also makes the corresponding test flaky.

Here’s the problematic test:

@SpringBootTest
class InvoiceServiceTest {

    @Autowired
    private OrderService orderService;

    @Test
    void firstTwoOrdersGetInvoiceNumbersOneAndTwo() {
        CompletableFuture<Invoice> alice = CompletableFuture.supplyAsync(
                () -> orderService.checkout("Alice", BigDecimal.TEN));
        CompletableFuture<Invoice> bob = CompletableFuture.supplyAsync(
                () -> orderService.checkout("Bob", BigDecimal.TEN));

        String num1 = alice.join().getInvoiceNumber();
        String num2 = bob.join().getInvoiceNumber();

        assertEquals(Set.of("INV-00001", "INV-00002"), Set.of(num1, num2));
    }
}

The test creates two orders concurrently and checks that the resulting invoices get numbers INV-00001 and INV-00002. Because of a bug in InvoiceService, it can either pass or fail randomly.

Note: If you’re using IntelliJ IDEA, you can test whether a test is actually flaky by using the Run until failure option in the test runner. Leave the suspect spinning for some time and see if it eventually fails.


If we knew nothing about the underlying bug, and only had the test, is there a tool that could help us find the root cause? Or can we make one ourselves? Furthermore, could we delegate both building and using the tool to AI?

The intuition

Let’s come up with some intuition for this class of problem.

To produce two kinds of results, the execution must follow different code paths. The difference might be minimal, possibly just one extra method call or one if branch taken instead of another. But it has to be there; otherwise, the result would be consistent. So, if we could record the code path for a passing run and a failing run and then compare them, the diff should at least point us in the right direction. And ideally, by following the call tree, we could find the place where execution splits. This line must be exactly where the flakiness originates.

Does this reasoning make sense? Let’s put it to the test.

Build the tools

What tool can we use for recording code paths? While not designed specifically for tracing, a test coverage tool can give us the information we’re after.

There are a couple of Java coverage tools to choose from, such as JaCoCo and IntelliJ IDEA’s coverage tool. We’ll go with IntelliJ IDEA’s, because it includes a hit counting feature that is very useful. We may need this extra granularity because the flakiness might stem not only from what is executed, but also how many times.

Run coverage from the command line

IntelliJ IDEA’s coverage tool has a familiar UI, but we need a way to launch it programmatically. Fortunately, coverage can also be collected from the command line by attaching the coverage agent to the JVM via Maven Surefire:

mvn surefire:test \
  -Dtest=com.example.webshop.service.InvoiceServiceTest \
  "-DargLine=-Didea.coverage.calculate.hits=true \
    -javaagent:\$AGENT_JAR=\$IC_FILE,true,false,false,true,com.example.webshop.*"

The -Didea.coverage.calculate.hits=true flag tells the agent to record invocation counts per line rather than just a boolean hit/not-hit mask. After the test finishes, the results are written to a binary .ic file.

So far so good, but we need the report in a human (and AI)-readable format.

Add text output

Luckily, the IntelliJ coverage agent is open-source. Let’s clone the project and ask AI to add a text reporter that converts binary reports to plain text.

The agent creates a new class called TextCoverageStatistics. After we build the project and run the reporter against our .ic file, we get something like this:

=== Coverage Summary ===

  Instructions: 236/618  38,2%
  Branches    : 0/20   0,0%
  Lines       : 56/150  37,3%
  ...

=== Per-Class Coverage ===

Class                                                           Lines    Line%  Methods    Meth%
--------------------------------------------------------------------------------------------
...
com.example.webshop.service.InvoiceNumberGenerator              4/4    100,0%    2/2    100,0%
com.example.webshop.service.InvoiceService                     10/10   100,0%    3/3    100,0%
com.example.webshop.service.OrderService                        6/6    100,0%    2/2    100,0%
...

The first part of the report gives a high-level overview: How many lines, branches, and methods were covered across the entire project. Below that, there’s a per-class breakdown showing the same metrics for each class individually.

Then it is followed by per-line hit counts for each class:

--- com.example.webshop.service.InvoiceService ---
  Line       Hits  Branch
  19            2
  20            1
  22            2
  23            2
  24            2
  ...

For every line that the coverage agent instrumented, we see how many times it was executed and whether any branches were taken. The actual report is longer, but you get the idea. Now we have a text representation of which lines were executed, and exactly how many times.

This is the raw material we need for the diff. So far, so good!

Diff the reports

Supposedly, the obtained reports contain the necessary information, and a very determined developer could peruse them and find the bug. But we’re not here for mundane tasks like that, right?

Let’s upgrade the tool so that it gets multiple report variations and presents the diff. The most controllable way would be to do one “brick” at a time, but I think we’re safe to delegate the entire thing to AI here, including the automation:

The resulting script runs the test in a loop until both of the following happen:

  • We get at least one passing and one failing run.
  • The specified number of runs have passed.

Both conditions are important because test failures can be very rare, and the specified number of runs might not be enough. At the same time, there can be finer grained variations within pass and fail runs, so we might want to catch those too.

After the reports are collected, the script summarizes the lines that have variations between the runs. Here’s what it looks like:

Collected 20 runs: 12 pass, 8 fail

Lines that vary across runs:

  Invoice:29                           Hits(1,2)
  Invoice:31                           Hits(1,2)
  Invoice:32                           Hits(1,2)
  InvoiceNumberGenerator:15            Hits(1,2)
  InvoiceService:19                    Hits(1,2)  Branch(1/2)
  InvoiceService:20                    Hits(1,2)
  InvoiceService:22                    Hits(1,2)
  InvoiceService:24                    Hits(1,2)

All variations have the same pattern: the difference is not which lines were executed, but how many times. As we expected, the hit counting feature of IntelliJ IDEA’s coverage agent proved useful!

The varying lines point at a lazy initialization block in InvoiceService and its downstream effects in InvoiceNumberGenerator and Invoice. The variation in hit counts means that the initialization sometimes runs more than once, which shouldn’t happen. That’s exactly where the flakiness comes from.

If you missed the article that describes the problem, here’s why double initialization causes this bug. The createGenerator() method queries the database for the last used invoice number and creates a counter starting from that value. When two threads both enter the if (generator == null) block before either finishes, each reads the same number from the database and creates its own generator starting from the same value. The result is duplicate invoice numbers.

The coverage diff has pointed us at the very same TOCTOU race discussed in more detail in the previous article. But, what is novel in our current approach is that it doesn’t solely rely on human expertise and is easily accessible for AI.

Turning it into a Skill

Now, I’d say that AI-assisted modifications to open-source tools that help you solve the task at hand, all within minutes, are amazing on their own. But let’s keep our eyes on the bigger picture.

Here’s what we’ve done so far: We started with an intuition: Flaky tests take different code paths, and coverage analysis can reveal where they diverge. Then we turned that intuition into a concrete, repeatable procedure. Does this warrant a knowledge base article, or an AI Agent Skill, perhaps? Yes!

In the same agent session, let’s ask the agent to:

  1. Make sure all the scripts are self-contained and runnable.
  2. Document the entire procedure in a SKILL.md file, step by step, so that another agent can follow it without any prior context.

The agent packages everything and writes a guide that describes when to apply the Skill, what tools are needed, and what steps to follow.

The only follow-up during review was to align the Skill with the specification. The original Skill written by the agent lacks meta in frontmatter. Agents are good at sorting out Skills that omit minor details, but meta is important for discoverability. Without it, a Skill might not be picked up by an agent in the first place.

Testing the Skill

To verify that the Skill actually works, let’s start a fresh agent session. No warm-up, no hints. Instead, let’s deliberately phrase it in a very general way, something like “find and fix the cause of flakiness in InvoiceServiceTest“.

An agent uses the skill

The agent matches the Skill description from SKILL.md with the problem description, discovers the instructions, and executes them: It runs the coverage script, reads the diff, and identifies the race condition. Instead of guesswork, it follows the established steps and arrives at the same conclusion every time. That’s about as deterministic as generative AI can get!

Summary

The changes that we’ve made to the coverage agent are already published with the new version 1.0.774. And the Skill is available here.

In this article, we started with an intuition about flaky tests, built custom tooling around an open-source coverage agent, used it to find a race condition, and packaged the entire procedure into a reusable AI Skill. You can use this Skill for finding flaky tests in your own projects, but I hope this post conveys the bigger idea.

AI Skills allow you to teach agents to solve virtually anything, as long as you can stack text interfaces together. Many hard programming problems can be broken down into simpler ones and solved using familiar tools. And with AI orchestrating all this, we can even make the process enjoyable. As was the case long before AI, curiosity is the only real prerequisite.

Have you been inspired to solve a tough problem in your own work? Would you like to share the Skills you wrote or find most useful? Let us know in the comments!

Happy debugging!

Read the whole story
alvinashcraft
46 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

High-performance distributed caching with .NET

1 Share

High-performance distributed caching with .NET
15 minutes by Jared Meade

Jared's guide walks through building a .NET console app with a two-layer caching setup using HybridCache and PostgreSQL. Slow external data calls that take seconds are replaced with responses returning in under a millisecond from memory, or in a fraction of a second from the database cache. When the in-memory cache expires, the database cache keeps things fast, and only after both expire does the app fetch fresh data from the source.

Find UI freezes before your users do
sponsored by Jetbrains

Don’t let hidden bottlenecks make it to production. The Monitoring tool window in JetBrains Rider shows real-time CPU, GC, and memory data in interactive charts. Built-in detectors quickly flag UI freezes, hotspots, and GC spikes, then take you right to the code with one click. Fix performance issues faster. Try Rider today.

Anti-corruption layer in .NET: Protecting your domain from external APIs
9 minutes by Adrian Bailador

Adrian explains the Anti-Corruption Layer pattern in .NET, which protects your domain from being polluted by external APIs. He shows how to isolate and translate external models into your own domain language, keeping business logic clean. By using an Anti-Corruption Layer, changes in external systems only affect one layer, making your application easier to maintain, understand, and extend.

Explore union types in C# 15
7 minutes by Bill Wagner

C# 15 adds union types, letting you declare a value as exactly one of a fixed set of types. The compiler enforces exhaustive pattern matching, so missing cases cause warnings at build time rather than failures at runtime. Types in a union do not need to share a common ancestor. The feature is available now in .NET 11 Preview 2 with language version set to preview.

ASP.NET Core cookie size limits in production: Causes and fixes
10 minutes by Khalid Abuhakmeh

Khalid argues that authentication can fail in production due to oversized cookies. ASP.NET Core stores user claims in cookies, which can grow too large from many claims, tokens, or multiple schemes. This causes silent login failures or HTTP 431 errors. Solutions include reducing claims, disabling token storage, clearing old cookies, or using server-side sessions. The best approach is minimizing cookie data and loading user information on demand.

Build your own CQRS dispatcher in .NET 10
24 minutes by Mukesh Murugan

MediatR went commercial in July 2025, pushing many teams to find alternatives. In this article, Mukesh built a custom CQRS dispatcher in .NET 10 that replaces MediatR with about 100 lines of code, supports the same pipeline behavior pattern, returns ValueTask for fewer allocations, and benchmarks 4.4x faster than MediatR 12.4.1 on real BenchmarkDotNet runs.

And the most popular article from the last issue was:

Read the whole story
alvinashcraft
48 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

OpenClaw 2026.5.3

1 Share

Highlights

  • Plugins/file-transfer: add bundled file-transfer plugin with file_fetch, dir_list, dir_fetch, and file_write agent tools for binary file ops on paired nodes; default-deny per-node path policy under plugins.entries.file-transfer.config.nodes with operator approval, symlink traversal refused by default (opt-in followSymlinks), and a 16 MB byte ceiling per round-trip. (#74742) Thanks @omarshahine.
  • Plugins/install: harden official plugin install, uninstall, update, onboarding, ClawHub fallback, npm dependency-state reporting, and beta-channel update paths so externalized plugins behave like first-class package installs.
  • Gateway/performance: trim startup and Control UI hot paths by lazy-loading plugin/runtime discovery, cron, schema, shutdown, sessions, and model metadata work only when needed.
  • Channels/replies: improve Discord status reactions and degraded transport reporting, add WhatsApp Channel/Newsletter targets, and tighten Telegram, Feishu, Matrix, Microsoft Teams, and Slack delivery/recovery behavior.
  • Install/update: recover broken macOS LaunchAgent upgrades, reject source-only plugin packages before runtime load, and repair stale Gateway/plugin state during updates and doctor runs.
  • Agent/runtime reliability: preserve streamed provider replies, delayed A2A session replies, prompt/tool delivery, memory recall, web search provider discovery, and provider-specific thinking/model metadata across common edge cases.

Changes

  • Channels/streaming: add unified streaming.mode: "progress" drafts with auto single-word status labels and shared progress configuration across Discord, Telegram, Matrix, Slack, and Microsoft Teams.
  • Agents/commands: add /steer <message> for queue-independent steering of the active current-session run without starting a new turn when the session is idle. (#76934)
  • Tools/BTW: add /side as a text and native slash-command alias for /btw side questions.
  • Doctor/config: doctor --fix now commits safe legacy migrations even when unrelated validation issues (e.g. a missing plugin) prevent full validation from passing, so agents.defaults.llm and other known-legacy keys are always cleaned up by doctor --fix regardless of other config problems. Fixes #76798. (#76800) Thanks @hclsys.
  • Agents/tools: skip optional media and PDF tool factories when the effective tool denylist already blocks them, avoiding unnecessary hot-path setup for tools that will be filtered out before model use. (#76773) Thanks @dorukardahan.
  • Discord/status: let explicit reaction tool calls opt into tracking subsequent tool progress on the reacted message with trackToolCalls: true, and use the shared tool display emoji table for status reactions.
  • Gateway/config: stop Gateway startup and hot reload from auto-restoring invalid config; invalid config now fails closed and openclaw doctor --fix owns last-known-good repair.
  • Gateway/performance: lazy-load early runtime discovery and shutdown-hook helpers, defer maintenance timers until after readiness, and trim duplicate plugin auto-enable work during Gateway startup.
  • QA/Mantis: add a pnpm openclaw qa mantis discord-smoke runner and manual GitHub workflow that verify the Mantis Discord bot can see the configured guild/channel, post a smoke message, add a reaction, and upload artifacts.
  • QA/Slack: add a Slack live transport QA runner with canary and mention-gating coverage for the private bot-to-bot harness. Thanks @vincentkoc.
  • Plugins/onboarding: let Manual setup install optional official plugins, including ClawHub-backed diagnostics with npm fallback, and expose the external Codex plugin as a selectable provider setup choice. Thanks @vincentkoc.
  • Plugins/CLI/update: include package dependency install state in openclaw plugins list --json, trust official externalized npm migrations, clean stale bundled load paths for externalized installs, try plugin @beta updates first on the beta OpenClaw channel, and fall back to default/latest when no plugin beta release exists.
  • Plugins/ClawHub: annotate 429 errors with reset windows and unauthenticated higher-rate-limit hints, so operators can tell when downloads recover and when signing in helps. Thanks @RomneyDa.
  • Gateway/performance: lazy-load early runtime discovery, shutdown hooks, cron, channel-config schema metadata, restart sentinels, and maintenance timers after readiness; trim duplicate plugin auto-enable work and add startup CPU/profile controls.
  • Gateway/config: stop Gateway startup and hot reload from auto-restoring invalid config; invalid config now fails closed and openclaw doctor --fix owns last-known-good repair.
  • Discord/status: let explicit reaction tool calls opt into tracking later tool progress with trackToolCalls: true, share tool display emoji mapping, and surface degraded Discord transport or gateway event-loop starvation in status output. (#76327) Thanks @joshavant.
  • Channels/WhatsApp: support explicit WhatsApp Channel/Newsletter @newsletter outbound message targets with channel session metadata instead of DM routing. Fixes #13417; carries forward the narrow outbound target idea from #13424. Thanks @vincentkoc and @agentz-manfred.
  • Agents/tools: skip optional media and PDF tool factories when the effective tool denylist already blocks them, avoiding unnecessary hot-path setup for tools that will be filtered out before model use. (#76773) Thanks @dorukardahan.
  • Agents/sandbox: store sandbox container and browser registry entries as per-runtime shard files, reducing unrelated session lock contention while openclaw doctor --fix migrates legacy monolithic registry files. (#74831) Thanks @luckylhb90.
  • Tools/BTW: add /side as a text and native slash-command alias for /btw side questions.
  • Exec approvals: add a tree-sitter-backed shell command explainer for future approval and command-review surfaces. (#75004) Thanks @jesse-merhi.
  • QA/Mantis: add a pnpm openclaw qa mantis discord-smoke runner and manual GitHub workflow that verify the Mantis Discord bot can see the configured guild/channel, post a smoke message, add a reaction, and upload artifacts.

Fixes

  • Channels/WhatsApp: allow @whiskeysockets/libsignal-node in onlyBuiltDependencies so pnpm v9+ blockExoticSubdeps no longer rejects the baileys git-tarball subdep and silences all inbound agent replies. Fixes #76539. Thanks @ottodeng and @vincentkoc.

  • Gateway/systemd: preserve operator-added secrets in the Gateway env file across re-stage while clearing OpenClaw-managed keys (such as OPENCLAW_GATEWAY_TOKEN) so a fresh staging value is never shadowed by a stale env-file copy; operator secrets are also retained when the state-dir .env is empty. Fixes #76860. Thanks @hclsys.

  • Plugin updates: do not short-circuit trusted official npm updates as unchanged when the default/latest spec still resolves to an already-installed prerelease that the installer should replace with a stable fallback. Thanks @vincentkoc.

  • Plugin tools: keep auth-unavailable optional tools hidden even when another default tool from the same plugin is available and tools.alsoAllow names the optional tool. Thanks @vincentkoc.

  • Realtime transcription: report socket closes before provider readiness as closed-before-ready failures instead of mislabeling them as connection timeouts for OpenAI, xAI, and Deepgram streaming transcription. Thanks @vincentkoc.

  • OpenAI/Google Meet: fail realtime voice connection attempts when the socket closes before session.updated, avoiding stuck Meet joins waiting on a bridge that never became ready. Thanks @vincentkoc.

  • QA/cache: require the full CACHE-OK <suffix> marker before live cache probes stop retrying, so suffix-only prose cannot hide a broken probe response. Thanks @vincentkoc.

  • Slack/Matrix: avoid creating blank progress-draft messages when streaming.progress.label=false and progress tool lines are disabled. Thanks @vincentkoc.

  • QA/Matrix: keep the mock OpenAI tool-progress provider aligned with exact-marker Matrix prompts so the hardened live preview scenario still forces a deterministic read before final delivery. Thanks @vincentkoc.

  • OpenAI/Google Meet: wait for realtime voice session.updated before treating the bridge as connected, so Meet joins do not return with audio queued behind an unconfigured realtime session. Thanks @vincentkoc.

  • Plugins/catalog: merge official external catalog descriptors into partial package channel config metadata, so lagging WeCom/Yuanbao manifests keep their own schema while still exposing host-supplied labels and setup text. Thanks @vincentkoc.

  • Plugins/catalog: supplement lagging official external WeCom and Yuanbao npm manifests with channel config descriptors and declared tool contracts from the OpenClaw catalog, so trusted package sweeps no longer fail because external package metadata trails the host contract. Thanks @vincentkoc.

  • Plugins/install: let trusted official @openclaw/* catalog installs recover when npm latest points at a prerelease by falling back to the newest stable version, or by selecting the newest exact prerelease for prerelease-only launch packages with a warning instead of making beta/development plugin sweeps fail at install time. Thanks @vincentkoc.

  • Google Meet: grant Chrome media permissions against the actual Meet tab, start the local realtime audio bridge only after Meet joins, expose realtime transcripts in status/logs, and force explicit audio responses with current OpenAI realtime output-audio events so BlackHole capture does not keep the OpenClaw participant muted or silent.

  • Memory/LanceDB: declare apache-arrow in the bundled memory plugin package so LanceDB installs include its runtime peer. Fixes #76910. Thanks @afiqfiles-max.

  • CLI/devices: retry explicit device-pair approval with operator.admin after a pairing-scope ownership denial, so existing admin-capable paired-device tokens can recover new Control UI/browser pairing after upgrades instead of requiring manual JSON edits. Fixes #76956. Thanks @neo19482.

  • Google Meet: use the local call-control microphone button instead of disabled remote participant mute buttons, and block realtime speech when the OpenClaw Meet microphone remains muted.

  • Google Meet: refresh realtime browser state during status and retry delayed speech after Meet finishes joining, so a just-opened in-call tab no longer leaves speech stuck behind stale not-in-call health.

  • Plugins/install: recover the install ledger from the managed npm root when plugins/installs.json is empty or partial, so reinstalling Discord and Codex no longer makes the other installed plugin disappear.

  • Google Meet: grant Meet media permissions through the Playwright browser context when CDP grants do not affect the attached Chrome page, and report in-call microphone/speaker permission problems instead of marking realtime speech ready.

  • QA/Slack: fail the live mention-gating scenario on any unexpected SUT reply, even when the reply does not echo the expected marker. Thanks @vincentkoc.

  • QA/Matrix: steer the live tool-progress preview check away from HEARTBEAT.md and report final preview candidates when the live marker reply misses the exact token. Thanks @vincentkoc.

  • QA/Matrix: let the live tool-progress preview check verify progress replacement events without depending on the preview saying Working. Thanks @vincentkoc.

  • Tlon: expose groupInviteAllowlist in the channel config schema and clarify that group invite auto-accept fails closed without an invite allowlist. Thanks @vincentkoc.

  • Control UI/WebChat: collapse duplicate in-flight internal text sends onto the active Gateway run so rapid repeat submits do not start fresh agent:main:main dispatches. Fixes #75737. Thanks @dsdsddd1 and @BunsDev.

  • Mattermost: accept the documented channels.mattermost.streaming config and honor streaming: "off" by disabling draft preview posts. Thanks @vincentkoc.

  • Mattermost: expose streaming progress config labels and help text in generated channel config metadata so Control UI/docs can explain the new channels.mattermost.streaming.progress.* fields. Thanks @vincentkoc.

  • Mattermost: honor channels.mattermost.streaming.progress.toolProgress=false in progress draft mode so compact tool status lines stay hidden until final delivery. Thanks @vincentkoc.

  • Microsoft Teams: honor progress draft tool lines in native Teams progress streams and suppress standalone tool messages when channels.msteams.streaming.progress.toolProgress=false. Thanks @vincentkoc.

  • Discord: keep progress draft boundary callbacks bound during streaming replies, so extension lint stays green while progress previews transition between assistant and reasoning blocks. Thanks @vincentkoc.

  • Discord: resolve SecretRef-backed bot tokens from the active runtime snapshot for named accounts and keep unresolved configured tokens from crashing status or health checks. (#76987) Thanks @joshavant.

  • Channels/streaming: expose streaming.progress.label, labels, maxLines, and toolProgress in bundled channel config metadata so progress draft settings appear in config, docs, and control surfaces. Thanks @vincentkoc.

  • Channels/streaming: normalize whitespace and case for streaming.progress.label: "auto" so progress draft labels keep using the built-in label pool instead of rendering a literal auto title. Thanks @vincentkoc.

  • Plugins/Codex: preserve Codex-native OAuth routing for /codex bind app-server turns so bound sessions keep the selected Codex auth profile instead of falling back to public OpenAI credentials. (#76714) Thanks @keshavbotagent.

  • Gateway/install: prefer supported system Node over nvm/fnm/volta/asdf/mise when regenerating managed gateway services, so gateway install --force no longer recreates service definitions that doctor immediately flags as version-manager-backed. Fixes #76339. Thanks @brokemac79.

  • Cron/status: render explicit delivery.mode: "none" jobs as no-delivery previews and label cron session history distinctly instead of showing fallback delivery or direct-session rows. Fixes #76945.

  • Gateway/usage: serve usage.cost and sessions.usage from a durable transcript aggregate cache with lock-safe background refreshes and localized stale-cache status, so large usage views avoid repeated full scans. (#76650) Thanks @Marvinthebored.

  • Plugins/hooks: let plugins.entries.<id>.hooks.timeoutMs and plugins.entries.<id>.hooks.timeouts bound plugin typed hooks from operator config, so slow hooks can be tuned without patching installed plugin code. Fixes #76778. Thanks @vincentkoc.

  • Telegram: add channels.telegram.mediaGroupFlushMs at the top level and per account so operators can tune album buffering instead of being stuck with the hard-coded 500ms media-group flush window. Fixes #76149. Thanks @vincentkoc.

  • Config/messages: coerce boolean messages.visibleReplies and messages.groupChat.visibleReplies values to the documented enum modes so an intuitive toggle no longer invalidates config and drops channel startup. Fixes #75390. Thanks @scottgl9.

  • Agents/network: allow trusted web-search providers and configured model-provider hosts to work behind Surge/Clash/sing-box fake-IP DNS by accepting RFC 2544 and IPv6 ULA synthetic answers only for the request's scoped hostname, without broad private-network access. Refs #76530 and #76549. Thanks @zqchris.

  • Providers: honor env-proxy settings for guarded provider model fetches when no explicit dispatcher policy is configured, preserving explicit transport overrides. Fixes #70453. (#72480) Thanks @mjamiv.

  • Web fetch: add a default-off tools.web.fetch.useTrustedEnvProxy opt-in for proxy-only environments so web_fetch can let an operator-controlled HTTP(S) proxy resolve DNS while preserving default strict DNS pinning and hostname policy checks. Refs #58034 and #62560. Thanks @cosmicnet and @mjamiv.

  • Feishu: accept and honor channels.feishu.blockStreaming at the top level and per account, while keeping the legacy default off so Feishu cards no longer reject documented config or silently drop block replies. Fixes #75555. Thanks @vincentkoc.

  • Gateway/update: avoid launchctl kickstart -k immediately after fresh macOS update bootstraps, and unlink dangling global plugin-runtime symlinks during packaged postinstall and doctor --fix so upgrades no longer SIGTERM the newly booted Gateway or leave bundled plugin imports pointed at pruned plugin-runtime-deps trees. Completes #76261 and fixes #76466. (#76929)

  • Google Chat: normalize custom Google auth transport headers before google-auth/gaxios interceptors run, restoring webhook token verification when certificate retrieval expects Fetch Headers. Fixes #76742. Thanks @donbowman.

  • Doctor/plugins: reset stale plugins.slots.memory and plugins.slots.contextEngine references during doctor --fix, so cleanup of missing plugin config does not leave unrecoverable slot owners behind. Fixes #76550 and #76551. Thanks @vincentkoc.

  • Docs/WhatsApp: merge the duplicate top-level web objects in the gateway channel config example so copy-pasted WhatsApp config keeps both web.whatsapp and reconnect settings. Fixes #76619. Thanks @WadydX.

  • Plugins/Anthropic: expose Claude thinking profiles from the bundled provider-policy artifact so non-runtime callers keep Opus 4.7 adaptive, xhigh, and max instead of downgrading to high. Fixes #76779. Thanks @tomascupr and @iAbhi001.

  • Plugins/tools: honor tools.alsoAllow as an optional plugin tool discovery hint without treating its internal allow-all default as permission to load every manifest-marked optional plugin tool. Fixes #76616.

  • Discord/native commands: skip slash-command registration and cleanup REST calls when channels.discord.commands.native=false, letting low-power gateways start without waiting on disabled native-command lifecycle requests. Fixes #76202. Thanks @vincentkoc.

  • CLI/plugins: reject unowned command roots such as openclaw foo before managed proxy startup and full plugin CLI runtime loading while preserving manifest-owned and CLI-metadata-owned plugin commands. Fixes #75287. Thanks @neilofneils404.

  • CLI/message: skip local configured-channel plugin preload for explicit gateway-owned message actions, letting normalized CLI delivery delegate to the gateway without initializing channel runtime in the short-lived CLI process. Fixes #75477.

  • Plugins/commands: normalize empty plugin command handler results and let Telegram native plugin commands send the empty-response fallback instead of throwing when a handler returns undefined. Fixes #74800. Thanks @vincentkoc.

  • Plugins/tools: cold-load selected plugin tool registries when the active registry only has partial tool coverage, so wildcard-expanded allowlists no longer hide installed plugin tools from tools.effective. Fixes #76780. Thanks @lilesjtu.

  • Plugins/tools: compare cached and runtime plugin tool name conflicts with normalized core tool names, so case variants of core tools are blocked instead of leaking duplicate tool registrations. Thanks @vincentkoc.

  • Plugins/OpenRouter: advertise DeepSeek V4 thinking levels, including xhigh and max, through the runtime and lightweight provider policy surfaces so /think validation no longer rejects OpenRouter-routed DeepSeek V4 models. Fixes #74788. Thanks @vincentkoc.

  • Status/sessions: ignore malformed non-string persisted session provider/model metadata instead of throwing while rendering status summaries. Fixes #76206. Thanks @vincentkoc.

  • CLI/config: remove only the targeted array element for openclaw config unset array[index] instead of replaying the unset during config write and deleting the shifted next element. Fixes #76290. Thanks @SymbolStar and @vincentkoc.

  • Plugins/voice-call: treat abnormal local Gateway close code 1006 as a standalone CLI fallback case, so voicecall smoke and related commands can still run the provider check path when the Gateway socket closes before returning a response.

  • CLI/doctor: migrate legacy per-channel streaming.progress config into streaming.preview.toolProgress, so upgrades with stale Discord or Telegram streaming keys validate again instead of blocking plugin commands.

  • Plugins/release: reject ClawHub code-plugin packages that contain TypeScript runtime entries without compiled dist/*.js output, and run package-local runtime-build checks during npm and ClawHub plugin release previews.

  • Plugins/update: keep beta-installed OpenClaw package updates on the beta plugin channel even when config still says stable, so Discord and other externalized plugins update from compiled @beta packages instead of stale source-only latest artifacts.

  • Agents/tools: stop treating tools.deny: ["write"] as an implicit apply_patch deny; operators who want to block patch writes should deny apply_patch or group:fs explicitly. Fixes #76749. (#76795) Thanks @Nek-12 and @hclsys.

  • Plugins/release: verify published plugin npm tarballs expose compiled runtime entries after publish, catching TS-only package artifacts before release closeout. Thanks @vincentkoc.

  • CLI/message: exit cleanly with a nonzero status when message-command plugin registry loading fails before dispatch, preventing openclaw-message children from staying alive after plugin load errors. Fixes #76168.

  • Plugins/config: report configured plugins that are present but blocked by path-safety checks as blocked instead of stale plugin not found entries, and deduplicate repeated blocked-candidate warnings during discovery. Fixes #76144. Thanks @mayank6136.

  • Gateway/update: recover an installed-but-unloaded macOS LaunchAgent after package updates, rerun Gateway health/version/channel readiness checks, and print restart, reinstall, and rollback guidance before reporting update failure. (#76790) Thanks @jonathanlindsay.

  • CLI/plugins: explain when a missing plugin command alias belongs to a bundled plugin that is disabled by default, including the openclaw plugins enable <plugin> repair command. (#76835)

  • Gateway/Bonjour: auto-start LAN multicast discovery only on macOS hosts while preserving explicit openclaw plugins enable bonjour startup elsewhere, so Linux servers and containers that do not need LAN discovery avoid default mDNS probing and watchdog churn. Refs #74209.

  • Gateway/macOS: stop doctor and LaunchAgent recovery from running launchctl kickstart -k after a fresh bootstrap, avoiding an immediate SIGTERM of the just-started gateway while still nudging already-loaded launchd jobs. Fixes #76261. Thanks @solosage1.

  • Google Meet: route stateful CLI session commands through the gateway-owned runtime so joined realtime sessions survive after the starting CLI process exits. Fixes #76344. Thanks @coltonharris-wq.

  • Memory/status: split builtin sqlite-vec store readiness from embedding-provider readiness in memory status --deep and openclaw status, so local vector-store failures no longer look like provider failures and provider failures no longer hide a healthy local vector store.

  • CLI/doctor: trust a ready gateway memory probe when CLI-side active memory backend resolution is unavailable, preventing false "No active memory plugin is registered" warnings for healthy runtime setups. Fixes #76792. Thanks @som-686.

  • Memory/status: keep plain openclaw memory status and openclaw memory status --json on the cheap read-only path by reserving vector and embedding provider probes for --deep or --index. Fixes #76769. Thanks @daruire.

  • Telegram: suppress stale same-session replies when a newer accepted message arrives before an older in-flight Telegram dispatch finalizes. Fixes #76642. Thanks @chinar-amrutkar.

  • Gateway/diagnostics: throttle repeated long-running active-work session warnings so healthy cron or subagent runs no longer print the same recovery=none line every heartbeat.

  • Gateway/diagnostics: keep non-blocking active-work and transient event-loop max-spike liveness diagnostics out of the default gateway console while preserving structured diagnostic events and warnings for queued, stalled, and recovery-eligible work.

  • Slack: collapse routine Socket Mode pong-timeout reconnects into one OpenClaw reconnect line and suppress the duplicate Slack SDK pong warning.

  • Gateway/diagnostics: abort-drain embedded runs after an extended no-progress stall so a single dead session no longer leaves queued Discord/channel turns blocked behind repeated recovery=none liveness warnings.

  • Plugins/ClawHub: accept the live artifact resolver kind/sha256 field names alongside the typed artifactKind/artifactSha256 form so clawhub: installs of npm-pack and legacy ZIP packages no longer miss downloadable artifacts. Thanks @RomneyDa.

  • Control UI/Sessions: avoid full sessions.list reloads for chat-turn sessions.changed payloads, so large session stores no longer add multi-second delays while chat responses are being delivered. (#76676) Thanks @VACInc.

  • Gateway/watch: run doctor --fix --non-interactive once and retry when the dev Gateway child exits during startup, so stale local plugin install/config state does not leave the tmux watch session disappearing without a repair attempt.

  • Doctor/Telegram: warn when selected Telegram quote replies can suppress streaming.preview.toolProgress, and document the replyToMode trade-off without changing runtime delivery. Fixes #73487. Thanks @GodsBoy.

  • Channels/Discord: send a best-effort native typing cue immediately after an inbound DM is accepted, so slow pre-dispatch turns show Discord liveness before queueing, context assembly, model, or tool work starts. Fixes #76417. Thanks @mlopez14.

  • Plugins/install: reject source-only TypeScript package installs and installed plugin packages that are missing compiled runtime output, so broken npm artifacts fail at install/discovery time instead of falling through jiti and surfacing later as unavailable providers. Fixes #76720.

  • Plugins/config: deduplicate identical manifest compatibility diagnostics when an explicitly configured plugin overrides another discovered candidate, so external channel plugins do not print the same missing channelConfigs warning repeatedly during install and enable. Thanks @vincentkoc.

  • Discord/status: honor explicit messages.statusReactions.enabled: true in tool-only guild channels so queued ack reactions can progress through thinking/done lifecycle reactions instead of stopping at the initial emoji. Thanks @Marvinthebored.

  • Discord/native commands: compare Discord-normalized slash-command descriptions and localized descriptions during reconcile so CJK or multiline command text no longer triggers redundant startup PATCH bursts and rate-limit 429s. Fixes #76587. Thanks @zhengsx.

  • Agents/OpenAI: omit Chat Completions reasoning_effort for gpt-5.4-mini only when function tools are present while preserving tool-free Chat and Responses reasoning support, preventing Telegram-routed fallback runs from hanging after OpenAI rejects tool payloads. Fixes #76176. Thanks @ThisIsAdilah and @chinar-amrutkar.

  • Telegram: reuse the successful startup getMe probe for grammY polling startup and continue into getUpdates after recoverable deleteWebhook cleanup failures, reducing high-latency Bot API control-plane calls before long polling starts. Refs #76388. Thanks @jackiedepp.

  • Gateway/diagnostics: merge session id/key aliases in diagnostic session state and activity tracking so completed runs no longer leave stale queued work behind that keeps liveness samples at warning level.

  • Agents/models: forward model maxTokens as the default output-token limit for OpenAI-compatible Responses and Completions transports when no runtime override is provided, preventing provider defaults from silently truncating larger outputs. (#76645) Thanks @joeyfrasier.

  • macOS CLI/onboarding: honor sensitive wizard text steps in openclaw-mac wizard with termios no-echo input, suppressing saved credential previews while preserving long API keys and gateway tokens. Fixes #76698. Thanks @anurag-bg-neu and @sallyom.

  • Control UI/Skills: fix skill detail modal silently failing to open in all browsers by deferring showModal() until the dialog element is connected to the DOM; the Lit ref callback fired before connection causing a DOMException: HTMLDialogElement.showModal: Dialog element is not connected on every skill click. Thanks @nickmopen.

  • Gateway/update: run doctor --non-interactive --fix after Control UI global package updates before reporting success, so legacy config is migrated before the gateway restart. Thanks @stevenchouai.

  • Gateway/cron: stop a lazy cron startup that loses a hot-reload race, preventing the old cron service from starting after reload has already replaced cron state.

  • CLI/plugins: warn when npm plugin installs remain shadowed by a failing config-selected source and surface the repair path in plugins doctor. Thanks @LindalyX-Lee.

  • Agents/Telegram: preserve explicit reply and quote context in embedded model prompts without letting quoted text drive prompt-local image loading. Fixes #76419. (#76659) Thanks @cheechnd.

  • Active Memory: apply setupGraceTimeoutMs to the embedded recall runner as well as the outer prompt-build watchdog, so very-cold first recalls keep the configured setup grace end-to-end. (#74480) Thanks @volcano303.

  • Channels/Feishu: cap how long the per-chat sequential queue blocks subsequent same-key tasks behind a single in-flight task (5 min default), so a single hung dispatch no longer leaves later same-chat messages in queued state until gateway restart; the stuck task continues running but is evicted from the blocking chain and a warning is logged. Fixes #70133. (#76687) Thanks @martingarramon and @bek91.

  • Active Memory: skip scoped Telegram forum-topic conversation ids (containing :) when resolving the embedded recall run channel, falling back to messageProvider instead, so Active Memory no longer throws a bundled-plugin dirName validation error in forum-topic sessions. Fixes #76704.

  • Agents/tools: defer automatic PDF model/auth resolution until the PDF tool is used, keeping agent-turn tool prep from probing auth profiles on messages without PDFs while preserving explicit PDF model registration. Fixes #76644. Thanks @hclsys.

  • CLI/config: keep JSON dry-run patches validating touched channel configuration against bundled channel schemas even when the patch only contains SecretRef objects.

  • Plugins/tools: keep disabled bundled tool plugins out of explicit runtime allowlist ownership and fall back from loaded-but-empty channel registries to tool-bearing plugin registries, so Active Memory can use bundled memory-core search/get tools even when memory-lancedb is disabled. Fixes #76603. Thanks @jwong-art.

  • Plugins/install: run npm install from the managed npm-root manifest so installing one @openclaw/* plugin preserves already installed sibling plugins instead of pruning them. Fixes #76571. (#76602) Thanks @byungskers and @crpol.

  • Plugins/context-engine: include the selected plugins.slots.contextEngine plugin in the gateway startup load plan so external context-engine plugins without activation.onStartup in their manifest are loaded before any agent turn resolves the active engine; prevents the "Context engine X is not registered; falling back to default engine legacy" warning after gateway startup. Fixes #76576. Thanks @hclsys.

  • Plugins/tools: restore on-demand registry load for path-based plugins (origin "config") so tool factories registered via plugins.load.paths are resolved at agent request time when no pre-warmed channel registry is present; prevents "unknown method" errors after gateway startup. Fixes #76598. Thanks @hclsys.

  • Plugins/hooks: include explicitly enabled hook-capable plugins in the Gateway startup runtime scope so embedded PI runs can see their before_prompt_build and agent_end hooks. Fixes #76649. Thanks @wwf3045 and @MkDev11.

  • Plugins/OpenCode: expose Claude thinking profiles through the lightweight provider policy surface so directive and session validation keep xhigh, adaptive, and max for opencode/claude-opus-4-7 instead of remapping xhigh to high. Fixes #76648. Thanks @aaajiao.

  • Channels/QQ Bot: resolve structured clientSecret SecretRefs before QQ token exchange, expose the QQ Bot secret contract to secrets tooling, and reject legacy secretref:/... marker strings. (#74772) Thanks @xialonglee.

  • Agents: keep active streamed provider replies alive by refreshing guarded fetch timeouts on raw body chunks and surface true prompt stream timeouts as explicit errors instead of partial assistant fragments. Fixes #76307. (#76633) Thanks @MkDev11.

  • Plugins/externalization: keep official ACPX, Google Chat, and LINE install specs on production package names, leaving beta-tag probing to the explicit OpenClaw beta update channel. Thanks @vincentkoc.

  • CLI/doctor: keep missing-plugin repair from overriding official catalog metadata with runtime fallbacks, so ACPX repairs preserve the official npm spec during the externalization rollout. Thanks @vincentkoc.

  • CLI/doctor: match stale bundled-plugin install records by exact parsed package name so doctor does not remove external npm or ClawHub records that only share an OpenClaw package-name prefix.

  • Plugins/catalog: preserve ClawHub install specs when generating the packaged channel catalog so future storepack-first channel plugins keep their remote source instead of becoming npm-only. Thanks @vincentkoc.

  • Plugins/catalog: pin bare npm specs from prerelease external channel catalog entries to the catalog entry version, so beta catalogs do not silently install the latest stable package.

  • Plugins/update: treat catalog-matched official npm updates and OpenClaw-authored externalized-bundled npm bridges as trusted official installs so launch-code plugins can update or migrate out of the bundled tree without scanner false positives. Thanks @vincentkoc.

  • Plugins/onboarding: fall back from ClawHub to npm only for missing package/version errors, keeping integrity and verification failures fail-closed during storepack rollout. Thanks @vincentkoc.

  • CLI/onboarding: mask credential inputs (model-auth provider API keys, gateway tokens and passwords, web-search provider keys, and skill env-var values) in the interactive openclaw onboard wizard so pasted secrets no longer echo into terminal scrollback, Start-Transcript logs, or screenshots; existing tokens/passwords are preserved through a masked-preview confirm step before the sensitive prompt. Thanks @anurag-bg-neu.

  • Control UI/Talk: fix Talk (OpenAI Realtime WebRTC) CORS failure by stripping server-side-only attribution headers (originator, version, User-Agent) from browser offer headers; api.openai.com/v1/realtime/calls only allows authorization and content-type in its CORS preflight, so forwarding these headers caused the browser SDP exchange to fail. Fixes #76435. Thanks @hclsys.

  • Chat delivery: make /verbose on|full|off changes affect subsequent tool-use chat bubbles again, including channels with draft preview tool progress enabled, while preserving one-shot verbose directives.

  • CLI/logs: auto-reconnect openclaw logs --follow on transient gateway disconnects with bounded backoff, stderr retry warnings, [logs] gateway reconnected recovery notices, and JSON notice records while still exiting immediately on non-recoverable auth or configuration errors. Fixes #74782. (#75059, #75372) Thanks @shashank-poola and @RomneyDa.

  • Codex/WhatsApp: keep the message dynamic tool available when Codex source replies are configured for message-tool delivery, so coding-profile chat agents do not complete turns privately without a visible channel reply. Fixes #76660. (#76663) Thanks @VishalJ99.

  • Codex/heartbeat: send heartbeat-specific initiative guidance through Codex turn-scoped collaboration-mode instructions, keeping ordinary message-tool chat turns in Default mode without heartbeat prompt leakage. Thanks @pashpashpash.

  • Plugins/onboarding: trust optional official plugin and web-search installs selected from the official catalog so npm security scanning treats them like other source-linked official install paths. Thanks @vincentkoc.

  • Agents/web_search: keep installed runtime provider discovery enabled when web-search metadata is missing, so externally installed official providers such as Brave remain visible to agent and cron turns instead of falling back to bundled-only lookup. Fixes #76626. Thanks @amknight.

  • Tests/plugins: expose the Discord npm onboarding Docker lane as a package script and assert planned Docker lanes point at real scripts, so external-channel onboarding coverage can actually run. Thanks @vincentkoc.

  • Plugins/ClawHub: explain unreleased ClawHub plugin artifacts as a rollout-state fallback to npm: installs instead of leaking raw archive metadata fields. Thanks @vincentkoc.

  • Tests/onboarding: assert packaged channel onboarding leaves openclaw channels status --json and plain openclaw status showing the configured channel, covering the empty Channels table regression path. Thanks @vincentkoc.

  • Microsoft Teams: persist sent-message markers across Gateway restarts so follow-up replies to recent bot messages keep resolving the original conversation instead of dropping out after restart, with marker TTLs preserved on best-effort recovery. (#75585) Thanks @amknight.

  • Matrix: persist pending approval reaction targets across Gateway restarts so room approvers can still approve or deny outstanding prompts after OpenClaw comes back online. (#75586) Thanks @amknight.

  • Channels/onboarding: map third-party official WeCom and Yuanbao catalog entries to their published plugin ids so npm installs pass expected-plugin validation. Thanks @vincentkoc.

  • Plugin SDK: restore the Mattermost and Matrix compatibility subpaths used by the pinned Yuanbao channel package so external installs can module-load after npm install. Thanks @vincentkoc.

  • Plugins/install: keep managed npm-root security scans from treating earlier plugin openclaw peer links as failures, so one external plugin install cannot poison later official npm installs. Thanks @vincentkoc.

  • Memory LanceDB: allow installed-but-unconfigured plugin metadata to load so onboarding and setup flows can prompt for embedding config instead of failing the plugin registry first. Thanks @vincentkoc.

  • CLI/plugins: keep plugins enable and plugins disable from creating unconfigured channel config sections, so channel plugins with required setup fields no longer fail validation during lifecycle probes. Thanks @vincentkoc.

  • Doctor/config: set messages.groupChat.visibleReplies: "message_tool" during compatibility repair for configured-channel configs that omit a visible-reply policy, so upgrades can persist the intended tool-only group/channel reply default. Thanks @kagura-agent.

  • Agents/sessions: keep delayed sessions_send A2A replies alive after soft wait-window timeouts, while preserving terminal run timeouts and avoiding stale target replies in requester sessions. Fixes #76443. Thanks @ryswork1993 and @vincentkoc.

  • TUI/Control UI: fix /think command showing only base thinking levels when the active session uses a different model from the default, so provider-specific levels like DeepSeek V4 Pro's xhigh and max are now visible and selectable. Fixes #76482. Thanks @amknight.

  • CLI/sessions: keep intentional empty agent replies silent after tool-delivered channel output, instead of surfacing a misleading "No reply from agent." fallback. Thanks @vincentkoc.

  • Config/doctor: cap .clobbered.* forensic snapshots per config path and serialize snapshot writes so repeated doctor --fix recovery loops cannot flood the config directory. Fixes #76454; carries forward #65649. Thanks @JUSTICEESSIELP, @rsnow, and @vincentkoc.

  • Feishu: suppress duplicate text when replies send native voice media, preserve captions for ordinary audio files, and send fallback text plus attachment links when audioAsVoice transcode/upload fallback produces a generic file.

  • TTS/plugins: activate configured and inherited speech provider plugins during Gateway startup, so Microsoft and Local CLI voice replies work immediately after persona selection instead of staying invisible in the startup plugin set. Fixes #76481. Thanks @amknight.

  • Feishu: keep packaged Feishu startup from bundling the Lark SDK's ESM __dirname path by loading the SDK as a plugin-local runtime dependency. Fixes #76291 and #76494. (#76392) Thanks @zqchris.

  • Plugins/npm: build package-local runtime dist files for publishable plugins and stop listing root-package-excluded plugin sidecars in the core package metadata, so npm plugin installs such as @openclaw/diffs and @openclaw/discord no longer publish source-only runtime payloads. Fixes #76426. Thanks @PrinceOfEgypt.

  • Channels/secrets: resolve SecretRef-backed channel credentials through external plugin secret contracts after the plugin split, covering runtime startup, target discovery, webhook auth, disabled-account enumeration, and late-bound web_search config. Fixes #76371. (#76449) Thanks @joshavant and @neeravmakwana.

  • Docker/Gateway: pass Docker setup .env values into gateway and CLI containers and preserve exec SecretRef passEnv keys in managed service plans, so 1Password Connect-backed Discord tokens keep resolving after doctor or plugin repair. Thanks @vincentkoc.

  • Control UI/WebChat: explain compaction boundaries in chat history and link directly to session checkpoint controls so pre-compaction turns no longer look silently lost after refresh. Fixes #76415. Thanks @BunsDev.

  • Agents/compaction: add an optional bundled compaction notifier hook and retry once from the compacted transcript when automatic compaction leaves a turn without a final visible reply. (#76651) Thanks @simplyclever914.

  • Agents/incomplete-turn: detect and surface a warning when the agent's final text after a tool-call chain is silently dropped because the post-tool assistant response was never produced, instead of completing the turn with only the pre-tool analysis text. Fixes #76477. Thanks @amknight.

  • Channels/WhatsApp: attach native outbound mention metadata for group text and media captions by resolving @+<digits> and @<digits> tokens against WhatsApp participant data, including LID groups. Fixes #39879; carries forward #56863. Thanks @kengi1437, @joe2643, and @fridayck.

  • Channels/WhatsApp: require outbound mention tokens to end at a word boundary so phone-number prefixes inside longer strings no longer trigger hidden native mentions.

  • Plugins/uninstall: remove empty managed git install parent directories after deleting cloned plugin repos and cover npm/git uninstall residue in Docker plugin lifecycle tests. Thanks @vincentkoc.

  • Plugins/install: resolve bare official external plugin IDs such as brave through the official catalog when no bundled source is available, so packaged installs fetch the intended scoped npm package instead of an unrelated unscoped package. Fixes #76373. Thanks @bek91 and @vincentkoc.

  • Plugins/install: require OpenClaw-owned install provenance before granting official npm plugin scanner trust, so direct npm package names no longer bypass launch-code scanning while catalog, onboarding, and doctor installs stay trusted. Thanks @fede-kamel and @vincentkoc.

  • Network proxy: preserve target TLS hostname validation for Node HTTPS requests routed through the managed HTTP proxy, so Discord-style CONNECT traffic no longer validates certificates against the local proxy host. Fixes #74809. (#76442) Thanks @jesse-merhi and @abnershang.

  • Gateway/sessions: keep sessions.list rows lightweight by bounding title/preview hydration to transcript head/tail reads and caching manifest model-id normalization plus setup fallback metadata against the active plugin snapshot. Thanks @vincentkoc and @rolandrscheel.

  • Gateway/performance: cache per-run verbose-level session reads, skip a redundant lsof scan in gateway --force when no listener was killed, and make the Gateway startup benchmark print usage for --help.

  • Gateway/sessions: keep agent runtime metadata on lightweight sessions.list rows and skip per-row transcript usage fallback, display model inference, and plugin projection, avoiding identity loss and event-loop stalls in large session stores. Thanks @Marvinthebored and @vincentkoc.

  • Gateway/models: keep read-only models.list fallbacks on persisted/current metadata, configured rows, registry-compatible fallbacks, and static auth checks while preserving full-catalog image attachment capability checks. Fixes #76382; refs #76360 and #75707. Thanks @trojy13, @RayWoo, @AnathemaOfficial, @Marvinthebored, and @vincentkoc.

  • CLI/plugins: reject missing plugin ids before config writes in plugins enable and plugins disable so a typo no longer persists a stale config entry. (#73554) Thanks @ai-hpc.

  • Agents/sessions: preserve delivered trailing assistant replies during session-file repair so Telegram/WebChat history is not rewritten to drop already-delivered responses. Fixes #76329. Thanks @obviyus.

  • Gateway/chat history: preserve oversized transcript turns as explicit omitted-message placeholders while avoiding large JSONL parse stalls. Thanks @Marvinthebored and @vincentkoc.

  • CLI/doctor: load the configured memory-slot plugin when resolving memory diagnostics so bundled memory-core no longer triggers a false “no active memory plugin” warning on standalone doctor / status runs. Fixes #76367. Thanks @neeravmakwana.

  • Gateway: preserve stack diagnostics when chat.send or agent attachment parsing/staging fails, improving image-send failure triage. Refs #63432. (#75135) Thanks @keen0206.

  • Agents/idle-timeout: add a cost-runaway breaker to the outer embedded-run retry loop that halts further attempts after 5 consecutive idle timeouts without completed model progress, so a wedged provider can no longer fan paid model calls out across the same run; completed text or tool-call progress resets the breaker, but partial tool-argument token dribbles do not. Fixes #76293. Thanks @ThePuma312.

  • Heartbeats/Codex: align structured heartbeat prompts with actual heartbeat_respond tool availability, stop sending legacy HEARTBEAT_OK when the tool exists, and keep tool-disabled commitment check-ins on the legacy ack path. Thanks @pashpashpash and @vincentkoc.

  • Agent runtimes: fail explicit plugin runtime selections honestly when the requested harness is unavailable instead of silently falling back to the embedded PI runtime. Thanks @pashpashpash.

  • Maintainer workflow: push prepared PR heads through GitHub's verified commit API by default and require an explicit override before git-protocol pushes can publish unsigned commits. Thanks @BunsDev.

  • Feishu: resolve setup/status probes through the selected/default account so multi-account configs with account-scoped app credentials show as configured and probeable. Fixes #72930. Thanks @brokemac79.

  • Gateway/responses: emit every client tool call from /v1/responses JSON and SSE responses when the agent invokes multiple client tools in a single turn, so multi-tool plans, graph orchestration calls, and similar batched flows no longer drop every call but the last. Fixes #52288. Thanks @CharZhou and @bonelli.

  • Gateway/agent: enforce session.sendPolicy=deny on gateway agent requests only when deliver: true, so non-delivery smoke checks and internal agent runs are no longer rejected with send blocked by session policy while outbound delivery remains gated. Fixes #73381. Thanks @wenxu007.

  • Slack/reactions: treat missing no_reaction remove responses as idempotent success and route own-reaction cleanup through the remove helper, so concurrent cleanup no longer surfaces Slack race errors. Fixes #50733. (#76304) Thanks @martingarramon and @Hollychou924.

  • Feishu: include media file_key and image_key values in inbound dedupe so reused message IDs still process distinct media attachments while true retries stay suppressed. Fixes #75057. Thanks @SymbolStar.

  • Control UI/Gateway: avoid full session-list reloads for locally applied message-phase session updates, carry known session keys through transcript-file update events, and defer media provider listing when explicit generation model config is present. Refs #76236, #76203, #76188, #76107, and #76166. Thanks @BunsDev.

  • Install/update: prune the obsolete plugin-runtime-deps state directory during packaged postinstall so upgrades from pre-2026.5.2 releases reclaim old bundled-plugin dependency caches without touching external plugin installs.

  • Auto-reply/queue: treat reset-triggered /new and /reset turns as interrupt runs across active-run queue handling, so steer/followup modes cannot delay a fresh session behind existing work. Fixes #74093. (#74144) Thanks @ruji9527 and @yelog.

  • Cron: persist repaired startup runtime state back to jobs-state.json so a valid future nextRunAtMs with missing updatedAtMs no longer triggers repeated external health-check repairs after Gateway restart. Fixes #76461. Thanks @vincentkoc.

  • Cron: preserve manual cron.run IDs in cron.runs history so manual run acknowledgements can be correlated with finished run records. Fixes #76276.

  • CLI/devices: request operator.admin for openclaw devices approve <requestId> only when the exact pending device request would mint or inherit admin-scoped operator access, while keeping lower-scope approvals on the pairing scope.

  • Memory/embedding: broaden the embedding reindex retry classifier to include transient socket-layer errors (fetch failed, ECONNRESET, socket hang up, UND_ERR_*, closed) so memory reindex survives provider network hiccups instead of aborting mid-run. Related #56815, #44166. (#76311) Thanks @buyitsydney.

  • Memory/sessions: keep rotated and deleted transcripts (.jsonl.reset.<iso> / .jsonl.deleted.<iso>) searchable by indexing archive content, mapping archive hits back to live transcript stems, emitting transcript update events on archive rotation, and bypassing incremental delta thresholds for one-shot archive mutations while keeping backups and compaction checkpoints opaque. Refs #56131. Thanks @buyitsydney.

  • Memory/search: keep sqlite-vec optional in packaged installs and point missing-extension recovery at the valid agents.defaults.memorySearch.store.vector.extensionPath setting. Thanks @willemsej and @vincentkoc.

  • Gateway: keep directly requested plugin tools invokable under restrictive tool profiles while preserving explicit deny lists and the HTTP safety deny list, preventing catalog/invoke mismatches that surface as "Tool not available". Thanks @BunsDev.

  • Gateway/update: allow beta binaries to refresh gateway services when the config was last written by the matching stable release version, avoiding false newer-config downgrade blocks during beta channel updates.

  • Channels: keep Matrix and Mattermost bundled in the core package instead of advertising external npm installs before those channels are cut over. Thanks @vincentkoc.

  • Bonjour: disable LAN mDNS advertising after a repeated stuck-announcing recovery instead of repeatedly restarting ciao and saturating the Gateway event loop.

  • Channels/setup: label installable channel picker hints as remote npm installs and hide remote install hints for bundled plugins that already ship with OpenClaw.

  • CLI/update: refuse package updates launched from the active gateway process tree before stopping the managed Gateway service, avoiding self-terminated in-lane updates that leave old Gateway code running. Fixes #75691. (#75819) Thanks @ai-hpc.

  • CLI/plugins: stop treating the non-plugin auth command root as a bundled plugin id, so restrictive plugins.allow configs no longer tell users to add stale auth plugin entries.

  • Doctor/plugins: update configured plugin installs whose stale manifests still declare channels without channelConfigs, so beta upgrades repair old Discord-style package payloads during doctor --fix.

  • Doctor/plugins: repair configured external plugin installs whose persisted install record points at a missing package directory, so upgrades reconcile phantom npm metadata before plugin runtime validation. Thanks @vincentkoc.

  • Active Memory: keep non-empty memory_search results from being fast-failed as empty when debug telemetry reports zero hits.

  • Active Memory: preserve the target agent context when building embedded recall plugin tools so memory_search and memory_get stay available for explicit recall sessions. Fixes #76343. Thanks @Countermarch.

  • Plugins/externalization: repair missing configured plugin installs from npm by default, reserve ClawHub downloads for explicit clawhubSpec metadata, and cover agent-runtime/env-selected plugin repair. Thanks @vincentkoc.

  • Plugins/install: allow official catalog-matched npm channel plugins such as Feishu to pass the trusted install scanner path while keeping spoofed package names blocked. Thanks @vincentkoc.

  • Tools/llm-task: keep JSON-only embedded model runs from tripping inherited tool allowlists when tools are intentionally disabled, while preserving runtime toolsAllow failures. Fixes #74019. Thanks @amknight.

  • Tools/profiles: make tools.profile: "full" grant all tools including optional plugin tools such as browser, so the full profile no longer silently drops plugin-provided tools that require an explicit allowlist entry. Fixes #76507. Thanks @amknight.

  • Feishu: keep timeout env parsing separate from the HTTP client wrapper so package security scans no longer report a false env-harvesting hit during install. Thanks @vincentkoc.

  • Upgrade/config: validate configured web-search providers and statically suppressed model/provider pairs against the active plugin set at config load, so stale plugin state fails loud before runtime fallback.

  • Status/update: resolve beta update-channel checks from the installed version when config still says stable, and let status --deep reuse live gateway channel credential state instead of warning on command-path-only token misses.

  • Doctor/plugins: preserve unmanaged third-party plugin node_modules during doctor --fix, while still pruning OpenClaw-managed runtime dependency caches.

  • Gateway/restart: add openclaw gateway restart --force and --wait <duration>, log active task run IDs before restart deferral timers, and report timeout restarts as explicit forced restarts.

  • Discord: persist slash-command deploy hashes across process restarts so unchanged command sets skip redeploy and avoid restart-loop 429s.

  • Providers/LM Studio: normalize binary off/on reasoning metadata from Gemma 4 and other local models to LM Studio's accepted OpenAI-compatible reasoning_effort values.

  • Plugins/externalization: keep official external install docs, update examples, and live Codex npm checks on default npm tags instead of @beta. Thanks @vincentkoc.

  • Plugins/externalization: keep ACPX, Google Chat, and LINE publishable plugin dist trees out of the core npm package file list.

  • Plugins/ClawHub: fall back to version metadata when the artifact resolver route is missing and keep the Docker ClawHub fixture aligned with npm-pack artifact resolution, avoiding false version-not-found failures during plugin install validation. Thanks @vincentkoc.

  • Providers/openai-codex: honor providerConfig.baseUrl in the dynamic-model synthesis fallback so codex providers configured with a custom upstream (for example a forwarding proxy) no longer silently bypass the configured URL when the registry has no template row to clone for the requested model id. (#76428) Thanks @arniesaha.

  • Status/channels: show configured channels in openclaw status and config-only openclaw channels status output even when the Gateway is unreachable, avoiding empty Channels tables on WSL and other no-Gateway paths. Thanks @vincentkoc.

  • Plugins/ClawHub: explain unavailable explicit ClawHub ClawPack artifact downloads with a temporary npm install hint while ClawHub artifact routing rolls out. Thanks @vincentkoc.

  • Media: accept home-relative MEDIA:~/... attachment paths while preserving existing file-read policy, traversal checks, and media type validation. Fixes #73796. Thanks @fabkury.

  • Onboarding/search: install official external web-search plugins such as Brave before saving provider config, and make doctor repair reconcile selected external search providers whose npm payload is missing. Thanks @vincentkoc.

  • Plugins/externalization: add official npm-first catalogs for externalized channel, provider, and generic plugins, keep unpublished ACPX/Google Chat/LINE bundled, and make missing-plugin repair honor npm-first metadata while ClawHub pack files roll out. Thanks @vincentkoc.

  • Plugins/update: detect tracked plugin install records whose package directories disappeared during openclaw update, reinstall them before normal plugin updates, and fail the update if any install record still points at missing disk payloads.

  • Plugins/registry: hash manifest and package metadata when validating persisted plugin registries so fast same-size rewrites cannot leave stale plugin metadata trusted.

  • Plugins/registry: canonicalize install-record provenance paths before trust diagnostics, so npm plugins installed under symlinked temp/state roots no longer warn as untracked local code.

  • Plugins/install: let official external Discord reinstall requests pass the invalid-config guard and run stale-channel repair, so upgrades can recover missing external plugin state directly.

  • CLI/infer: reject local codex/* one-shot model probes before simple-completion dispatch and point operators at the Codex app-server runtime path instead of ending with an empty-output error.

  • Agents/sessions: preserve terminal lifecycle state when final run metadata persists from a stale in-memory snapshot, preventing main sessions from staying stuck as running after completed or timed-out turns.

  • Gateway/CLI: make openclaw gateway start repair stale managed service definitions that point at old OpenClaw versions, missing binaries, or temporary installer paths before starting.

  • Heartbeat/scheduler: make heartbeat phase scheduling active-hours-aware so the scheduler seeks forward to the first in-window phase slot instead of arming timers for quiet-hours slots and relying solely on the runtime guard. Non-UTC activeHours.timezone values (e.g. Asia/Shanghai) now correctly influence when the next heartbeat timer fires, avoiding wasted quiet-hours ticks and long dormant gaps after gateway restarts. Fixes #75487. Thanks @amknight.

  • Providers/Arcee AI: mark Trinity Large Thinking as tool-incompatible so main-session runs use the same text-only request shape that made subagent runs recover, avoiding the remaining main-session response-shape mismatch after the #62848 transport failover fix. Fixes #62851 and #62847; carries forward #62848. Thanks @Adam-Researchh.

  • Status: show the openai-codex OAuth profile for openai/gpt-* sessions running through the native Codex runtime instead of reporting auth as unknown. (#76197) Thanks @mbelinky.

  • Gateway: avoid repeated plugin tool descriptor config hashing so large runtime configs do not block reply startup and trigger reconnect/timeouts. (#75944) Thanks @joshavant.

  • Plugins/externalization: keep diagnostics ClawHub packages and persisted bundled-plugin relocation on npm-first install metadata for launch, and omit Discord from the core package now that its external package is published. Thanks @vincentkoc.

  • Setup/TUI: bound the Terminal hatch bootstrap run so a stalled provider request times out instead of leaving first-run hatching stuck behind the watchdog. (#76241) Thanks @joshavant.

  • Cron/CLI runtimes: route isolated cron jobs through configured per-agent CLI runtimes only when the resolved model provider is compatible, so OpenAI job overrides no longer inherit a mismatched Claude CLI backend. Thanks @vishutdhar.

  • Plugins/Codex: allow the official npm Codex plugin to install without the unsafe-install override, keep /codex command ownership, and cover the real npm Docker live path through managed .openclaw/npm dependencies plus uninstall failure proof.

  • Gateway/status: add concrete service, config, listener-owner, and log collection next steps when gateway probes fail and Bonjour finds no local gateway, so frozen or port-conflict reports include the data needed for root-cause triage. Refs #49012. Thanks @vincentkoc.

  • Codex harness: forward OpenClaw workspace bootstrap files such as SOUL.md through native Codex config instructions while leaving AGENTS.md to Codex project-doc discovery. Fixes #76273. Thanks @zknicker.

  • Parallels/Windows update smoke: escape the stale post-swap import regex in the generated PowerShell script so expected ERR_MODULE_NOT_FOUND update handoffs continue to post-update health checks. (#75315)

  • Slack: allow draft preview streaming in top-level DMs when replyToMode is off while keeping Slack native streaming and assistant thread status gated on reply threads. Fixes #56480. (#56544) Thanks @HangGlidersRule.

  • Control UI/chat: remove the delete-confirm popover outside-click listener on every dismiss path, so Cancel, Delete, outside clicks, and same-button toggles no longer leave stale document listeners behind. Refs #75590 and #69982. Thanks @Ricardo-M-L.

  • Memory-core: treat exhausted file watcher limits as non-fatal for builtin memory auto-sync while preserving fatal handling for unrelated disk-full errors. (#73357) Thanks @solodmd.

  • Providers/Ollama: restore catalog context-window forwarding as num_ctx for native /api/chat requests; fixes tool selection and context truncation regressions on models with catalog entries (qwen3, llama3, gemma3, …) when no explicit params.num_ctx was configured. Fixes #76117. (#76181) Thanks @openperf.

  • Plugins/install: pin npm plugin installs to the verified resolved version and reject package-lock version or integrity drift, so mutable tags cannot race integrity checks into accepting a different artifact. Thanks @Lucenx9.

  • Plugins/providers: preserve scoped cold-load fallback for enabled external manifest-contract capability providers missing from the startup registry, so providers such as Fish Audio can resolve on request without requiring activation.onStartup for correctness. (#76536) Thanks @Conan-Scott.

  • Gateway/update: carry continuationMessage from update.run into successful restart sentinels so session-scoped self-updates can resume one follow-up turn after the Gateway restarts. Refs #71178. (#74362) Thanks @100menotu001, @HeilbronAILabs, and @artnking.

  • Agents/fallback: suppress duplicate current-turn user-message transcript writes after embedded fallback retries while still sending the retry prompt to the model. (#63696) Thanks @dashhuang.

  • Channels/Telegram: force a fresh final message when a visible non-preview bubble (tool/block/error) was delivered after the active answer preview, so multi-step assistant replies no longer end up with the final answer above intermediate output. Fixes #76529. Thanks @jack-stormentswe.

  • Channels/Telegram: require an observed Telegram send, edit, or fallback before treating a forum-topic final as delivered, so final replies generated in transcript no longer disappear from Telegram topics. Fixes #76554. (#76764) Thanks @bubucilo and @obviyus.

  • Plugins/update: keep externalized bundled npm bridge updates on the normal plugin security scanner path instead of granting source-linked official trust without artifact provenance. (#76765) Thanks @Lucenx9.

  • Agents/reply context: label replied-to messages as the current user message target in model-visible metadata, so short replies are grounded to their explicit reply target instead of nearby chat history. (#76817) Thanks @obviyus.

  • Doctor/plugins: install configured missing official plugins such as Discord and Brave during doctor/update repair, auto-enable repaired provider plugins, preserve config when a download fails, and stop auto-enable from inventing plugin entries when no manifest declares a configured channel. Fixes #76872. Thanks @jack-stormentswe.

Read the whole story
alvinashcraft
48 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Brady Gaster on Squad and a Multi Agent AI

1 Share

Episode 901

Brady Gaster on Squad and a Multi Agent AI

Brady Gaster and his friends have developed Squad an AI tool built on top of GitHub Copilot that spins up multiple agents - each with a different role in the software development process.

Links:
https://github.com/bradygaster/squad
https://bradygaster.github.io/squad/

Read the whole story
alvinashcraft
48 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Episode 900: A Celebration of Friends!

1 Share

Episode 900

A Celebration of Friends!

Today, I am celebrating the Nonacentennial of "Technology and Friends" with a montage of the last 99 episodes!

Featuring:
Adi Polak
Akash Dubey
AL Rodriguez
Alex Riviere
Alvin Ashcraft
Ankita Guha Biegas
Arunava Majumdar
Barry Stahl
Ben Kotvis
Bill Sempf
Blaize Stewart
Brian Hitney
Brian McKeiver
Burton Smith
Chris Ayers
Chris Woodruff
Christina Aldan
Damian Synadinos
Danny Kim
Dean Schuster
Dee Peterson
Edward Thomson
Esteban Garcia
Fidel Guzman
Glenn F Henriksen
Greg Crist
Guy Royse
J Tower
Jason Bock
Javier Salmeron
Jayson Street
Jeff Fritz
Jeffrey Snover
Jennifer Marsman
Jennifer Reif
Jennifer Wadella
Jeremy Miller
Jerry Nixon
Jimmy Bogard
Joe Guadagno
Joe Sharmer
Justine Cocchi
Karl-Henrik Nilsson
Kathryn Grayson
Ken Versaw
Kevin Griffin
Laurent Bugnion
Andrew Looney
Kristin Looney
Maarten Balliauw
Mads Torgersen
Magnus Martensson
Mark Tinderholt
Michael Eaton
Michael Feathers
Michelle Frost
Michelle Sandford
Mike Shelton
Peter Van Vilet
Prasanna Pendse
Rachel Appel
Raj Krishnan
Randy Pagels
Rhia Dixon
Richard Campbell
Roan Weigert
Robert Bogue
Rob Conery
Rocky Lhotka
Rod Christensen
Sam Gomez
Sam Nasr
Sarah Dutkiewicz
Sarang Brahme
Scott Hanselman
Scott Hermes
Scott Hunter
Stacey Mulcahy
Steve Smith
Sudeep Goswami
Ted Neward
Tim Moore
Tommy Falgout
Trinh Tran
Valerie Gurka
Venkat Subramaniam
</p>

Read the whole story
alvinashcraft
48 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

How to migrate from controllers to Minimal APIs

1 Share

Learn why Minimal APIs are now recommended in .NET and follow a step-by-step guide to migrate from controllers to Minimal APIs with versioning and Swagger.

The page How to migrate from controllers to Minimal APIs appeared on Round The Code.

Read the whole story
alvinashcraft
49 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories