Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152390 stories
·
33 followers

Qwen3.6-27B Brings Open-Weight Vision and Coding Power

1 Share
Qwen3.6-27B is an open-weight multimodal model built for coding, reasoning, visual understanding, and long-context AI workflows.
Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Susurrus: Crafting a Cozy Watercolor World with Three.js and Shaders

1 Share
A behind-the-scenes look at blending NPR shading, sound, and interaction to shape a meditative WebGL scene.



Download video: https://codrops-1f606.kxcdn.com/codrops/wp-content/uploads/2026/04/susurrus-short-video.mp4?x30804



Download video: https://codrops-1f606.kxcdn.com/codrops/wp-content/uploads/2026/04/0414.mp4?x30804



Download video: https://codrops-1f606.kxcdn.com/codrops/wp-content/uploads/2026/04/spawn.mp4?x30804



Download video: https://codrops-1f606.kxcdn.com/codrops/wp-content/uploads/2026/04/dissssolve.mp4?x30804
Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Why Claude needs a real environment to validate cloud-native code

1 Share
Artistic blue illustration of a forest landscape reflecting in a lake at night, symbolizing the validation and mirroring of cloud-native environments for AI coding agents.

Boris Cherny, who built Claude Code, recently shared on X how to get the most out of it following the release of Opus 4.7. He left the most important tip for last:

“Make sure Claude has a way to verify its work. This has always been a way to 2-3x what you get out of Claude, and with 4.7 it’s more important than ever.”

That observation describes a pattern establishing itself as the standard model for developing software with coding agents. It is also a pattern that is easy to implement locally, against a single codebase with limited dependencies.

It is much more difficult against a cloud-native application with a complex topology. Closing that gap is the difference between coding agents that accelerate teams and those that bury them in review queues and manual validation.

The pattern emerging across coding agents

Boris’s tip mirrors a pattern emerging across the industry. Every major coding agent has shipped infrastructure in the last six months whose explicit purpose is to let the agent check its own work before handing it off.

OpenAI’s Codex iterates in a loop within an isolated cloud container, editing code, running checks, and validating its changes against commands specified in the team’s AGENTS.md file. The validation loop is the product, not a feature on the side.

“The validation loop is the product, not a feature on the side.”

GitHub’s Copilot coding agent runs in an ephemeral GitHub Actions environment that automatically executes the repo’s tests, linters, CodeQL, and secret scanning on every task. If anything fails, Copilot attempts to fix it before marking the task ready for review. Cursor’s cloud agents run in sandboxed VMs with shell and browser access so the agent can exercise its changes end-to-end and produce screenshots, videos, and logs as evidence of what it tested.

Claude Code exposes the same shape as composable primitives. Stop hooks prevent teams from completing a task until tests pass. Subagents can run dedicated validation passes that inspect work without modifying it. The verification loop is something the team assembles, but the building blocks are explicit and well-supported.

The convergence is not a coincidence. Every team building a coding agent has identified the same problem: a model that writes plausible code without checking it pushes the entire correctness problem back onto the developer. The productivity gain disappears into review overhead.

A coding agent that can verify its own work operates differently. It iterates on the task, catches its own mistakes, and hands over something the developer can reasonably trust. That’s where useful agent work lives, and it’s the bet every major agent vendor is now making.

Cloud-native systems make the loop harder

All of this assumes the agent can run the change against a realistic environment that mirrors production. In modern cloud-native architectures, that assumption breaks quickly.

The code an agent is changing rarely fails in isolation. It fails at the seams. Services call other services. Async events fire through message buses. A schema change in one service cascades through its consumers. A new middleware header breaks callers three hops away.

“The code an agent is changing rarely fails in isolation. It fails at the seams.”

The agent writing the change has no way to catch any of that with a mocked integration test. The mock returns whatever the agent told it to return.

Real validation in a distributed system means running the change in a realistic environment and observing what happens as actual requests flow through it. Full end-to-end. Real dependencies. Real traffic patterns.

Anything less pushes the problem back onto the developer. More review rounds. More iteration cycles. Broken staging environments that slow other developers and agents down. The occasional bug that makes it into production.

Diagram of cloud-native distributed system compared with a monolithic system.

Realistic feedback, without duplicating the stack

What cloud-native teams need is a feedback loop that lets their agents see how a change actually behaves. Not against mocks. Not against a simplified approximation of production. Against real services, real data paths, and real traffic patterns, close enough to production that the integration failures the agent is most likely to cause are the integration failures it can most easily catch.

That loop has to satisfy three constraints at once.

It has to be realistic. The agent is trying to verify a change that crosses service boundaries, so it needs an environment where those boundaries exist and behave the way they will in production. Anything less and the agent ends up validating a version of the system that will not match what its code actually runs against when it ships.

It has to be isolated. Multiple agents and multiple developers will be exercising changes concurrently, often in overlapping parts of the system. If one agent’s test run breaks the environment for everyone else, the loop has closed for that agent but opened a bigger one for the rest of the team. An agent’s validation work cannot become a coordination problem for the humans around it.

And it has to scale with the way agents actually work. A team running coding agents is not running one task at a time. It is running many tasks in parallel, each on a different branch, each needing its own realistic place to validate. A model that requires duplicating the entire application stack for every agent collapses the moment the team gets serious about throughput, both on cost and on the wall-clock time it takes to stand each stack up.

The environment has to feel like production to the agent, stay out of the way of everyone else using it, and be cost-efficient enough that a team can run as many of them as they have agents in flight.

Diagram of each agent getting its own isolated environment, but sharing a cluster.

Agents need to know how to use the environment

Giving the agent a realistic environment is necessary, but on its own it does not close the loop. An agent with access to a production-like system, and no guidance about how to use it, behaves like a new engineer dropped into an unfamiliar codebase on day one. The access is there. The judgment about how to use it is not.

That judgment has two parts. One is the team’s operational knowledge: which upstream callers to exercise when a code path changes, which downstream dependencies actually affect the outcome, how to tell whether a failing request was caused by the change under review or by a flake three services away. The other is fluency with the tooling inside the environment itself: how to route traffic for testing, how to inspect state across services, which commands are available, how to read the logs the environment exposes. Generic testing know-how covers neither.

Agent skills are the vehicle for both. A skill captures how changes in a particular system should be validated and debugged, and how to drive the specific tools the environment provides to do that work. It is the team’s institutional knowledge, plus the operating manual for the environment, handed to the agent alongside access itself.

What this enables is the thing that matters. An agent that can validate its own work the way a senior engineer on the team would. Not just running tests, but exercising the right paths with the right tools, interpreting the right signals, and recognizing when something is wrong in a way that reflects how the system actually behaves.

Skills and environments have to ship together. An environment without a skill gives the agent access with no judgment. A skill without an environment gives the agent judgment with nothing to act on. Either one alone leaves the loop open.

Diagram emphasizing that skill and environments are required for the validation workflow to work.

The inner loop is where the next leap lives

The next unlock for cloud-native teams using coding agents is not a smarter model. It is a real environment the agent can work against in their inner loop and the context to use it well.

What makes this more interesting over time is that the boundary between inner loop and outer loop starts to dissolve once agents are involved. When a test fails in CI, the natural next step for an agent is not to surface the failure and wait for a human. It is to drop the same change back into an inner-loop environment, reproduce the failure with real dependencies, debug it, and push a fix. The outer loop’s signal becomes the inner loop’s starting condition.

It runs the other way too. An ad-hoc validation an agent runs once in the inner loop often deserves to outlive the task. Encoded into the outer loop, it becomes part of the team’s standing regression suite. The inner loop’s one-off experiment becomes the outer loop’s durable guard.

Both directions depend on the same foundation: a real environment the agent can reach from either loop, and the context to use it well. This is the approach we’re building toward at Signadot, so the validation cycle stays continuous wherever the signal arrives.

This feedback loop is what turns agents from fast code generators into collaborators developers can trust. The teams that close it, across both loops, will be the ones getting the real benefits of coding agents while the rest are buried under their review queues.

The post Why Claude needs a real environment to validate cloud-native code appeared first on The New Stack.

Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Blazorise 2.1.1 - Stability Update and Package Maintenance

1 Share
Blazorise 2.1.1 resolves a critical Bootstrap upgrade issue, updates dependencies, and introduces synchronized charts along with several fixes.
Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Pragmatic AI in .NET Show: AI-Powered Enterprise Software with Colin Whitlatch

1 Share
From: UnoPlatform
Duration: 0:00
Views: 0

Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Becoming a Better Python Developer Through Learning Rust

1 Share

How can learning Rust help make you a better Python Developer? How do techniques required by a compiled language translate to improving your Python code? Christopher Trudeau is back on the show this week with another batch of PyCoder’s Weekly articles and projects.

We discuss a recent article by Bob Belderbos titled “Learning Rust Made Me a Better Python Developer.” Bob has been on a journey learning to program in Rust, which has made him rethink how he’s been writing Python. The compiler forced him to confront things he’d been ignoring.

We also share other articles and projects from the Python community, including recent releases, a boatload of PEPs, NumPy as a synth engine, firing and forgetting with Python’s asyncio, managing state with signals in Python, a documentation site generator for Python packages, and a tool to explain your Python environment.

This episode is sponsored by AgentField.

Video Course Spotlight: Using Loguru to Simplify Python Logging

Learn how to use Loguru for simpler Python logging, from zero-config setup and custom formats to file rotation, retention, and adding context.

Topics:

  • 00:00:00 – Introduction
  • 00:02:23 – Python 3.15.0a8, 3.14.4 and 3.13.13 Released
  • 00:03:01 – Django Security Releases: 6.0.4, 5.2.13, and 4.2.30
  • 00:03:38 – DjangoCon Europe 2027 Call for Organizers
  • 00:04:04 – PEP 803: "abi3t": Stable ABI for Free-Threaded Builds
  • 00:04:44 – PEP 829: Structured Startup Configuration via .site.toml File
  • 00:05:18 – PEP 830 – Add timestamps to exceptions and tracebacks
  • 00:05:44 – PEP 831 – Frame Pointers Everywhere: Enabling System-Level Observability for Python
  • 00:06:59 – PEP 832 – Virtual environment discovery
  • 00:10:13 – PyCoder’s Weekly - Submit a Link
  • 00:11:15 – NumPy as Synth Engine
  • 00:21:04 – Sponsor: AgentField
  • 00:22:05 – Fire and Forget at Textual
  • 00:25:39 – Learning Rust Made Me a Better Python Developer
  • 00:34:06 – Video Course Spotlight
  • 00:35:49 – Signals: State Management for Python Developers
  • 00:40:34 – great-docs: Documentation Site Generator for Python Package
  • 00:42:32 – pywho: Explain Your Python Environment and Detect Shadows
  • 00:44:01 – Thanks and goodbye

News:

Show Links:

  • NumPy as Synth Engine – Kenneth has “recorded” a song in a Python script. The catch? No sampling, no recording, no pre-recorded sound. Everything was done through generating wave functions in NumPy. Learn how to become a mathematical musician.
  • Fire and Forget at Textual – In this follow up to a previous article (Fire and forget (or never) with Python’s asyncio, Michael discusses a similar article by Will McGugan as it relates to Textual. He found the problematic pattern in over 500K GitHub files.
  • Learning Rust Made Me a Better Python Developer – Bob thinks that learning Rust made him a better Python developer. Not because Rust is better, but because it made him think differently about how he has been writing Python. The compiler forced him to confront things he’d been ignoring.
  • Signals: State Management for Python Developers – If you’ve ever debugged why your cache didn’t invalidate or notifications stopped firing after a “simple” state change, this guide is for you. Signals are becoming a JavaScript standard, but Python developers can use the same patterns to eliminate “forgot to update that thing” bugs.

Projects:

Additional Links:

Level up your Python skills with our expert-led courses:

Support the podcast & join our community of Pythonistas





Download audio: https://dts.podtrac.com/redirect.mp3/files.realpython.com/podcasts/RPP_E292_03_PyCoders.b59c2ad7bc0f.mp3
Read the whole story
alvinashcraft
48 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories