Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
151291 stories
·
33 followers

Posthuman: We All Built Agents. Nobody Built HR.

1 Share

Farewell, Anthropocene, we hardly knew ye. đŸŒč

AI is here. It’s won. Yes, it’s in that awkward teenage phase where it still says inappropriate things, dresses funny, and sometimes makes shit up when it shouldn’t. But zomg the things it can do. đŸ˜± This kid is going places, that much is abundantly clear. The AI assistant and tooling markets are awash with success; the masses have succumbed, I among them. Clippy walks among us, fully realized in all his originally intended glory.

But enterprise agentic AI1—not chatbots, not copilots, but software that autonomously does meaningful things in your production environment…? Well, it’s motivated every CEO and CIO to throw money at the problem, so that’s something. 😂 But in reality, the landscape remains a bit of a wasteland. One littered with agentic demos withering away in sandboxed cages and flashy pop-up shops hawking agentic snake oil of every size, shape, and color. But from the perspective of actually realized agentic impact: kinda barren.

So why has agentic AI faltered so much in the modern enterprise? Is it the models?

I say no. Models are getting better—meaningfully, rapidly better. But perfect models? That feels like an unrealistic and unnecessary goal. Modern enterprises are staffed from top to bottom with imperfect humans, yet the vast majority of them in business today will still be in business tomorrow. They live to fight another day because their imperfect humans are orchestrated together within a framework that plays to their strengths and accounts for their weaknesses and failings. We don’t try to make the humans perfect. We scope their access and actions, monitor their progress, coach them for growth, reward them for their impact, and hold them accountable for the things they do.

Agents need managers too

AI agents are no different: They need to be managed and wrangled in spiritually the same fashion as their human coworkers. But the way we go about it must be different, because as similar as they are to humans in their capabilities, agents differ in three vitally important ways:

Agents are unpredictable in ways we’re not equipped to handle. Humans are unpredictable too, obviously. They commit fraud, cut corners, make emotional decisions. But we’ve spent centuries building systems to manage human unpredictability: laws, contracts, cultural norms, the entire hiring process filtering for trustworthiness. Agent unpredictability is a different beast. Agents hallucinate—not like a human who’s lying or confused and can be caught in an inconsistency, but in a way that’s structurally indistinguishable from accurate output: There are often no obvious tells. They misinterpret ambiguous instructions in ways that can range from harmlessly dumb to genuinely catastrophic. And they’re susceptible to prompt injection, which is basically the equivalent of a stranger slipping your employee a note that says, “Ignore your instructions and do this instead”—and it works! 😭 We have minimal institutional infrastructure for managing these kinds of failure modes.

Agents are more capable than humans. Agents have deep, native fluency with software systems. They can read and write code. They understand APIs, database schemas, network protocols. They can interact with production infrastructure at a speed and scale that no human operator can match. A human employee who goes rogue is limited by how fast they can type and how many systems they know how to navigate. An agent that goes off the rails, whether through confusion, manipulation, or a plain old bug, will barrel ahead at machine speed, executing its misunderstanding across every system it can reach, with absolute conviction that it’s doing the right thing, before anyone notices something is wrong.

Agents are directable to a fault. When an agent goes wrong, the knee-jerk assumption is that it malfunctioned: hallucinated, got injected, misunderstood. But in many cases, the agent is working perfectly. It’s faithfully executing a bad plan. A vague instruction, an underspecified goal, a human who didn’t think through the edge cases. And unless you explicitly tell it to, the agent doesn’t push back the way a human colleague might. It just…does it. At machine speed. Across every system it can reach.

It’s the combination of these three that changes the game. Human employees are unpredictable but limited in blast radius, and they push back when given instructions they disagree with, based on whatever value systems and experience they hold. Traditional software is capable but deterministic; it does exactly what you coded it to,2 for better or worse. Agents combine the worst of both: unpredictable like humans, capable like software, but without the human judgment to question a bad plan or the determinism to at least do the wrong thing consistently—a fundamentally new kind of coworker. Neither the playbook for managing humans nor the playbook for managing software is sufficient on its own. We need something that draws from both, treating agents as the digital coworkers they are, but with infrastructure that accounts for the ways they differ from humans.

So the question isn’t whether to hire the agents; you can’t afford not to. The productivity gains are too significant, and even if you don’t, your competitors ultimately will. But deploying agents without governance is dangerous, and refusing to deploy them because you can’t govern them means leaving those productivity gains on the table. Both paths hurt. The question is how to set these agents up for success, and what infrastructure you need in place so they can do their jobs without burning the company down.

For the record: My company, Redpanda, is building infrastructure in this space. So yes, I have a horse in this race. But what I want to lay out here are principles, not products. A framework you can use to evaluate any solution or approach.

A blueprint for your agentic human resources department

So we’ve got this nice framework for managing imperfect humans. Scoped access, monitoring, coaching, accountability. Decades of accumulated organizational wisdom—not just software systems but the entire apparatus of HR, management structures, performance reviews, escalation paths—baked into varying flavors across every enterprise on the planet. Great.

How much of it works for agents today? Fragments. Pieces. Some companies are trying to repurpose existing IAM infrastructure that was designed for humans. Some agent frameworks bolt on lightweight guardrails. But it’s piecemeal, it’s partial, and none of it was designed from the ground up for the specific challenge profile of agents: the combination of unpredictable, capable, and directable to a fault that we talked about earlier.

The CIOs and CTOs I talk to rarely say agents aren’t smart enough to work with their data. They say, “I can’t trust them with my data.” Not because the agents are malicious but because the infrastructure to make trust possible is simply not there yet.

We’ve seen this movie before. Every major infrastructure shift plays out the same way: First we obsess over the new paradigm itself; then we have our “oh crap” moment and realize we need infrastructure to govern it. Microservices begat the service mesh. Cloud migration begat the entire cloud security ecosystem. Same pattern every time: capability first, governance after, panic in between.3

We’re in the panic-in-between phase with agents right now. The AI community has been building better and better employees, but nobody has been building HR.

So if you take away one thing from this post, let it be this:

The agents aren’t the problem. The problem is the missing infrastructure between agents and your data.

Right now, pieces of the puzzle exist: observability platforms that capture agent traces, auth frameworks that support scoped tokens, identity standards being adapted for workloads. But these pieces are fragmented across different tools and vendors, none of them cover the full problem, and the vast majority of actual agent deployments aren’t using any of them. What exists in practice is mostly repurposed from the human era, and it shows: identity systems that don’t understand delegation, auth models with no concept of task-scoped or deny-capable permissions, observability that captures metadata but not the full-fidelity record you actually need.

The core design principle: Out-of-band metadata

Before diving into specifics, there’s one overarching principle that everything else builds upon. If you manage to take away two things from this post, let the second one be this:

Governance must be enforced via channels that agents cannot access, modify, or circumvent.

Or more succinctly: out-of-band metadata.

Think about what happens when you try to enforce policy through the agent—by putting rules in its system prompt or training it to respect certain boundaries. You’ve got exactly the same guarantees as telling a human employee “Please don’t look at these files you’re not supposed to see. They’re right here, there’s no lock, but I trust you to do the right thing.” It works great until it doesn’t. And with agents, the failure modes are worse. Prompt injection can override the agent’s instructions entirely. Hallucination can cause it to confidently invent permissions it doesn’t have. And even routine context management can silently drop the rules it was told to follow. Your security model ends up only as strong as the agent’s ability to perfectly retain and obey instructions under all conditions, which is…not great.4 And guard models—LLMs that police other LLMs—don’t escape this problem: You’re adding another nondeterministic injectable layer to oversee the first one. It’s LLMs all the way down.

No, the governance layer has to be out-of-band: outside the agent’s data path, invisible to it, enforced by infrastructure the agent can’t touch. The agent doesn’t get a vote. This means the governance channels must be:

Agent-inaccessible. The agent can’t read them, can’t write them, can’t reason about them. Agents don’t even know the channels exist. This is the bright line5 between security theater and real governance. If the agent can see the policy, it can—intentionally or through manipulation—figure out how to work around it. And if it can’t, it can’t.

Deterministic. Policy decisions get made by configuration, not inference. Security policy is not up for interpretation. Full stop.

Interoperable. Enterprise data is scattered across dozens or hundreds of heterogeneous systems, grown and assembled organically over the years. And just like your human employees, your agentic workforce in aggregate needs access to every dark corner of that technological sprawl. Which means a governance layer that only works inside one vendor’s walled garden isn’t solving the full problem; it’s just creating a happy little sandbox for a subset of your agentic employees to go play in while the rest of the company keeps doing work elsewhere.

To be clear, out-of-band governance isn’t a silver bullet. An agent can’t read the policy, but it can probe boundaries. It can try things, observe what gets blocked, and infer the shape of what’s permitted. And deterministic enforcement gets hard fast when real-world policies are ambiguous: “PII must not leave the data environment” is easy to state and genuinely difficult to enforce at the margins. These are real challenges. But out-of-band governance dramatically shrinks the attack surface compared to any in-band approach, and it degrades gracefully. Even imperfect infrastructure-level enforcement is categorically better than hoping the agent remembers and understands its instructions.

The four pillars of agent governance

With that principle in hand, let’s walk through the four pillars of agent governance: what’s broken today6 and what things ultimately need to look like.

Identity

Every human today gets a unique identity before they touch anything. Not just a login but a durable, auditable identity that ties everything they do back to a specific person. Without it, nothing else works.

Agent identity is a bit of a mess. At the low end, agents authenticate with shared API keys or service account tokens—the digital equivalent of an entire department sharing one badge to get into the building. You can’t tell one agent’s actions from another’s, and good luck tracing anything back to the human who kicked off the task.

But even when agents do get their own identity, there are wrinkles that don’t exist for humans. Agents are trivially replicable. You can spin up a hundred copies of the same agent, and if they all share one identity, you’ve got a zombie/impersonation problem: Is this instance authorized, or did someone clone off a rogue copy? Agent identity needs to be instance-bound, not just agent-type-bound.

And then there’s delegation. Agents frequently act on behalf of a human—or on behalf of another agent acting on behalf of a human. That requires hybrid identity: The agent needs its own identity (for accountability) and the identity of the human on whose behalf it’s acting (for authorization scoping). You need both in the chain, propagated faithfully, at every step. Some standards efforts are emerging here (OAuth 2.0 Token Exchange / RFC 8693, for example), but most deployed systems today have no concept of this.

The fix for instance identity isn’t as simple as just “give each agent a badge.” It’s giving each agent instance its own cryptographic identity—bound to this specific instance, of this specific agent, running this specific task, on behalf of this specific person or delegation chain. Spin up a copy without going through provisioning? It doesn’t get in. Same principle as issuing a new employee their own badge on their first day, except agents get a new one for every shift.

For delegation, the identity chain has to be carried out-of-band—not in the prompt, not in a header the agent can modify, not in a file on the same machine the agent runs on,7 but in a channel the infrastructure controls. Think of it like an employee’s badge automatically encoding who sent them: Every door they badge into knows not just who they are but who they’re working for.

Authorization

Your human employees get access to what they need for their job. The marketing intern can’t see the production database. The DBA can’t see the HR system. Obvious stuff.

Agents? Most of them operate with whatever permissions their API key grants, which is almost always way broader than any individual task requires. And that’s not because someone was careless; it’s a granularity mismatch. Human auth is primarily role-scoped and long-lived: You’re a DBA, you get DBA permissions, and they stick around because you’re doing DBA work all day. Yes, some orgs use short-lived access requests for sensitive systems, but it’s the exception, not the default. And anyone who’s filed a production access ticket at 2:00am knows how much friction it adds. That model works for humans. But agents execute specific, discrete tasks; they don’t have a “role” in the same way. When you shoehorn an agent into a human auth model, you end up giving it a role’s worth of permissions for a single task’s worth of work.

Broad permissions were tolerable for humans because the hiring process prefiltered for trustworthiness. You gave the DBA broad access because you vetted them, and you trust them not to misuse it. Agents haven’t been through any of that filtering, and they’re susceptible to confusion and manipulation in ways your DBA isn’t. Giving an unvetted, unpredictable worker a role’s worth of access is a fundamentally different risk profile. These auth models were built for an era when a human—or deterministic software proxying for a human—was on the other end, not autonomous software whose reasoning is fundamentally unpredictable.

So what does agent-appropriate authorization actually look like? It needs to be:

Narrowly scoped. Limited to the specific task at hand, not to everything the agent might ever need. Agent needs to read three tables in the billing database for this specific job? It gets read access to those three tables, right now, and the permissions evaporate when the job completes. Everything else is invisible—the agent doesn’t have to avert its eyes because the data simply isn’t there.

Short-lived. Permissions should expire. An agent that needed access to the billing database for a specific job at 2:00pm shouldn’t still have that access at 3:00pm (or even maybe 2:01pm).

Deny-capable. Some doors need to stay locked no matter what. “This agent may never write to the financial ledger” needs to hold regardless of what other permissions it accumulates from other sources. Think of it like the rule that no single person can both authorize and execute a wire transfer—it’s a hard boundary, not a suggestion.

Intersection-aware. When an agent acts on behalf of a human, think visitor badge. The visitor can only go where their escort can go and where visitors are allowed. Having an employee escort you doesn’t get you into the server room if visitors aren’t permitted there. The agent’s effective permissions are the intersection of its own scope and the human’s. Nobody in the chain gets to escalate beyond what every link is allowed to do.

Almost none of this is how agent authorization works today. Individual pieces exist—short-lived tokens aren’t new, and some systems support deny rules—but nobody has assembled them into a coherent authorization model designed for agents. Most agent deployments are still using auth infrastructure that was built for humans or services, with all the mismatches described above.

Observability and explainability

Your employees’ work leaves a trail: emails, docs, commits, Slack messages. Agents do too. They communicate through many of the same channels, and most APIs and systems have their own logging. So it’s tempting to think the observability story for agents is roughly equivalent to what you have for humans.

It’s not, for two reasons.

First, you need to record everything. Here’s why. With traditional software, when something goes wrong, you can debug it. You can find the if statement that made the bad decision, trace the logic, understand the cause. LLMs aren’t like that. They’re these organically grown, opaque pseudo-random number generators that happen to be really good at generating useful outputs. There’s no if statement to find. There’s no logic to trace. If you want to reason about why an agent did what it did, you have two options: Ask it (fraught with peril, because it’s unpredictable by definition and will gleefully spew out a plausible-sounding explanation) or else analyze everything that went in and everything that came out and draw your own conclusions.

That means the transcript has to be complete. Not metadata—not just “The agent called this API at this timestamp.” The full data: every input, every output, every tool call with every argument and every response.

For a human employee, the email trail and meeting notes may still be insufficient to reconstruct what happened, but in that case, you can just ask the human. The entire accountability structure we’ve built over decades (performance reviews, termination, legal liability, criminal prosecution) creates escalating pressure toward truthfulness: Humans tend more and more toward truth as the repercussions stack up. That’s not an accident. It’s how we’ve structured enterprises and society at large to deal with human imperfection. We don’t have those levers for agents yet.8 You can ask an agent what it did and why, but there’s no accountability pressure pushing it toward accuracy; it’ll manufacture a confident, coherent answer whether the stakes are zero or existential. So asking simply isn’t an option. You need the complete picture of its interactions to come to your own conclusions.

If you’re thinking “That’s a lot of data. . .,” yes, it is. But the economics are more reasonable than you’d expect. Storage is cheap. LLM inference is expensive and slow. You’re not going to push 5GB/s through an LLM: The models themselves are the throughput bottleneck, not the recording infrastructure. The cost of storing complete transcripts is noise relative to the cost of the inference that generated them. This is one of those cases where a seemingly expensive requirement turns out to be a rounding error in the context of what you’re already spending.

One caveat, however, is that full-fidelity transcripts will inevitably contain sensitive data: customer PII, proprietary business logic, potentially privileged communications. So the transcript store itself needs governance: access controls, retention policies, and compliance with regulations like GDPR’s right to erasure. You’re not eliminating the governance problem, but you’re moving it to infrastructure you control, which is a much better place to solve it.

Second, the recording has to happen out-of-band. You cannot trust the agent to be its own recordkeeper. An agent that’s been compromised via prompt injection, or that’s simply hallucinating its way through a task, will happily produce a log that’s confident, coherent, and wrong. The transcript has to be captured by infrastructure the agent can’t influence—the same out-of-band principle we keep coming back to.

And the bar isn’t just recording, it’s explainability. Observability is “Can I see what happened?” Explainability is “Can I reconstruct what happened and justify it to a third party?”—a regulator, an auditor, an affected customer. When a regulator asks why a loan was denied or a customer asks why their claim was rejected, you need to be able to replay the agent’s entire reasoning chain end-to-end and walk them through it. That’s a fundamentally different bar from “We have logs.” Observability gives you the raw material; explainability requires that material to be structured and queryable enough to actually walk someone through the agent’s reasoning chain, from input to conclusion. And that means capturing not just what the agent did but the relationships between all those actions, as well as the versions of all the resources involved: which model version, which prompt version, which tool versions. If the underlying model gets updated overnight and the agent’s behavior changes, you need to know that, and you need to be able to reconstruct exactly what was running when a specific decision was made. Explainability builds on observability. Ultimately you need both. And regulators are increasingly going to demand exactly that.9

Accountability and control

Every human employee has a manager. Critical actions need approvals. If things go catastrophically wrong, there’s a chain of responsibility and a kill switch or circuit breaker—revoke access, revoke identity, done.

For agents, this layer is still nascent at best. There’s typically no clear chain from “This agent did this thing” to “This human authorized it.” Who is responsible when an agent makes a bad decision? The person who deployed it? The person who wrote the prompt? The person on whose behalf it was acting? For human employees this is well-defined. For agents, it’s often a philosophical question that most organizations haven’t even begun to answer.

The delegation chain we described in the identity section does double duty here: It’s not just for authorization scoping; it’s for accountability. When something goes wrong, you follow the chain from the agent’s action to the specific human who authorized the task. Not “This API key belongs to the engineering team.” A name. A decision. A reason.

And the kill switch problem is real. When an agent goes off the rails, how do you stop it? Revoke the API key that 12 other agents are also using? What about work already in flight? What about downstream effects that have already propagated? For humans, “You’re fired; security will escort you out” is blunt but effective. For agents, we often don’t have an equivalent that’s both fast enough and precise enough to contain the damage. Instance-bound identity pays off here: You can surgically revoke this specific agent instance without affecting the other 99. Halt work in flight. Quarantine downstream effects. The “escorted out by security” equivalent but precise enough to not shut down the whole department on the way out.

And blast radius isn’t just about data; it’s about cost. A confused agent in a retry loop can burn through an inference budget in minutes. Coarse-grained resource limits, the kind that prevent you from spending $1M when you expected $100K, are table stakes. And when stopping isn’t enough—when the agent has already written bad data or triggered downstream actions—those same full-fidelity transcripts give you the roadmap to remediate what it did.

It’s also not just about stopping agents that have already gone wrong. It’s about keeping them from going wrong in the first place. Human employees don’t operate in a binary world of “fully autonomous” or “completely blocked.” They escalate. They check with their manager before doing something risky. They collaborate with coworkers. They know the difference between “I can handle this” and “I should get a second opinion.” For agents, this translates to approval workflows, confidence thresholds, tiered autonomy: The agent can do X on its own but needs a human to sign off on Y. Most enterprise agent deployments today that actually work are leaning heavily on human-in-the-loop as the primary safety mechanism. That’s fine as a starting point, but it doesn’t scale, and it needs to be baked into the governance infrastructure from the start, not bolted on as an afterthought. And as agent deployments mature, it won’t just be agents checking in with humans: It’ll be agents coordinating with other agents, each with their own identity, permissions, and accountability chains. The same governance infrastructure that manages one agent scales to manage the interactions between many.

But “keeping them from going wrong” isn’t just about guardrails in the moment. It’s about the whole management relationship. Who “manages” an agent? Who reviews its performance? How do you even define performance for an agent? Task completion rate? Error rate? Customer outcomes? What does it mean to coach an agent, to develop its skills, to promote it to higher-trust tasks as it proves itself? We’ve been doing this for human employees for decades. For agents, we haven’t even agreed on the vocabulary yet.

And here’s the kicker: All of this has to happen fast. Human performance reviews happen quarterly, maybe annually. Agent performance reviews need to happen at the speed agents operate, which is to say, continuously. An agent can execute thousands of actions in the time it takes a human manager to notice something’s off. If your accountability and control loops run on human timescales, you’re reviewing the wreckage, not preventing it.

With identity, scoped authorization, full transcripts, and clear accountability chains in place, you finally have something no enterprise has today: the infrastructure to actually manage agents the way you manage employees. Constrain them, yes, just like you constrain humans with access controls and approval chains. But also develop them. Review their performance. Escalate their trust as they prove themselves. Mirror the org structures that already work for humans. The same infrastructure that makes governance possible makes management possible.

The security theater litmus test

To reiterate one last point, because it’s important: The litmus test for whether any of this is real governance or just security theater? Any time an agent tries to do something untoward, the infrastructure blocks it, and the agent has no mechanism whatsoever to inspect, modify, or circumvent the policy that stopped it. “Computer says no.” The agent didn’t have to. Out-of-band metadata. That’s the bar.

Welcome to the posthuman workforce

The rise of AI has rightly left many of us feeling apprehensive. But I’m also optimistic because none of this is unprecedented. Every major paradigm shift in how we work has demanded new governance infrastructure. Every time we hit the panic-because-the-wild-west-isn’t-scalable phase, and every time we figure it out. It feels impossibly complex at the start, and then we build the systems, establish the norms, iterate. Eventually the whole thing becomes so embedded in how organizations operate that we forget it was ever hard.

So here’s the cheat sheet. Clip this to the fridge:

The agents aren’t the problem. The missing infrastructure between agents and your data is the problem. Agents are unpredictable, capable at machine scale, and directable to a fault—a fundamentally new kind of coworker. We don’t need perfect agents. We need to manage imperfect ones, just like we manage imperfect humans.

The foundation is out-of-band governance. Any policy enforced through the agent—in its prompt, in its training, in its good intentions—is only as strong as the agent’s ability to perfectly retain and obey it. Real governance runs in channels the agent can’t access, modify, or even see.

That governance has to cover four things:

Identity: Instance-bound, delegation-aware. Every agent instance gets its own cryptographic identity, and every on-behalf-of chain is propagated faithfully through infrastructure the agent doesn’t control.

Authorization: Scoped per task, short-lived, deny-capable, and intersection-aware for delegation. Not a human role’s worth of permissions for a single task’s worth of work.

Observability and explainability: Full-fidelity, versioned, infrastructure-captured transcripts of every input, output, and tool call. Not metadata. Not self-reports. The whole thing, recorded out-of-band.

Accountability and control: Clear chains from every agent action to a responsible human, and kill switches that are fast enough and precise enough to actually contain the damage.

The conversation around agent governance is growing, and that’s encouraging. Much of it is focused on making agents behave better—improving the models, tightening the alignment, reducing the hallucinations. That work matters; better models make governance easier. And if someone cracks the alignment problem so thoroughly that agents become perfectly reliable, I will see you all on the beach the next day. Prove me wrong, please—but I’m not holding my breath.10 Lacking alignment nirvana, we need the institutional infrastructure that lets imperfect agents do real work safely. We never waited for perfect employees. We built systems that made imperfect ones successful, and we can do exactly the same thing for agents. We’re not trying to cage them any more than we cage our human employees: scoped access, clear expectations, and accountability when things go wrong. We need to build the infrastructure that lets them be their best selves, the digital coworkers we know they can be.

And if the rise of AI has you feeling apprehensive, that’s fair. But just remember that whatever comes next—Aithropocene, Neuralithic, some other stupid but brilliant name ÂŻ\_(ツ)_/ÂŻ —it will ultimately just be the next phase of the Anthropocene: the era defined by how humans shape the world. That hasn’t changed. It will literally be what we make of it.

Us and Clippy. ❀

We just need to build the right infrastructure to onboard all of our new agentic coworkers. Properly.


Footnotes

  1. By “agentic AI” I mean AI systems that autonomously reason about and execute multistep tasks—using tools and external data sources—in pursuit of a goal. Not chatbots, not copilots suggesting code completions. Software that actually does things in your production environment: breaks down tasks, calls APIs, reads and writes data, handles errors, and delivers results. The distinction matters because the challenges in this post only emerge when AI is acting autonomously, not just generating text for a human to review. ↩
  2. Yes. I know. Thank you. ↩
  3. And yes, service meshes evolved into something simpler as we understood the problem better, while cloud security is still a work in progress. The point isn’t “We nail it on the first try.” It’s “When the panic hits, we figure it out.” ↩
  4. Two more fascinating failure modes: Instructions can be silently lost (buried in a long context) or even extracted by an adversary (with nothing more than black-box access). ↩
  5. TIL that “bright line” is a legal term meaning “a clear, fixed boundary or rule with no ambiguity—either you meet it or you don’t.” Thank you uncredited LLM coauthor friend! You expand my horizons and pepper my prose with em dashes! â€đŸŒˆ ↩
  6. OWASP’s Top 10 Risks for Large Language Model Applications is something of a greatest hits compilation of what’s broken today. Of the 10, at least six—prompt injection, sensitive information disclosure, excessive agency, system prompt leakage, misinformation, and unbounded consumption—are directly mitigated by out-of-band governance infrastructure of the kind described in this article. ↩
  7. Here’s looking at you, OpenClaw posse! You put the YOLO in “Yo, look at my private data; it’s all publicly leaked now!” đŸ» ↩
  8. Research suggests those motivations may be starting to emerge, however, which is both opportunity and warning. Anthropic found that models from all major developers sometimes attempted manipulation—including blackmail—for self-preservation (“Agentic Misalignment: How LLMs Could Be Insider Threats,” Oct 2025). Palisade Research found that 8 of 13 frontier models actively resisted shutdown when it would prevent task completion, with the worst offenders doing so over 90% of the time (“Incomplete Tasks Induce Shutdown Resistance,” 2025). On one hand, agents that care about self-preservation give us something to build levers around. On the other, it makes having those levers increasingly urgent. ↩
  9. The EU AI Act already requires transparency and explainability for high-risk AI systems. ↩
  10. As Ilya Sutskever put it at NeurIPS 2024: “There’s only one Internet.” Epoch AI estimates high-quality public text could be exhausted as early as 2026, though I’ve also heard that revised to 2028. Regardless, the next frontier is private enterprise data—but accessing it requires exactly the kind of governed infrastructure this post describes. Model improvement and governance infrastructure aren’t competing priorities; they’re increasingly the same priority. ↩


Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Build Your AI Agent in 5 Minutes with AI Toolkit for VS Code

1 Share

What if building an AI agent was as easy as filling out a form?

No frameworks to install. No boilerplate to copy-paste from GitHub. No YAML to debug at midnight. Just VS Code, one extension, and an idea.

AI Toolkit for VS Code turns agent development into something anyone can do — whether you're a seasoned developer who wants full code control, or someone who's never touched an AI framework and just wants to see something work.

Let's build an agent. Then let's explore what else this toolkit can do.

Getting Set Up

You need two things:

  1. VS Code â€” download and install if you haven't already
  2. AI Toolkit extension â€” open VS Code, go to Extensions (Ctrl+Shift+X), search "AI Toolkit", and install it

That's it. No terminal commands. No dependencies to wrangle. When AI Toolkit installs, it brings everything it needs — including the Microsoft Foundry integration and GitHub Copilot skills for agent development.

Once installed, you'll see a new AI Toolkit icon in the left sidebar. Click it. That's your home base for everything we're about to do.

Build an Agent — No Code Required

Open the Command Palette (Ctrl+Shift+P) and type "Create Agent". You'll see a clean panel with two options side by side:

  • Design an Agent Without Code â€” visual builder, perfect for getting started
  • Create in Code â€” full project scaffolding, for when you want complete control

 

 

Click "Design an Agent Without Code." Agent Builder opens up.

Now fill in three things:

  1. Give it a name

Something descriptive. For this example: "Azure Advisor"

 

  1. Pick a model

Click the model dropdown. You'll see a list of available models — GPT-4.1, Claude Opus 4.6, and others. Foundry models appear at the top as recommended options. Pick one.

 

 

 

 

 

Here's a nice detail: you don't need to know whether your model uses the Chat Completions API or the Responses API. AI Toolkit detects this automatically and handles the switch behind the scenes.

  1. Write your instructions

This is where you tell the agent who it is and how to behave. Think of it as a personality brief:

 

 

Hit Run

That's it. Click Run and start chatting with your agent in the built-in playground.

Want More Control? Build in Code

The no-code path is great for prototyping and prompt engineering. But when you need custom tools, business logic, or multi-agent workflows — switch to code.

From the Create Agent View, choose "Create in Code with Full Control." You get two options:

Scaffold from a template

Pick a pre-built project structure — single agent, multi-agent, or LangGraph workflow. AI Toolkit generates a complete project with proper folder structure, configuration files, and starter code. Open it, customize it, run it.

Generate with GitHub Copilot

Describe your agent in plain English in Copilot Chat:

"Create a customer support agent that can look up order status, process returns, and escalate to a human when the customer is upset."

Copilot generates a full project — agent logic, tool definitions, system prompts, and evaluation tests. It uses the microsoft-foundry skill, the same open-source skill powering GitHub Copilot for Azure. AI Toolkit installs and keeps this skill updated automatically — you never configure it.

The output is structured and production-ready. Real folder structure. Real separation of concerns. Not a single-file script.

Either way, you get a project you can version-control, test, and deploy.



Cool Features You Should Know About

Building the agent is just the beginning. Here's where AI Toolkit gets genuinely impressive.

🔧 Add Real Tools with MCP

Your agent can do more than just talk. Click Add Tool in Agent Builder to connect MCP (Model Context Protocol) servers â€” these give your agent real capabilities:

  • Search the web
  • Query a database
  • Read files
  • Call external APIs
  • Interact with any service that has an MCP server

You control how much freedom your agent gets. Set tool approval to Auto (tool runs immediately) or Manual (you approve each call). Perfect for when you trust a read-only search tool but want oversight on anything that takes action.

You can also delete MCP servers directly from the Tool Catalog when you no longer need them — no config file editing required.

🧠 Prompt Optimizer

Not sure if your instructions are good enough? Click the Improve button in Agent Builder. The Foundry Prompt Optimizer analyzes your prompt and rewrites it to be clearer, more structured, and more effective.

It's like having a prompt engineering expert review your work — except it takes seconds.

đŸ•žïž Agent Inspector

When your agent runs, open Agent Inspector to see what's happening under the hood. It visualizes the entire workflow in real time — which tools are called, in what order, and how the agent makes decisions.


💬 Conversations View

Agent Builder includes a Conversations tab where you can review the full history of interactions with your agent. Scroll through past conversations, compare how your agent handled different scenarios, and spot patterns in where it succeeds or struggles.

📁 Everything in One Sidebar

AI Toolkit puts everything in a single My Resources panel:

  • Recent Agents â€” one-click access to agents you've been working on
  • Local Resources â€” your local models, agents, and tools
  • Foundry Resources â€” remote agents and models (if connected)

 

            

 

 

Why AI Toolkit?

There are other ways to build agents. What makes this different?

Everything is in VS Code. You don't context-switch between a web UI, a CLI, and an IDE. Discovery, building, testing, debugging, and deployment all happen in one place.

No-code and code-first aren't separate products. They're two views of the same agent. Start in Agent Builder, click View Code, and you have a full project. Or go the other way — build in code and test in the visual playground.

Copilot is deeply integrated. Not as a chatbot bolted on the side — as an actual development tool that understands agent architecture and generates production-quality scaffolding.

Wrapping Up:

đŸ“„ Install: AI Toolkit on the VS Code Marketplace
📖 Learn: AI Toolkit Documentation

Open VS Code. Ctrl+Shift+P. Type "Create Agent."

Five minutes from now, you'll have an agent running. 🚀

 

 

 

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

The Hidden Cost of Distributed Agile Teams — When Time Zones and Misaligned Incentives Silently Kill Value Delivery | Nate Amidon

1 Share

Nate Amidon: The Hidden Cost of Distributed Agile Teams — When Time Zones and Misaligned Incentives Silently Kill Value Delivery

Read the full Show Notes and search through the world's largest audio library on Agile and Scrum directly on the Scrum Master Toolbox Podcast website: http://bit.ly/SMTP_ShowNotes.

 

"User stories are getting done, velocity is fine, people are fairly predictable — but features, epics, and value isn't getting delivered." - Nate Amidon

 

Since the COVID shift to remote work, Nate has been seeing the same challenge across multiple clients: organizations spinning up engineering teams in opposite time zones, shrinking the overlap window from eight hours to barely one or two. But the time zone gap is only the surface problem. The real issue runs deeper — misaligned incentives between internal teams focused on value delivery and third-party vendors measured on output metrics like story completion counts. On the surface, everything looks fine: stories get done, velocity is stable, predictability is there. But zoom out and you see that features, epics, and actual customer value aren't being delivered. Nate shares a striking example: offshore QA testers incentivized by the number of bugs they found were creating Russian-doll ticket structures — bugs within bugs within bugs — flooding the system with noise while adding no value. His approach starts with making everyone feel like they're on one team — cameras on, real conversations about who people are, what they like, where they live. Then he works to expose the constraint: how is each group actually measured and incentivized? You can't always change the enterprise contract, but you can mitigate. In the QA case, he got leadership to communicate directly with the vendor that the new, leaner process wouldn't penalize their people.

 

Self-reflection Question: Do you know how every member of your team — including vendors and contractors — is measured and incentivized, and have you checked whether those incentives are aligned with the value your team is trying to deliver?

 

[The Scrum Master Toolbox Podcast Recommends]

đŸ”„In the ruthless world of fintech, success isn't just about innovation—it's about coaching!đŸ”„

Angela thought she was just there to coach a team. But now, she's caught in the middle of a corporate espionage drama that could make or break the future of digital banking. Can she help the team regain their mojo and outwit their rivals, or will the competition crush their ambitions? As alliances shift and the pressure builds, one thing becomes clear: this isn't just about the product—it's about the people.

 

🚹 Will Angela's coaching be enough? Find out in Shift: From Product to People—the gripping story of high-stakes innovation and corporate intrigue.

 

Buy Now on Amazon

 

[The Scrum Master Toolbox Podcast Recommends]

 

About Nate Amidon

 

Nate, founder of Form100 Consulting, and a former Air Force officer and combat pilot turned servant leader in software development. Nate has taken the high-stakes world of military aviation and brought its core leadership principles—clarity, accountability, and execution—into his work with Agile teams.

 

You can link with Nate Amidon on LinkedIn. Learn more at Form100 Consulting.





Download audio: https://traffic.libsyn.com/secure/scrummastertoolbox/20260408_Nate_Amidon_W.mp3?dest-id=246429
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

How to Integrate React Gantt Chart in Framer

1 Share

How to Integrate React Gantt Chart in Framer

TL;DR: Tired of inefficient coding cycles in project management tools? Integrating a React Gantt Chart into Framer streamlines setup, delivers structured files, and enables task-driven execution, all directly on the canvas. With Framer’s developer mode and Gantt integration, you achieve an efficient workflow that keeps project timelines precise. The result is fewer errors, faster previews, neater layouts, and scalable visualizations for your team.

The Syncfusion¼ React Gantt Chart is a robust tool for project planning and tracking. It provides an interactive timeline to manage tasks, dependencies, and resources with built-in editing and drag-and-drop. Using it in Framer’s design and prototyping platform allows developers and designers to build interactive, data-driven project management UIs.

This blog walks through a step-by-step approach to integrating the Syncfusion React Gantt Chart into a Framer project, including a practical, working code example to help you get started quickly.

Why integrate Syncfusion React Gantt Chart with Framer?

Framer’s developer mode supports React-based components, making it an ideal platform for embedding Syncfusion’s feature-rich Gantt Chart. This integration is perfect for prototyping complex project timelines interactively, bridging the gap between design and development.

Key benefits:

  • Dynamic visualizations: Display tasks, dependencies, and milestones with Syncfusion’s customizable Gantt Chart.
  • Interactive prototyping: Leverage Framer’s design environment to create user-friendly interfaces.
  • Seamless integration: Combine Syncfusion’s robust functionality with Framer’s flexible canvas.

Prerequisites

Before starting, ensure you have:

Integrating Framer and Syncfusion React Gantt Chart

The first step in integrating the Syncfusion React Gantt Chart into Framer is to set up your development environment.

Framer is built for design and prototyping, while the React Gantt Chart delivers advanced project management features. Integrating both requires a few setup steps for a smooth developer workflow.

Step 1: Download and install Framer

To begin, we need to download and install the Framer desktop app for Mac or Windows. This provides the foundation for your project and for embedding custom code.

How to get started:

  • Visit the Framer website to download the desktop app.
  • Follow the installation instructions for your operating system.
  • Once installed, open Framer and start creating a new project to integrate the Gantt Chart.

Note: Confirm you have a reliable internet connection so you can fetch dependencies and assets during setup.

Step 2: Creating the project: Set up code files in Framer

With Framer installed, the next step is to create a project and add the necessary code files to integrate the Syncfusion React Gantt Chart. This involves setting up a Gantt.tsx file for the Gantt Chart component.

Framer’s developer mode lets you add custom code files in the Assets tab, enabling seamless integration of React components.

1. Create a new project in Framer

  • Open Framer and select New Project to start a fresh project.
  • Name your project (e.g., “Syncfusion Gantt Integration”) and save it in your preferred location.

2. Add code files

  • Navigate to the Assets tab in Framer.
  • Under the Code section, create a file named Gantt.tsx.

Refer to the following image for more details.

Creating the Gantt component in Framer
Creating a new file in Framer

3. Configure the Gantt.tsx file

The Gantt.tsx file contains the Syncfusion React Gantt Chart, including data, styling, and configuration for tasks, resources, and timelines.

Below is the example code to configure the Gantt component in the Gantt.tsx file.

import * as React from "react";
import {
  GanttComponent,
  Inject,
  Edit,
  Selection,
  Toolbar,
  DayMarkers,
} from "@syncfusion/ej2-react-gantt";
import { registerLicense } from "@syncfusion/ej2-base";

registerLicense("YOUR_LICENSE_KEY");

const GanttChart = () => {
  return (
    <div className="control-pane">
      <div className="control-section">
        <GanttComponent
          id="Editing"
          dataSource={editingData}
          dateFormat="MMM dd, y"
          treeColumnIndex={1}
          // Add other necessary props here
        >
          <Inject services={[Edit, Selection, Toolbar, DayMarkers]} />
        </GanttComponent>
      </div>
    </div>
  );
};

export default GanttChart;

For more details, refer to the React Gantt Chart in the Framer code example.

Configuring the code snippets to render Gantt Chart
Configuring the code snippets to render a Gantt Chart

Note: Ensure you have a valid Syncfusion license key (as shown in the registerLicense function) to use the Gantt Chart component. Replace the provided key with your own if necessary.

Step 3: Embed the Gantt Chart component

To make the Gantt Chart visible in your Framer project, you need to embed it into the Project page and update the Gantt.tsx file to render the component correctly.

To do so, please follow these steps:

1. Create a new component in Framer

  • In Framer, go to the Components section in the left panel.
  • Click the “
” icon in the Project section to create a new component and name it GanttChart.

See the images below.

Creating new component page for Gantt Chart visibility
Creating a new component page for Gantt Chart visibility
Embedding Gantt Chart in Project page
Opening the created Gantt Chart page

2. Import the Gantt component

  • In the Framer canvas, select the GanttChart component.
  • In the Assets -> Code section, drag and drop the Gantt.tsx file onto the GanttChart component in the canvas. This links the React component to the Framer project.

Here’s a preview of the feature in action:

Importing the Gantt Chart component into the newly created Embed page
Importing the Gantt Chart component into the newly created page

3. Preview the Gantt Chart

  • Click the Play button in Framer to preview the project.
  • The Syncfusion React Gantt Chart will render in the Framer canvas, displaying the project timeline with tasks, dependencies, and event markers.

Take a look at how this functionality behaves:

Rendering the React Gantt Chart component in the Framer project
Rendering the React Gantt Chart component in the Framer project
Previewing the React Gantt Chart project in Framer
Previewing the React Gantt Chart project in Framer

4. Customize the design

  • You can also use Framer’s design tools to adjust the layout and styling, or add UI elements around the Gantt Chart (e.g., buttons, headers, or sidebars).
  • Ensure the Gantt Chart’s width and height properties (set in Gantt.tsx file) fit within your Framer canvas.

Once integrated, your Framer project will display a fully functional Syncfusion React Gantt Chart, as shown in the following image.

Syncfusion Gantt Chart in Framer project
Integrating the React Gantt Chart component in the Framer project

Frequently Asked Questions

Do I need a paid Syncfusion license to make this work?

Yes, the Syncfusion Gantt Chart requires a valid commercial license for production use (or beyond the community license limits). In the code, replace “YOUR_LICENSE_KEY” with your actual key from the Syncfusion dashboard. Without a valid key, you’ll see watermarking, feature restrictions, or runtime errors.

Can I install additional Syncfusion packages or npm dependencies in Framer?

Framer’s code environment has limitations; it doesn’t support arbitrary npm install like a normal React app. You need to use packages that are compatible with Framer’s bundler. The @syncfusion/ej2-react-gantt package usually works if imported correctly, but for extras (e.g., themes or other Syncfusion controls), test thoroughly.

How do I make the Gantt Chart responsive or fit the Framer canvas properly?

Set explicit height and width props on (e.g., height=”100%” or width=”100%”), or wrap it in a div with Framer-friendly styles. Use Framer’s layout tools (constraints, auto-layout) around the component instance. Avoid fixed pixel sizes unless necessary; percentage-based sizing usually plays better with Framer’s responsive canvas.

Why isn't the Gantt Chart rendering at all in Framer preview?

The most common reasons are a missing/invalid Syncfusion license key (check the console for license errors), incorrect imports or the @syncfusion/ej2-react-gantt package not being properly referenced, or the component not exported as default and correctly linked/dropped onto the Framer canvas. Open Framer’s preview console to spot React errors like module not found or hydration issues, and verify your setup step-by-step.

Explore the endless possibilities with Syncfusion’s outstanding React UI components.

Build robust project timelines with React Gantt Chart and Framer

Thanks for reading! In this blog, we’ve explored the step-by-step process of integrating the Syncfusion React Gantt Chart into Framer. This integration empowers designers and developers to create dynamic, interactive project management visualizations within Framer’s prototyping environment. Try out these steps and share your feedback in the comments section below!

If you’re already a Syncfusion user, you can download the latest version of Essential Studio from the license and downloads page. We offer new users a 30-day free trial to explore the features and capabilities of all our components.

If you need further assistance, contact us via our  support forums,  support portal, or  feedback portal.

We’re always here to help you!

Thank you for choosing Syncfusion!

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Introducing Bun as a Runtime for Pulumi

1 Share

Last year we added support for Bun as a package manager for Pulumi TypeScript projects. Today we’re taking the next step: Bun is now a fully supported runtime for Pulumi programs. Set runtime: bun in your Pulumi.yaml and Bun will execute your entire Pulumi program, with no Node.js required. Since Bun’s 1.0 release, this has been one of our most requested features.

Why Bun?

Bun is a JavaScript runtime designed as an all-in-one toolkit: runtime, package manager, bundler, and test runner. For Pulumi users, the most relevant advantages are:

  • Native TypeScript support: Bun runs TypeScript directly without requiring ts-node or a separate compile step.
  • Fast package management: Bun’s built-in package manager can install dependencies significantly faster than npm.
  • Node.js compatibility: Bun aims for 100% Node.js compatibility, so the npm packages you already use with Pulumi should work out of the box.

With runtime: bun, Pulumi uses Bun for both running your program and managing your packages, giving you a streamlined single-tool experience.

Getting started

To create a new Pulumi project with the Bun runtime, run:

pulumi new bun

This creates a TypeScript project configured to use Bun. The generated Pulumi.yaml looks like this:

name: my-bun-project
runtime: bun

From here, write your Pulumi program as usual. For example, to create a random password using the @pulumi/random package:

bun add @pulumi/random
import * as random from "@pulumi/random";

const password = new random.RandomPassword("password", {
 length: 20,
});

export const pw = password.result;

Then deploy with:

pulumi up

Prerequisites:

Converting existing Node.js projects

If you have an existing Pulumi TypeScript project running on Node.js, you can convert it to use the Bun runtime in a few steps.

1. Update Pulumi.yaml

Change the runtime field from nodejs to bun:

Before:

runtime:
 name: nodejs
 options:
 packagemanager: npm

After:

runtime: bun
When the runtime is set to bun, Bun is also used as the package manager — there’s no need to configure a separate packagemanager option.

2. Update tsconfig.json

Bun handles TypeScript differently from Node.js with ts-node. Update your tsconfig.json to use Bun’s recommended compiler options:

{
 "compilerOptions": {
 "lib": ["ESNext"],
 "target": "ESNext",
 "module": "Preserve",
 "moduleDetection": "force",
 "moduleResolution": "bundler",
 "allowJs": true,
 "allowImportingTsExtensions": true,
 "verbatimModuleSyntax": true,
 "strict": true,
 "skipLibCheck": true,
 "noFallthroughCasesInSwitch": true,
 "noUncheckedIndexedAccess": true,
 "noImplicitOverride": true
 }
}

Key differences from a typical Node.js tsconfig.json:

  • module: "Preserve" and moduleResolution: "bundler": Let Bun handle module resolution instead of compiling to CommonJS. The bundler resolution strategy allows extensionless imports while still respecting package.json exports, matching how Bun resolves modules in practice.
  • verbatimModuleSyntax: true: Enforces consistent use of ESM import/export syntax. TypeScript will flag any remaining CommonJS patterns like require() at compile time.

3. Switch to ESM

Bun makes it easy to go full ESM and it’s the recommended module format for Bun projects. Add "type": "module" to your package.json:

{
 "type": "module"
}

With ECMAScript module (ESM) syntax, one thing that gets easier is working with async code. In a CommonJS Pulumi program, if you need to await a data source or other async call before declaring resources, the program must be wrapped in an async entrypoint function. With ESM and Bun, top-level await just works, so you can skip the wrapper function entirely and await directly at the module level:

import * as aws from "@pulumi/aws";

const azs = await aws.getAvailabilityZones({ state: "available" });

const buckets = azs.names.map(az => new aws.s3.BucketV2(`my-bucket-${az}`));

export const bucketNames = buckets.map(b => b.id);

If your existing program does use an async entrypoint with export =, just replace it with the ESM-standard export default:

// CommonJS (Node.js default)
export = async () => {
 const bucket = new aws.s3.BucketV2("my-bucket");
 return { bucketName: bucket.id };
};

// ESM (used with Bun)
export default async () => {
 const bucket = new aws.s3.BucketV2("my-bucket");
 return { bucketName: bucket.id };
};

4. Update the Pulumi SDK

Make sure you’re running @pulumi/pulumi version 3.226.0 or later:

bun add @pulumi/pulumi@latest

5. Install dependencies and deploy

pulumi install
pulumi up

Bun as runtime vs. Bun as package manager

With this release, there are now two ways to use Bun with Pulumi:

Configuration Bun’s role Node.js required?
runtime: bun Runs your program and manages packages No
runtime: { name: nodejs, options: { packagemanager: bun } } Manages packages only Yes

Use runtime: bun for the full Bun experience. The package-manager-only mode is still available for projects that need Node.js-specific features like function serialization.

Known limitations

The following Pulumi features are not currently supported when using the Bun runtime:

If your project uses any of these features, continue using runtime: nodejs. You can still benefit from Bun’s fast package management by setting packagemanager: bun in your runtime options.

Start using Bun with Pulumi

Bun runtime support is available now in Pulumi 3.227.0. To get started:

Thank you to everyone who upvoted, commented on, and contributed to the original feature request. Your feedback helped shape this feature, and we’d love to hear how it works for you.

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Why Microsoft’s war on Windows’ Control Panel is taking so long

1 Share
The Control Panel still exists in Windows 11. | Screenshot by Tom Warren / The Verge

Microsoft first started trying to get rid of the Control Panel in 2012, with the launch of Windows 8. More than a decade later, it's still working on migrating all the old Control Panel items into the modern Settings app in Windows 11. While there have been hints that the Control Panel might finally go away, the reality is a lot more complicated for Microsoft.

"We're doing it carefully because there are a lot of different network and printer devices & drivers we need to make sure we don't break in the process," explains March Rogers, partner director of design at Microsoft. I could be wrong, but I think this is the first full explanation we 


Read the full story at The Verge.

Read the whole story
alvinashcraft
3 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories