Co-Authored by Avneesh Kaushik
Why Trust Matters for AI Agents
Unlike static ML models, AI agents call tools and APIs, retrieve enterprise data, generate dynamic outputs, and can act autonomously based on their planning. This introduces expanded risk surfaces: prompt injection, data exfiltration, over-privileged tool access, hallucinations, and undetected model drift. A trustworthy agent must be designed with defense-in-depth controls spanning planning, development, deployment, and operations.
Key Principles for Trustworthy AI Agents
Trust Is Designed, Not Bolted On- Trust cannot be added after deployment. By the time an agent reaches production, its data flows, permissions, reasoning boundaries, and safety posture must already be structurally embedded. Trust is architecture, not configuration. Architecturally this means trust must exist across all layers:
|
Layer |
Design-Time Consideration |
|
Model |
Safety-aligned model selection |
|
Prompting |
System prompt isolation & injection defenses |
|
Retrieval |
Data classification & access filtering |
|
Tools |
Explicit allowlists |
|
Infrastructure |
Network isolation |
|
Identity |
Strong authentication & RBAC |
|
Logging |
Full traceability |
Implementing trustworthy AI agents in Microsoft Foundry requires embedding security and control mechanisms directly into the architecture.
Observability Is Mandatory - Observability converts AI from a black box into a managed system. AI agents are non-deterministic systems. You cannot secure or govern what you cannot see. Unlike traditional APIs, agents reason step-by-step, call multiple tools, adapt outputs dynamically and generate unstructured content which makes deep observability non-optional. When implementing observability in Microsoft Foundry, organizations must monitor the full behavioral footprint of the AI agent to ensure transparency, security, and reliability. This begins with
Least Privilege Everywhere- AI agents amplify the consequences of over-permissioned systems. Least privilege must be enforced across every layer of an AI agent’s architecture to reduce blast radius and prevent misuse.
Continuous Validation Beats One-Time Approval- Unlike traditional software that may pass QA testing and remain relatively stable, AI systems continuously evolve—models are updated, prompts are refined, and data distributions shift over time. Because of this dynamic nature, AI agents require ongoing validation rather than a single approval checkpoint.
Humans Remain Accountable - AI agents can make recommendations, automate tasks, or execute actions, but they cannot bear accountability themselves. Organizations must retain legal responsibility, ethical oversight, and governance authority over every decision and action performed by the agent. To enforce accountability, mechanisms such as immutable audit logs, detailed decision trace storage, user interaction histories, and versioned policy documentation should be implemented. Every action taken by an agent must be fully traceable to a specific model version, prompt version, policy configuration, and ultimately a human owner.
Together, these five principles—trust by design, observability, least privilege, continuous validation, and human accountability—form a reinforcing framework. When applied within Microsoft Foundry, they elevate AI agents from experimental tools to enterprise-grade, governed digital actors capable of operating reliably and responsibly in production environments.
|
Principle |
Without It |
With It |
|
Designed Trust |
Retroactive patching |
Embedded resilience |
|
Observability |
Blind production risk |
Proactive detection |
|
Least Privilege |
High blast radius |
Controlled exposure |
|
Continuous Validation |
Silent drift |
Active governance |
|
Human Accountability |
Unclear liability |
Clear ownership |
The AI Agent Lifecycle - We can structure trust controls across five stages:
Design & Planning:
Establishing Guardrails Early. Trustworthy AI agents are not created by adding controls at the end of development, they are architected deliberately from the very beginning. In platforms such as Microsoft Foundry, trust must be embedded during the design and planning phase, before a single line of code is written. This stage defines the security boundaries, governance structure, and responsible AI commitments that will shape the agent’s entire lifecycle.
From a security perspective, planning begins with structured threat modeling of the agent’s capabilities. Teams should evaluate what the agent is allowed to access and what actions it can execute. This includes defining least-privilege access to tools and APIs, ensuring the agent can only perform explicitly authorized operations. Data classification is equally critical. identifying whether information is public, confidential, or regulated determines how it can be retrieved, stored, and processed. Identity architecture should be designed using strong authentication and role-based access controls through Microsoft Entra ID, ensuring that both human users and system components are properly authenticated and scoped. Additionally, private networking strategies such as VNet integration and private endpoints should be defined early to prevent unintended public exposure of models, vector stores, or backend services.
Governance checkpoints must also be formalized at this stage. Organizations should clearly define the intended use cases of the agent, as well as prohibited scenarios to prevent misuse. A Responsible AI impact assessment should be conducted to evaluate potential societal, ethical, and operational risks before development proceeds. Responsible AI considerations further strengthen these guardrails. Finally, clear human-in-the-loop thresholds should be defined, specifying when automated outputs require review. By treating design and planning as a structured control phase rather than a preliminary formality, organizations create a strong foundation for trustworthy AI.
Development: Secure-by-Default Agent Engineering
During development in Microsoft Foundry, agents are designed to orchestrate foundation models, retrieval pipelines, external tools, and enterprise business APIs making security a core architectural requirement rather than an afterthought. Secure-by-default engineering includes model and prompt hardening through system prompt isolation, structured tool invocation and strict output validation schemas. Retrieval pipelines must enforce source allow-listing, metadata filtering, document sensitivity tagging, and tenant-level vector index isolation to prevent unauthorized data exposure.
Observability must also be embedded from day one. Agents should log prompts and responses, trace tool invocations, track token usage, capture safety classifier scores, and measure latency and reasoning-step performance. Telemetry can be exported to platforms such as Azure Monitor, Azure Application Insights, and enterprise SIEM systems to enable real-time monitoring, anomaly detection, and continuous trust validation.
Pre-Deployment: Red Teaming & Validation
Before moving to production, AI agents must undergo reliability, and governance validation. Security testing should include prompt injection simulations, data leakage assessments, tool misuse scenarios, and cross-tenant isolation verification to ensure containment boundaries are intact. Responsible AI validation should evaluate bias, measure toxicity and content safety scores, benchmark hallucination rates, and test robustness against edge cases and unexpected inputs. Governance controls at this stage formalize approval workflows, risk sign-off, audit trail documentation, and model version registration to ensure traceability and accountability. The outcome of this phase is a documented trustworthiness assessment that confirms the agent is ready for controlled production deployment.
Deployment: Zero-Trust Runtime Architecture
Deploying AI agents securely in Azure requires a layered, Zero Trust architecture that protects infrastructure, identities, and data at runtime. Infrastructure security should include private endpoints, Network Security Groups, Web Application Firewalls (WAF), API Management gateways, secure secret storage in Azure Key Vault, and the use of managed identities. Following Zero Trust principles verify explicitly, enforce least privilege, and assume breach ensures that every request, tool call, and data access is continuously validated. Runtime observability is equally critical. Organizations must monitor agent reasoning traces, tool execution outcomes, anomalous usage patterns, prompt irregularities, and output drift. Key telemetry signals include safety indicators (toxicity scores, jailbreak attempts), security events (suspicious tool call frequency), reliability metrics (timeouts, retry spikes), and cost anomalies (unexpected token consumption). Automated alerts should be configured to detect spikes in unsafe outputs, tool abuse attempts, or excessive reasoning loops, enabling rapid response and containment.
Operations: Continuous Governance & Drift Management
Trust in AI systems is not static, rather it should be continuously monitored, validated, and enforced throughout production. Organizations should implement automated evaluation pipelines that perform regression testing on new model versions, apply safety scoring to production logs, detect behavioral or data drift, and benchmark performance over time. Governance in production requires immutable audit logs, a versioned model registry, controlled policy updates, periodic risk reassessments, and well-defined incident response playbooks. Strong human oversight remains essential, supported by escalation workflows, manual review queues for high-risk outputs, and kill-switch mechanisms to immediately suspend agent capabilities if abnormal or unsafe behavior is detected.
To conclude - AI agents unlock powerful automation but those same capabilities can introduce risk if left unchecked. A well-architected trust framework transforms agents from experimental chatbots into enterprise-ready autonomous systems. By coupling Microsoft Foundry’s flexibility with layered security, observability, and continuous governance, organizations can confidently deliver AI agents that are:
TOC
Topic: Required tools
Topic: Compliant with your organization
Topic: Network limitations
Topic: Permission limitations
Let’s use a classic case where an HTTP trigger cannot be tested from the Azure Portal. As you can see, when clicking Test/Run in the Azure Portal, an error message appears.
At the same time, however, the home page does not show any abnormal status.
At this point, we first obtain the Function App’s SAMI and assign it the Owner role for the entire resource group. This is only for demonstration purposes. In practice, you should follow the principle of least privilege and scope permissions down to only the specific resources and operations that are actually required.
Next, go to the Kudu container, which is the always-on maintenance container dedicated to the app.
Install and enable Copilot CLI.
Then we can describe the problem we are encountering.
After the agent processes the issue and interacts with you further, it can generate a reasonable investigation report. In this example, it appears that the Function App’s Storage Account access key had been rotated previously, but the Function App had not updated the corresponding environment variable.
Once we understand the issue, we could perform the follow-up actions ourselves. However, to demonstrate the agent’s capabilities, you can also allow it to fix the problem directly, provided that you have granted the corresponding permissions through SAMI.
During the process, the container restart will disconnect the session, so you will need to return to the Kudu container and resume the previous session so it can continue.
Finally, it will inform you that the issue has been fixed, and then you can validate the result.
This is the validation result, and it looks like the repair was successful.
After each repair, we can even extract the experience from that case into a skill and store it in a Storage Account for future reuse. In this way, we can not only reduce the agent’s initial investigation time for similar issues, but also save tokens. This makes both time and cost management more efficient.
1177. This week, we look at behind-the-scenes of being a curator at Harvard's Houghton Library with John Overholt. We look at why 18th-century paper is surprisingly tough, how John managed the high-stakes transport of a George Washington book, and why curators actually prefer bare hands over white gloves. This bonus discussion originally ran for Grammarpaloozians back in January.
Find John Overholt on Mastodon.
🔗 Join the Grammar Girl Patreon.
🔗 Share your familect recording in Speakpipe or by leaving a voicemail at 833-214-GIRL (833-214-4475)
🔗 Watch my LinkedIn Learning writing courses.
🔗 Subscribe to the newsletter.
🔗 Take our advertising survey.
🔗 Transcript available on QuickandDirtyTips.com.
🔗 Get Grammar Girl books.
| HOST: Mignon Fogarty
| Grammar Girl is part of the Quick and Dirty Tips podcast network.
| Theme music by Catherine Rannus.
| Grammar Girl Social Media: YouTube. TikTok. Facebook. Threads. Instagram. LinkedIn. Mastodon. Bluesky.
Hosted on Acast. See acast.com/privacy for more information.
Share Episode
We explore the past and AI-driven future of Infrastructure as Code with Cloud Posse's Eric Osterman, discussing various IaC traumas. Erik maintains the world's largest repository of open-source IaC modules. Looking back at the dark ages of infrastructure, from the early days of raw CloudFormation and Capistrano to the rise and fall of tools like Puppet and Chef, we discuss the organic, messy growth of cloud environments. Where organizations frequently scale a single AWS account into a tangled web rather than adopting a robust multi-account architecture guided by a proper framework.
The conversation then shifts to the modern era of rapid integration of infrastructure development. While generating IaC with large language models can be incredibly fast, it introduces severe risks if left unchecked, and we explore how organizations can protect themselves by relying on Architectural Decision Records (ADRs) and predefined "skills". The hopeful goal of ensuring autonomous deployments are compliant, reproducible, and secure instead of relying on hallucinated architecture.
Finally, we tackle the compounding issue of code review in an age where developers can produce a year's worth of engineering slop progress in a single week.
Autonomous driving is not just a big tech or closed-source game, it's becoming accessible through open innovation and real-world deployment. Dan and Chris sit down with Harald Schäfer, CTO at Comma AI, to explore how OpenPilot is bringing self-driving to everyday vehicles using open source AI. We dive into the intersection of machine learning, robotics, and simulation, including how world models are enabling training at scale and shaping the future of autonomy.
Featuring:
Links: