Deploying Copilot for Microsoft 365 without first establishing a solid data protection baseline is one of the fastest ways to create unintentional exposure. Copilot is powerful, fast, and capable of pulling together insights from across your Microsoft 365 estate. But it does this strictly through the permissions and governance you already have in place. If you have inconsistent labels, weak data classification, limited auditing, or a messy information architecture, Copilot will immediately reveal those weaknesses. Baseline data protection is the controlling force that keeps Copilot aligned with your compliance requirements and your organizational risk posture. This step ensures that before AI begins summarizing, correlating, or generating content, you’ve tightened the rules for how your data should be accessed, protected, and monitored.
Sensitivity labels are the core mechanism that informs Copilot what content can be viewed, summarized, extracted, or restricted. If your taxonomy is bloated, unclear, or inconsistently applied, Copilot will reflect that confusion. A strong baseline begins with a clean, intentional label hierarchy used universally across your tenant. Rather than piling on dozens of labels, the goal is to create a small, predictable set that can be easily understood by humans and honored faithfully by the AI.
A typical Copilot-ready taxonomy might include labels such as “Public,” “Internal,” “Confidential,” “Highly Confidential,” and “Restricted.” The practical difference is not the name; it’s the permissions behind the label, especially the distinction between VIEW rights and EXTRACT rights. Copilot relies on EXTRACT rights to generate summaries or reinterpretations of content. When EXTRACT rights are removed, the user may still open the file, but Copilot cannot interact with it. This becomes crucial for content you never want to pass through the AI pipeline, such as legal hold material, executive board documents, certain financial reports, or private HR records.
Label policies must be applied intentionally. Some departments may require more flexible data interaction capabilities, Finance, for example, might allow Copilot to summarize financial models internally, while others, such as Legal or HR, may require non-negotiable restrictions. What matters most is that your taxonomy is predictable, consistently enforced, and structured around actual business needs rather than hypothetical scenarios. A stable sensitivity label hierarchy is one of the most important prerequisites for Copilot adoption.
To support this structure, the following table provides a valid, Microsoft-supported baseline set of sensitivity labels, each using only real controls available in Microsoft Purview today. This table outlines encryption behavior, permissions, and the resulting effect on Copilot’s ability to read or summarize content.
| Label Name | Purpose / Description | Encryption | Allowed Permissions | Effect on Copilot |
|---|---|---|---|---|
| Public | Content intended for public or widely visible sharing. | Off | N/A (no encryption) | Fully accessible; Copilot can read and summarize. |
| Internal | Default internal business content. | Off (or On without restrictions) | View, Edit, Copy, Export, Print | Copilot can read and summarize normally. |
| Confidential | Sensitive organizational content requiring protection. | On | View, Edit, Copy, Export, Print | Copilot can view and summarize content securely. |
| Highly Confidential | Critical information requiring strict access limitations. | On | View only (No Copy, No Export; Print optional) | Copilot cannot summarize or extract content because Copy/Export are disabled. |
| Restricted | High-risk or regulated data with the most stringent controls. | On (assigned to specific groups) | View only (No Edit, No Copy, No Export, No Print) | Copilot cannot read, reference, or summarize. Full AI restriction. |
| Finance Confidential | Financial statements, forecasting, budgeting. | On (scoped to Finance group) | View, Edit, Copy, Export | Copilot fully available to Finance users only. |
| Legal Privileged | Attorney-client privileged documents. | On | View only (No Copy, No Export) | Copilot is blocked; summarization prevented by permissions. |
| HR Sensitive | Employee data, performance, compensation. | On (HR group only) | View, Edit (No Copy, No Export) | Copilot can help with drafting but cannot summarize or extract. |
| Project Sensitive | R&D, M&A, confidential product work. | On (dynamic group) | View, Edit, Copy, Export (project members only) | Copilot available to authorized project members. |
| Executive Board Confidential | Board packets, strategy discussions, critical reviews. | On (Exec group only) | View only (No Copy, No Export, No Print) | Copilot fully blocked, protecting executive material. |
Disclaimer: The sensitivity label behaviors and Copilot access outcomes described in the table are based on Microsoft’s documented enforcement model for Microsoft Information Protection (MIP), including encrypted content handling, usage rights (such as View, Copy, Export, and Print), and the principle that Copilot operates strictly within the user’s existing permissions. While Microsoft does not explicitly state Copilot behavior for every individual permission scenario, such as “Copy/Export restrictions directly prevent summarization,” the examples presented here are based on the way the MIP SDK, encryption policies, usage rights, and Graph API content access controls function together in the Microsoft 365 ecosystem. Organizations should validate these configurations in their own environments, noting that Copilot’s behavior aligns with the underlying permissions and protection rules applied through Microsoft Purview, rather than through Copilot-specific policy settings.
Manual classification is never enough. Users forget, don’t understand labels, or misclassify content. Auto-labeling ensures your governance applies universally, even across years of legacy data. The goal is not to replace human decisions but to protect the organization from gaps created by human error.
Auto-labeling should be configured to detect easily identifiable sensitive information such as financial data, customer identifiers, personally identifiable information, regulated industry terms, and other key patterns. When these patterns appear, Purview can automatically elevate or assign the correct label. This is especially important for older SharePoint libraries and OneDrive folders where unlabeled or incorrectly labeled files would otherwise slip through Copilot’s visibility filters.
Simulation mode is a practical starting point. It reveals labeling patterns without changing any content, allowing you to tune detection before enforcement. Once refined, enabling auto-labeling across SharePoint and OneDrive ensures Copilot interacts with a dataset that reflects your intended protection strategy.
Encryption, when tied to sensitivity labels, becomes the most precise tool for controlling whether Copilot can interpret content. The interplay is straightforward: if a user has EXTRACT rights under an encrypted label, Copilot can summarize that content. If EXTRACT is denied, Copilot cannot operate—even if the user can view the file.
A well-designed encryption strategy lets you allow Copilot to work with most organizational content while protecting your highest-risk material. Typically, “Confidential” content remains accessible, while “Highly Confidential” or “Restricted” content becomes off-limits by removing EXTRACT permissions.
Below is a recommended table of encryption configurations aligned with Copilot behavior:
| Label Name | Encryption Setting | Usage Rights (User / Copilot) | Copilot Behavior | Intended Outcome |
|---|---|---|---|---|
| Internal | No encryption | VIEW + EXTRACT allowed | Copilot fully accessible | Normal business workflows |
| Confidential | Encryption enabled | VIEW + EXTRACT allowed | Copilot can summarize securely | Balanced productivity and security |
| Highly Confidential | Encryption enabled | VIEW allowed, EXTRACT denied | Copilot cannot summarize | Protects sensitive operations |
| Restricted | Encryption w/ strict access | VIEW restricted; EXTRACT removed entirely | Copilot fully blocked | Ensures regulatory or legal data stays out of AI workflows |
| Department-Confidential | Encrypted, scoped to department | LIMITED EXTRACT for department members only | Copilot works only for authorized users | Supports controlled AI within departments |
| Project-Sensitive | Encryption with dynamic groups | EXTRACT only for project participants | Copilot aids project teams securely | Enables AI for time-limited initiatives |
Disclaimer: The encryption configurations and Copilot interaction outcomes described here are based on Microsoft’s documented behavior for Microsoft Information Protection (MIP) sensitivity labels, encryption enforcement, and usage rights such as View, Copy, Export, and Print. Microsoft does not explicitly document Copilot-specific responses to each usage right; however, Copilot relies on the exact underlying access mechanisms and MIP enforcement controls as other Microsoft 365 applications. When encryption prevents applications from extracting or exporting protected content, Copilot is likewise unable to read or summarize it. The described outcomes—including scenarios where removing Copy/Export rights prevents AI summarization—are therefore inferred from the established MIP encryption model rather than stated as Copilot-specific rules. Organizations should validate these configurations in their own tenant to confirm that AI interactions align with their intended sensitivity label design and encryption enforcement strategy.
Copilot introduces new forms of data access: summaries, correlations, generated insights, and cross-workload retrieval, that require a robust auditing foundation. Microsoft Purview Audit, whether on the standard or premium tier, provides visibility into how Copilot interacts with your content. Without it, Copilot’s activities become opaque, leaving security teams blind to how AI-assisted workflows influence data access and movement across the tenant.
Audit logs capture and surface key events such as:
This visibility becomes essential for investigations. If a user claims that Copilot surfaced content they did not expect, or if sensitive information appears in a generated output, the audit trail provides the historical record needed to understand what happened and why. This level of transparency is fundamental in regulated sectors, where AI-assisted content handling may be reviewed by internal compliance teams, external auditors, or legal entities.
Retention policies must align with your operational and regulatory requirements. Many organizations function effectively with:
The goal is simple: maintain an audit trail that ensures every Copilot interaction involving sensitive data remains discoverable long after the event. A well-configured audit environment doesn’t just support Copilot; it reinforces trust, accountability, and responsible AI adoption across the entire Microsoft 365 ecosystem.
Copilot introduces a new category of content into your Microsoft 365 environment: AI-generated text, AI-assisted edits, rewritten summaries, suggested responses, and contextual references to existing documents and conversations. While Copilot does not store prompts or responses as separate artifacts in the tenant, the actions it performs and the content it interacts with can become subject to legal discovery, internal investigations, compliance reviews, or regulatory audits. This makes it essential that your eDiscovery environment is configured to identify, preserve, and export the materials that Copilot interacts with.
Your eDiscovery (Standard or Premium) configuration should be capable of locating:
Although Copilot does not create new “AI objects” inside the tenant, it influences how users interact with existing content. This means discovery must be able to reconstruct which content was accessed, when it was accessed, and who accessed it. Purview Audit Premium is especially important because it captures the detailed events required to rebuild the sequence of AI-driven activity. If an investigation requires proof that Copilot was used to produce or retrieve specific content, the audit logs function as the authoritative source.
Beyond discovery, organizations must also strengthen Communication Compliance policies to monitor the behavioral risks introduced by generative AI. Copilot enables employees to retrieve and transform large volumes of data quickly, which may expose patterns of misuse that did not exist before. Communication Compliance helps detect:
These policies are not designed to stop Copilot. They ensure that the way users engage with Copilot aligns with regulatory requirements, ethical expectations, and internal governance standards. As Copilot accelerates communication and content creation, it also accelerates the need for oversight, monitoring, and accountability.
Data Loss Prevention (DLP) becomes significantly more important once AI enters the environment. Without DLP, Copilot may legitimately summarize content that is labeled correctly but still too sensitive to be shared or discussed broadly.
This is where targeted, Copilot-specific DLP becomes essential. Policies should protect your most sensitive classifications, particularly “Highly Confidential” and “Restricted”, by restricting or auditing AI interactions with those labels. DLP can surface warnings to users, block certain AI actions, or require justification before a sensitive summarization occurs. When combined with sensitivity labels and contextual conditions such as location, device, and user risk, DLP becomes a layered security model that ensures sensitive material remains under strict control even when used inside productivity workflows.
Below is a suggested set of DLP rules tailored specifically for Copilot. These are by any means all you need, they are just examples:
| DLP Rule Name | Purpose / Scenario | Trigger Conditions | Action / Enforcement | Outcome for Copilot |
|---|---|---|---|---|
| DLP-Copilot-Block-Restricted-Data-Usage | Prevent AI from interacting with Restricted/HC data | Label = Restricted/HC | Block + High-Severity Alert | Copilot cannot access or summarize data |
| DLP-Copilot-Warn-High-Risk-Confidential-Access | Warn users interacting with regulated data | Confidential + Sensitive info types | Warning + Alert | Allows use but monitored |
| DLP-Copilot-Audit-Sensitive-Summaries | Track sensitive summaries | Confidential only | Audit | Visibility without blocking |
| DLP-Copilot-Block-Sharing-Outside-Department | Prevent departmental IP leakage | SharePoint/Teams dept sites | Block cross-department share | AI cannot leak content across groups |
| DLP-Copilot-Block-External-Sharing | Prevent AI-generated content leaving tenant | Any external share attempt | Block + Notify | Eliminates external exposure |
| DLP-Copilot-Monitor-Bulk-Data-Access | Detect AI-triggered mass data aggregation | Bulk summarization pattern | Alert + Monitor | Identifies compromised accounts |
| DLP-Copilot-Block-Unlabeled-Sensitive-Patterns | Protect unlabeled sensitive legacy data | Sensitive info types w/ no label | Block Copilot access | Forces proper labeling |
Disclaimer: The Data Loss Prevention (DLP) rules and enforcement outcomes described reflect Microsoft Purview’s documented capabilities, including auditing, policy tips, blocking actions, and automatic labeling. Microsoft does not provide Copilot-specific DLP actions; instead, DLP governs the underlying content access within SharePoint, OneDrive, Exchange, and Teams. The Copilot behaviors referenced, such as being unable to summarize or access restricted content, are inferred from the way DLP policies prevent users and applications from accessing or transmitting protected data. Because Copilot operates strictly within the user’s permissions and the platform’s data access controls, blocking an activity via DLP prevents Copilot from performing AI-driven actions on that content. Organizations should validate these rules within their own Purview environment to ensure they align with internal governance standards and real-world Copilot usage patterns.
All the technical controls in the world won’t matter if your underlying data environment is chaotic. Copilot has no special override; it merely reflects your existing permissions. If users can see something today, Copilot can help them find it faster tomorrow. Many organizations carry years of oversharing, outdated content, abandoned sites, and misconfigured permissions accumulated from legacy collaboration patterns. Before introducing AI into this environment, you must establish a clean governance baseline.
A stable baseline includes ensuring:
This step is where governance meets practicality. It is not about achieving perfection across every site, library, or document. It’s about ensuring the data Copilot touches is structured, protected, intentional, and overseen by accountable owners. Establishing this baseline dramatically reduces the risk of accidental exposure when the AI begins connecting information across the tenant. It also gives you confidence that Copilot is amplifying the correct data, not the forgotten, misconfigured, or overshared data hiding in the shadows of your environment.
A strong data protection baseline ensures that Copilot operates within the boundaries you intentionally define, not the accidental ones your environment has inherited over time. Copilot is not an elevated identity or a privileged system; it is an accelerator that amplifies whatever access your users and your underlying governance model already provide. This makes your data protection posture the single most important factor in determining whether Copilot becomes a controlled, enterprise-grade asset or an uncontrolled accelerant of preexisting risks.
Sensitivity labels, auto-labeling rules, encryption enforcement, audit logging, DLP policies, and strong governance collectively shape the perimeter that Copilot must operate within. Each component plays a distinct role in controlling how data flows, how it’s classified, how it’s protected, and ultimately how AI is permitted to interpret it.
When these controls function together, they create a layered protection framework that prevents Copilot from interacting with inappropriate content, strengthens your Zero Trust posture, and reduces the likelihood of accidental data exposure. More importantly, they allow Copilot to operate confidently within well-defined risk parameters, enabling your organization to harness AI’s value without compromising compliance, privacy, or security. This alignment between AI capability and data protection discipline is the foundation of safe, scalable Copilot adoption.
In the next post, we will take a deeper step into that foundation by addressing one of the most overlooked areas of Copilot readiness: Fixing Oversharing in SharePoint and OneDrive Before Copilot Deployment. This step is essential because Copilot will surface, correlate, and summarize data based on user access, not based on your intentions. If broad or unintended access exists today, Copilot will faithfully amplify it at machine speed. By cleaning up oversharing, restructuring permissions, and enforcing least-privilege principles, you eliminate latent data exposure risks before Copilot begins interpreting your content at scale.
The work ahead is not simply technical, it is transformational. But done correctly, it enables your organization to deploy Copilot with confidence, clarity, and control.
Web components, as imagined in 1998 from a never-adopted specification:
Componentization is a powerful paradigm that allows component users to build applications using ‘building blocks’ of functionality without having to implement those building blocks themselves, or necessarily understand how the building works in fine detail. This method makes building complex applications easier by breaking them down into more manageable chunks and allowing the building blocks to be reused in other applications.
I still think of web components as a recent feature. The first time we even took a stab at explaining them here at CSS-Tricks was in a five-part series by Caleb Williams back in 2019. John Rhea followed that up with another six-part series in 2021. Not that long ago.
But nay! Jay Hoffman dug up the 1998 proposal cited above and shared it with me from a recent Igalia Chat (which is a great podcast, by the way) he had with Brian Kardell, Eric Meyer, and Jeremy Keith.
So, we’re really talking about a feature that’s been in the works for nearly 30 years. Style encapsulation is firmly a part of the time capsule that is web history.
It’s not that we need to know any of this information today, but the context is what’s key. It’s easy to overlook the early work put into something, especially when it comes to the web which is littered with arcane artifacts in disparate places.
HTML Web Components Proposal From 1998 originally published on CSS-Tricks, which is part of the DigitalOcean family. You should get the newsletter.
In my last piece, we established a foundational truth: for users to adopt and rely on AI, they must trust it. We talked about trust being a multifaceted construct, built on perceptions of an AI’s Ability, Benevolence, Integrity, and Predictability. But what happens when an AI, in its silent, algorithmic wisdom, makes a decision that leaves a user confused, frustrated, or even hurt? A mortgage application is denied, a favorite song is suddenly absent from a playlist, and a qualified resume is rejected before a human ever sees it. In these moments, ability and predictability are shattered, and benevolence feels a world away.
Our conversation now must evolve from the why of trust to the how of transparency. The field of Explainable AI (XAI), which focuses on developing methods to make AI outputs understandable to humans, has emerged to address this, but it’s often framed as a purely technical challenge for data scientists. I argue it’s a critical design challenge for products relying on AI. It’s our job as UX professionals to bridge the gap between algorithmic decision-making and human understanding.
This article provides practical, actionable guidance on how to research and design for explainability. We’ll move beyond the buzzwords and into the mockups, translating complex XAI concepts into concrete design patterns you can start using today.
De-mystifying XAI: Core Concepts For UX PractitionersXAI is about answering the user’s question: “Why?” Why was I shown this ad? Why is this movie recommended to me? Why was my request denied? Think of it as the AI showing its work on a math problem. Without it, you just have an answer, and you’re forced to take it on faith. In showing the steps, you build comprehension and trust. You also allow for your work to be double-checked and verified by the very humans it impacts.
There are a number of techniques we can use to clarify or explain what is happening with AI. While methods range from providing the entire logic of a decision tree to generating natural language summaries of an output, two of the most practical and impactful types of information UX practitioners can introduce into an experience are feature importance (Figure 1) and counterfactuals. These are often the most straightforward for users to understand and the most actionable for designers to implement.

This explainability method answers, “What were the most important factors the AI considered?” It’s about identifying the top 2-3 variables that had the biggest impact on the outcome. It’s the headline, not the whole story.
Example: Imagine an AI that predicts whether a customer will churn (cancel their service). Feature importance might reveal that “number of support calls in the last month” and “recent price increases” were the two most important factors in determining if a customer was likely to churn.
This powerful method answers, “What would I need to change to get a different outcome?” This is crucial because it gives users a sense of agency. It transforms a frustrating “no” into an actionable “not yet.”
Example: Imagine a loan application system that uses AI. A user is denied a loan. Instead of just seeing “Application Denied,” a counterfactual explanation would also share, “If your credit score were 50 points higher, or if your debt-to-income ratio were 10% lower, your loan would have been approved.” This gives Sarah clear, actionable steps she can take to potentially get a loan in the future.
Although technical specifics are often handled by data scientists, it's helpful for UX practitioners to know that tools like LIME (Local Interpretable Model-agnostic Explanations) which explains individual predictions by approximating the model locally, and SHAP (SHapley Additive exPlanations) which uses a game theory approach to explain the output of any machine learning model are commonly used to extract these “why” insights from complex models. These libraries essentially help break down an AI’s decision to show which inputs were most influential for a given outcome.
When done properly, the data underlying an AI tool’s decision can be used to tell a powerful story. Let’s walk through feature importance and counterfactuals and show how the data science behind the decision can be utilized to enhance the user’s experience.
Now let’s cover feature importance with the assistance of Local Explanations (e.g., LIME) data: This approach answers, “Why did the AI make this specific recommendation for me, right now?” Instead of a general explanation of how the model works, it provides a focused reason for a single, specific instance. It’s personal and contextual.
Example: Imagine an AI-powered music recommendation system like Spotify. A local explanation would answer, “Why did the system recommend this specific song by Adele to you right now?” The explanation might be: “Because you recently listened to several other emotional ballads and songs by female vocalists.”
Finally, let’s cover the inclusion of Value-based Explanations (e.g. Shapley Additive Explanations (SHAP) data to an explanation of a decision: This is a more nuanced version of feature importance that answers, “How did each factor push the decision one way or the other?” It helps visualize what mattered, and whether its influence was positive or negative.
Example: Imagine a bank uses an AI model to decide whether to approve a loan application.
Feature Importance: The model output might show that the applicant’s credit score, income, and debt-to-income ratio were the most important factors in its decision. This answers what mattered.
Feature Importance with Value-Based Explanations (SHAP): SHAP values would take feature importance further based on elements of the model.
This helps the loan officer explain to the applicant beyond what was considered, to how each factor contributed to the final “yes” or “no” decision.
It’s crucial to recognize that the ability to provide good explanations often starts much earlier in the development cycle. Data scientists and engineers play a pivotal role by intentionally structuring models and data pipelines in ways that inherently support explainability, rather than trying to bolt it on as an afterthought.
Research and design teams can foster this by initiating early conversations with data scientists and engineers about user needs for understanding, contributing to the development of explainability metrics, and collaboratively prototyping explanations to ensure they are both accurate and user-friendly.
XAI And Ethical AI: Unpacking Bias And ResponsibilityBeyond building trust, XAI plays a critical role in addressing the profound ethical implications of AI*, particularly concerning algorithmic bias. Explainability techniques, such as analyzing SHAP values, can reveal if a model’s decisions are disproportionately influenced by sensitive attributes like race, gender, or socioeconomic status, even if these factors were not explicitly used as direct inputs.
For instance, if a loan approval model consistently assigns negative SHAP values to applicants from a certain demographic, it signals a potential bias that needs investigation, empowering teams to surface and mitigate such unfair outcomes.
The power of XAI also comes with the potential for “explainability washing.” Just as “greenwashing” misleads consumers about environmental practices, explainability washing can occur when explanations are designed to obscure, rather than illuminate, problematic algorithmic behavior or inherent biases. This could manifest as overly simplistic explanations that omit critical influencing factors, or explanations that strategically frame results to appear more neutral or fair than they truly are. It underscores the ethical responsibility of UX practitioners to design explanations that are genuinely transparent and verifiable.
UX professionals, in collaboration with data scientists and ethicists, hold a crucial responsibility in communicating the why of a decision, and also the limitations and potential biases of the underlying AI model. This involves setting realistic user expectations about AI accuracy, identifying where the model might be less reliable, and providing clear channels for recourse or feedback when users perceive unfair or incorrect outcomes. Proactively addressing these ethical dimensions will allow us to build AI systems that are truly just and trustworthy.
From Methods To Mockups: Practical XAI Design PatternsKnowing the concepts is one thing; designing them is another. Here’s how we can translate these XAI methods into intuitive design patterns.
This is the simplest and often most effective pattern. It’s a direct, plain-language statement that surfaces the primary reason for an AI’s action.
Example: Imagine a music streaming service. Instead of just presenting a “Discover Weekly” playlist, you add a small line of microcopy.
Song Recommendation: “Velvet Morning”
Because you listen to “The Fuzz” and other psychedelic rock.
Counterfactuals are inherently about empowerment. The best way to represent them is by giving users interactive tools to explore possibilities themselves. This is perfect for financial, health, or other goal-oriented applications.
Example: A loan application interface. After a denial, instead of a dead end, the user gets a tool to determine how various scenarios (what-ifs) might play out (See Figure 1).

When an AI performs an action on a user’s content (like summarizing a document or identifying faces in photos), the explanation should be visually linked to the source.
Example: An AI tool that summarizes long articles.
AI-Generated Summary Point:
Initial research showed a market gap for sustainable products.
Source in Document:
“...Our Q2 analysis of market trends conclusively demonstrated that no major competitor was effectively serving the eco-conscious consumer, revealing a significant market gap for sustainable products...”
For more complex decisions, users might need to understand the interplay of factors. Simple data visualizations can make this clear without being overwhelming.
Example: An AI screening a candidate’s profile for a job.
Why this candidate is a 75% match:
Factors pushing the score up:
- 5+ Years UX Research Experience
- Proficient in Python
Factors pushing the score down:
- No experience with B2B SaaS
Learning and using these design patterns in the UX of your AI product will help increase the explainability. You can also use additional techniques that I’m not covering in-depth here. This includes the following:
A Note For the Front End: Translating these explainability outputs into seamless user experiences also presents its own set of technical considerations. Front-end developers often grapple with API design to efficiently retrieve explanation data, and performance implications (like the real-time generation of explanations for every user interaction) need careful planning to avoid latency.
Some Real-world ExamplesUPS Capital’s DeliveryDefense
UPS uses AI to assign a “delivery confidence score” to addresses to predict the likelihood of a package being stolen. Their DeliveryDefense software analyzes historical data on location, loss frequency, and other factors. If an address has a low score, the system can proactively reroute the package to a secure UPS Access Point, providing an explanation for the decision (e.g., “Package rerouted to a secure location due to a history of theft”). This system demonstrates how XAI can be used for risk mitigation and building customer trust through transparency.
Autonomous Vehicles
These vehicles of the future will need to effectively use XAI to help their vehicles make safe, explainable decisions. When a self-driving car brakes suddenly, the system can provide a real-time explanation for its action, for example, by identifying a pedestrian stepping into the road. This is not only crucial for passenger comfort and trust but is a regulatory requirement to prove the safety and accountability of the AI system.
IBM Watson Health (and its challenges)
While often cited as a general example of AI in healthcare, it’s also a valuable case study for the importance of XAI. The failure of its Watson for Oncology project highlights what can go wrong when explanations are not clear, or when the underlying data is biased or not localized. The system’s recommendations were sometimes inconsistent with local clinical practices because they were based on U.S.-centric guidelines. This serves as a cautionary tale on the need for robust, context-aware explainability.
The UX Researcher’s Role: Pinpointing And Validating ExplanationsOur design solutions are only effective if they address the right user questions at the right time. An explanation that answers a question the user doesn’t have is just noise. This is where UX research becomes the critical connective tissue in an XAI strategy, ensuring that we explain the what and how that actually matters to our users. The researcher’s role is twofold: first, to inform the strategy by identifying where explanations are needed, and second, to validate the designs that deliver those explanations.
Before we can design a single explanation, we must understand the user’s mental model of the AI system. What do they believe it’s doing? Where are the gaps between their understanding and the system’s reality? This is the foundational work of a UX researcher.
Through deep, semi-structured interviews, UX practitioners can gain invaluable insights into how users perceive and understand AI systems. These sessions are designed to encourage users to literally draw or describe their internal “mental model” of how they believe the AI works. This often involves asking open-ended questions that prompt users to explain the system’s logic, its inputs, and its outputs, as well as the relationships between these elements.
These interviews are powerful because they frequently reveal profound misconceptions and assumptions that users hold about AI. For example, a user interacting with a recommendation engine might confidently assert that the system is based purely on their past viewing history. They might not realize that the algorithm also incorporates a multitude of other factors, such as the time of day they are browsing, the current trending items across the platform, or even the viewing habits of similar users.
Uncovering this gap between a user’s mental model and the actual underlying AI logic is critically important. It tells us precisely what specific information we need to communicate to users to help them build a more accurate and robust mental model of the system. This, in turn, is a fundamental step in fostering trust. When users understand, even at a high level, how an AI arrives at its conclusions or recommendations, they are more likely to trust its outputs and rely on its functionality.
By meticulously mapping the user’s journey with an AI-powered feature, we gain invaluable insights into the precise moments where confusion, frustration, or even profound distrust emerge. This uncovers critical junctures where the user’s mental model of how the AI operates clashes with its actual behavior.
Consider a music streaming service: Does the user’s trust plummet when a playlist recommendation feels “random,” lacking any discernible connection to their past listening habits or stated preferences? This perceived randomness is a direct challenge to the user’s expectation of intelligent curation and a breach of the implicit promise that the AI understands their taste. Similarly, in a photo management application, do users experience significant frustration when an AI photo-tagging feature consistently misidentifies a cherished family member? This error is more than a technical glitch; it strikes at the heart of accuracy, personalization, and even emotional connection.
These pain points are vivid signals indicating precisely where a well-placed, clear, and concise explanation is necessary. Such explanations serve as crucial repair mechanisms, mending a breach of trust that, if left unaddressed, can lead to user abandonment.
The power of AI journey mapping lies in its ability to move us beyond simply explaining the final output of an AI system. While understanding what the AI produced is important, it’s often insufficient. Instead, this process compels us to focus on explaining the process at critical moments. This means addressing:
AI journey mapping transforms the abstract concept of XAI into a practical, actionable framework for UX practitioners. It enables us to move beyond theoretical discussions of explainability and instead pinpoint the exact moments where user trust is at stake, providing the necessary insights to build AI experiences that are powerful, transparent, understandable, and trustworthy.
Ultimately, research is how we uncover the unknowns. Your team might be debating how to explain why a loan was denied, but research might reveal that users are far more concerned with understanding how their data was used in the first place. Without research, we are simply guessing what our users are wondering.
Collaborating On The Design (How to Explain Your AI)Once research has identified what to explain, the collaborative loop with design begins. Designers can prototype the patterns we discussed earlier—the “Because” statement, the interactive sliders—and researchers can put those designs in front of users to see if they hold up.
Targeted Usability & Comprehension Testing: We can design research studies that specifically test the XAI components. We don’t just ask, “Is this easy to use?” We ask, “After seeing this, can you tell me in your own words why the system recommended this product?” or “Show me what you would do to see if you could get a different result.” The goal here is to measure comprehension and actionability, alongside usability.
Measuring Trust Itself: We can use simple surveys and rating scales before and after an explanation is shown. For instance, we can ask a user on a 5-point scale, “How much do you trust this recommendation?” before they see the “Because” statement, and then ask them again afterward. This provides quantitative data on whether our explanations are actually moving the needle on trust.
This process creates a powerful, iterative loop. Research findings inform the initial design. That design is then tested, and the new findings are fed back to the design team for refinement. Maybe the “Because” statement was too jargony, or the “What-If” slider was more confusing than empowering. Through this collaborative validation, we ensure that the final explanations are technically accurate, genuinely understandable, useful, and trust-building for the people using the product.
The Goldilocks Zone Of ExplanationA critical word of caution: it is possible to over-explain. As in the fairy tale, where Goldilocks sought the porridge that was ‘just right’, the goal of a good explanation is to provide the right amount of detail—not too much and not too little. Bombarding a user with every variable in a model will lead to cognitive overload and can actually decrease trust. The goal is not to make the user a data scientist.
One solution is progressive disclosure.
This layered approach respects user attention and expertise, providing just the right amount of information for their needs. Let’s imagine you’re using a smart home device that recommends optimal heating based on various factors.
Start with the simple: “Your home is currently heated to 72 degrees, which is the optimal temperature for energy savings and comfort.”
Offer a path to detail: Below that, a small link or button: “Why is 72 degrees optimal?"
Reveal the complexity: Clicking that link could open a new screen showing:

It’s effective to combine multiple XAI methods and this Goldilocks Zone of Explanation pattern, which advocates for progressive disclosure, implicitly encourages this. You might start with a simple “Because” statement (Pattern 1) for immediate comprehension, and then offer a “Learn More” link that reveals a “What-If” Interactive (Pattern 2) or a “Push-and-Pull Visual” (Pattern 4) for deeper exploration.
For instance, a loan application system could initially state the primary reason for denial (feature importance), then allow the user to interact with a “What-If” tool to see how changes to their income or debt would alter the outcome (counterfactuals), and finally, provide a detailed “Push-and-Pull” chart (value-based explanation) to illustrate the positive and negative contributions of all factors. This layered approach allows users to access the level of detail they need, when they need it, preventing cognitive overload while still providing comprehensive transparency.
Determining which XAI tools and methods to use is primarily a function of thorough UX research. Mental model interviews and AI journey mapping are crucial for pinpointing user needs and pain points related to AI understanding and trust. Mental model interviews help uncover user misconceptions about how the AI works, indicating areas where fundamental explanations (like feature importance or local explanations) are needed. AI journey mapping, on the other hand, identifies critical moments of confusion or distrust in the user’s interaction with the AI, signaling where more granular or interactive explanations (like counterfactuals or value-based explanations) would be most beneficial to rebuild trust and provide agency.

Ultimately, the best way to choose a technique is to let user research guide your decisions, ensuring that the explanations you design directly address actual user questions and concerns, rather than simply offering technical details for their own sake.
XAI for Deep Reasoning AgentsSome of the newest AI systems, known as deep reasoning agents, produce an explicit “chain of thought” for every complex task. They do not merely cite sources; they show the logical, step-by-step path they took to arrive at a conclusion. While this transparency provides valuable context, a play-by-play that spans several paragraphs can feel overwhelming to a user simply trying to complete a task.
The principles of XAI, especially the Goldilocks Zone of Explanation, apply directly here. We can curate the journey, using progressive disclosure to show only the final conclusion and the most salient step in the thought process first. Users can then opt in to see the full, detailed, multi-step reasoning when they need to double-check the logic or find a specific fact. This approach respects user attention while preserving the agent’s full transparency.
Next Steps: Empowering Your XAI JourneyExplainability is a fundamental pillar for building trustworthy and effective AI products. For the advanced practitioner looking to drive this change within their organization, the journey extends beyond design patterns into advocacy and continuous learning.
To deepen your understanding and practical application, consider exploring resources like the AI Explainability 360 (AIX360) toolkit from IBM Research or Google’s What-If Tool, which offer interactive ways to explore model behavior and explanations. Engaging with communities like the Responsible AI Forum or specific research groups focused on human-centered AI can provide invaluable insights and collaboration opportunities.
Finally, be an advocate for XAI within your own organization. Frame explainability as a strategic investment. Consider a brief pitch to your leadership or cross-functional teams:
“By investing in XAI, we’ll go beyond building trust; we’ll accelerate user adoption, reduce support costs by empowering users with understanding, and mitigate significant ethical and regulatory risks by exposing potential biases. This is good design and smart business.”
Your voice, grounded in practical understanding, is crucial in bringing AI out of the black box and into a collaborative partnership with users.
MCP enables secure agent connections to Windows apps and services.[/caption]
As part of this release, two agent connectors are built into Windows—File Explorer and Windows Settings.
Windows MIDI Services Console Monitor[/caption]
[caption id="attachment_178457" align="alignnone" width="1024"]
Windows MIDI Services Settings app[/caption]
We look forward to having Windows Insiders who create music try out the Windows MIDI Services release especially with your current applications and devices. You can also join the discussion on Discord if you have questions or are looking to help or provide feedback! You will find a list of known issues documented here.
Feedback: Share your thoughts on GitHub or Discord.
Choose an app to open files—now with direct Microsoft Store integration.[/caption]