Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149822 stories
·
33 followers

WebMCP for Beginners

1 Share

blog cover

Raise your hand if you thought WebMCP was just an MCP server. Guilty as charged. I did too. It turns out it's a W3C standard that uses similar concepts to MCP. Here's what it actually is.

What is WebMCP?

WebMCP is a way for websites to define actions that AI agents can call directly.

Normally, when an agent interacts with a website, it has to interpret the interface. It looks at the page, tries to find inputs, clicks buttons, and hopes it is interacting with the right elements. That process works, but it is indirect and often fragile.

With WebMCP, the website removes that layer of guesswork. Instead of forcing the agent to figure out the UI, the site exposes functions that represent the actions it supports.

So instead of simulating a user flow like typing into an input and clicking a button, the agent can call something like:

set_background_color({ color: "coral" })

In this case, the agent is not trying to understand the layout of the page or navigate through it step by step. It is calling a function that the website explicitly defined, which makes the interaction more direct and more reliable.

WebMCP is not MCP

This is the part that matters most. MCP and WebMCP solve a similar problem, but they do it in completely different places.

With MCP, you run a server that exposes tools. Your agent connects to that server and calls those tools. You are responsible for building it, hosting it, handling authentication, and maintaining it over time. With WebMCP, everything happens in the browser. The website itself defines the tools, and the agent discovers them by visiting the page. There is no separate server to deploy for that interaction.

Another way to think about it is that MCP is something you build around systems you want to access, while WebMCP is something a website builds into itself.

Why WebMCP exists

The reasoning behind WebMCP makes more sense when you look at the limitations people ran into with MCP at scale. When teams tried to build large MCP servers, they often ended up with too many tools for the model to reason about effectively. On top of that, authentication became complicated because every service had its own requirements, and managing all of that in one place was difficult.

The browser already solves a lot of those problems. When you are logged into a website, your session, cookies, and authentication state are already in place. That system has existed for years and works reliably. WebMCP builds on that idea by letting websites expose their own actions within that authenticated context, instead of requiring a separate server to manage everything.

WebMCP is not the Playwright MCP Server

I livestreamed myself exploring WebMCP for the first time, and a common question I got was: "Is this the same thing as Chrome DevTools MCP server or Playwright MCP Server?" They're not the same.

These MCP servers enable browser automation. Browser automation allows an agent to control a browser by interacting with the interface. The agent can take screenshots, read the DOM, click elements, type into inputs, and navigate pages. This works on any website, but the agent has to interpret what it sees and decide how to act.

WebMCP takes a different approach. It only works on websites that implement it, but when they do, the agent does not need to interpret the UI at all. The website provides structured actions, and the agent calls them directly.

In practice, that difference changes the interaction model. With browser automation, the agent follows a sequence of steps that approximate what a user would do. With WebMCP, the agent skips that process and directly invokes the underlying action.

If you are using goose, Chrome DevTools MCP is still useful because it connects goose to the browser. It acts as the bridge. The improvement in how the agent interacts with the site comes from WebMCP itself.

WebMCP in Practice

To make this more concrete, think about ordering food from a restaurant website. Without WebMCP, you would need to build something that understands how that site works. That includes mapping out the ordering flow, handling login, parsing the menu, and submitting orders. You would also need to maintain that logic whenever the site changes. If you wanted to support multiple restaurants, you would repeat that process for each one.

With WebMCP, the restaurant defines a tool like place_order. The site already knows its menu structure, its modification options, and its checkout flow. It also already handles authentication. Instead of rebuilding all of that externally, the agent simply calls the tool that the site provides.

Why it matters

There are a few reasons this approach stands out. Websites already understand their own structure and logic better than any external system. WebMCP allows them to encode that knowledge once and make it available to any agent. Authentication is already handled within the browser, which removes a large amount of complexity that MCP servers would otherwise need to manage. Maintenance also shifts to the right place. When a website changes, the people who own it update their tools. You are no longer responsible for maintaining integrations for systems you do not control.

Building a WebMCP site

To understand this better, I built a simple color picker demo that exposes one action: changing the background color of the page.

The structure looks like this:

my-webmcp-site/
├── index.html
├── style.css
└── webmcp.js

The HTML is a basic page:

<!DOCTYPE html>
<html>
<head>
<title>WebMCP Color Picker</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<div class="container">
<h1>🎨 Color Picker</h1>
<p>Ask an AI to change my background!</p>
<p>Current color: <span id="colorName">#6366f1</span></p>
</div>
<script src="webmcp.js"></script>
</body>
</html>

The WebMCP functionality comes from registering a tool:

if (window.navigator.modelContext) {
window.navigator.modelContext.registerTool({
name: "set_background_color",
description: "Change the background color of the page",
inputSchema: {
type: "object",
properties: {
color: { type: "string" }
},
required: ["color"]
},
execute: ({ color }) => {
document.body.style.backgroundColor = color;
document.getElementById("colorName").textContent = color;

return {
content: [{
type: "text",
text: `Background color changed to ${color}`
}]
};
}
});
}

Once this is registered, an agent can discover and call the tool when it visits the page. One detail that stood out to me is how important the description is. The model uses that description to decide when to call the tool and what inputs to provide, so being specific makes a difference.

Two ways to define tools

You can define tools in JavaScript, which works well for dynamic behavior and applications that need more control. There is also a simpler option where you define tools directly in HTML using form attributes:

<form 
toolname="subscribe_newsletter"
tooldescription="Subscribe an email address"
>
<input
type="email"
name="email"
required
/>
<button type="submit">Subscribe</button>
</form>

In this case, the browser turns the form into a tool automatically. This approach works well for simple use cases where you do not need custom logic.

Connecting goose to WebMCP

Because WebMCP runs in the browser, you need a way for goose to interact with it. That is where Chrome DevTools MCP comes in. It acts as a bridge between goose and the browser, allowing the agent to access WebMCP tools.

One thing I noticed while testing this is that the prompt alone was not always enough to get the agent to use WebMCP. I had to provide hints that explained how to discover and execute tools:

const tools = await navigator.modelContextTesting.listTools();
const result = await navigator.modelContextTesting.executeTool("toolName", JSON.stringify({}));

Without that guidance, the agent defaulted to browser automation because that is what current models are more familiar with. As WebMCP becomes more common, this will likely become less necessary, but for now it helps guide the behavior.

Right now, WebMCP only works on sites that choose to implement it, which limits how widely it can be used. At the same time, the direction is important. Instead of agents trying to interpret interfaces, websites can define their capabilities directly and let agents interact with them in a structured way. That shift reduces guesswork, simplifies integration, and moves responsibility to the systems that already understand themselves best. It is still early, but this model makes more sense than trying to automate every interface on the web.

Resources

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Monitor and improve your web app’s load performance

1 Share
Today, large web applications are often assembled from many independent pieces, which all load their own data and resources. When all these pieces compete for the same network connection, congestion can build up and the user experience can suffer. Illustration showing multiple pieces of a webapp, 3rd-party content, images, analytics, data, etc. all going through a single Network pipe, and, on the other side, a webpage with a part of it still loading. To address this problem, we're excited to introduce a new feature which web developers can start testing in Microsoft Edge today: Network Efficiency Guardrails. If you're a web developer, improving the load performance of your app starts with knowing what to focus on first. However, when your web app embeds a mix of first-party and third-party content from different sources, optimizing performance depends on each of these pieces—not just on what you built. That's why identifying the resources that need to be optimized to improve your app's load time is crucial, and that's exactly what the Network Efficiency Guardrails feature does. So, if you're working on an app that embeds content, read on to learn more about Network Efficiency Guardrails and start using them.

Detecting bad resource-loading patterns

Based on our own experience working with large web-based applications, we know that there are certain resource-loading patterns which have a disproportionate impact on performance. For example:
  • Very large images
  • Uncompressed resources
  • Large data: URLs
With the Network Efficiency Guardrails feature, you can ask the browser to monitor your app's network resource usage. Once network monitoring starts, the browser automatically identifies inefficient resource-loading patterns and reports them to you. You can then use this information to optimize your app for all your users. App loads resources, then violations are reported, then the developer optimizes their site, then the app loads faster. In practice, you first opt into the feature by setting a Document Policy. Once you've done that, the offending loading patterns which the browser detects are reported as policy violations through the Reporting API, a web platform mechanism that lets you send structured reports back to your server when something notable happens at runtime.

What gets reported

Currently, when you opt into the Network Efficiency Guardrails feature, Microsoft Edge will use the following criteria to identify policy violations:
  • Text-based resources that are not HTTP-compressed.
  • Images larger than 200 kB.
  • data: URLs larger than 100 kB.
These are our initial criteria, and we believe they are effective at flagging resource usage patterns that are atypical for well‑performing apps. We chose these criteria based on aggregate, real‑world data, established industry best practices, and Web Almanac findings. To learn more about how these values were chosen, you can read about it in our feature explainer document. We expect to make changes to these values as we continue to gather more feedback and data.

Try Network Efficiency Guardrails today

The feature is available in Microsoft Edge, starting with version 146. To try it, you'll first need to enable it:
  1. In Edge, go to edge://flags.
  2. Type "Experimental Web Platform features" in the Search flags text box at the top.
  3. Under the Experimental Web Platform features section, select Enabled in the dropdown menu.
  4. Restart Edge.

Enable the document policy on your site

Next, opt into the feature by enabling the document policy on your site, which you can do in either of these two ways:
  • Set the document policy by sending the following HTTP response headers from your server: Document-Policy:network-efficiency-guardrails; report-to=neg-endpoint Reporting-Endpoints: neg-endpoint="/neg-reporting/" The report-to endpoint name and value are not important yet. They're only required so you can start seeing reports, but you don't need to have the server endpoint running yet.
  • Or set the above response headers by creating a local override in DevTools instead. To learn how to do this, see Override HTTP response headers. This can be helpful to quickly get started since you don't have to modify your server code.

View the reported violations

Now that you have everything set up, you can use your app as normal, and the browser will start reporting problematic network usage patterns to you. You can view the reported violations either in DevTools, or on your server. Using DevTools for this is a simple way to get started. As violations get detected, they'll appear in the Console tool as error messages: The Console tool in Microsoft Edge DevTools, showing a reported violation message saying: Document policy violation: resource compression is required. And you can also see them in the Application tool, under the Reporting API section, where you can find more details about each report: The Application tool in Microsoft Edge DevTools, showing the Reporting API data, which contains one report about a resource that lacks compression. The additional data provides the link to the source file which lacks compression. Viewing the reports in DevTools is a great way to get started. However, if you want to use Network Efficiency Guardrails in production, and receive real reports from your users' devices, you'll also need to configure a reporting endpoint on your server.

Configure a reporting server endpoint

To collect reports in production, use the report-to field in the Document-Policy header, giving it the name of your choosing. And then specify the value for this server endpoint in Reporting-Endpoints: Document-Policy: network-efficiency-guardrails; report-to=neg-endpoint Reporting-Endpoints: neg-endpoint="/neg-reporting/" Now, configure the /neg-reporting/ endpoint on your server and make sure it can receive the reports, as well as accept preflight requests if you're using a cross-origin endpoint. To learn more, read the Reporting API documentation at MDN, and our feature explainer.

View reports client-side

You can also retrieve violation reports on the client with JavaScript code. The reports are exposed to the document where they got created, through the Reporting client-side API. Use the ReportingObserver interface to access these reports as they are raised, and look for reports that have ReportBody.featureId === "network-efficiency-guardrails".

Let us know what you think

The Network Efficiency Guardrails feature is in its early stages of development, and we'd love you to try it and share your feedback with us. Learning about your app's specific network usage patterns can help us design the right API for you. We're actively exploring the following open questions, so now is a great opportunity to try the feature and help us improve it:
  • Fine-tuning existing guardrails: help us better detect the network usage patterns based on your data and feedback.
  • Adding new guardrails: are there additional patterns that we should be considering?
  • Cross-frame reporting: how should a parent frame monitor a child frame? Should bidirectional monitoring be possible? How should guardrails be enforced on embedded frames?
Checkout out the Network Efficiency Guardrails explainer and let us know your feedback by opening a new issue.
Read the whole story
alvinashcraft
59 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Two more EVs for the trash heap: Volvo EX30 and Honda Prologue

1 Share
A photo of the Volvo EX30.

The steady stream of news about automakers cancelling or discontinuing electric vehicles continues apace. This week it's Volvo's small, quirky EX30 and Honda's solo electric offering in the US, the Prologue. Both are the latest victims of stagnating EV sales in the US thanks to the Trump administration's decision to eliminate tax incentives.

First, the EX30. The small SUV was the most affordable EV in Volvo's lineup, even if it took some time before it arrived on our shores. Volvo spokesperson Sophia Durr says that the automaker's US division has decided to discontinue the EX30 and EX30 Cross Country after the 2026 model year. It will, how …

Read the full story at The Verge.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

The future of code is exciting and terrifying

1 Share

Suddenly it seems like everyone's a coder. Or, at the very least, like they play one in the Claude Code app. But even for the seasoned pros, the act of software development is changing fast - many people are writing less code themselves and instead spending their time managing agents and projects. So what does all that change mean, both for the code and the people who make it?

Verge subscribers, don't forget you get exclusive access to ad-free Vergecast wherever you get your podcasts. Head here. Not a subscriber? You can sign up here.

On this episode of The Vergecast, Paul Ford, a writer and entrepreneur and longtime tech thinker, expl …

Read the full story at The Verge.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

AI Agent & Copilot Podcast: Microsoft Data Scientists Vaishali Vinay and Raghav Bhatta on AI for Cyber Defense

1 Share

In this episode of the AI Agent & Copilot Podcast, host Tom Smith speaks with Vaishali Vinay, Data Scientist at Microsoft, and Raghav Bhatta, Data Scientist at Microsoft, about their upcoming masterclass at the 2026 AI Agent & Copilot Summit NA in San Diego. They discuss how AI can serve as a threat research partner for cybersecurity teams, augmenting human expertise in threat hunting and detection engineering while helping organizations proactively defend against increasingly sophisticated cyber attacks.

Key Takeaways

  • AI as a Threat Research Partner: Vinay explains that traditional threat hunting and detection engineering have historically been highly manual processes requiring significant time and expertise. AI can now assist by analyzing attacker behavior and identifying detection opportunities faster. As Vinay notes, the goal is to augment our human experts and accelerate this threat research process much faster.
  • Scaling Cyber Defense in an AI-Powered Threat Landscape: Bhatta highlights that as AI adoption grows across industries, the volume of data and potential attack vectors increases rapidly. Organizations must therefore adapt AI for defensive purposes as well. “The amount of data which is produced… is increasing at a nonlinear scale,” Bhatta explains. AI copilots help defenders process this scale by assisting with detection engineering, threat hunting, and proactive defense strategies that protect infrastructure and customers from evolving cyber threats.
  • Capturing and Sharing ‘Tribal Knowledge’ Through AI: Cybersecurity often depends on the deep experience of veteran researchers who understand attacker behavior patterns. Bhatta suggests AI copilots can help scale that expertise across teams. He explains that copilots can serve as a source of tribal knowledge,” enabling newer analysts and teams to leverage insights that historically lived only in the heads of experienced researchers. This dramatically increases productivity and knowledge transfer within security organizations.
  • AI Attackers vs. AI Defenders: The session also acknowledges that cyber attackers are increasingly leveraging AI themselves. That makes defensive innovation essential. Vinay and Bhatta emphasize the importance of building AI systems that analyze attack techniques and automatically recommend detection rules. This dynamic defense model enables security teams to react faster to emerging threats and reduces the manual workload traditionally required to understand complex attack patterns.

The post AI Agent & Copilot Podcast: Microsoft Data Scientists Vaishali Vinay and Raghav Bhatta on AI for Cyber Defense appeared first on Cloud Wars.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Announcing Copilot leadership update

1 Share

Satya Nadella, Chairman and CEO, and Mustafa Suleyman, Executive Vice President and CEO of Microsoft AI, shared the below communications with Microsoft employees this morning.

SATYA NADELLA MESSAGE

I want to share two org changes we’re making to our Copilot org and superintelligence effort.

It’s clear a new era of productivity is emerging as AI experiences rapidly evolve from answering questions and suggesting code, to executing multi-step tasks with clear user control points. You see this in our announcements over the last couple of weeks, like Copilot Tasks and Copilot Cowork, agentic capabilities in Office, and Agent 365. As these experiences connect more naturally across agents, apps, and workflows, we have an opportunity to help customers spend more time on higher-value work and reduce manual coordination, while providing people with more agency and empowerment and organizations with the governance and security controls they need.

To that end, we are bringing the Copilot system across commercial and consumer together as one unified effort. This will span four connected pillars: Copilot experience, Copilot platform, Microsoft 365 apps, and AI models. This is how we move from a collection of great products to a truly integrated system, one that is simpler and more powerful for customers.

Jacob Andreou will lead the Copilot experience across consumer and commercial, driving design, product, growth, and engineering, as EVP, Copilot, reporting to me. As CVP of Product and Growth at Microsoft AI, Jacob has accelerated our user-focused AI-first product making and growth framework. Prior to that, he was SVP at Snap, where he helped scale the company from its early days.

Progress at the AI model layer is more critical than ever to our success as a company over the next decade and is foundational to everything we build above it. We are doubling down on our superintelligence mission with the talent and compute to build models that have real product impact, in terms of evals, COGS reduction, as well as advancing the frontier when it comes to meeting enterprise needs and achieving the next set of research breakthroughs. Mustafa Suleyman and I have been working towards this plan for some time, and he will continue to lead this high ambition work, reporting to me. Mustafa is uniquely qualified to drive this forward, with his deep focus and commitment to advancing the frontiers of model science, while also ensuring that human control, agency, and economic opportunity remain at the center of these advancements.

Ryan Roslansky, Perry Clarke, and Charles Lamanna will lead M365 apps and the Copilot platform. Together, Jacob, Ryan, Charles, Perry, and Mustafa make up the Copilot LT and over the next few weeks they’ll work to align the teams.

Our org boundaries will simply reflect system architecture and product shape such that we can deliver more coherent and competitive experiences that continue to evolve with model capabilities. And I am looking forward to how together we apply all of this to empower people, organizations, and the world.

MUSTAFA SULEYMAN MESSAGE

Subject: A new structure for Microsoft AI

Technology and the future of our industry will be defined by two things: frontier models, and the products through which they are experienced. For some time, I’ve been thinking about how we best tackle these huge challenges, and today I’m excited to be evolving our structure at Microsoft AI, ensuring we’re positioned to succeed in both.

I came to Microsoft with an overriding mission: to create Superintelligence that delivers a transformative, positive impact for millions of people. This requires us to build frontier models, at scale, pushing the boundaries of what’s possible. Everything else follows from this. It’s the foundation for our future as a company. With our ambitious, long-term frontier scale compute roadmap locked, we now have everything we need to build truly SOTA models.

As you will have just heard from Satya, the next phase of this plan is to restructure our organization to enable me to focus all my energy on our Superintelligence efforts and be able to deliver world class models for Microsoft over the next 5 years. These models will enable us to build enterprise tuned lineages that help improve all our products across the company. They’ll also enable us to deliver the COGS efficiencies necessary to be able to serve AI workloads at the immense scale required in the coming years. Achieving all this will be a huge challenge, and I’m committing everything we have – and I have personally – to make it happen.

To that end, I’ve been working hard with other leaders in the background for a while now to define a strategy to unify Copilot by bringing together the Consumer and Commercial efforts as one. We all know this makes sense. Every user – whether at home or at work – will be able to enjoy the full benefit of what we are all building. Today, we’re combining these organizations into a single, unified Copilot org. Jacob has demonstrated himself to be an outstanding leader for the product experience and clearly has the product instincts, the operational range, and the conviction to make Copilot a great success.

Jacob will retain a dotted line to me, and I’ll stay directly involved in much of the day-to-day operation of MAI, attending Meetups, MMMs, LT, and supporting Jacob to drive all areas of product strategy. To ensure that the models we build and the products we ship are mutually reinforcing, we are establishing a Copilot Leadership Team that includes me, Jacob, Charles Lamanna, Perry Clarke, and Ryan Roslansky. This will enable us to focus our brand strategy, our product roadmap, our models and our core infrastructure as one to deliver the best experiences possible for all our users.

Thank you for everything you’ve done over the last few years. I know how hard everyone has been pushing and the sacrifices many of you have made to help the company adapt to this new era.

We really do have an incredible opportunity to redefine Microsoft for this agentic revolution.

Mustafa’s mail has been edited slightly for external use.

The post Announcing Copilot leadership update appeared first on The Official Microsoft Blog.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories