Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149322 stories
·
33 followers

A Function Inliner for Wasmtime and Cranelift

1 Share

Note: I cross-posted this to the Bytecode Alliance blog.

Function inlining is one of the most important compiler optimizations, not because of its direct effects, but because of the follow-up optimizations it unlocks. It may reveal, for example, that an otherwise-unknown function parameter value is bound to a constant argument, which makes a conditional branch unconditional, which in turn exposes that the function will always return the same value. Inlining is the catalyst of modern compiler optimization.

Wasmtime is a WebAssembly runtime that focuses on safety and fast Wasm execution. But despite that focus on speed, Wasmtime has historically chosen not to perform inlining in its optimizing compiler backend, Cranelift. There were two reasons for this surprising decision: first, Cranelift is a per-function compiler designed such that Wasmtime can compile all of a Wasm module’s functions in parallel. Inlining is inter-procedural and requires synchronization between function compilations; that synchronization reduces parallelism. Second, Wasm modules are generally produced by an optimizing toolchain, like LLVM, that already did all the beneficial inlining. Any calls remaining in the module will not benefit from inlining — perhaps they are on slow paths marked [[unlikely]] or the callee is annotated with #[inline(never)]. But WebAssembly’s component model changes this calculus.

With the component model, developers can compose multiple Wasm modules — each produced by different toolchains — into a single program. Those toolchains only had a local view of the call graph, limited to their own module, and they couldn’t see cross-module or fused adapter function definitions. None of them, therefore, had an opportunity to inline calls to such functions. Only the Wasm runtime’s compiler, which has the final, complete call graph and function definitions in hand, has that opportunity.

Therefore we implemented function inlining in Wasmtime and Cranelift. Its initial implementation landed in Wasmtime version 36, however, it remains off-by-default and is still baking. You can test it out via the -C inlining=y command-line flag or the wasmtime::Config::compiler_inlining method. The rest of this article describes function inlining in more detail, digs into the guts of our implementation and rationale for its design choices, and finally looks at some early performance results.

Function Inlining

Function inlining is a compiler optimization where a call to a function f is replaced by a copy of f’s body. This removes function call overheads (spilling caller-save registers, setting up the call frame, etc…) which can be beneficial on its own. But inlining’s main benefits are indirect: it enables subsequent optimization of f’s body in the context of the call site. That context is important — a parameter’s previously unknown value might be bound to a constant argument and exposing that to the optimizer might cascade into a large code clean up.

Consider the following example, where function g calls function f:

fn f(x: u32) -> bool {
    return x < u32::MAX / 2;
}

fn g() -> u32 {
    let a = 42;
    if f(a) {
        return a;
    } else {
        return 0;
    }
}

After inlining the call to f, function g looks something like this:

fn g() -> u32 {
    let a = 42;

    let x = a;
    let f_result = x < u32::MAX / 2;

    if f_result {
        return a;
    } else {
        return 0;
    }
}

Now the whole subexpression that defines f_result only depends on constant values, so the optimizer can replace that subexpression with its known value:

fn g() -> u32 {
    let a = 42;

    let f_result = true;
    if f_result {
        return a;
    } else {
        return 0;
    }
}

This reveals that the if-else conditional will, in fact, unconditionally transfer control to the consequent, and g can be simplified into the following:

fn g() -> u32 {
    let a = 42;
    return a;
}

In isolation, inlining f was a marginal transformation. When considered holistically, however, it unlocked a plethora of subsequent simplifications that ultimately led to g returning a constant value rather than computing anything at run-time.

Implementation

Cranelift’s unit of compilation is a single function, which Wasmtime leverages to compile each function in a Wasm module in parallel, speeding up compile times on multi-core systems. But inlining a function at a particular call site requires that function’s definition, which implies parallelism-hurting synchronization or some other compromise, like additional read-only copies of function bodies. So this was the first goal of our implementation: to preserve as much parallelism as possible.

Additionally, although Cranelift is primarily developed for Wasmtime by Wasmtime’s developers, it is independent from Wasmtime. It is a reusable library and is reused, for example, by the Rust project as an alternative backend for rustc. But a large part of inlining, in practice, are the heuristics for deciding when inlining a call is likely beneficial, and those heuristics can be domain specific. Wasmtime generally wants to leave most calls out-of-line, inlining only cross-module calls, while rustc wants something much more aggressive to boil away its Iterator combinators and the like. So our second implementation goal was to separate how we inline a function call from the decision of whether to inline that call.

These goals led us to a layered design where Cranelift has an optional inlining pass, but the Cranelift embedder (e.g. Wasmtime) must provide a callback to it. The inlining pass invokes the callback for each call site, the callback returns a command of either “leave the call as-is” or “here is a function body, replace the call with it”. Cranelift is responsible for the inlining transformation and the embedder is responsible for deciding whether to inline a function call and, if so, getting that function’s body (along with whatever synchronization that requires).

The mechanics of the inlining transformation — wiring arguments to parameters, renaming values, and copying instructions and basic blocks into the caller — are, well, mechanical. Cranelift makes extensive uses of arenas for various entities in its IR, and we begin by appending the callee’s arenas to the caller’s arenas, renaming entity references from the callee’s arena indices to their new indices in the caller’s arenas as we do so. Next we copy the callee’s block layout into the caller and replace the original call instruction with a jump to the caller’s inlined version of the callee’s entry block. Cranelift uses block parameters, rather than phi nodes, so the call arguments simply become jump arguments. Finally, we translate each instruction from the callee into the caller. This is done via a pre-order traversal to ensure that we process value definitions before value uses, simplifying instruction operand rewriting. The changes to Wasmtime’s compilation orchestration are more interesting.

The following pseudocode describes Wasmtime’s compilation orchestration before Cranelift gained an inlining pass and also when inlining is disabled:

// Compile each function in parallel.
let objects = parallel map for func in wasm.functions {
    compile(func)
};

// Combine the functions into one region of executable memory, resolving
// relocations by mapping function references to PC-relative offsets.
return link(objects)

The naive way to update that process to use Cranelift’s inlining pass might look something like this:

// Optionally perform some pre-inlining optimizations in parallel.
parallel for func in wasm.functions {
    pre_optimize(func);
}

// Do inlining sequentially.
for func in wasm.functions {
    func.inline(|f| if should_inline(f) {
        Some(wasm.functions[f])
    } else {
        None
    })
}

// And then proceed as before.
let objects = parallel map for func in wasm.functions {
    compile(func)
};
return link(objects)

Inlining is performed sequentially, rather than in parallel, which is a bummer. But if we tried to make that loop parallel by logically running each function’s inlining pass in its own thread, then a callee function we are inlining might or might not have had its transitive function calls inlined already depending on the whims of the scheduler. That leads to non-deterministic output, and our compilation must be deterministic, so it’s a non-starter.1 But whether a function has already had transitive inlining done or not leads to another problem.

With this naive approach, we are either limited to one layer of inlining or else potentially duplicating inlining effort, repeatedly inlining e into f each time we inline f into g, h, and i. This is because f may come before or after g in our wasm.functions list. We would prefer it if f already contained e and was already optimized accordingly, so that every caller of f didn’t have to redo that same work when inlining calls to f.

This suggests we should topologically sort our functions based on their call graph, so that we inline in a bottom-up manner, from leaf functions (those that do not call any others) towards root functions (those that are not called by any others, typically main and other top-level exported functions). Given a topological sort, we know that whenever we are inlining f into g either (a) f has already had its own inlining done or (b) f and g participate in a cycle. Case (a) is ideal: we aren’t repeating any work because it’s already been done. Case (b), when we find cycles, means that f and g are mutually recursive. We cannot fully inline recursive calls in general (just as you cannot fully unroll a loop in general) so we will simply avoid inlining these calls.2 So topological sort avoids repeating work, but our inlining phase is still sequential.

At the heart of our proposed topological sort is a call graph traversal that visits callees before callers. To parallelize inlining, you could imagine that, while traversing the call graph, we track how many still-uninlined callees each caller function has. Then we batch all functions whose associated counts are currently zero (i.e. they aren’t waiting on anything else to be inlined first) into a layer and process them in parallel. Next, we decrement each of their callers’ counts and collect the next layer of ready-to-go functions, continuing until all functions have been processed.

let call_graph = CallGraph::new(wasm.functions);

let counts = { f: call_graph.num_callees_of(f) for f in wasm.functions };

let layer = [ f for f in wasm.functions if counts[f] == 0 ];
while layer is not empty {
    parallel for func in layer {
        func.inline(...);
    }

    let next_layer = [];
    for func in layer {
        for caller in call_graph.callers_of(func) {
            counts[caller] -= 1;
            if counts[caller] == 0 {
                next_layer.push(caller)
            }
        }
    }
    layer = next_layer;
}

This algorithm will leverage available parallelism, and it avoids repeating work via the same dependency-based scheduling that topological sorting did, but it has a flaw. It will not terminate when it encounters recursion cycles in the call graph. If function f calls function g which also calls f, for example, then it will not schedule either of them into a layer because they are both waiting for the other to be processed first. One way we can avoid this problem is by avoiding cycles.

If you partition a graph’s nodes into disjoint sets, where each set contains every node reachable from every other node in that set, you get that graph’s strongly-connected components (SCCs). If a node does not participate in a cycle, then it will be in its own singleton SCC. The members of a cycle, on the other hand, will all be grouped into the same SCC, since those nodes are all reachable from each other.

In the following example, the dotted boxes designate the graph’s SCCs:

Ignoring edges between nodes within the same SCC, and only considering edges across SCCs, gives us the graph’s condensation. The condensation is always acyclic, because the original graph’s cycles are “hidden” within the SCCs.

Here is the condensation of the previous example:

We can adapt our parallel-inlining algorithm to operate on strongly-connected components, and now it will correctly terminate because we’ve removed all cycles. First, we find the call graph’s SCCs and create the reverse (or transpose) condensation, where an edge a→b is flipped to b→a. We do this because we will query this graph to find the callers of a given function f, not the functions that f calls. I am not aware of an existing name for the reverse condensation, so, at Chris Fallin’s brilliant suggestion, I have decided to call it an evaporation. From there, the algorithm largely remains as it was before, although we keep track of counts and layers by SCC rather than by function.

let call_graph = CallGraph::new(wasm.functions);
let components = StronglyConnectedComponents::new(call_graph);
let evaoporation = Evaporation::new(components);

let counts = { c: evaporation.num_callees_of(c) for c in components };

let layer = [ c for c in components if counts[c] == 0 ];
while layer is not empty {
    parallel for func in scc in layer {
        func.inline(...);
    }

    let next_layer = [];
    for scc in layer {
        for caller_scc in evaporation.callers_of(scc) {
            counts[caller_scc] -= 1;
            if counts[caller_scc] == 0 {
                next_layer.push(caller_scc);
            }
        }
    }
    layer = next_layer;
}

This is the algorithm we use in Wasmtime, modulo minor tweaks here and there to engineer some data structures and combine some loops. After parallel inlining, the rest of the compiler pipeline continues in parallel for each function, yielding unlinked machine code. Finally, we link all that together and resolve relocations, same as we did previously.

Heuristics are the only implementation detail left to discuss, but there isn’t much to say that hasn’t already been said. Wasmtime prefers not to inline calls within the same Wasm module, while cross-module calls are a strong hint that we should consider inlining. Beyond that, our heuristics are extremely naive at the moment, and only consider the code sizes of the caller and callee functions. There is a lot of room for improvement here, and we intend to make those improvements on-demand as people start playing with the inliner. For example, there are many things we don’t consider in our heuristics today, but possibly should:

  • Hints from WebAssembly’s compilation-hints proposal
  • The number of edges to a callee function in the call graph
  • Whether any of a call’s arguments are constants
  • Whether the call is inside a loop or a block marked as “cold”
  • Etc…

Some Initial Results

The speed up you get (or don’t get) from enabling inlining is going to vary from program to program. Here are a couple synthetic benchmarks.

First, let’s investigate the simplest case possible, a cross-module call of an empty function in a loop:

(component
  ;; Define one module, exporting an empty function `f`.
  (core module $M
    (func (export "f")
      nop
    )
  )

  ;; Define another module, importing `f`, and exporting a function
  ;; that calls `f` in a loop.
  (core module $N
    (import "m" "f" (func $f))
    (func (export "g") (param $counter i32)
      (loop $loop
        ;; When counter is zero, return.
        (if (i32.eq (local.get $counter) (i32.const 0))
          (then (return)))
        ;; Do our cross-module call.
        (call $f)
        ;; Decrement the counter and continue to the next iteration
        ;; of the loop.
        (local.set $counter (i32.sub (local.get $counter)
                                     (i32.const 1)))
        (br $loop))
    )
  )

  ;; Instantiate and link our modules.
  (core instance $m (instantiate $M))
  (core instance $n (instantiate $N (with "m" (instance $m))))

  ;; Lift and export the looping function.
  (func (export "g") (param "n" u32)
    (canon lift (core func $n "g"))
  )
)

We can inspect the machine code that this compiles down to via the wasmtime compile and wasmtime objdump commands. Let’s focus only on the looping function. Without inlining, we see a loop around a call, as we would expect:

00000020 wasm[1]::function[1]:
        ;; Function prologue.
        20: pushq   %rbp
        21: movq    %rsp, %rbp

        ;; Check for stack overflow.
        24: movq    8(%rdi), %r10
        28: movq    0x10(%r10), %r10
        2c: addq    $0x30, %r10
        30: cmpq    %rsp, %r10
        33: ja      0x89

        ;; Allocate this function's stack frame, save callee-save
        ;; registers, and shuffle some registers.
        39: subq    $0x20, %rsp
        3d: movq    %rbx, (%rsp)
        41: movq    %r14, 8(%rsp)
        46: movq    %r15, 0x10(%rsp)
        4b: movq    0x40(%rdi), %rbx
        4f: movq    %rdi, %r15
        52: movq    %rdx, %r14

        ;; Begin loop.
        ;;
        ;; Test our counter for zero and break out if so.
        55: testl   %r14d, %r14d
        58: je      0x72
        ;; Do our cross-module call.
        5e: movq    %r15, %rsi
        61: movq    %rbx, %rdi
        64: callq   0
        ;; Decrement our counter.
        69: subl    $1, %r14d
        ;; Continue to the next iteration of the loop.
        6d: jmp     0x55

        ;; Function epilogue: restore callee-save registers and
        ;; deallocate this functions's stack frame.
        72: movq    (%rsp), %rbx
        76: movq    8(%rsp), %r14
        7b: movq    0x10(%rsp), %r15
        80: addq    $0x20, %rsp
        84: movq    %rbp, %rsp
        87: popq    %rbp
        88: retq

        ;; Out-of-line traps.
        89: ud2
            ╰─╼ trap: StackOverflow

When we enable inlining, then M::f gets inlined into N::g. Despite N::g becoming a leaf function, we will still push %rbp and all that in the prologue and pop it in the epilogue, because Wasmtime always enables frame pointers. But because it no longer needs to shuffle values into ABI argument registers or allocate any stack space, it doesn’t need to do any explicit stack checks, and nearly all the rest of the code also goes away. All that is left is a loop decrementing a counter to zero:3

00000020 wasm[1]::function[1]:
        ;; Function prologue.
        20: pushq   %rbp
        21: movq    %rsp, %rbp

        ;; Loop.
        24: testl   %edx, %edx
        26: je      0x34
        2c: subl    $1, %edx
        2f: jmp     0x24

        ;; Function epilogue.
        34: movq    %rbp, %rsp
        37: popq    %rbp
        38: retq

With this simplest of examples, we can just count the difference in number of instructions in each loop body:

  • 12 without inlining (7 in N::g and 5 in M::f which are 2 to push the frame pointer, 2 to pop it, and 1 to return)
  • 4 with inlining

But we might as well verify that the inlined version really is faster via some quick-and-dirty benchmarking with hyperfine. This won’t measure only Wasm execution time, it also measures spawning a whole Wasmtime process, loading code from disk, etc…, but it will work for our purposes if we crank up the number of iterations:

$ hyperfine \
    "wasmtime run --allow-precompiled -Cinlining=n --invoke 'g(100000000)' no-inline.cwasm" \
    "wasmtime run --allow-precompiled -Cinlining=y --invoke 'g(100000000)' yes-inline.cwasm"

Benchmark 1: wasmtime run --allow-precompiled -Cinlining=n --invoke 'g(100000000)' no-inline.cwasm
  Time (mean ± σ):     138.2 ms ±   9.6 ms    [User: 132.7 ms, System: 6.7 ms]
  Range (min … max):   128.7 ms … 167.7 ms    19 runs

Benchmark 2: wasmtime run --allow-precompiled -Cinlining=y --invoke 'g(100000000)' yes-inline.cwasm
  Time (mean ± σ):      37.5 ms ±   1.1 ms    [User: 33.0 ms, System: 5.8 ms]
  Range (min … max):    35.7 ms …  40.8 ms    77 runs

Summary
  'wasmtime run --allow-precompiled -Cinlining=y --invoke 'g(100000000)' yes-inline.cwasm' ran
    3.69 ± 0.28 times faster than 'wasmtime run --allow-precompiled -Cinlining=n --invoke 'g(100000000)' no-inline.cwasm'

Okay so if we measure Wasm doing almost nothing but empty function calls and then we measure again after removing function call overhead, we get a big speed up — it would be disappointing if we didn’t! But maybe we can benchmark something a tiny bit more realistic.

A program that we commonly reach for when benchmarking is a small wrapper around the pulldown-cmark markdown library that parses the CommonMark specification (which is itself written in markdown) and renders that to HTML. This is Real World™ code operating on Real World™ inputs that matches Real World™ use cases people have for Wasm. That is, good benchmarking is incredibly difficult, but this program is nonetheless a pretty good candidate for inclusion in our corpus. There’s just one hiccup: in order for our inliner to activate normally, we need a program using components and making cross-module calls, and this program doesn’t do that. But we don’t have a good corpus of such benchmarks yet because this kind of component composition is still relatively new, so let’s keep using our pulldown-cmark program but measure our inliner’s effects via a more circuitous route.

Wasmtime has tunables to enable the inlining of intra-module calls4 and rustc and LLVM have tunables for disabling inlining5. Therefore we can roughly estimate the speed ups our inliner might unlock on a similar, but extensively componentized and cross-module calling, program by:

  • Disabling inlining when compiling the Rust source code to Wasm

  • Compiling the resulting Wasm binary to native code with Wasmtime twice: once with inlining disabled, and once with intra-module call inlining enabled

  • Comparing those two different compilations’ execution speeds

Running this experiment with Sightglass, our internal benchmarking infrastructure and tooling, yields the following results:

execution :: instructions-retired :: pulldown-cmark.wasm

  Δ = 7329995.35 ± 2.47 (confidence = 99%)

  with-inlining is 1.26x to 1.26x faster than without-inlining!

  [35729153 35729164.72 35729173] without-inlining
  [28399156 28399169.37 28399179] with-inlining

Conclusion

Wasmtime and Cranelift now have a function inliner! Test it out via the -C inlining=y command-line flag or via the wasmtime::Config::compiler_inlining method. Let us know if you run into any bugs or whether you see any speed-ups when running Wasm components containing multiple core modules.

Thanks to Chris Fallin and Graydon Hoare for reading early drafts of this piece and providing valuable feedback. Any errors that remain are my own.


  1. Deterministic compilation gives a number of benefits: testing is easier, debugging is easier, builds can be byte-for-byte reproducible, it is well-behaved in the face of incremental compilation and fine-grained caching, etc… 

  2. For what it is worth, this still allows collapsing chains of mutually-recursive calls (a calls b calls c calls a) into a single, self-recursive call (abc calls abc). Our actual implementation does not do this in practice, preferring additional parallelism instead, but it could in theory. 

  3. Cranelift cannot currently remove loops without side effects, and generally doesn’t mess with control-flow at all in its mid-end. We’ve had various discussions about how we might best fit control-flow-y optimizations into Cranelift’s mid-end architecture over the years, but it also isn’t something that we’ve seen would be very beneficial for actual, Real World™ Wasm programs, given that (a) LLVM has already done much of this kind of thing when producing the Wasm, and (b) we do some branch-folding when lowering from our mid-level IR to our machine-specific IR. Maybe we will revisit this sometime in the future if it crops up more often after inlining. 

  4. -C cranelift-wasmtime-inlining-intra-module=yes 

  5. -Cllvm-args=--inline-threshold=0, -Cllvm-args=--inlinehint-threshold=0, and -Zinline-mir=no 

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Defensive Technology: Ransomware Data Recovery

1 Share

In a prior installment we looked at Controlled Folder Access, a Windows feature designed to hamper ransomware attacks by preventing untrusted processes from modifying files in certain user folders. In today’s post, we look at the other feature on the Ransomware protection page of the Windows Security Center AppRansomware data recovery.

User-Interface

The UI of the feature is simple and reflects the state of your cloud file provider (if any) which for most folks will be OneDrive. Depending on whether OneDrive is enabled, and what kind of account you have, you’ll see one of the following four sets of details:

Windows 11 Ransomware data recovery feature status

What’s it do?

Conceptually, this whole feature is super-simple.

Ransomware works by encrypting your files with a secret key and holding that key for ransom. If you have a backup of your files, you can simply restore the files without paying the bad guys.

However, for backup to work well as a ransomware recovery method, you need

  1. to ensure that your backup processes don’t overwrite the legitimate files with the encrypted versions, and
  2. to easily recognize which files were modified by ransomware to replace them with their latest uncorrupted version.

The mechanism of this feature is quite simple: If Defender recognizes a ransomware attack is underway, it battles the ransomware (killing its processes, etc) and also notifies your cloud file provider of the timestamp of the detected infection. Internally, we’ve called this a shoulder tap, as if we tapped the backup software on the shoulder and said “Uh, hang on, this device is infected right now.

This notice serves two purposes:

  1. To allow the file backup provider to pause backups until given an “all clear” (remediation complete) notification, and
  2. To allow the file backup provider to determine which files may have been corrupted from the start of the infection so that it can restore their backups.

Simple, right?

-Eric

Appendix: Extensibility

As far as I can tell, this feature represents semi-public interface that allows 3P security software and cloud backup software to integrate with the Windows Security Center. OnDataCorruptionMalwareFoundNotification and OnRemediationNotification. Unfortunately, the documentation isn’t public — I suspect it’s only available to members of the Microsoft Virus Initiative program for AV partners.



Read the whole story
alvinashcraft
16 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Building AI Agents with Google Gemini 3 and Open Source Frameworks

1 Share
Gemini 3 Pro Preview is introduced as a powerful, agentic model for complex, (semi)-autonomous workflows. New agentic features include `thinking_level` for reasoning control, Stateful Tool Use via Thought Signatures, and `media_resolution` for multimodal fidelity. It has Day 0 support for open-source frameworks like LangChain, AI SDK, LlamaIndex, Pydantic AI, and n8n. Best practices include simplifying prompts and keeping temperature at 1.0.
Read the whole story
alvinashcraft
28 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Announcing the azd AI agent extension: Publish Microsoft Foundry agents directly from your development environment

1 Share

Announcing the new azd AI agent extension: Publish Microsoft Foundry agents directly from your development environment

If you’ve ever built an AI agent, you know the real challenge is more than writing the code, it’s the complex dance of provisioning resources, managing models, and securely deploying everything to the cloud. We thought, what if that entire workflow could be simplified?

That’s why our team has been working on a new extension for the Azure Developer CLI (azd), which we’re excited to announce at Ignite 2025. Our goal was to simplify the entire workflow. This new extension helps you build, provision, and publish AI agents on Microsoft Foundry, bridging the gap between local development and cloud deployment. It brings everything you need into your terminal and code editor, letting you go from an agent idea on your machine to a running, shareable agent in Azure with minimal friction.

With this extension, agents become a part of your toolchain. You get the power of Foundry’s advanced features (from multi-model reasoning to integrated evaluations) combined with the developer-first workflow of azd. This extension empowers you to rapidly iterate on AI agents locally and reliably push them to the cloud for at-scale use, all using a consistent set of tools.

What is the new azd AI agent extension?

The new azd ai agent extension adds agent-centric workflows directly into the Azure Developer CLI. It integrates Foundry projects, model deployments, and agent definitions leveraging the azd concepts, such as azd init and azd up. When paired with a template that includes Infrastructure as Code (IaC) or when you bring your own IaC, this extension automates the process of provisioning Azure resources, configuring model deployments, and publishing agents.

azd ai agent extension workflow

These new features handle three critical aspects of agent development:

  • Project scaffolding: Initialize a new project with an agent-ready starter template that includes pre-configured IaC files you need to get started, then initializes agents from a manifest file.
  • Resource provisioning: After you’ve initialized your IaC and agents, then the extension will provision and configure Foundry projects, deploy models with proper SKUs and capacity settings, and establish required connections.
  • Agent deployment: Package your agent, push container images to Azure Container Registry, and deploy agents to Foundry where they become shareable, production-ready services.

The result? You can go from an empty folder to a deployed, callable agent endpoint in minutes instead of hours. Ready to see how?

Key features

The new azd ai agent extension provides several powerful features that accelerate your agent development:

  • Simplify agent project initialization: Use starter templates that include the necessary IaC to scaffold complete projects. azd creates the necessary directory structure, IaC files, and configuration so you can start building immediately.

  • Declarative configuration: Define your entire agent stack—services, resources, and model deployments—in a declarative azure.yaml file. This approach makes your agent infrastructure version-controlled, repeatable, and easy to share across teams.

  • Unified provisioning and deployment: Use the familiar azd up command to provision infrastructure and deploy your agent. azd handles the container builds, registry pushes, resource provisioning, and agent deployment automatically.

  • Agent definition management: Pull agent definitions from GitHub repositories or local paths. The azd ai agent init command analyzes the definition, updates your project configuration, and maps parameters to environment variables automatically.

  • Secure by default: azd handles, with IaC files in templates, the boilerplate security configuration for you. It automatically configures your agent with a managed identity for secure access to Azure resources, following recommended practices without manual setup.

Getting started

Let’s walk through creating and deploying your first Foundry agent using the new AI agent features in azd.

Prerequisites

Before you begin, ensure you have:

  • Azure Developer CLI (azd) (version 1.21.3 or later) installed and authenticated (azd auth login).
  • The azd ai agent extension installed (azd extension install azure.ai.agents); and, if you don’t have the extension installed, when you initialize the starter template or run azd ai agent the extension is installed automatically.
  • An Azure subscription with permissions to create resource groups and Foundry resources. Sign up for a free account at azure.com/free if you don’t have one.
  • Azure CLI (az) installed for certain operations.

Initialize with a Foundry configuration template

This extension works with an azd template and is also able to connect to an existing project using azd ai agent init --project-id [resourceID].

In this post, we’ll walk through the approach of starting with a template. It’s a basic template for Foundry Agent Service to get started quickly.

  • Basic Setup (azd-ai-starter-basic): Optimized for speed and simplicity with all essential resources

Start by initializing a new project with the Basic Setup template. In an empty folder, run:

azd init -t Azure-Samples/azd-ai-starter-basic

When prompted, provide an environment name for your agent project (for example, “my-agent-project”).

The azd init process:

  • Clones the starter template files into your project
  • Creates the directory structure with infra/ (IaC files) and src/ folders
  • Generates an azure.yaml configuration file
  • Sets up .azure/<env>/.env for environment-specific variables

Review your project structure

The initialized template includes these key files:

├── .azure/              # Environment-specific settings (.env)
├── infra/               # Bicep files for Azure infrastructure
└── azure.yaml           # Project configuration

Open the azure.yaml file to see how your agent project is configured:

# yaml-language-server: $schema=https://raw.githubusercontent.com/Azure/azure-dev/main/schemas/v1.0/azure.yaml.json
name: my-agent-project

infra:
  provider: bicep
  path: ./infra

requiredVersions:
  extensions:
    # the azd ai agent extension is required for this template
    "azure.ai.agents": latest

This declarative configuration defines your agent service and the Azure AI resources it needs, including model deployments. To read more about the azure.yaml schema, see the Azure Developer CLI documentation and azd environment variables that the agent relies on, such as AZURE_AI_ACCOUNT_NAME and AZURE_AI_PROJECT_NAME, see the azd environment variables documentation.

Initialize your agent definition

The starter template provides the project structure, but you need to add a specific agent definition. Agent definitions describe your agent’s behavior, tools, and capabilities. You can find example definitions in the Agent Framework repository.

For this walkthrough, use an agent definition manifest:

azd ai agent init -m <agent-definition-url>

As an example, you can use the following URL for a simple calculator agent:

azd ai agent init -m https://github.com/azure-ai-foundry/foundry-samples/blob/main/samples/microsoft/hosted-agents/python/calculator-agent/agent.yaml

Here’s what happens when you run azd ai agent init:

  • Downloads the agent definition YAML file into your project’s src/ directory.
  • Analyzes the agent definition to understand its requirements.
  • Updates azure.yaml with the corresponding services and configurations.
  • Maps agent parameters to environment variables.

After running azd ai agent init, review the updated azure.yaml and .env files to see how your agent is configured. The azure.yaml file looks like:

requiredVersions:
    extensions:
        azure.ai.agents: latest
        azure.ai.agents: latest
name: my-agent-project
services:
    CalculatorAgent:
        project: src/CalculatorAgent
        host: azure.ai.agent
        language: docker
        docker:
            remoteBuild: true
        config:
            container:
                resources:
                    cpu: "1"
                    memory: 2Gi
                scale:
                    maxReplicas: 3
                    minReplicas: 1
            deployments:
                - model:
                    format: OpenAI
                    name: gpt-4o-mini
                    version: "2024-07-18"
                  name: gpt-4o-mini
                  sku:
                    capacity: 10
                    name: GlobalStandard

Provision and deploy your agent

Now with your project is configured, you can deploy everything to Azure with one command:

azd up

This single command orchestrates the entire deployment workflow, from infrastructure to a live agent endpoint. Here’s what happens step-by-step:

  1. Provisions Infrastructure: Creates the Foundry account, project, and all necessary Azure resources defined in the Bicep files.
    • Pre-provision hooks inspect the agents and their dependencies, models, and other resources, then populates environment variables so that Bicep knows what to provision, including:
    • AI_PROJECT_DEPLOYMENTS (JSON): Specification of the models to deploy.
    • AI_PROJECT_CONNECTIONS (JSON): Specification of the connections to create.
    • AI_PROJECT_DEPENDENT_RESOURCES (JSON): Specification of the dependent resources.
    • ENABLE_HOSTED_AGENTS (boolean): Whether hosted agents need to be provisioned (with an ACR and CapHost).
  2. Deploys Models: Provisions the model deployments specified in azure.yaml (for example, GPT-4o-mini with the configured SKU and capacity).
  3. Builds & Pushes Container: If the agent has custom code, azd packages it into a container image and pushes it to your Azure Container Registry.
  4. Publishes Agent: Creates an Agent Application in Foundry and deploys your agent as a live, callable service.

The entire process typically completes in 5-10 minutes for a new project. When azd up finishes, there are links in the terminal output to the agent playground portal and the agent endpoint.

The agent is now live and ready to use. From an empty folder to a running agent, all with just three commands:

  1. azd init -t Azure-Samples/azd-ai-starter-basic
  2. azd ai agent init -manifest <agent-definition-url>
  3. azd up

Test your agent in Foundry

Now for the fun part—let’s make sure your agent is working.

  1. Follow the link in the terminal output. (Alternatively, open the Foundry portal and navigate to the project provisioned by azd which is the project name was displayed in the azd up output).
  2. Open the Agents section to see your deployed agent.
  3. Launch the agent in the playground and send a test query, for example: “Summarize your capabilities.”

If your agent responds successfully, congratulations! You’ve just deployed a working Foundry agent from your local development environment.

How it works: Under the hood

So, what’s happening behind the scenes when you run these azd commands? Let’s break down the how this transforms your local project into a cloud-native AI agent. The process follows a clear, logical flow from scaffolding to a secure, running service.

1. Project scaffolding and configuration

When you run azd init with a Foundry template, azd sets up a complete, well-structured project. The template includes:

  • Bicep infrastructure files in the infra/ directory that define all the necessary Foundry resources, model deployments, and networking.
  • An azure.yaml file that provides a declarative map of your services, resources, and dependencies.
  • Environment configurations in .azure/<env>/.env that store subscription IDs, resource names, and endpoints.
  • An agent.yaml file in the src/ directory that stores the environment variables the agent needs. For example:
kind: hosted
name: CalculatorAgent
description: This LangGraph agent can perform arithmetic calculations such as addition, subtraction, multiplication, and division.
metadata:
    authors:
        - migu
    example:
        - content: What is the size of France in square miles, divided by 27?
          role: user
    tags:
        - example
        - learning
protocols:
    - protocol: responses
      version: v1
environment_variables:
    - name: AZURE_OPENAI_ENDPOINT
      value: ${AZURE_OPENAI_ENDPOINT}
    - name: OPENAI_API_VERSION
      value: 2025-03-01-preview

Next, when you run azd ai agent init, the CLI:

  • Fetches the agent definition from the URL or local path you provided.
  • Parses the YAML to understand the agent’s requirements (models, tools, connections).
  • Updates azure.yaml to include the agent as a service.
  • Creates or updates the environment variables needed for the agent’s runtime.

2. Resource provisioning

The azd up command triggers all infrastructure provisioning through Azure Resource Manager. Based on your azure.yaml and Bicep files, azd:

  • Compiles your Bicep templates into ARM templates.
  • Creates the resource group in your specified Azure region.
  • Provisions the Foundry account and project.
  • Deploys the specified models to the project with your configured SKUs and capacity.

For example, if your azure.yaml specifies gpt-4o-mini with version 2024-07-18, azd creates that exact model deployment in your Foundry project. This declarative approach ensures consistency between environments, so your development, staging, and production deployments use identical configurations.

3. Container build and publishing

For agents with custom code (for custom tools, integrations, or business logic), azd handles the complete containerization workflow:

  1. Build: Packages your agent code into a Docker container image using the configuration from your project.
  2. Push: Authenticates to Azure Container Registry and pushes the image with a unique tag.
  3. Deploy: Creates an agent in the Foundry and a deployment that runs your container.

Your agent is deployed to the Foundry hosted agent service, which provides automatic scaling, managed compute, and integrated monitoring. The Agent Application becomes the stable, versionable interface for your agent, with a unique name and endpoint.

4. Identity and security

Finally, azd automatically configures secure access patterns with IaC so you don’t have to manage credentials manually:

  • Managed Identity: Your agent uses the Foundry project’s system-assigned managed identity for authenticating to other Azure resources.
  • Role Assignments: Required permissions are granted automatically (for example, giving your agent access to Azure AI services, storage, or databases) with the starter template.
  • Endpoint Security: Agent endpoints use Azure AD authentication by default to ensure only authorized users or applications can call your agent.

These security configurations in the template follow Azure recommended practices and work out of the box, giving you a secure foundation from the start.

Use cases and scenarios

Now that you have a solid understanding of how these azd features work, let’s explore how you can use them to build different kinds of agents. For complete, ready-to-deploy examples, check out:

Building conversational AI assistants

Create intelligent customer service agents that understand context, access knowledge bases, and provide personalized responses. With azd, you can:

  • Rapidly deploy multiple agent variations for A/B testing
  • Integrate agents with Azure AI Search for retrieval-augmented generation
  • Connect to business systems and APIs through custom tools
  • Version and roll back agent deployments as you iterate

Data analysis and insights agents

Build agents that analyze data, generate visualizations, and provide insights. With azd, you can:

  • Provision agents with access to Azure SQL Database or Cosmos DB.
  • Deploy specialized models for quantitative analysis.
  • Create agents that use code interpreter tools for calculations.
  • Publish agents to help learn French.

Multi-agent orchestration

Develop systems where multiple specialized agents collaborate on complex tasks:

  • Deploy a coordinator agent that routes requests to specialist agents
  • Provision each agent with different model configurations for optimal performance
  • Use the declarative azure.yaml to define agent relationships and dependencies
  • Scale individual agents independently based on workload

Enterprise agent deployment

Standardize agent development and deployment across your organization:

  • Create reusable agent blueprints that encode your organization’s best practices
  • Publish agent templates to internal catalogs for teams to consume
  • Enforce consistent security, compliance, and monitoring configurations
  • Automate agent deployment in CI/CD pipelines using azd provision and azd deploy

Advanced configuration

Once you’re comfortable with the basic workflow, you can start customizing your deployments to meet more advanced requirements.

Customizing model deployments

Your azure.yaml file gives you full control over which models get deployed. To add or change a model, edit the file:

services:
    CalculatorAgent:
        project: src/CalculatorAgent
        host: azure.ai.agent
        language: docker
        docker:
            remoteBuild: true
        config:
            container:
                resources:
                    cpu: "1"
                    memory: 2Gi
                scale:
                    maxReplicas: 3
                    minReplicas: 1
            deployments:
                - model:
                    format: OpenAI
                    name: gpt-4o-mini
                    version: "2024-07-18"
                  name: gpt-4o-mini
                  sku:
                    capacity: 10
                    name: GlobalStandard

This configuration provisions multiple models, enabling your agent to use different models for different tasks (for example, a larger model for complex reasoning and a smaller one for simple queries). When you next run azd up, it automatically deploys the new model and update your project.

Managing environment variables

Key environment variables azd sets or uses:

Variable Purpose
AZURE_SUBSCRIPTION_ID Target subscription for resources.
AZURE_RESOURCE_GROUP Resource group hosting the AI project.
AZURE_LOCATION Azure region (must support chosen models).
AZURE_AI_PROJECT_ID The full Azure resource ID of your project.
AZURE_AI_PROJECT_NAME Project hosting the agent.
AZURE_AI_PROJECT_ENDPOINT Endpoint for agent management and runtime calls.
AZURE_CONTAINER_REGISTRY_ENDPOINT Endpoint to build and push container images

These variables are stored in .azure/<environment-name>/.env and can be customized for each of your environments (for example, dev, test, and prod).

Get started today

The new azd ai agent capabilities are available in public preview as of Microsoft Ignite 2025. While it’s an early release, it’s ready for you to try today, and we’re actively evolving these features based on your feedback.

Getting involved

To get started:

Your feedback shapes these priorities, so we encourage you to share your ideas and use cases.

Start building intelligent agents today

We believe this new extension will change how you develop agents by letting you focus on what’s important: building intelligent solutions. We handled the complexity so you can get back to creating. With Foundry’s advanced capabilities and azd‘s developer-friendly workflow, you have everything you need to create, iterate, and deploy production-grade AI agents.

Install the Azure Developer CLI today and start building the next generation of AI agents.

Additional resources

Happy building!

The post Announcing the azd AI agent extension: Publish Microsoft Foundry agents directly from your development environment appeared first on Azure SDK Blog.

Read the whole story
alvinashcraft
37 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Azure Developer CLI (azd) Nov 2025 – Container Apps (GA), Layered Provisioning (Beta), Extension Framework, and Aspire 13

1 Share

Welcome to the November 2025 edition of the Azure Developer CLI (azd) release blog! This post covers releases 1.20.1 through 1.21.2. To share your feedback and questions, join the November release discussion on GitHub.

Highlights:

  • Azure Container Apps support reaches GA
  • Layered provisioning enters Beta
  • Extension framework capabilities enhanced
  • Aspire 13 compatibility added
  • New community templates available for Azure Functions, Copilot Studio, and more

New features

🚀 Extension Framework

  • Custom ServiceConfig settings: Extensions can now define custom service configuration settings to enable richer service behaviors, allowing authors to extend the service configuration model beyond the built-in properties and enabling custom workflows and integration patterns. [#6013] For example, an extension could define custom settings for specialized deployment scenarios or service-specific configurations that go beyond standard azd capabilities.
  • Enhanced no-prompt support: The prompt service now provides better support for non-interactive scenarios in extensions. When running with --no-prompt or in CI/CD environments, extensions can now gracefully handle user input requirements without failing, ensuring that azd workflows run smoothly in automated pipelines and headless environments and making extensions more reliable for production use cases. [#6073]
  • ServiceContext in lifecycle events: Extensions can now access ServiceContext in service lifecycle events, providing richer context information for event handlers and enabling extensions to make more informed decisions during service provisioning and deployment. [#6002]
  • Language framework support: Extensions can now register and provide language framework support, expanding azd‘s capabilities beyond built-in languages and opening up possibilities for supporting more programming languages and framework ecosystems through the extension model. [#5847]
  • Custom service target endpoints in azd show: The azd show command now displays endpoints from custom service targets provided by extensions, improving visibility into services deployed through extension-provided targets and giving developers a complete picture of their deployed resources regardless of the deployment method. [#6074]
  • AccountService gRPC API: azd added a new AccountService gRPC API and server implementation, providing programmatic access to account-related operations. Extensions can now integrate with account management workflows more easily. [#5995]

🚀 Aspire

  • Aspire 13 support: azd now supports Aspire 13, ensuring compatibility with the latest Aspire features and improvements. Developers can now use new Aspire capabilities while maintaining seamless integration with azd workflows. [#6093]
  • .NET 10 support in auto-generated pipeline configs: Pipeline configuration now includes .NET 10 in auto-generated templates, ensuring your CI/CD pipelines are ready for the latest .NET runtime. [#6133]
  • Aspire Dashboard URL for App Service deployments: When deploying Aspire applications to Azure App Service, azd now displays the Aspire dashboard URL, making it easier to monitor and manage your deployed applications. [#5881]

🚀 Prepublish and postpublish hooks

  • Prepublish and postpublish hooks: The v1.0 azure.yaml schema now supports prepublish and postpublish hooks, giving developers more control over the publishing workflow. These hooks enable custom pre-processing and post-processing steps during container image publishing. [#5841]

🚀 Azure Container Apps

  • Runtime environment variable management: Container Apps deployments now support the env property in azure.yaml for managing runtime environment variables. This feature allows you to merge environment variables during deployment, providing better control over your container configuration without modifying infrastructure code. [#6154]

Bugs fixed

  • Fixed Container App resource existence checks that were returning incorrect results [#5939]
  • Improved error messaging when the --environment flag is missing [#5930]
  • Fixed hyperlink ANSI escape codes appearing in non-terminal output [#6015]
  • Resolved flickering progress bar during agent deployment operations [#6032]
  • Fixed lifetime management issues with extension container helpers [#6028]
  • Environment values are now properly reloaded before loading parameters in the bicep provider [#6063]
  • Fixed remote build functionality on the agent extension [#6098]
  • Fixed project-level events not being invoked from extensions [#5964]
  • Fixed concurrent map write panics in FrameworkService [#5985]
  • Prevented index out of range panics in progressLog.Write() [#6001]
  • Fixed duplicate event registration in workflow commands [#6012]

Other changes

  • Hook warnings are now logged to the log file instead of being displayed in the console, reducing console noise while maintaining visibility for troubleshooting [#6083]
  • azd completion support added to Visual Studio Code

New docs & blog

The Azure Developer CLI documentation received several important updates this month:

In addition to these documentation updates, we published a new blog post in our “Dev to production” series. After reading our first and second blog and you wondered: “So what? I use Azure Container Apps (ACA) and azd publishes a new image every time to Azure Container Registry!” Well, we solved the problem! Check out Azure Developer CLI: Azure Container Apps Dev-to-Prod Deployment with Layered Infrastructure, and yes, as the title calls out, you also see layered provisioning (new in 1.20.0) in action!

New templates

Community-driven templates help you get started faster, solve real-world scenarios, and showcase best practices for deploying solutions with Azure Developer CLI.

The Azure Developer CLI template gallery continues to grow with exciting new contributions from the community. Thank you!

Contributor acknowledgments

Shout out to all the community contributors who helped make this release possible!

  • HadwaAbdelhalem for the Copilot Studio with Azure AI Search template!
  • JayChase for the Semantic Kernel Function App template!
  • marnixcox for the Logic Apps AI Agents and Azure AI Foundry templates!
  • Jeff Martinez for the Python dynamic sessions template!
  • Bill DeRusha for the Label Studio template!

Thank you for your contributions to the azd community!

New to azd?

If you’re new to the Azure Developer CLI, azd is an open-source command-line tool that accelerates the time it takes to get your application from local development environment to Azure. azd provides best practice, developer-friendly commands that map to key stages in your workflow, whether you’re working in the terminal, your editor or CI/CD.

Get started with azd:

The post Azure Developer CLI (azd) Nov 2025 – Container Apps (GA), Layered Provisioning (Beta), Extension Framework, and Aspire 13 appeared first on Azure SDK Blog.

Read the whole story
alvinashcraft
50 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

6 Must-Have MCP Servers (and How to Use Them)

1 Share

The era of AI agents has arrived, and with it, a new standard for how they connect to tools: the Model Context Protocol (MCP). MCP unlocks powerful, flexible workflows by letting agents tap into external tools and systems. But with thousands of MCP servers (including remote ones) now available, it’s easy to ask: Where do I even start?

I’m Oleg Šelajev, and I lead Developer Relations for AI products at Docker. I’ve been hands-on with MCP servers since the very beginning. In this post, we’ll cover what I consider to be the best MCP servers for boosting developer productivity, along with a simple, secure way to discover and run them using the Docker MCP Catalog and Toolkit.

Let’s get started.

Top MCP servers for developer productivity

Before we dive into specific servers, let’s first cover what developers should consider before incorporating these tools into their workflows. What makes a MCP server worth using? 

From our perspective, the best MCP servers (regardless of your use case) should:

  1. Come from verified, trusted sources to reduce MCP security risk
  2. Easily connect to existing tools and fit into your workflow.
  3. Have real productivity payoff (whether it’s note-taking, fetching web content, or keeping your AI agents honest with additional context from trusted libraries). 

With that in mind, here are six MCP servers we’d consider must-haves for developers looking to boost their everyday productivity.

1. Context7 – Enhancing AI coding accuracy

  • What it is: Context7 is a powerful MCP tool specifically designed to make AI agents better at coding.
  • How it’s used with Docker: Add the Context7 MCP server by clicking on the tile in Docker Toolkit or use the CLI command docker mcp server enable context7.
  • Why we use it: It solves the “AI hallucination” problem. When an agent is working on code, Context7 injects up-to-date, version-specific documentation and code examples directly into the prompt. This means the agent gets accurate information from the actual libraries we’re using, not from stale training data.

2. Obsidian – Smarter note-taking and project management

  • What it is: Obsidian is a powerful, local-first knowledge base and note-taking app.
  • How it’s used with Docker: While Obsidian itself is a desktop app, install the community plugin that enables the local REST API. And then configure the MCP server to talk to that localhost endpoint. 
  • Why we use it: It gives us all the power of Obsidian to our AI assistants. Note-taking and accessing your prior memories has never been easier. 
  • Here’s a video on how you can use it.

3. DuckDuckGo – Bringing search capabilities to coding agents 

  • What it is: This is an MCP server for the DuckDuckGo search engine.
  • How it’s used with Docker: Simply enable the DuckDuckGo server in the MCP Toolkit or CLI.
  • Why we use it: It provides a secure and straightforward way for our AI agents to perform web searches and fetch content from URLs. If you’re using coding assistants like Claude Code or Gemini CLI, they know how to do it with built-in functionalities, but if your entry point is something more custom, like an application with an AI component, giving them access to a reliable search engine is fantastic. 

4. Docker Hub – Exploring the world’s largest artifact repository

  • What it is: An MCP server from Docker that allows your AI to fetch info from the largest artifact repository in the world! 
  • How it’s used with Docker: You need to provide the personal access token and the username that you use to connect to Docker Hub. But enabling this server in the MCP toolkit is as easy as just clicking some buttons.
  • Why we use it: From working with Docker Hardened Images to checking the repositories and which versions of Docker images you can use, accessing Docker Hub gives AI the power to tap into the largest artifact repository with ease. 
  • Here’s a video of updating a Docker Hub repository info automatically from the GitHub repo

The powerful duo: GitHub + Notion MCP servers – turning customer feedback into actionable dev tasks

Some tools are just better together. When it comes to empowering AI coding agents, GitHub and Notion make a particularly powerful pair. These two MCP servers unlock seamless access to your codebase and knowledge base, giving agents the ability to reason across both technical and product contexts.

Whether it’s triaging issues, scanning PRs, or turning customer feedback into dev tasks, this combo lets developer agents move fluidly between source code and team documentation, all with just a few simple setup steps in Docker’s MCP Toolkit.

Let’s break down how these two servers work, why we love them, and how you can start using them today.

5. GitHub-official

  • What it is: This refers to the official GitHub server, which allows AI agents to interact with GitHub repositories.
  • How it’s used with Docker: Enabled via the MCP Toolkit, this server connects your agent to GitHub for tasks like reading issues, checking PRs, or even writing code. Either use a personal access token or log in via OAuth. 
  • Why we use it: GitHub is an essential tool in almost any developer’s toolbelt. From surfing the issues in the repositories you work on to checking if the errors you see are documented in the repo. GitHub MCP server gives AI coding agents incredible power!

6. Notion

  • What it is: Notion actually has two MCP servers in the catalog. A remote MCP server hosted by Notion itself, and a containerized version. In any case, if you’re using Notion, enabling AI to access your knowledge base has never been easier. 
  • How it’s used with Docker: Enable the MCP server, provide an integration token, or log in via OAuth if you choose to use the remote server.
  • Why we use it: It provides an easy way to, for example, plow through the customer feedback and create issues for developers. In any case, plugging your knowledge base into AI leads to almost unlimited power. 

Here’s a video where you can see how Notion and GitHub MCP servers work perfectly together. 

Getting started with MCP servers made easy 

While MCP unlocks powerful new workflows, it also introduces new complexities and security risks. How do developers manage all these new MCP servers? How do they ensure they’re configured correctly and, most importantly, securely?

This focus on a trusted, secure foundation is precisely why partners like E2B chose the Docker MCP Catalog to be the provider for their secure AI agent sandboxes. The MCP Catalog now hosts over 270+ MCP servers, including popular remote servers

The security risks aren’t theoretical; our own “MCP Horror Stories” blog series documents the attacks that are already happening. The series, the latest episode of which, the “Local Host Breach” (CVE-2025-49596), details how vulnerabilities in this new ecosystem can lead to full system compromise. The MCP Toolkit directly combats these threats with features like container isolation, signed image verification from the catalog, and an intelligent gateway that can intercept and block malicious requests before they ever reach your tools.

This is where the Docker MCP Toolkit comes in. It provides a comprehensive solution that gives you:

  • Server Isolation: Each MCP server runs in its own sandboxed container, preventing a breach in one tool from compromising your host machine or other services.
  • Convenient Configuration: The ToolKit offers a central place to configure all your servers, manage tokens, and handle OAuth flows, dramatically simplifying setup and maintenance.
  • Advanced Security: It’s designed to overcome the most common and dangerous attacks against MCP.
Docker Desktop UI showing MCP Toolkit with enabled servers (Context7, DuckDuckGo, GitHub, Notion, Docker Hub).

Figure 1: Docker Desktop UI showing MCP Toolkit with enabled servers (Context7, DuckDuckGo, GitHub, Notion, Docker Hub).

Find MCP servers that work best for you

This list, from private knowledge bases like Obsidian to global repositories like Docker Hub and essential tools like GitHub, is just a glimpse of what’s possible when you securely and reliably connect your AI agents to the tools you use every day.

The Docker MCP Toolkit is your central hub for this new ecosystem. It provides the essential isolation, configuration, and security to experiment and build with confidence, knowing you’re protected from the various real threats.

This is just our list of favorites, but the ecosystem is growing every day.

We invite you to explore the full Docker MCP Catalog to discover all the available servers that can supercharge your AI workflows. Get started with the Docker MCP Toolkit today and take control of your AI tool interactions.

We also want to hear from you: Explore the Docker MCP Catalog and tell us what are your must-have MCP servers? What amazing tool combinations have you built? Let us know in our community channel!

Learn more

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories