Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147054 stories
·
32 followers

AI agents for your digital chores

1 Share
Ryan welcomes Dhruv Batra, co-founder and chief scientist at Yutori, to explore the future of AI agents, how AI usage is changing the way people interact with advertisements and the web as a whole, and the challenges that proactive AI agents may face when being integrated into workflows and personal internet use.
Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Transition to Azure Functions V2 on Azure Container Apps

1 Share

Introduction

Azure Functions on Azure Container Apps lets you run serverless functions in a flexible, scalable container environment. As the platform evolves, there are two mechanisms to deploy Images with Function Programming Model as Azure Container apps:

  • Functions V1 (Legacy Microsoft.Web RP Model)
  • Functions V2 (New Microsoft.App RP Model)

V2 is latest and recommended approach to host Functions on Azure Container Apps. In this article, we will learn about differences in different approaches and how you can transition to V2 model.

V1 Limitations (Legacy approach)

Function App Deployments has limited functionality and experience therefore transition to V2 is encouraged. Below are the limitations of V1

Troubleshooting Limitations

  • Direct container access and real-time log viewing are not supported.
  • Console access and live log output are restricted due to system-generated container configurations.
  • Low-level diagnostics are available via Log Analytics, while application-level logs can be accessed through Application Insights.

Portal UI experience limitations

  • Lacks experiences on multi revision, easy auth, health probes, custom domain

DAPR Integration Challenges

  • Compatibility issues with DAPR with .NET isolated functions, particularly during build processes due to dependency conflicts.

Functions V2 (Improved and Recommended)

Deployment with -–kind=functionapp using Microsoft.App RP reflects the newer approach of deployment (Function on Container Apps V2) 

Simplified Resource Management internally: Instead of relying on proxy Function App (as in the V1 model), V2 provisions a native Azure Container App resource directly. This shift eliminates the need to dual-resource management previously involving both the proxy and the container app, thereby simplifying operations by consolidating everything into a single, standalone resource.

Feature rich and fully native: As a result, V2 brings the native features of Azure Container Apps for the images deployed with Azure functions programming model, including

Since V2 contains significant upgrade on experience and functionality, it’s recommended to transition to V2 from existing V1 deployments.

Legacy Direct Function image deployment approach 

Some customers continue to deploy Function images as standard container apps (without kind=functionapp) using the Microsoft.App resource provider. While this method enables access to native Container Apps features, it comes with key limitations: 

  • Not officially supported.
  • No auto-scale rules — manual configuration required
  • No access to new V2 capabilities in roadmap (e.g., List Functions, Function Keys, Invocation Count) 

Recommendation: Transition to Functions on Container Apps V2, which offers a significantly improved experience and enhanced functionality. 

Checklist for the transitioning to Functions V2 on Azure Container Apps 

Below is the transition guide

1. Preparation

  • Identify your current deployment: Confirm you are running Functions V1 (Web RP) in Azure Container Apps
  • Locate your container image: Ensure you have access to the container image used in your V1 deployment.
  • Document configuration: Record all environment variables, secrets, storage account connections, and networking settings from your existing app.
  • Check Azure Container Apps environment quotas: Review memory, CPU, and instance limits for your Azure Container Apps environment. Request quota increases if needed.

2. Create the New V2 App

  • Create a new Container App with kind=functionapp:
    • Use the Azure Portal (“Optimize for Functions app” option)
    • Or use the CLI (az functionapp create) and specify your existing container image.
    • See detailed guide for creating Functions on Container apps V2 

 

  • No code changes required: You can use the same container image you used for V1-no need to modify your Functions code or rebuild your image.
  • Replicate configuration: Apply all environment variables, secrets, and settings from your previous deployment.

3. Validation

  • Test function triggers: Confirm all triggers (HTTP, Event Hub, Service Bus, etc.) work as expected.
  • Test all integrations: Validate connections to databases, storage, and other Azure services.

4. DNS and Custom Domain Updates (optional)

  • Review DNS names: The new V2 app will have a different default DNS name than your v1 app.
  • Update custom domains:
    • If you use a custom domain (e.g., api.yourcompany.com), update your DNS records (CNAME or A record) to point to the new V2 app’s endpoint after validation.
    • Re-bind or update SSL/TLS certificates as needed.
  • Notify Users and Stakeholders: Inform anyone who accesses the app directly about the DNS or endpoint change.
  • Test endpoint: Ensure the new DNS or custom domain correctly routes traffic to the V2 app.

5. Cutover

  • Switch production traffic: Once validated, update DNS, endpoints, or routing to direct traffic to the new V2 app.
  • Monitor for issues: Closely monitor the new deployment for errors, latency, or scaling anomalies.
  • Communicate with stakeholders: Notify your team and users about the transition and any expected changes.

6. Cleanup

  • Remove the old V1 app: Delete the previous V1 deployment to avoid duplication and unnecessary costs.
  • Update documentation: Record the new deployment details, configuration, and any lessons learned

Feedback and Support

We’re continuously improving Functions on Container Apps V2 and welcome your input. 

  • Share Feedback: Let us know what’s working well and what could be better.
  • Submit an issue or a feature request to the Azure Container Apps GitHub repo
  • Support Channels 

Your feedback helps shape the future of serverless on containers—thank you for being part of the journey! 

Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

How .NET Garbage Collector works (and when you should care)

1 Share

This blog post is originally published on https://blog.elmah.io/how-net-garbage-collector-works-and-when-you-should-care/

In the world of .NET, memory management is an important aspect of any application. Fortunately, you don't have to shoulder this immense task yourself. .NET handles it with the superpower of the Garbage Collector (GC). A GC is an engine that keeps your app fast, responsive, and resource-efficient. Although on a surface level, you don't need to know much about everything going on below your brackets, it is better to understand how memory management works in your application. In this blog, we will discuss the GC and explore ways to better harness its capabilities.

How .NET Garbage Collector works (and when you should care)

What is a Garbage Collector in .NET?

The GC is a component of the Common Language Runtime (CLR) that performs automatic memory management. The GC allocates memory to a managed program and releases it when it is no longer needed. The automatic management relieves developers from writing code for memory management tasks and ensures unused objects do not consume memory indefinitely.

Any application uses memory for storing data and objects. Operations such as variable declaration, data fetching, file streaming, and buffer initialization are stored in memory. As we know, these operations can be frequent in any application. However, once these objects are no longer in use, we need to reclaim their memory, because any client or server has limited resources. If memory is not freed, all these unused objects accumulate, leading the program to a memory leak. Developers have to handle this situation manually by writing memory-clearing programs. However, this is a tiresome process because applications frequently use variables and other memory objects. A big chunk of the program would be dealing with the deallocation of memory. Besides its time consumption, manual memory release often requires pointer tweaking, and any mishaps in the pointers can crash the application. You may face a double free situation where releasing the same object multiple times can cause undefined behaviour.

This is where the GC comes to the rescue and handles the entire process of allocation and deallocation automatically.

What is the managed heap?

The managed heap is a segment of memory to store and manage objects allocated by the CLR for a process. It's a key part of NET's automatic memory management system, which is handled by the GC. The managed heap is the working ground of a GC.

Phases in Garbage Collection

1. Marking Phase

In the marking phase, the GC scans memory to identify live objects that are still reachable (referenced) from your program's active code. The GC then lists all live objects and marks unreferenced objects as garbage.

2. Relocating Phase

Now the GC updates the references of live objects, so the pointers will remain valid after all these objects are moved in memory later.

3. Compacting Phase

Here, GC reclaims heap memory from dead objects and compacts live objects together to eliminate gaps in the heap. Compaction of live objects reduces fragmentation and enables new memory allocation faster.

Heap generations in Garbage Collection

The GC in .NET follows a generational approach that organizes objects in the managed heap based on their lifespan. The GC uses this division because compacting a portion of the managed heap is faster than compacting the entire heap. Also, most of the garbage consists of short-lived objects such as local variables and lists, so a generational approach is practical too.

Generation 0

Generation 0 contains short-lived objects such as local variables, strings inside loops, etc. Since most objects in the application are short-lived, they are reclaimed for garbage collection in generation 0 and don't survive to the next generation. If the application needs to create new objects while the heap is full, the GC performs a collection to free address space for the new object. In Generation 0, garbage collection frequently reclaims memory from unused objects of the executed methods.  

Generation 1

Generation 1 is the buffer zone between Generation 0 and 2. Objects that survived multiple GC cycles are promoted in this generation. Temporary but reused data structures, which are used by multiple methods, have a shorter lifetime than the application itself. The usual timespan of Generation 1 objects spans seconds to minutes. Naturally, these objects do not need to become unnecessary too early like Generation 0 objects. Hence, the GC performs collection less often than Generation 0, maintaining a balance between performance and memory use.

Generation 2

Generation 2 contains long-lived data whose lifespan can be as long as the application lifetime. Survivors of multiple collections end up in Generation 2, such as singleton interfaces, Static collections, application caches, large objects, or Dependency Injection container services.

Unmanaged resources

Most of your application relies on the GC for memory deallocation. However, unmanaged resources require explicit cleanup. The most prominent example of unmanaged resources is an object that wraps an operating system resource, such as a file handle, window handle, or network connection. Objects like FileStream, StreamReader, StreamWriter, SqlConnection, SqlCommand, NetworkStream, and SmtpClient encapsulate an unmanaged resource, and the GC doesn't have specific knowledge about how to clean up the resource. We have to call them in using a block that calls Dispose() to release their unmanaged handles properly. You can also write code to place the Dispose() method when needed.

An example of using a block is below

using (var resource = new FileResource("mytext.txt"))
{
    resource.Write("Hello!");
} // Dispose() automatically called here

.NET Garbage Collector best practices to boost app performance

Limit Large Object Heap

In .NET, any object larger than 85000 bytes is allocated on the Large Object Heap (LOH) instead of the normal managed heap. The GC does not compact the LOH because copying large objects imposes a performance penalty. This can lead to fragmentation and wasted space. Their cleanup is expensive since it is performed in Generation 2. Large JSON serialisation results, large collections, data buffers, and image byte arrays are common examples of LOH. Try to limit the usage of such objects, and if it is not practical to fully avoid them, go for reusing them rather than creating them multiple times separately.

For example:

// ❌ BAD: Creates new 1MB buffer each call
void Process()
{
    byte[] buffer = new byte[1024 * 1024];
    // use buffer
}

// ✅ GOOD: Reuse buffer
static byte[] sharedBuffer = new byte[1024 * 1024];
void Process()
{
    // reuse sharedBuffer safely
}

Minimize unnecessary object allocations

Be careful for short-lived objects as well. Although they are collected in Generation 0 but it still increases the collector's workload. Avoid creating variables repeatedly inside frequently called methods. Instead, reuse these objects when possible.

// ❌ Avoid
for (int i = 0; i < 10000; i++)
    var sb = new StringBuilder();

// ✅ Better
var sb = new StringBuilder();
for (int i = 0; i < 10000; i++)
    sb.Clear();

Use value types (structs) wisely

For small, short-lived data, opt for value types, as they live on the stack and are auto-cleaned. That way, you can save GC cycles and improve the application's speed. To know more about value types, check out Exploring C# Records and Their Use Cases.

Avoid long-lived object references

Long-lived references are promoted to the Generation 2 heap. That means they occupy memory longer, slowing down GC and increasing overall memory usage. Remove references (set to null, clear collections) once objects are no longer needed.

As in the code:

// ❌ Bad: Keeping large object references alive unintentionally
static List<byte[]> cache = new List<byte[]>();

void LoadData()
{
    cache.Add(new byte[1024 * 1024]); // never cleared
}

// ✅ Better: Clear or dereference when not needed
cache.Clear();

Cache Intelligently

While cache can unload the application and improve performance. But overusing caches can fill the heap with long-lived Generation 2 objects. Only cache data where necessary, and if you use the MemoryCache, fine-tune expiration/size limits.

Avoid memory leaks (event and static references)

Unsubscribed event handlers or long-lived static lists can keep objects alive in the Generation 2 heap for a long time.

Example:

// ❌ Potential leak
publisher.Event += subscriber.Handler;

// ✅ Unsubscribe when done
publisher.Event -= subscriber.Handler;

Conclusion

The .NET GC is not just a memory sweeper but an unsung hero working round-the-clock to reclaim the heap space and keep your applications alive and efficient. It examines dead objects in memory and releases their memory to keep the resource from being overburdened and fragmentation. In this post, we walked through the different phases of the GC and learned about unmanaged resources. Finally, we went through some tips to make the GC work better. Garbage collection is a huge area that we only scratched the surface of in this post.



Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Leveling up your deployment pipelines

1 Share

Our Platform Engineering Pulse report gathered a list of features organizations commonly add to their internal developer platforms. We grouped sample platforms into common feature collections and found that platform teams follow similar patterns when implementing deployment pipelines.

They begin by creating an end-to-end deployment pipeline that automates the flow of changes to production and establishes monitoring. Next, they add security concerns into the pipeline to scan for vulnerabilities and manage secrets. This eventually leads to a DevOps pipeline, which adds documentation and additional automation.

You can use this pragmatic evolution of CI/CD pipelines as a benchmark and a source of inspiration for platform teams. It’s like a natural maturity model that has been discovered through practice, rather than one that has been designed upfront.

Stage 1: Deployment pipeline

The initial concern for platform teams is to establish a complete deployment pipeline, allowing changes to flow to production with high levels of automation. Although the goal is a complete yet minimal CI/CD process, it’s reassuring to see that both test automation and monitoring are frequently present at this early stage.

Early deployment pipelines involve integrating several tools, but these tools are designed to work together, making the integration quick and easy. Build servers, artifact management tools, deployment tools, and monitoring tools have low barriers to entry and lightweight touchpoints, so they feel very unified, even when provided by a mix of different tool vendors or open-source options.

In fact, when teams attempt to simplify the toolchain by using a single tool for everything, they often end up with more complexity, as tool separation enforces a good pipeline architecture.

Deployment pipeline with stages for build, test, artifact management, deployment, and monitoring

Builds

Building an application from source code involves compiling code, linking libraries, and bundling resources so you can run the software on a target platform. While this may not be the most challenging task for software teams, build processes can become complex and require troubleshooting that takes time away from feature development.

When a team rarely changes its build process, it tends to be less familiar with the tools it uses. It may not be aware of features that could improve build performance, such as dependency caching or parallelization.

Test automation

To shorten feedback loops, it is essential to undertake all types of testing continuously. This means you need fast and reliable test automation suites. You should cover functional, security, and performance tests within your deployment pipeline.

You must also consider how to manage test data as part of your test automation strategy. The ability to set up data in a known state will help you make tests less flaky. Test automation enables developers to identify issues early, reduces team burnout, and enhances software stability.

Artifact management

Your Continuous Integration (CI) process creates a validated build artifact that should be the canonical representation of the software version. An artifact repository ensures only one artifact exists for each version and allows tools to retrieve that version when needed.

Deployment automation

Even at the easier end of the complexity scale, deployments are risky and often stressful. Copying the artifact, updating the configuration, migrating the database, and performing related tasks present numerous opportunities for mistakes or unexpected outcomes.

When teams have more complex deployments or need to deploy at scale, the risk, impact, and stress increase.

Monitoring and observability

While test automation covers a suite of expected scenarios, monitoring and observability help you expand your view to the entirety of your real-world software use. Monitoring implementations tend to start with resource usage metrics, but mature into measuring software from the customer and business perspective.

The ability to view information-rich logs can help you understand how faults occur, allowing you to design a more robust system.

Stage 2: Secure pipeline

The natural next step for platform teams is to integrate security directly into the deployment pipeline. At this stage, teams add security scanning to automatically check for code weaknesses and vulnerable dependencies, alongside secrets management to consolidate how credentials and API keys are stored and rotated.

This shift is significant because security measures are now addressed earlier in the pipeline, reducing the risk of incidents in production. Rather than treating security as a separate concern that happens after development, it becomes part of the continuous feedback loop.

Security integration at this stage typically involves adding new tools to the existing pipeline, with well-defined touchpoints and clear interfaces. Security scanners and secrets management tools are designed to integrate with CI/CD systems, making the additions feel like natural extensions of the deployment pipeline rather than disruptive changes.

Two stages have been added to the pipeline for security scanning and secrets management

![][image2]

Security scanning

While everyone should take responsibility for software security, having automated scanning available within a deployment pipeline can help ensure security isn’t forgotten or delayed. Automated scanning can provide developers with rapid feedback.

You can supplement automated scanning with security reviews and close collaboration with information security teams.

Secrets management

Most software systems must connect securely to data stores, APIs, and other services. The ability to store secrets in a single location prevents the ripple effect when a secret is rotated. Instead of updating many tools with a new API key, you can manage the change centrally with a secret store.

When you deploy an application, you usually have to apply the correct secrets based on the environment or other characteristics of the deployment target.

Stage 3: DevOps pipeline

The DevOps pipeline represents a shift from building deployment infrastructure to accelerating developer productivity. At this stage, platform teams add documentation capabilities, infrastructure automation, and one-click setup for new projects. These features focus on removing friction from the developer experience.

The impact of this stage is felt most strongly at the start of new projects and when onboarding new team members. Instead of spending days or weeks on boilerplate setup, teams get a walking skeleton that fast-forwards them directly to writing their first unit test.

While the earlier stages focused on moving code through the pipeline efficiently and securely, this stage is about making the pipeline itself easy to replicate and understand. The automation added here helps teams maintain consistency across projects while giving developers the freedom to focus on features rather than configuration.

Three more stages have been added for documentation, one-click setup, and infrastructure automation

Documentation

To provide documentation as a service to teams, you may either supply a platform for storing and finding documentation or use automation to extract documentation from APIs, creating a service directory for your organization.

For documentation to be successful, it must be clear, well-organized, up-to-date, and easily accessible.

One-click setup for new projects

When setting up a new project, several boilerplate tasks are required to configure a source code repository, establish a project template, configure deployment pipelines, and set up associated tools. Teams often have established standards, but manual setup means projects unintentionally drift from the target setup.

One-click automation helps teams set up a walking skeleton with sample test projects, builds, and deployment automation. This ensures a consistent baseline and speeds up the time to start writing meaningful code.

Infrastructure automation

Traditional ClickOps infrastructure is hand-crafted and often drifts from the intended configuration over time. Environments may be set up differently, which means problems surface only in one environment and not another. Equally, two servers in the same environment with the same intended purpose may be configured differently, making troubleshooting problems more challenging.

Infrastructure automation solves these problems, making it easier to create new environments, spin up and tear down ephemeral (temporary) environments, and recover from major faults.

Evolving your platform’s pipelines

Whether you choose to introduce features according to this pattern or decide to approach things differently, it is advisable to take an evolutionary approach. Delivering a working solution that covers the flow of changes from commit to production brings early value. The evolution of the platform enhances the flow and incorporates broader concerns.

Your organization may have additional compliance or regulatory requirements that could become part of a “compliant pipeline”, or you may have a heavyweight change approval process you could streamline with an “approved pipeline”.

Regardless of the requirements you choose to bring under the platform’s capabilities, you’ll be more successful if you deliver a working pipeline and evolve it to add additional features.

Happy deployments!

Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Deprecating support for TLS 1.0 and 1.1

1 Share

Transport Layer Security (TLS) 1.0 and 1.1 are legacy cryptographic protocols that first appeared in 1999 and 2006, respectively. These protocols contain known security vulnerabilities, and more secure versions have superseded them, particularly TLS 1.2 (2008) and TLS 1.3 (2018).

Microsoft has progressively phased out support for TLS 1.0 and 1.1 across Windows Server operating systems:

  • Windows Server 2019 and later: Disables TLS 1.0 and 1.1 by default
  • Windows Server 2016: Allows you to disable TLS 1.0 and 1.1 via registry settings
  • Windows Server 2012 R2: Requires updates to support TLS 1.2 as the default protocol
  • Windows Server 2012: Requires specific updates to support TLS 1.2

We’re following Microsoft’s recommendation by deferring TLS version selection to the Operating System. This approach prevents systems that don’t enable legacy protocols by default from using them.

Impact on Octopus Cloud customers

We’re removing support for these legacy protocols on Octopus Cloud to enhance security. This change will affect Tentacles on older operating systems that don’t support TLS 1.2+.

Tentacles affected by this change include those running on:

These Tentacles will need TLS 1.2+ support to maintain secure connections and continue deployments.

This will also affect newer Operating Systems if you have explicitly disabled TLS 1.2 or 1.3. If affected, you’ll need to re-enable TLS 1.2 or 1.3.

Impact on self-hosted customers using Linux Docker

Our upgrade to Debian 12 in January 2026 will also affect customers using our official Linux Docker image. Like Octopus Cloud, your Tentacles will need TLS 1.2+ support to connect to your Octopus Server.

Impact on self-hosted customers using Windows

Self-hosted customers running Octopus Server on Windows won’t see direct changes to their server. However, your Operating System configuration determines your TLS version availability, so you may already use TLS 1.2+ only.

Most Windows Server 2016+ installations already use TLS 1.2+ by default, so you’re likely already prepared.

Customer support and monitoring

For Octopus Cloud customers: We’re monitoring Octopus Cloud for usages of TLS 1.0 and 1.1, and will reach out to affected customers.

For self-hosted customers: To ensure you’re prepared, please review your environment for TLS 1.0/1.1 dependencies before the January 2026 timeline. This step will help you identify and address any compatibility requirements early.

If you believe your organization may be affected, or if you have questions about TLS protocol support, please don’t hesitate to contact our support team for assistance.

What you can do

To keep your systems connected, you have several options:

Recommended approach for all customers:

  • Upgrade your operating system to a supported version (Windows Server 2016 or later, recent Linux distributions)
  • Update your Tentacle to the latest version, which includes enhanced TLS support
  • Review external integrations to ensure they support TLS 1.2 or higher

Alternative options for specific systems:

How to check your current setup:

  • External service support: Most modern services already support TLS 1.2+, but you can test connections or contact service providers to confirm
  • Operating System TLS: Windows Server 2016+ and modern Linux distributions enable TLS 1.2+ by default. Older operating systems, such as Windows Server 2012/2012 R2, may require security updates to enable TLS 1.2. Since Tentacle uses your OS’s TLS capabilities, ensuring your OS supports TLS 1.2+ is the key step for compatibility

Deprecation timeline

PeriodOctopus CloudSelf-Hosted Docker
October - November 2025We’ll monitor for usages of TLS 1.0/1.1Customers should assess their environments
Mid-November 2025We’ll disable TLS 1.0/1.1 on Octopus Cloud (with accommodations for affected customers)No immediate change
December 2025We’ll continue to track and help affected customersCustomers should continue preparation
January 2026Octopus Cloud will use TLS 1.2+ onlyWe’ll upgrade the official Docker image to Debian 12, supporting TLS 1.2+ only

Note: We may adjust this timeline based on customer impact analysis and feedback. We’re committed to providing adequate notice and support throughout the transition process.

Summary

Removing support for these outdated protocols brings us in line with modern security standards. Most customers won’t be affected, but if you’re running older systems, now’s the time to plan your upgrade.

Key takeaways:

  • Octopus Cloud customers will see us disable TLS 1.0/1.1 from mid-November 2025, with complete removal by January 2026
  • Self-hosted Docker customers will experience changes when we upgrade the official image to Debian 12 in January 2026
  • Self-hosted Windows customers will continue to work as before

The best fix is upgrading to modern operating systems with built-in TLS 1.2+ support. If you need more time, apply security patches and enable TLS 1.2 as a temporary measure.

Our support team is here to help throughout this transition. If you have concerns about your environment or need help with remediation, please reach out early so we can work together to ensure a smooth migration.

Happy deployments!

Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Docker Model Runner on the new NVIDIA DGX Spark: a new paradigm for developing AI locally

1 Share

We’re thrilled to bring NVIDIA DGX™ Spark support to Docker Model Runner. The new NVIDIA DGX Spark delivers incredible performance, and Docker Model Runner makes it accessible. With Model Runner, you can easily run and iterate on larger models right on your local machine, using the same intuitive Docker experience you already trust.

In this post, we’ll show how DGX Spark and Docker Model Runner work together to make local model development faster and simpler, covering the unboxing experience, how to set up Model Runner, and how to use it in real-world developer workflows.

What is NVIDIA DGX Spark

NVIDIA DGX Spark is the newest member of the DGX family: a compact, workstation-class AI system, powered by the Grace Blackwell GB10 Superchip  that delivers incredible  performance for local model development. Designed for researchers and developers, it makes prototyping, fine-tuning, and serving large models fast and effortless, all without relying on the cloud.

Here at Docker, we were fortunate to get a preproduction version of  DGX Spark. And yes, it’s every bit as impressive in person as it looks in NVIDIA’s launch materials.

Why Run Local AI Models and How Docker Model Runner and NVIDIA DGX Spark Make It Easy 

Many of us at Docker and across the broader developer community are experimenting with local AI models. Running locally has clear advantages:

  • Data privacy and control: no external API calls; everything stays on your machine
  • Offline availability: work from anywhere, even when you’re disconnected
  •  Ease of customization: experiment with prompts, adapters, or fine-tuned variants without relying on remote infrastructure

But there are also familiar tradeoffs:

  • Local GPUs and memory can be limiting for large models
  • Setting up CUDA, runtimes, and dependencies often eats time
  • Managing security and isolation for AI workloads can be complex

This is where DGX Spark and Docker Model Runner (DMR) shine. DMR provides an easy and secure way to run AI models in a sandboxed environment, fully integrated with Docker Desktop or Docker Engine. When combined with the DGX Spark’s NVIDIA AI software stack and large 128GB unified memory, you get the best of both worlds: plug-and-play GPU acceleration and Docker-level simplicity.

Unboxing NVIDIA DGX Spark

The device arrived well-packaged, sleek, and surprisingly small, resembling more a mini-workstation than a server.

Setup was refreshingly straightforward: plug in power, network, and peripherals, then boot into NVIDIA DGX OS, which includes NVIDIA drivers, CUDA, and AI software stack pre-installed.

Nividia 1

Once on the network, enabling SSH access makes it easy to integrate the Spark into your existing workflow.

This way, the DGX Spark becomes an AI co-processor for your everyday development environment, augmenting, not replacing, your primary machine.

Getting Started with Docker Model Runner on NVIDIA DGX Spark

Installing Docker Model Runner on the DGX Spark is simple and can be done in a matter of minutes.

1. Verify Docker CE is Installed

DGX OS comes with Docker Engine (CE) preinstalled. Confirm you have it:

docker version

If it’s missing or outdated, install following the regular Ubuntu installation instructions.

2. Install the Docker Model CLI Plugin

The Model Runner CLI is distributed as a Debian package via Docker’s apt repository. Once the repository is configured (see linked instructions above) install via the following commands:

sudo apt-get update
sudo apt-get apt-get install docker-model-plugin

Or use Docker’s handy installation script:

curl -fsSL https://get.docker.com | sudo bash

You can confirm it’s installed with:

docker model version

3. Pull and Run a Model

Now that the plugin is installed, let’s pull a model from the Docker Hub AI Catalog. For example, the Qwen 3 Coder model:

docker model pull ai/qwen3-coder

The Model Runner container will automatically expose an OpenAI-compatible endpoint at:

http://localhost:12434/engines/v1

You can verify it’s live with a quick test:

# Test via API

curl http://localhost:12434/engines/v1/chat/completions   -H 'Content-Type: application/json'   -d 
'{"model":"ai/qwen3-coder","messages":[{"role":"user","content":"Hello!"}]}'

# Or via CLI
docker model run ai/qwen3-coder

GPUs are allocated to the Model Runner container via nvidia-container-runtime and the Model Runner will take advantage of any available GPUs automatically. To see GPU usage:

nvidia-smi

4. Architecture Overview

Here’s what’s happening under the hood:

[ DGX Spark Hardware (GPU + Grace CPU) ]

             │

     (NVIDIA Container Runtime)

             │

     [ Docker Engine (CE) ]

             │

     [ Docker Model Runner Container ]

             │

     OpenAI-compatible API :12434

The NVIDIA Container Runtime bridges the NVIDIA GB10 Grace Blackwell Superchip drivers and Docker Engine, so containers can access CUDA directly. Docker Model Runner then runs inside its own container, managing the model lifecycle and providing the standard OpenAI API endpoint. (For more info on Model Runner architecture, see this blog).

From a developer’s perspective, interact with models similarly to any other Dockerized service — docker model pull, list, inspect, and run all work out of the box.

Using Local Models in Your Daily Workflows

If you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.

1. Forward the DMR Port (for Model Access)

To access the DGX Spark via SSH first set up an SSH server:

Using Local Models in Your Daily Workflows
If you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.

sudo apt install openssh-server
sudo systemctl enable --now ssh

Run the following command to access Model Runner via your local machine. Replace user with the username you configured when you first booted the DGX Spark and replace dgx-spark.local with the IP address of the DGX Spark on your local network or a hostname configured in /etc/hosts. 

ssh -N -L localhost:12435:localhost:12434 user@dgx-spark.local


This forwards the Model Runner API from the DGX Spark to your local machine.
Now, in your IDE, CLI tool, or app that expects an OpenAI-compatible API, just point it to:

http://localhost:12435/engines/v1/models

Set the model name (e.g. ai/qwen3-coder) and you’re ready to use local inference seamlessly.

2. Forward the DGX Dashboard Port (for Monitoring)

The DGX Spark exposes a lightweight browser dashboard showing real-time GPU, memory, and thermal stats, usually served locally at:

http://localhost:11000

You can forward it through the same SSH session or a separate one:

ssh -N -L localhost:11000:localhost:11000 user@dgx-spark.local

Then open http://localhost:11000 in your browser on your main workstation to monitor the DGX Spark performance while running your models.

Nividia 2



This combination makes the DGX Spark feel like a remote, GPU-powered extension of your development environment. Your IDE or tools still live on your laptop, while model execution and resource-heavy workloads happen securely on the Spark.

Example application: Configuring Opencode with Qwen3-Coder


Let’s make this concrete.

Suppose you use OpenCode, an open-source, terminal-based AI coding agent.

Once your DGX Spark is running Docker Model Runner with ai/qwen3-coder pulled and the port is forwarded, you can configure OpenCode by adding the following to ~/.config/opencode/opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12435/engines/v1"   // DMR’s OpenAI-compatible base
      },
      "models": {
        "ai/qwen3-coder": { "name": "Qwen3 Coder" }
      }
    }
  },
  "model": "ai/qwen3-coder"
}


Now run opencode and select Qwen3 Coder with the /models command:

Nividia 3


That’s it! Completions and chat requests will be routed through Docker Model Runner on your DGX Spark, meaning Qwen3-Coder now powers your agentic development experience locally.

Nividia 4 1


You can verify that the model is running by opening http://localhost:11000 (the DGX dashboard) to watch GPU utilization in real time while coding.
This setup lets you:

  • Keep your laptop light while leveraging the DGX Spark GPUs
  • Experiment with custom or fine-tuned models through DMR
  • Stay fully within your local environment for privacy and cost-control

Summary

Running Docker Model Runner on the NVIDIA DGX Spark makes it remarkably easy to turn powerful local hardware into a seamless extension of your everyday Docker workflow.

You install one plugin and use familiar Docker commands (docker model pull, docker model run).
You get full GPU acceleration through NVIDIA’s container runtime.
You can forward both the model API and monitoring dashboard to your main workstation for effortless development and visibility.

This setup bridges the gap between developer productivity and AI infrastructure, giving you the speed, privacy, and flexibility of local execution with the reliability and simplicity Docker provides.

As local model workloads continue to grow, the DGX Spark + Docker Model Runner combo represents a practical, developer-friendly way to bring serious AI compute to your desk — no data center or cloud dependency required.

Learn more:

  • Read the official announcement of DGX Spark launch on NVIDIA newsroom
  • Check out the Docker Model Runner General Availability announcement
  • Visit our Model Runner GitHub repo. Docker Model Runner is open-source, and we welcome collaboration and contributions from the community! Star, fork and contribute.

Read the whole story
alvinashcraft
8 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories