Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152421 stories
·
33 followers

Kubernetes v1.35: Mutable PersistentVolume Node Affinity (alpha)

1 Share

The PersistentVolume node affinity API dates back to Kubernetes v1.10. It is widely used to express that volumes may not be equally accessible by all nodes in the cluster. This field was previously immutable, and it is now mutable in Kubernetes v1.35 (alpha). This change opens a door to more flexible online volume management.

Why make node affinity mutable?

This raises an obvious question: why make node affinity mutable now? While stateless workloads like Deployments can be changed freely and the changes will be rolled out automatically by re-creating every Pod, PersistentVolumes (PVs) are stateful and cannot be re-created easily without losing data.

However, Storage providers evolve and storage requirements change. Most notably, multiple providers are offering regional disks now. Some of them even support live migration from zonal to regional disks, without disrupting the workloads. This change can be expressed through the VolumeAttributesClass API, which recently graduated to GA in 1.34. However, even if the volume is migrated to regional storage, Kubernetes still prevents scheduling Pods to other zones because of the node affinity recorded in the PV object. In this case, you may want to change the PV node affinity from:

spec:
 nodeAffinity:
 required:
 nodeSelectorTerms:
 - matchExpressions:
 - key: topology.kubernetes.io/zone
 operator: In
 values:
 - us-east1-b

to:

spec:
 nodeAffinity:
 required:
 nodeSelectorTerms:
 - matchExpressions:
 - key: topology.kubernetes.io/region
 operator: In
 values:
 - us-east1

As another example, providers sometimes offer new generations of disks. New disks cannot always be attached to older nodes in the cluster. This accessibility can also be expressed through PV node affinity and ensures the Pods can be scheduled to the right nodes. But when the disk is upgraded, new Pods using this disk can still be scheduled to older nodes. To prevent this, you may want to change the PV node affinity from:

spec:
 nodeAffinity:
 required:
 nodeSelectorTerms:
 - matchExpressions:
 - key: provider.com/disktype.gen1
 operator: In
 values:
 - available

to:

spec:
 nodeAffinity:
 required:
 nodeSelectorTerms:
 - matchExpressions:
 - key: provider.com/disktype.gen2
 operator: In
 values:
 - available

So, it is mutable now, a first step towards a more flexible online volume management. While it is a simple change that removes one validation from the API server, we still have a long way to go to integrate well with the Kubernetes ecosystem.

Try it out

This feature is for you if you are a Kubernetes cluster administrator, and your storage provider allows online update that you want to utilize, but those updates can affect the accessibility of the volume.

Note that changing PV node affinity alone will not actually change the accessibility of the underlying volume. Before using this feature, you must first update the underlying volume in the storage provider, and understand which nodes can access the volume after the update. You can then enable this feature and keep the PV node affinity in sync.

Currently, this feature is in alpha state. It is disabled by default, and may subject to change. To try it out, enable the MutablePVNodeAffinity feature gate on APIServer, then you can edit the PV spec.nodeAffinity field. Typically only administrators can edit PVs, please make sure you have the right RBAC permissions.

Race condition between updating and scheduling

There are only a few factors outside of a Pod that can affect the scheduling decision, and PV node affinity is one of them. It is fine to allow more nodes to access the volume by relaxing node affinity, but there is a race condition when you try to tighten node affinity: it is unclear how the Scheduler will see the modified PV in its cache, so there is a small window where the scheduler may place a Pod on an old node that can no longer access the volume. In this case, the Pod will stuck at ContainerCreating state.

One mitigation currently under discussion is for the kubelet to fail Pod startup if the PersistentVolume’s node affinity is violated. This has not landed yet. So if you are trying this out now, please watch subsequent Pods that use the updated PV, and make sure they are scheduled onto nodes that can access the volume. If you update PV and immediately start new Pods in a script, it may not work as intended.

Future integration with CSI (Container Storage Interface)

Currently, it is up to the cluster administrator to modify both PV's node affinity and the underlying volume in the storage provider. But manual operations are error-prone and time-consuming. It is preferred to eventually integrate this with VolumeAttributesClass, so that an unprivileged user can modify their PersistentVolumeClaim (PVC) to trigger storage-side updates, and PV node affinity is updated automatically when appropriate, without the need for cluster admin's intervention.

We welcome your feedback from users and storage driver developers

As noted earlier, this is only a first step.

If you are a Kubernetes user, we would like to learn how you use (or will use) PV node affinity. Is it beneficial to update it online in your case?

If you are a CSI driver developer, would you be willing to implement this feature? How would you like the API to look?

Please provide your feedback via:

For any inquiries or specific questions related to this feature, please reach out to the SIG Storage community.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Qwen3 vs GPT-5.2 vs Gemini 3 Pro: Which Should You Use and When?

1 Share

A few years back, choosing an AI model was simple. You pick the most capable one you can afford and move on. But today, that approach no longer works.

Today, teams use AI across many parts of a system. Customer-facing features. Internal tooling. Research workflows. Automation and agents. Each workload brings different requirements. Cost behaves differently. Reliability matters in different ways. Control becomes either a strength or a burden.

This is why model choice has become harder. Qwen3, GPT-5.2, and Gemini 3 Pro sit at the center of this shift. They are all capable models. The difference lies in what they are optimized for after deployment, when systems run continuously and constraints surface.

Some teams prioritize control and ownership. Others focus on predictable behavior and ecosystem maturity. Some depend on strong search, document handling, and multimodal inputs. These priorities pull teams in different directions.

This article focuses on those tradeoffs. In this piece, we will analyze:

  • What each model is designed to optimize for.

  • How they behave in real production workflows.

  • The operational and cost implications teams often underestimate.

  • Where each model becomes a poor fit.

  • How teams can choose an approach that holds up over time.

The goal is to help teams make a decision they can stand behind after deployment.

Table of Contents

TL;DR: Quick Decision Guide

Qwen3

Best fit for teams that want control.

  • Self-hosted and private deployment.

  • Full ownership of data and cost behavior.

  • Requires platform and infrastructure maturity.

GPT-5.2

Best fit for teams that want reliability.

  • Stable APIs and mature tooling.

  • Strong support for production agents.

  • Less control over internals and pricing.

Gemini 3 Pro

Best fit for research and knowledge work.

  • Search- and document-centric design.

  • Strong multimodal understanding.

  • Works best inside Google’s ecosystem.

Mixed Workloads

Many teams use more than one model.

  • Stability for customer-facing systems.

  • Flexibility or cost control for internal tools.

These choices come from different design philosophies. The following sections break these down.

Three Models, Three Philosophies

Qwen3, GPT-5.2, and Gemini 3 Pro are shaped by different assumptions about how AI should be used in practice. Each model encodes a view on where intelligence should run, how much control teams should have, and which problems matter most after deployment. These assumptions explain why their strengths, limits, and tradeoffs look the way they do.

Qwen3: Open-Source Power and Control

Qwen3 is designed around ownership. Its Apache 2.0 license allows teams to run the model without usage restrictions, modify it if needed, and integrate it deeply into internal systems. For organizations that care about autonomy and long-term flexibility, this is a foundational advantage.

Deployment is a first-class concern. Qwen3 supports:

  • Self-hosted environments

  • Private cloud deployments

  • Hybrid setups that mix internal and external infrastructure

This makes it suitable for regulated environments, internal tools, and workloads where external APIs are not an option.

Qwen3 also favors agent-style systems. Its hybrid reasoning approach supports multi-step tasks and tool coordination without enforcing a strict execution pattern. This works well for custom automation, internal agents, and domain-specific workflows where teams want to shape behavior directly.

The tradeoffs are operational:

  • Infrastructure setup and maintenance sit with the team.

  • Monitoring, upgrades, and performance tuning are not managed.

  • The surrounding ecosystem is smaller than proprietary platforms.

Qwen3 fits teams that value control and can support it operationally. Platform teams, infrastructure-heavy organizations, and cost-sensitive environments tend to benefit most.

GPT-5.2: Reliability at Scale

GPT-5.2 is built for consistency. It is a proprietary frontier model optimized to behave predictably across a wide range of production workloads. For many teams, this predictability outweighs the need for deep customization.

The platform emphasizes:

  • Stable APIs.

  • Mature tooling for function calling and agents.

  • Strong support for multi-step workflows.

These features reduce engineering overhead. Teams spend less time managing models and more time shipping product features.

Safety and alignment are enforced at the platform level. Guardrails, usage controls, and behavioral constraints are part of the service. For customer-facing systems, this simplifies risk management and compliance. It also leads to more consistent behavior under load.

These characteristics explain its popularity with SaaS teams. GPT-5.2 works well when:

  • Time to production matters.

  • Reliability is critical.

  • Operational simplicity is preferred.

The tradeoff is dependency. Teams accept limited visibility into internals and pricing tied to usage. For many products, this is a reasonable exchange for stability.

Gemini 3 Pro: Multimodal, Search-Native Intelligence

Gemini 3 Pro is built around access to knowledge. Its design assumes that strong reasoning depends on retrieval, context, and synthesis across large information sources.

The model integrates closely with:

  • Search-driven workflows.

  • Document-heavy environments.

  • Multimodal inputs such as text, images, and files.

This makes it effective for research, analysis, and knowledge-centric tasks. Retrieval is not layered on top. It is part of how the model reasons and responds.

Multimodal understanding is a practical strength. Gemini 3 Pro handles mixed inputs uniformly, which is useful for reports, diagrams, scanned documents, and combined media sources.

The “Pro” tier matters because it targets sustained analytical work. It is designed for longer sessions, deeper context, and higher consistency in synthesis.

The tradeoff is focus. Gemini 3 Pro delivers the most value in environments that already depend on search and document workflows. Outside that context, its advantages are less pronounced.

These philosophies set expectations. What matters next is how they translate into core capabilities in practice.

Core Capabilities Comparison

Reasoning, coding, context handling, and multimodal support expose how a model behaves in practice.

Reasoning and Complex Problem Solving

The three models approach reasoning differently.

Qwen3 uses a hybrid reasoning style. It supports stepwise thinking and tool coordination without enforcing a rigid structure. This works well for custom agents and domain-specific workflows where teams want to guide how reasoning unfolds. The flexibility helps when tasks vary or require adaptation mid-process. The downside appears when guardrails are weak. Without careful design, reasoning paths can drift or become inconsistent across runs.

GPT-5.2 relies on a more structured approach. Reasoning behavior is constrained by platform-level controls and alignment systems. This leads to consistent outcomes across repeated tasks and makes behavior easier to predict in production. It performs well in multi-step workflows that need to be completed reliably. The limitation is flexibility. Teams have less influence over how reasoning is shaped internally.

Gemini 3 Pro leans on retrieval-enhanced reasoning. It performs best when answers depend on external context such as documents, search results, or large knowledge bases. Reasoning quality improves when the right information is available. Performance drops when tasks require extended internal reasoning without strong retrieval support.

In practice:

  • Qwen3 excels in customizable reasoning pipelines.

  • GPT-5.2 excels in consistent, repeatable reasoning.

  • Gemini 3 Pro excels in context-driven reasoning tied to knowledge sources.

Coding and Software Development

All three models can generate usable code. The differences appear in consistency and workflow integration.

GPT-5.2 performs strongly in production coding tasks. It produces consistent code style, handles refactoring well, and integrates cleanly with agent-based development workflows. Debugging tasks are reliable, especially when combined with tools. This makes it suitable for teams building features quickly with minimal oversight.

Qwen3 performs well in code generation and refactoring when tuned correctly. It is effective for internal tooling and automation where teams want control over prompts, tools, and execution logic. Repo-level understanding is possible but requires more scaffolding. The burden of orchestration sits with the team.

Gemini 3 Pro is strongest when coding tasks involve documentation, specifications, or external references. It handles code explanation, analysis, and synthesis well when source material is available. It is less consistent for long-running agentic coding workflows that require repeated execution and correction.

In practice:

  • GPT-5.2 fits continuous coding agents.

  • Qwen3 fits custom developer tooling.

  • Gemini 3 Pro fits analysis-heavy coding tasks.

Long-Context Understanding

Long-context handling matters for legal review, research, and policy analysis.

Gemini 3 Pro performs well with large documents. It maintains coherence when summarizing, comparing, and synthesizing information across long inputs. Retrieval support helps anchor responses to source material, which is important for accuracy.

GPT-5.2 handles long context reliably when tasks are structured. It maintains consistency over extended inputs and performs well in workflows that process documents in stages. Memory across steps is stable, which supports agent pipelines.

Qwen3 can handle long context effectively, but results depend on deployment and tuning. Performance varies with configuration, chunking strategy, and memory management. Teams that invest in these areas can achieve strong results. Teams that do not may see degradation over time.

In practice:

  • Gemini 3 Pro fits document-heavy analysis.

  • GPT-5.2 fits staged long-context workflows.

  • Qwen3 fits long-context tasks with custom handling.

Multimodal Capabilities

Multimodal support is no longer optional, but its usefulness varies.

Gemini 3 Pro leads in practical multimodal understanding. It handles text, images, and files together in a coherent way. This is valuable for research, reporting, and analysis that combines multiple input types.

GPT-5.2 supports multimodal inputs with reliable behavior. It works well when multimodality supports a broader workflow rather than being the focus. Integration with tools and agents remains the primary strength.

Qwen3 supports multimodal use cases through extensions and deployment choices. Flexibility is high, but implementation effort is high. The value depends on how much teams invest in integration.

In practice, multimodal capabilities matter most when they support real workflows. Integration quality and consistency matter more than surface-level demonstrations.

These capabilities lay the groundwork for examining how models behave when connected to tools, workflows, and automation.

Tool Use, Agents, and Automation

Tool use is where model behavior becomes visible quickly. Function calling, orchestration, and autonomous workflows expose strengths and weaknesses that are easy to miss in single-prompt interactions. Small inconsistencies compound when a model is expected to act repeatedly, coordinate with systems, and recover from errors.

Function calling and orchestration differ across the three models. GPT-5.2 is optimized for this layer. Tool invocation is predictable, schemas are respected consistently, and retries behave as expected. This makes it well-suited for production systems that rely on deterministic handoffs between the model and external services. Teams spend less time building guardrails around basic execution.

Qwen3 offers more flexibility, but less structure by default. Tool use works well when teams design the orchestration layer carefully. Custom routing, validation, and fallback logic are often required. The benefit is control. Teams can shape execution to closely match internal systems. The cost is engineering effort and ongoing maintenance.

Gemini 3 Pro approaches tool use from a retrieval-first perspective. It performs best when tools are tied to search, document access, or data lookup. Orchestration is most effective when tasks revolve around information gathering and synthesis. It is less suited to complex, action-oriented pipelines that require frequent state changes or corrective loops.

Autonomous agent workflows amplify these differences. GPT-5.2 performs reliably in long-running agents that execute plans, call tools, and adjust behavior across steps. State management is stable, which reduces drift over time. This reliability is a key reason it is often chosen for customer-facing automation.

Qwen3 supports agent workflows well when teams manage state explicitly. Memory, task boundaries, and stopping conditions need careful handling. When done properly, Qwen3 enables highly customized agents. When done poorly, agents become brittle or unpredictable.

Gemini 3 Pro works best in agents that prioritize analysis over action. Research agents, document reviewers, and synthesis pipelines benefit from its strengths. Action-heavy agents are more challenging.

Reliability in multi-step tasks is the dividing line. GPT-5.2 tends to fail gracefully. Qwen3 fails transparently. Gemini 3 Pro fails contextually, often due to missing or weak retrieval signals.

Common failure modes follow predictable patterns:

  • Silent tool misuse or partial execution.

  • Gradual reasoning drift across steps.

  • Over-reliance on missing context.

  • Feedback loops that amplify early errors.

Successful teams design around these risks. Model choice sets the baseline, but system design determines outcomes. In automation, models do not operate alone. They behave as components inside systems that either constrain them well or expose their limits quickly.

Once models are embedded into systems, cost, deployment, and ownership constraints start to shape how they can be used.

Cost, Access, and Deployment Reality

Cost, deployment, and data ownership shape how AI systems behave and adapt over time. These factors determine how models scale, where they can run, and how much control teams retain as usage grows. These constraints differ sharply across models.

Pricing and Cost Predictability

Pricing behavior varies significantly between API-based services and self-hosted models.

GPT-5.2 follows a usage-based pricing model. Costs scale with request volume, context length, and agent activity. This is easy to adopt early on, but becomes harder to forecast as systems mature. Spikes in usage, retries, and long-running workflows can quickly shift cost profiles. The advantage is operational simplicity. Infrastructure, scaling, and upgrades are handled by the provider.

Qwen3 moves cost into infrastructure. Compute, storage, and operations become the primary drivers. This requires upfront planning and ongoing management, but it offers clearer marginal costs once workloads stabilize. For steady internal use, this can be easier to budget for. For highly variable demand, it introduces capacity planning challenges.

Gemini 3 Pro also relies on usage-based pricing tied to managed services. Cost estimation works well for document-centric and search-driven workloads. Less predictability appears as workflows expand into automation and multi-step processes.

Across all three models, hidden costs matter. Monitoring, retries, failure handling, and human review rarely appear in pricing calculators, but they contribute materially to the total cost of ownership.

Deployment Flexibility

Deployment options define where and how models can operate.

Qwen3 offers the widest flexibility. It can run locally, in private cloud environments, or as part of hybrid architectures. This supports strict data residency requirements and deep integration with internal systems. Teams control latency, scaling behavior, and network boundaries.

GPT-5.2 is accessed through managed APIs. Deployment choices are limited, but the operational burden is low. For many teams, this tradeoff is acceptable. Infrastructure concerns are externalized, and reliability is handled at the platform level.

Gemini 3 Pro fits best within managed cloud environments. It integrates cleanly with existing services, particularly where document management and search workflows are already established. Outside those environments, deployment options narrow.

In regulated and enterprise contexts, deployment constraints often outweigh model preferences. Where a model can run is sometimes more important than how it performs.

Data Ownership and Compliance

Data ownership affects long-term risk, governance, and regulatory posture. How much visibility and control a team has depends largely on the model and deployment approach.

Qwen3 provides the highest level of control. Because it can be fully self-hosted, teams manage data flow, storage, retention, and logging directly. This simplifies auditability and supports strict compliance requirements. It also reduces dependency on external vendors and makes internal governance easier to enforce.

GPT-5.2 operates within a managed platform. Data handling, logging, and retention policies are defined by the provider. Compliance support is built in, which lowers setup effort, but limits visibility into internal processes. Teams must accept the provider’s controls and trust their enforcement.

Gemini 3 Pro follows a similar managed model. Data governance aligns closely with the surrounding ecosystem and its services. This works well for organizations already operating within that environment, but offers less flexibility for custom compliance or audit requirements outside it.

Across all three, governance depends on transparency. Teams need to understand where data moves, how it is processed, and how decisions are recorded. These concerns rarely block early adoption. They tend to surface later, when systems are already embedded and changes become costly.

Taken together, these constraints determine which models are practical for specific workloads.

Real-World Use-Case Matrix

At this point, the tradeoffs are clearer. The question is no longer which model is strongest in general, but which one fits a specific type of work. The table below maps common use cases to the model that best aligns with their constraints.

Use Case

Best Fit

Why

Open-source and internal platforms

Qwen3

Full control over deployment, data, and cost behavior

Customer-facing SaaS products

GPT-5.2

Stable APIs, predictable behavior, and mature tooling

Research and analysis workflows

Gemini 3 Pro

Strong retrieval, document handling, and synthesis

Cost-sensitive internal tools

Qwen3

Infrastructure-based cost with clear marginal control

Regulated or enterprise environments

GPT-5.2 or Gemini 3 Pro

Built-in compliance support and managed operations

These mappings reflect patterns that emerge once systems are in regular use. They describe how teams tend to align models with operational needs over time.

Open-source projects and internal platforms commonly align with Qwen3. Ownership, deployment flexibility, and cost control are central concerns in these environments. Teams value the ability to shape infrastructure and governance directly. This approach assumes the presence of platform and operational expertise.

Customer-facing SaaS products often align with GPT-5.2. Stable behavior, mature tooling, and predictable execution support rapid iteration and sustained operation. These characteristics simplify delivery at scale and reduce coordination overhead across teams.

Research and analysis workflows align closely with Gemini 3 Pro. Document-heavy tasks, search-driven exploration, and synthesis across large information sets benefit from its design. These workflows emphasize context depth, and retrieval quality.

Cost-sensitive internal tools frequently align with Qwen3 once usage patterns stabilize. Infrastructure-based cost models support planning and long-term budgeting when capacity is managed deliberately.

Enterprise environments often distribute workloads across models. Managed platforms support compliance and operational consistency. Self-hosted models support transparency and internal control. Many organizations combine both approaches to meet different requirements.

This matrix anchors decisions in workload and operational constraints, and exposes the limits that come with each choice.

Where Each Model Falls Short

Every model fits some environments better than others. Limits usually appear when assumptions built into a model no longer match how it is used. This section highlights where each option tends to strain, based on operating context rather than abstract capability.

When Qwen3 Is the Wrong Choice

Qwen3 places responsibility on the team. This works well where infrastructure ownership is expected, but it becomes a constraint when operational capacity is limited. Teams without strong platform or DevOps support often struggle to maintain reliability, monitor performance, and manage upgrades over time.

Qwen3 also demands deliberate system design. Agent workflows, memory handling, and tool orchestration need careful implementation. Without that discipline, behavior becomes inconsistent. In fast-moving product environments, this overhead can slow iteration.

Qwen3 fits best where control is a priority. It fits poorly where simplicity and speed outweigh autonomy.

When GPT-5.2 Is Overkill

GPT-5.2 is optimized for reliability at scale. In simpler workflows, that reliability can exceed what is required. Lightweight internal tools, offline processing, and low-frequency tasks often do not benefit from a fully managed frontier platform.

Cost sensitivity is another factor. Usage-based pricing is easy to adopt but harder to justify when workloads are predictable and stable. In these cases, infrastructure-backed models provide clearer long-term economics.

GPT-5.2 works best when failure carries real cost. It becomes less attractive when requirements are modest and control matters more than abstraction.

When Gemini 3 Pro Is Not Ideal

Gemini 3 Pro is strongest in knowledge-centric environments. When workflows depend less on documents, search, or retrieval, its advantages narrow. Action-oriented systems, especially those requiring frequent state changes or tight execution loops, expose these limits.

Gemini 3 Pro also aligns closely with managed cloud ecosystems. Outside those environments, integration options become more constrained. Teams building highly customized agent logic may find less flexibility than expected.

Gemini 3 Pro fits best where context depth drives value. It fits less cleanly where execution and customization dominate.

Seen together, these limits point toward a more deliberate way to choose.

How to Choose the Right Model in 2026

Choosing the right model in 2026 means matching a model’s strengths to how your system actually operates. The decision becomes clearer when questions are answered with specific models in mind.

Key Questions and How They Map to Models

  • Do you need full control over data, deployment, and cost behavior?

Choose Qwen3 when ownership matters. This applies to internal platforms, regulated environments, and teams that want to manage infrastructure directly.

  • Do you need predictable behavior in customer-facing systems?

Choose GPT-5.2 when reliability and consistency outweigh customization. This fits SaaS products, user-facing agents, and workflows where failure is visible and costly.

  • Does the work depend on search, documents, or large knowledge sources?

Choose Gemini 3 Pro when retrieval, synthesis, and document handling are central. This applies to research, analysis, and reporting-heavy workflows.

  • Is cost stability more important than speed to setup

Choose Qwen3 for steady workloads with known demand. Infrastructure-backed cost models support long-term planning when teams can manage capacity.

  • Is speed to production the priority?

Choose GPT-5.2 when time and operational simplicity matter more than internal control.

Matching models to business goals

  • Product velocity and scale align with GPT-5.2.

  • Platform ownership and transparency align with Qwen3.

  • Knowledge-centric depth and synthesis align with Gemini 3 Pro.

  • Internal automation and experimentation often align with Qwen3.

  • External-facing automation often aligns with GPT-5.2.

The mistake teams make is to optimize for capability rather than alignment. Each model performs well when used for the type of work it was designed to support.

Why multi-model strategies are becoming the norm

  • Different parts of a system have different risk profiles.

  • No single model optimizes reliability, cost control, and knowledge depth simultaneously.

  • Routing workloads across models reduces lock-in and operational strain.

A common 2026 pattern:

  • GPT-5.2 for customer-facing reliability.

  • Qwen3 for internal systems and cost control.

  • Gemini 3 Pro for research and document-heavy analysis.

Choosing well means choosing deliberately. Teams that align models with workload realities avoid expensive rework later.

Closing Thoughts

In 2026, choosing an AI model is a question of fit. Fit to workload, operating constraints, and risk tolerance. Raw capability is no longer the deciding factor.

Qwen3, GPT-5.2, and Gemini 3 Pro succeed for different reasons. Qwen3 aligns with teams that want control, transparency, and predictable cost through ownership. GPT-5.2 aligns with products that require reliable behavior and minimal operational overhead. Gemini 3 Pro aligns with work centered on search, documents, and synthesis.

These models are not interchangeable. Each reflects a different set of tradeoffs. Using the wrong model for the wrong workload creates friction that surfaces later, usually through cost, complexity, or limited flexibility.

This is why multi-model use is becoming common. Teams separate workloads based on their needs. Customer-facing systems emphasize stability and consistency. Internal systems emphasize ownership and cost control. Research workflows emphasize access to significant knowledge sources and synthesis quality.

That approach holds up longer than chasing any single “best” model.



Read the whole story
alvinashcraft
31 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Fixing OllamaSharp Timeouts in C# (with a Simple Extension and just for fun 😄)

1 Share

⚠ This blog post was created with the help of AI tools. Yes, I used a bit of magic from language models to organize my thoughts and automate the boring parts, but the geeky fun and the 🤖 in C# are 100% mine.

Hi!

Avoid reading the blog post with this 5-min video:

When working with local models in OllamaSharp, I hit a timeout while running long-running workloads like video analysis. The issue wasn’t the model, it was the default 100-second timeout coming from HttpClient.


The problem

By default, OllamaSharp uses an HttpClient with a fixed timeout.
If your model needs more time, you’ll see errors like:

Unhandled exception. System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
 ---> System.TimeoutException: The operation was canceled.
 ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
 ---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)

What I learned

The timeout must be configured on the HttpClient, before passing it to the OllamaSharp client. That works—but it’s not very clean or discoverable.


The solution: a C# extension

I created a small extension library so you can do this instead:

ollamaClient.SetTimeout(TimeSpan.FromMinutes(5));

There are also convenience helpers:

// Sets a timeout suitable for quick queries (2 minutes).
ollamaClient.WithQuickTimeout(); 
// Sets a timeout suitable for standard prompts (5 minutes).
ollamaClient.WithStandardTimeout();
// Sets a timeout suitable for long-form generation (10 minutes).
ollamaClient.WithLongTimeout();


Get it here


Takeaway

If something hurts in C#…
it probably deserves an extension method 😄

Happy coding!

Greetings

El Bruno

More posts in my blog ElBruno.com.

More info in https://beacons.ai/elbruno






Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Scaling AI Agents with Aspire: The Missing Isolation Layer for Parallel Development

1 Share

Liked this blog post? It was originally posted on Tamir Dresher’s blog, https://www.tamirdresher.com/, check out more content there!

In my previous post about Git worktrees, I showed how to run multiple AI agents in parallel, each working on different features in separate worktrees. Aspire is a game-changer for AI-assisted development because it gives your agents superpowers: with a single Program.cs, an agent can spawn an entire distributed system—backend APIs, Python services, frontends, databases, message queues—everything orchestrated and ready to test. Even better, using Aspire’s MCP server, agents can programmatically query resource status, retrieve logs, and troubleshoot issues. This means AI agents can interact with the whole system, not just individual components, dramatically simplifying development workflows.

But there’s a critical problem when you want to scale to multiple worktrees: port conflicts. When you try to run Aspire AppHost from multiple worktrees simultaneously, they all fight over the same ports, making parallel AI agent development impossible.

I solved this by adding an isolation layer that automatically allocates unique ports for each worktree. While I implemented this with scripts and a clever MCP proxy, I hope the Aspire team will bake this capability directly into the Aspire CLI as part of an aspire run --isolated command—making multi-instance isolation a first-class feature.

Why Aspire is Perfect for AI Agents

Before diving into the port isolation problem, let me explain why Aspire is such a powerful tool for AI-assisted development:

1. Spawn Entire Systems with Minimal Code

An AI agent can create a complete distributed application with just a few lines:

var builder = DistributedApplication.CreateBuilder(args);

var cache = builder.AddRedis("cache");
var db = builder.AddPostgres("db").AddDatabase("notetakerdb");
var messaging = builder.AddRabbitMQ("messaging");

var backend = builder.AddProject<Projects.Backend>("backend")
    .WithReference(cache)
    .WithReference(db)
    .WithReference(messaging)
    .WithHttpEndpoint(name: "http")
    .WithExternalHttpEndpoints();

var aiService = builder.AddPythonApp("ai-service", "../ai-service", "main.py")
    .WithReference(db)
    .WithReference(messaging)
    .WithHttpEndpoint(env: "PORT", name: "http")
    .WithExternalHttpEndpoints();

builder.AddJavaScriptApp("frontend", "../frontend")
    .WithReference(backend)
    .WithReference(aiService.GetEndpoint("http"))
    .WithHttpEndpoint(env: "PORT")
    .WithExternalHttpEndpoints();

builder.Build().Run();

This spins up:

  • Redis cache
  • PostgreSQL database
  • RabbitMQ message broker
  • C# backend API
  • Python AI service
  • JavaScript frontend
  • All networking between them configured automatically

An AI agent can modify this, run it, test the whole system, and iterate—all autonomously.

2. System-Wide Observability via Aspire MCP

Aspire’s MCP (Model Context Protocol) support lets AI agents interact with the running system:

Agent: "Check if all resources are healthy"
→ Uses list_resources tool
→ Gets status of all services, containers, and executables

Agent: "Why is the backend failing?"
→ Uses list_console_logs tool for backend
→ Reads startup errors and stack traces

Agent: "Show me traces for slow requests"
→ Uses list_traces tool
→ Analyzes distributed tracing data

This is transformative: instead of debugging individual components, agents can reason about the entire system, following request flows across services, correlating logs, and identifying root causes.

Aspire also provides distributed integration testing capabilities that enable agents to run comprehensive tests against the entire system—I’ll cover this later in the post.

The Problem: Port Conflicts Kill Parallelism

This all works beautifully—until you try to run multiple worktrees in parallel. Every AppHost instance tries to grab the same ports:

All worktrees try to use:

  • Port 18888 for Aspire Dashboard
  • Port 18889 for OTLP (OpenTelemetry) endpoint
  • Port 18890 for Resource Service endpoint
  • Port 4317 for MCP endpoint

Port Conflict on Startup

Terminal 1 (feature-auth worktree):

cd worktrees-example.worktrees\feature-auth\src\NoteTaker.AppHost
dotnet run
# ✅ Works - Dashboard on port 18888

Terminal 2 (feature-payments worktree):

cd worktrees-example.worktrees\feature-payments\src\NoteTaker.AppHost
dotnet run
# ❌ ERROR: Port 18888 is already in use!
# ❌ ERROR: Port 18889 is already in use!
# ❌ ERROR: Port 18890 is already in use!

You can’t run the second AppHost at all.

Manual Workarounds Don’t Scale

You could manually edit ports for each worktree, but this is tedious and error-prone:

  • ❌ You need to remember which ports are free
  • ❌ You have to manually set 3+ environment variables per worktree
  • ❌ Cleanup requires tracking which terminals use which ports
  • ❌ Biggest problem: Your agent’s MCP connection needs to know which port to connect to

The fundamental issue: your worktrees have isolated code, but shared port space.

The Solution: Port Isolation + MCP Proxy

The solution has two layers:

  1. Port allocation: Scripts that automatically find and allocate unique ports for each AppHost instance
  2. MCP proxy: An indirection layer that lets AI agents connect to whichever AppHost is currently running

Layer 1: Automatic Port Allocation

The start-apphost.ps1 and start-apphost.sh scripts:

  1. Find free ports using .NET’s port allocation
  2. Set environment variables for Aspire dashboard components
  3. Launch AppHost with those ports
  4. Save port configuration to scripts/settings.json for MCP proxy
  5. Display dashboard URL and Process ID for monitoring

Here’s what the output looks like:

Start AppHost Output

Notice the dynamically allocated ports (54772-54775) saved for the MCP proxy to use.

Layer 2: The MCP Proxy Problem

Here’s the challenge: Aspire’s MCP server runs on a specific port (e.g., 54775) and requires an API key. When using direct HTTP MCP configuration, your AI agent’s configuration (.roo/mcp.json) needs both to be fixed:

{
  "mcpServers": {
    "aspire-dashboard": {
      "type": "http",
      "url": "http://localhost:62980/mcp",  // ❌ This port is fixed!
      "headers": {
        "x-mcp-api-key": "McpKey"  // ❌ This API key is fixed!
      }
    }
  }
}

The two problems:

  1. Dynamic Ports: Which port to use?

    • Worktree 1’s AppHost MCP is on port 54775
    • Worktree 2’s AppHost MCP is on port 61450
    • Worktree 3’s AppHost MCP is on port 58232
  2. Dynamic API Keys: Each AppHost generates a unique API key for security

Your .roo/mcp.json can’t know either value in advance!

Solution: The aspire-mcp-proxy

The aspire-mcp-proxy.cs script adds the missing layer of indirection:

┌─────────────┐         ┌──────────────────┐         ┌─────────────────┐
│ Roo AI      │ stdio   │ aspire-mcp-proxy │  HTTP   │ Aspire AppHost  │
│ Agent       ├────────►│ (fixed config)   ├────────►│ (dynamic port)  │
│             │         │                  │         │                 │
└─────────────┘         └──────────────────┘         └─────────────────┘
                              ↓
                        reads from
                        scripts/settings.json
                        (updated by start-apphost.ps1)

How it works:

  1. AI agent connects to proxy via stdio (always the same configuration)
  2. Proxy reads scripts/settings.json to discover the current AppHost’s MCP port
  3. Proxy forwards MCP requests to the correct dynamic port via HTTP
  4. Responses flow back through the proxy to the agent

In .roo/mcp.json:

{
  "mcpServers": {
    "aspire-mcp": {
      "command": "dotnet",
      "args": ["scripts/aspire-mcp-proxy.cs", "--no-build"],
      "description": "Aspire Dashboard MCP stdio proxy"
    }
  }
}

Note: I’m using .NET 10’s single-file script feature—dotnet run app.cs runs a C# file directly without needing a project file. This makes the proxy incredibly simple: one 272-line file that’s both an MCP client (connecting to Aspire) and an MCP server (exposing tools to Roo), all using the official ModelContextProtocol@0.4.1-preview.1 NuGet package. It’s amazing to have a complete bidirectional MCP proxy in a single, self-contained script!

The proxy configuration is fixed—it doesn’t need to know which AppHost is running! The scripts/settings.json file bridges the gap:

{
  "port": "54775",
  "apiKey": "abc123...",
  "lastUpdated": "2025-11-15T10:30:00Z"
}

Every time you run start-apphost.ps1, it updates settings.json with the new ports. The proxy reads it dynamically on each request. Also, it sets the AppHost__McpApiKey envvar for the apphost so we can control the token that will be sued for the Aspire MCP server

Putting It All Together

  1. cd worktrees-example.worktrees/feature-auth

  2. ./scripts/start-apphost.ps1 → Finds free ports: 54772-54775 → Updates scripts/settings.json with port 54775 → Starts AppHost on those ports

  3. Roo connects to aspire-mcp-proxy (via .roo/mcp.json) → Proxy reads scripts/settings.json → Discovers AppHost MCP is on port 54775 → Forwards all MCP requests there

  4. Roo asks: “list_resources” → Goes through proxy → AppHost MCP on 54775 → Returns resource status

  5. Switch to different worktree: cd ../feature-payments ./scripts/start-apphost.ps1 → Finds free ports: 61447-61450 → Updates scripts/settings.json with port 61450

  6. Roo’s next request automatically goes to port 61450 → No configuration change needed!

This is the key insight: by adding the proxy layer, we decouple the AI agent’s configuration from the dynamic port allocation. The agent always talks to the same proxy, and the proxy figures out where the current AppHost is running.

Implementation: Using the Scripts

Let me show you how to set this up for the NoteTaker example application:

Step 1: Configure AppHost for Worktree Detection

In your Program.cs, detect the Git folder name and customize the dashboard name:

var gitFolderName = GitFolderResolver.GetGitFolderName();
var dashboardAppName = string.IsNullOrEmpty(gitFolderName)
    ? "NoteTaker"
    : $"NoteTaker-{gitFolderName}";

var builder = DistributedApplication.CreateBuilder(new DistributedApplicationOptions()
{
    Args = args,
    DashboardApplicationName = dashboardAppName,
});

var cache = builder.AddRedis("cache");
var db = builder.AddPostgres("db").AddDatabase("notetakerdb");
var messaging = builder.AddRabbitMQ("messaging");

var backend = builder.AddProject<Projects.Backend>("backend")
    .WithReference(cache)
    .WithReference(db)
    .WithReference(messaging)
    .WithHttpEndpoint(name: "http")  // ✅ No port = random allocation
    .WithExternalHttpEndpoints();

builder.Build().Run();

Key benefits:

By setting the DashboardApplicationName property of the DistributedApplicationBuilder we can make it clear in the dashboard in which worktree are we working on.

Dashboard Title

Step 2: Start AppHost with Scripts

Never run dotnet run directly. Always use the management scripts:

PowerShell (Windows)

cd worktrees-example.worktrees\feature-auth
.\scripts\start-apphost.ps1

# Output shows:
# - Dashboard URL with unique port
# - MCP endpoint saved to settings.json
# - Process ID for cleanup

Bash (Linux/macOS or Git Bash)

cd worktrees-example.worktrees/feature-auth
./scripts/start-apphost.sh

The script:

  1. Finds 4 free ports
  2. Sets environment variables
  3. Updates scripts/settings.json with MCP port and API key
  4. Launches AppHost
  5. Returns Process ID for cleanup

Step 3: Configure MCP Proxy in .roo/mcp.json

Add the proxy to your .roo/mcp.json:

{
  "mcpServers": {
    "aspire-mcp": {
      "command": "dotnet",
      "args": ["scripts/aspire-mcp-proxy.cs", "--no-build"],
      "env": {},
      "description": "Aspire Dashboard MCP stdio proxy",
      "alwaysAllow": [
        "list_resources",
        "execute_resource_command",
        "list_traces",
        "list_trace_structured_logs",
        "list_console_logs",
        "list_structured_logs"
      ]
    }
  }
}

No ports to configure! The proxy reads scripts/settings.json dynamically.

Step 4: Run Multiple Worktrees Simultaneously

Now you can run as many worktrees as you need:

Terminal 1 – Feature Auth:

cd worktrees-example.worktrees\feature-auth
.\scripts\start-apphost.ps1
# Dashboard: https://localhost:54772
# MCP: port 54775 (saved to settings.json)
# Process ID: 12345

Terminal 2 – Feature Payments:

cd worktrees-example.worktrees\feature-payments
.\scripts\start-apphost.ps1
# Dashboard: https://localhost:61447
# MCP: port 61450 (saved to settings.json)
# Process ID: 67890

Terminal 3 – Feature UI:

cd worktrees-example.worktrees\feature-ui
.\scripts\start-apphost.ps1
# Dashboard: https://localhost:58229
# MCP: port 58232 (saved to settings.json)
# Process ID: 11223

All three run simultaneously with zero conflicts! Your AI agent automatically connects to whichever one you’re working in.

Step 5: Cleanup When Done

Quick Cleanup (Recommended)

.\scripts\kill-apphost.ps1 -All
./scripts/kill-apphost.sh --all

This terminates all AppHost processes from your repository.

Enabling AI Agents to Work Independently

One powerful benefit is that AI agents can now work completely autonomously, including testing their own changes. I’ve documented the AppHost management rules in .roo/rules/05-apphost-management.md which instructs Roo (my AI coding agent) on:

With these rules, I can ask Roo to:

  1. Make code changes
  2. Start the AppHost to test
  3. Use MCP tools to verify functionality
  4. Check logs if something fails
  5. Clean up when done

The agent works completely autonomously, even running and testing multiple worktrees in parallel.

Script Reference

Script Purpose
start-apphost.ps1 / .sh Start AppHost with auto port allocation, update settings.json
aspire-mcp-proxy.cs MCP proxy that reads settings.json and forwards to current AppHost
kill-apphost.ps1 / .sh Kill AppHost instances (by PID or all)
list-apphosts.ps1 / .sh List all running instances

Environment Variables Set by Scripts:

$env:ASPIRE_DASHBOARD_PORT = "54772"                              # Dynamic
$env:ASPIRE_DASHBOARD_OTLP_HTTP_ENDPOINT_URL = "http://localhost:54773"
$env:ASPIRE_RESOURCE_SERVICE_ENDPOINT_URL = "http://localhost:54774"
$env:ASPIRE_DASHBOARD_MCP_ENDPOINT_URL = "http://localhost:54775"  # Saved to settings.json

Benefits

1. True AI Agent Superpowers

Combine Aspire’s system orchestration with MCP observability:

  • ✅ Agents spawn entire distributed systems
  • ✅ Agents query resource status programmatically
  • ✅ Agents read logs and traces to debug
  • ✅ Agents work on whole systems, not just components

2. Zero Configuration MCP Connections

The proxy solves the dynamic port problem:

  • ✅ .roo/mcp.json is fixed (no per-worktree configuration)
  • ✅ Proxy automatically finds current AppHost
  • ✅ Works seamlessly across worktree switches
  • ✅ No manual port tracking needed

3. True Parallel Development

Multiple agents work simultaneously:

  • ✅ No “wait for Agent A to finish”
  • ✅ No manual port coordination
  • ✅ Each agent completely independent

4. Works with Any Aspire Project

  • ✅ Standard Aspire features only
  • ✅ No custom NuGet packages
  • ✅ Simple script-based approach

Aspire’s Built-in Distributed Testing Support

Beyond orchestration and observability, Aspire provides distributed testing capabilities that enable true end-to-end testing with automatic port isolation. Instead of just running the AppHost, your AI agent can now run comprehensive tests against the entire system.

Using DistributedApplicationTestingBuilder, you can spin up your full application stack—frontend, backend, databases, message queues—with automatically randomized ports for complete isolation:

var appHost = await DistributedApplicationTestingBuilder
    .CreateAsync<Projects.NoteTaker_AppHost>();

var app = await appHost.BuildAsync();
await app.StartAsync();

// Wait for resources to be healthy
await app.ResourceNotifications.WaitForResourceHealthyAsync("frontend");

// Get dynamically allocated endpoint
var frontendUrl = app.GetEndpoint("frontend");

Combine this with Playwright and you achieve true end-to-end tests:

// Get the dynamically allocated frontend URL
var frontendUrl = app.GetEndpoint("frontend").ToString();

// Use Playwright to interact with the UI
var page = await browser.NewPageAsync();
await page.GotoAsync(frontendUrl);

// Test the actual UI with all dependencies running
await page.FillAsync("#title", "Test Task");
await page.ClickAsync("button[type='submit']");
await page.WaitForSelectorAsync(".task-item");

In the NoteTaker example, tests interact with the actual frontend UI while all backend services, databases, and dependencies run in the background—all with isolated, randomly allocated ports.

This means your AI agent is now truly autonomous: it can modify code, run the full test suite with all system dependencies, and validate changes end-to-end without manual intervention. Read more about accessing resources in tests.

Under the Hood: How the MCP Proxy Works

The aspire-mcp-proxy.cs is the heart of the solution. Let me explain how it’s implemented.

The Dual Nature: MCP Client + MCP Server

The proxy is simultaneously:

  1. MCP Server (stdio) – Exposes tools to Roo via standard input/output
  2. MCP Client (HTTP) – Connects to Aspire’s MCP server to invoke tools

Here’s the architecture:

┌─────────────────────────────────────────────────────────────────┐
│ aspire-mcp-proxy.cs (272 lines, single file)                    │
│                                                                 │
│  ┌────────────────────┐         ┌─────────────────────┐         │
│  │ MCP Server (stdio) │◄────────┤ Roo Agent           │         │
│  │ - Exposes tools    │         │ (sends tool calls)  │         │
│  │ - Handles requests │         └─────────────────────┘         │
│  └────────┬───────────┘                                         │
│           │                                                     │
│           ▼                                                     │
│  ┌────────────────────┐                                         │
│  │ ProxyTool          │  For each tool:                         │
│  │ - Wraps downstream │  1. Read settings.json                  │
│  │ - Reads settings   │  2. Create HTTP client                  │
│  │ - Forwards calls   │  3. Forward request to Aspire           │
│  └────────┬───────────┘  4. Return response                     │
│           │                                                     │
│           ▼                                                     │
│  ┌────────────────────┐                                         │
│  │ MCP Client (HTTP)  │────────► Aspire Dashboard MCP           │
│  │ - Connects to port │         (dynamic port from settings)    │
│  │ - Invokes tools    │                                         │
│  └────────────────────┘                                         │
└─────────────────────────────────────────────────────────────────┘

Key Implementation Details

1. Dynamic Settings Loading

The proxy reads scripts/settings.json on every request to get the current AppHost’s port and API key:

async Task<McpClient> CreateClientAsync()
{
    var current = await LoadSettingsAsync(settingsPath);
    var transport = new HttpClientTransport(new()
    {
        Endpoint = new Uri($"http://localhost:{current.Port}/mcp"),
        AdditionalHeaders = new Dictionary<string, string>
        {
            ["x-mcp-api-key"] = current.ApiKey!,
            ["Accept"] = "application/json, text/event-stream"
        }
    });
    return await McpClient.CreateAsync(transport);
}

2. Tool Caching for Offline Mode

When the proxy starts, it attempts to connect to Aspire and cache the available tools. If Aspire isn’t running, it uses cached tool metadata:

try
{
    var client = await CreateClientAsync();
    await cache.RefreshAsync(client);
    // Online mode: use live tools
    tools = cache.GetTools().Select(t => new ProxyTool(CreateClientAsync, t));
}
catch (Exception ex)
{
    await Console.Error.WriteLineAsync($"[AspireMcpProxy] Connection failed: {ex.Message}");
    await Console.Error.WriteLineAsync("[AspireMcpProxy] Using cached tools");

    var cachedTools = await cache.LoadAsync();
    // Offline mode: create tools from cached metadata
    tools = cachedTools.Select(t => new ProxyTool(CreateClientAsync, t));
}

This means Roo can see Aspire tools even before starting AppHost, though they’ll fail if invoked while offline.

3. ProxyTool: The Forwarding Logic

Each tool exposed by the proxy is a ProxyTool that:

  • Accepts stdio requests from Roo
  • Reads current settings
  • Creates an HTTP client
  • Forwards to Aspire’s MCP server
  • Returns the response
public override async ValueTask<CallToolResult> InvokeAsync(
    RequestContext<CallToolRequestParams> request,
    CancellationToken ct = default)
{
    var args = request.Params?.Arguments?
        .ToDictionary(kv => kv.Key, kv => (object?)kv.Value)
        ?? new Dictionary<string, object?>();

    await Console.Error.WriteLineAsync($"[ProxyTool] Calling {_tool.Name}");

    try
    {
        var client = await _clientFactory();  // Reads settings.json
        var result = await client.CallToolAsync(_tool.Name, args, null);
        await Console.Error.WriteLineAsync($"[ProxyTool] {_tool.Name} completed");
        return result;
    }
    catch (HttpRequestException ex)
    {
        return Error($"Connection failed: {ex.Message}\n\nVerify Aspire is running.");
    }
}

4. Environment Variable Priority

Settings are resolved in priority order:

  1. ASPIRE_MCP_PORT environment variable (highest priority)
  2. settings.json file (updated by scripts)

Same for API key:

  1. ASPIRE_MCP_API_KEY environment variable
  2. settings.json file

This design allows flexibility: you can override settings via environment variables if needed, but the default behavior reads from the file updated by start-apphost.ps1.

Why Single-File Script is Perfect

Using .NET 10’s dotnet run app.cs feature makes this solution incredibly elegant:

# No project file needed - just run the .cs file!
dotnet run scripts/aspire-mcp-proxy.cs

The #:package directives at the top automatically restore NuGet packages:

#:package ModelContextProtocol@0.4.1-preview.1
#:package Microsoft.Extensions.Hosting@10.0.0
#:package Microsoft.Extensions.Logging@10.0.0

This is the power of modern .NET—a complete MCP proxy in a single, readable script file!

Conclusion

Aspire + MCP gives AI agents unprecedented capabilities: they can spawn entire distributed systems and interact with them programmatically. But this power only scales when you solve the port isolation problem.

The solution combines two layers:

  1. Port allocation scripts that automatically find free ports
  2. MCP proxy that provides indirection so AI agents don’t need to know which port to use

The key enablers are:

  1. Scripts that allocate unique ports and save to settings.json
  2. aspire-mcp-proxy.cs that reads settings.json dynamically
  3. GitFolderResolver for unique dashboard names
  4. WithHttpEndpoint() for random service ports

While I’ve implemented this with scripts, I encourage the Aspire team to add this as a built-in feature. Imagine:

# Future vision: Native Aspire CLI support
aspire run --isolated

# Aspire automatically:
# - Detects worktree context
# - Allocates unique ports
# - Updates MCP proxy configuration
# - Manages cleanup on exit

Until then, the scripts in the worktrees-example repository provide everything you need.

This approach transformed my AI agent workflow from sequential (one at a time) to truly parallel (four agents simultaneously). The combination of Git worktrees + Aspire orchestration + MCP observability + port isolation is a game-changer for AI-assisted development at scale.

Example Repository: Check out the complete implementation at worktrees-example with the NoteTaker application, all management scripts, and the MCP proxy.


Running multiple Aspire instances with AI agents? How are you handling MCP connections? Share your approach in the comments!

Related Posts

The post Scaling AI Agents with Aspire: The Missing Isolation Layer for Parallel Development appeared first on Aspire Blog.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

A data model for Git (and other docs updates)

1 Share

Hello! This past fall, I decided to take some time to work on Git’s documentation. I’ve been thinking about working on open source docs for a long time – usually if I think the documentation for something could be improved, I’ll write a blog post or a zine or something. But this time I wondered: could I instead make a few improvements to the official documentation?

So Marie and I made a few changes to the Git documentation!

a data model for Git

After a while working on the documentation, we noticed that Git uses the terms “object”, “reference”, or “index” in its documentation a lot, but that it didn’t have a great explanation of what those terms mean or how they relate to other core concepts like “commit” and “branch”. So we wrote a new “data model” document!

You can read the data model here for now. I assume at some point (after the next release?) it’ll also be on the Git website.

I’m excited about this because understanding how Git organizes its commit and branch data has really helped me reason about how Git works over the years, and I think it’s important to have a short (1600 words!) version of the data model that’s accurate.

The “accurate” part turned out to not be that easy: I knew the basics of how Git’s data model worked, but during the review process I learned some new details and had to make quite a few changes (for example how merge conflicts are stored in the staging area).

updates to git push, git pull, and more

I also worked on updating the introduction to some of Git’s core man pages. I quickly realized that “just try to improve it according to my best judgement” was not going to work: why should the maintainers believe me that my version is better?

I’ve seen a problem a lot when discussing open source documentation changes where 2 expert users of the software argue about whether an explanation is clear or not (“I think X would be a good way to explain it! Well, I think Y would be better!”)

I don’t think this is very productive (expert users of a piece of software are notoriously bad at being able to tell if an explanation will be clear to non-experts), so I needed to find a way to identify problems with the man pages that was a little more evidence-based.

getting test readers to identify problems

I asked for test readers on Mastodon to read the current version of documentation and tell me what they find confusing or what questions they have. About 80 test readers left comments, and I learned so much!

People left a huge amount of great feedback, for example:

  • terminology they didn’t understand (what’s a pathspec? what does “reference” mean? does “upstream” have a specific meaning in Git?)
  • specific confusing sentences
  • suggestions of things things to add (“I do X all the time, I think it should be included here”)
  • inconsistencies (“here it implies X is the default, but elsewhere it implies Y is the default”)

Most of the test readers had been using Git for at least 5-10 years, which I think worked well – if a group of test readers who have been using Git regularly for 5+ years find a sentence or term impossible to understand, it makes it easy to argue that the documentation should be updated to make it clearer.

I thought this “get users of the software to comment on the existing documentation and then fix the problems they find” pattern worked really well and I’m excited about potentially trying it again in the future.

the man page changes

We ended updating these 4 man pages:

The git push and git pull changes were the most interesting to me: in addition to updating the intro to those pages, we also ended up writing:

Making those changes really gave me an appreciation for how much work it is to maintain open source documentation: it’s not easy to write things that are both clear and true, and sometimes we had to make compromises, for example the sentence “git push may fail if you haven’t set an upstream for the current branch, depending on what push.default is set to.” is a little vague, but the exact details of what “depending” means are really complicated and untangling that is a big project.

on the process for contributing to Git

It took me a while to understand Git’s development process. I’m not going to try to describe it here (that could be a whole other post!), but a few quick notes:

  • Git has a Discord server with a “my first contribution” channel for help with getting started contributing. I found people to be very welcoming on the Discord.
  • I used GitGitGadget to make all of my contributions. This meant that I could make a GitHub pull request (a workflow I’m comfortable with) and GitGitGadget would convert my PRs into the system the Git developers use (emails with patches attached). GitGitGadget worked great and I was very grateful to not have to learn how to send patches by email with Git.
  • Otherwise I used my normal email client (Fastmail’s web interface) to reply to emails, wrapping my text to 80 character lines since that’s the mailing list norm.

I also found the mailing list archives on lore.kernel.org hard to navigate, so I hacked together my own git list viewer to make it easier to read the long mailing list threads.

Many people helped me navigate the contribution process and review the changes: thanks to Emily Shaffer, Johannes Schindelin (the author of GitGitGadget), Patrick Steinhardt, Ben Knoble, Junio Hamano, and more.

(I’m experimenting with comments on Mastodon, you can see the comments here)

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Postgres vs tproc-c on a small server

1 Share

This is my first post with results from tproc-c using HammerDB. This post has results for Postgres. 

tl;dr - across 8 workloads (low and medium concurrency, cached database to IO-bound)

  • there might be a regression for Postgres 14.20 and 15.15 in one workload
  • there are improvements, some big, for Postgres 17 and 18 in most workloads

Builds, configuration and hardware

I compiled Postgres from source using -O2 -fno-omit-frame-pointer for versions 12.22, 13.23, 14.20, 15.15, 16.11, 17.7 and 18.1.

The server is an ASUS ExpertCenter PN53 with an AMD Ryzen 7 7735HS CPU, 8 cores, SMT disabled, and 32G of RAM. Storage is one NVMe device for the database using ext-4 with discard enabled. The OS is Ubuntu 24.04. More details on it are here.

For versions prior to 18, the config file is named conf.diff.cx10a_c8r32 and they are as similar as possible and here for versions 1213141516 and 17.

For Postgres 18 the config file is named conf.diff.cx10b_c8r32 and adds io_mod='sync' which matches behavior in earlier Postgres versions.

Benchmark

The benchmark is tproc-c from HammerDB. The tproc-c benchmark is derived from TPC-C.

The benchmark was run for several workloads:
  • vu=1, w=100 - 1 virtual user, 100 warehouses
  • vu=6, w=100 - 6 virtual users, 100 warehouses
  • vu=1, w=1000 - 1 virtual user, 1000 warehouses
  • vu=6, w=1000 - 6 virtual users, 1000 warehouses
  • vu=1, w=2000 - 1 virtual user, 2000 warehouses
  • vu=6, w=2000 - 6 virtual users, 2000 warehouses
  • vu=1, w=4000 - 1 virtual user, 4000 warehouses
  • vu=6, w=4000 - 6 virtual users, 4000 warehouses
The benchmark is run by this script which depends on scripts here.
  • stored procedures are enabled
  • partitioning is used for when the warehouse count is >= 1000
  • a 5 minute rampup is used
  • then performance is measured for 120 minutes

Results

All artifacts from the tests are here. A spreadsheet with the charts and numbers is here.

My analysis at this point is simple -- I only consider average throughput. Eventually I will examine throughput over time and efficiency (CPU and IO).

On the charts that follow y-axis starts at 0.9 to improve readability. The y-axis shows relative throughput. There might be a regression when the relative throughput is less than 1.0. There might be an improvement when it is > 1.0. The relative throughput is:
(NOPM for a given version / NOPM for Postgres 12.22)

Results: vu=1, w=100

Summary:
  • no regressions, no improvements
Results: vu=6, w=100

Summary:
  • no regressions, no improvements
Results: vu=1, w=1000

Summary:
  • no regressions, improvements in Postgres 17 and 18
Results: vu=6, w=1000

Summary:
  • no regressions, improvements in Postgres 16, 17 and 18
Results: vu=1, w=2000

Summary:
  • possible regressions in Postgres 14 and 15, improvements in 13, 16, 17\ and 18
Results: vu=6, w=2000

Summary:
  • no regressions, improvements in Postgres 13 through 18
Results: vu=1, w=4000

Summary:
  • no regressions, improvements in Postgres 13 through 18
Results: vu=6, w=4000

Summary:
  • no regressions, improvements in Postgres 16 through 18
Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories