Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149833 stories
·
33 followers

Create Fillable PDFs from HTML Forms in C# ASP.NET Core Using a WYSIWYG Template

1 Share
Learn how to generate PDFs from HTML forms in ASP.NET Core using a pixel-perfect WYSIWYG template. Extract form fields from a document, render a dynamic HTML form, and merge the data server-side to produce professional PDF documents.

Read the whole story
alvinashcraft
39 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Blazor Basics: Implementing a Theme Switch in Blazor (Dark Mode)

1 Share

Learn how to implement dark mode and a theme switch for Blazor web applications using standardized CSS features and no JavaScript code.

In this article, we’ll learn how to properly implement dark mode and a theme switch for Blazor web applications.

Introduction

We want to implement a minimal yet complete example of dark mode and a theme switch for Blazor web applications.

We will keep it simple and use pure CSS to style the application and Blazor to manage the state. We do not use any JavaScript code for this solution.

Hint: Although the code used in the example project is based on the .NET 10 Blazor Web Application project template, which uses Bootstrap for its sample pages, we do not use Bootstrap or any other CSS library to implement the dark mode or the theme switch.

You can access the code used in this example on GitHub.

The Concept

There are three core principles we want to follow:

  1. We define all colors as CSS variables.
  2. We use a .dark-theme CSS class to override/replace those base colors.
  3. We implement a Blazor component that toggles this CSS class at runtime.

This solution works for Blazor Server and Blazor WebAssembly.

Defining the CSS Color Variables

In the app.css file or any other CSS file referenced in the index.html or App.razor file of your Blazor web application, we add the following CSS variable definitions:

:root {
    --background-color: #FFFFFF;
    --text-color: #1F1F39;
}

.dark-theme {
    --background-color: #1F1F39;
    --text-color: #FAFAFB;
}

Hint: We keep it simple in this example and use only CSS variables for the background color and text color. In a real-world implementation, you might also want to define CSS variables for border colors, primary and secondary colors, etc.

Applying the Color Variables

Now that we defined the CSS variables for our desired colors, we need to apply those variables in CSS definitions. Remember that the variable definition above only defines the variables, but does not apply them.

.theme {
    background-color: var(--background-color);
    color: var(--text-color);
}

Again, we keep it simple and apply the background color and the text color to the theme CSS class. We could go on and define colors for buttons, headings and other parts of the web application.

Adding the Theme State to the Layout Component

Now that we have the basic building block for styling the component in place, we want to keep track of whether dark mode is in use.

A simple implementation alters the MainLayout component like this:

<div class=@($"page theme {ThemeCSSClass}")>
    <div class="sidebar">
        <NavMenu />
    </div>

    <main>
        <div class="top-row px-4">
            <button class="btn btn-primary" @onclick="ToggleTheme">
                Toggle Theme
            </button>
            <a href="https://learn.microsoft.com/aspnet/core/" target="_blank">About</a>
        </div>

        <article class="content px-4">
            @Body
        </article>
    </main>
</div>

We add two CSS classes to the class list of the outer div element. In addition to the page CSS class used in the default project template, we also want to add the previously defined theme class, followed by a class name managed by the component.

We also add a button to toggle the theme.

We add the following behavior to the code section of the component:

@code {
    private bool _darkModeEnabled = false;

    private string ThemeCSSClass => _darkModeEnabled ? "dark-theme" : "";

    private void ToggleTheme()
    {
        _darkModeEnabled = !_darkModeEnabled;
    }
}

The code defines a _darkModeEnabled variable that tracks the user’s theme choice. We conditionally return dark-theme or an empty string for the ThemeCSSClass property, which is then rendered as part of the CSS class name list for the Layout component’s outer div.

The ToggleTheme method is triggered when the user presses the button and wants to switch the theme. It flips the value of the _darkModeEnabled boolean variable.

Theming in Action

The user can now toggle the light and dark themes using the button in the page header.

A browser with a Blazor web application using a light mode theme with a dark text color and a white background.

A browser with a Blazor web application using a dark mode theme with a light text color and a dark background.

Why This Solution Works Well in Blazor

This solution leverages how Razor component rendering works. Whenever the state of a component changes, the component is re-rendered.

For the layout component, this means that whenever the user presses the button, the component’s state changes, so it will be re-rendered. As part of the rendering process, the correct CSS classes will be applied.

A declarative user interface definition combined with state-driven component rendering is the strength of Blazor, and our solution therefore fits it perfectly.

Improving the User Experience

There are two main areas that could be improved to take our simple solution to the next level:

  • The theme state is currently attached to the MainLayout component. It means that if the user closes the browser tab or refreshes the page, the state is lost. Persisting the theme choice using local storage is a great idea to keep the user experience consistent. Learn more about accessing the local storage for Blazor Server web applications or how to use local Storage in Blazor WebAssembly applications.
  • You could respect the system preferences and apply them when a particular user starts the web application for the first time. CSS provides the preferes-color-scheme media feature. The code would look like this:
@media (prefers-color-scheme: dark) {
    :root {
        /* variable definitions */
    }
}

@media (prefers-color-scheme: light) {
    :root {
        /* variable definitions */
    }
}

The Limitations of This Approach

As shown in the previous chapter on improving the user experience of our simple solutions, our approach is basic and doesn’t scale well. The following imminent limitations come to mind:

  • Handling system preferences already adds complexity to our solution and requires more code to cleanly implement the expected behavior.
  • CSS variables and definitions only cascade down the DOM tree. In our example, we add the themed CSS classes to the MainLayout component. Components rendered outside the MainLayout component element tree are out of scope for theming and must be treated the same way (by adding the theme and dark-mode CSS classes).
  • Changing colors and adjusting the theme requires altering global CSS definitions, which is hard to test and can potentially have unexpected side effects. Also, making sure to apply the CSS variables in the correct applicable CSS classes doesn’t scale.
  • Third-party components will not automatically apply the custom CSS classes defined in the application. Meaning that if you integrate third-party components, you will need to implement additional code to integrate them with your custom theming implementation.
  • Implementing multiple themes becomes even harder to maintain. Working with light and dark mode should not add too much complexity, but maintaining three or four, or even 10 different themes becomes much harder.
  • You have to verify the selected colors work together and meet accessibility requirements (e.g., color contrast). And let’s be honest—not all developers are good designers.

For a small website, the approach shown in this article will probably work well. For larger web applications consisting of hundreds of pages and components, it can become a maintenance nightmare if the implementation is not carefully architected to properly handle state changes and apply the correct CSS classes for each component during rendering.

How Telerik UI for Blazor Solves Theming at Scale

In modern large-scale web applications, theming is important and helps meet accessibility requirements and expected user experience standards.

The Progress Telerik UI for Blazor component library provides a Blazor ThemeBuilder tool that allows visual customization (color previews), SCSS-based design tokens and built-in dark and light themes.

And most importantly, the themes are applied consistently across all components.

It is still important to understand how theming works under the hood, but for a professional web application, using a professionally implemented theming system can prevent headaches and reduce the amount of custom code required. Plus, accessibility features are baked in.

Learn more about theming from Peter Vogel in the Themes Magic in Telerik UI for Blazor article.

Conclusion

We learned how to implement a simple solution for a theme switch and light and dark mode in a Blazor web application. This basic solution works for Blazor Server and Blazor WebAssembly because it uses standardized CSS features and no JavaScript.

We learned how to further improve the solution and what limitations it will face for a large-scale web application.

Professionally implemented user interface component libraries, such as Telerik UI for Blazor, implement complex theming systems that we can leverage to get around those limitations.


If you want to learn more about Blazor development, watch my free Blazor Crash Course on YouTube. And stay tuned to the Telerik blog for more Blazor Basics.

Read the whole story
alvinashcraft
47 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Lock Down Values in Pulumi ESC with fn::final

1 Share

Pulumi ESC (Environments, Secrets, and Configuration) allows you to compose environments by importing configuration and secrets from other environments, but this also means a child environment can silently override a value set by a parent. When that value is a security policy or a compliance setting, an accidental override can cause real problems. With the new fn::final built-in function, you can mark values as final, preventing child environments from overriding them. If a child environment tries to override a final value, ESC raises a warning and preserves the original value.

How it works

Let’s say you have a parent environment that sets the AWS region for all deployments. You can use fn::final to ensure no child environment can change it:

# project/parent-env
values:
 aws-region:
 fn::final: us-east-1

If a child environment tries to override the final value, ESC raises a cannot override final value warning.

# project/child-env
imports:
 - project/parent-env
values:
 aws-region: eu-west-1 # raises a warning

This evaluates to:

{
 "aws-region": "us-east-1"
}

In this scenario, the ESC environment is still valid, but the final value remains unchanged.

When to use fn::final

Use fn::final for:

  • Security-sensitive values that shouldn’t be changed
  • Compliance or policy settings enforced by a platform team
  • Shared base environments where certain values must remain consistent

Getting started

The fn::final function is available now in all Pulumi ESC environments. For more information, check out the fn::final documentation!

Read the whole story
alvinashcraft
53 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

RT.Assistant: A Multi-Agent Voice Bot Using .NET and OpenAI

1 Share

This is a guest post by Faisal Waris, an AI strategist in the telecom industry. Faisal built RT.Assistant to explore how .NET, F#, and the OpenAI Realtime API can come together in a production-style, multi-agent voice application.

RT.Assistant is a voice-enabled, multi-agent assistant built entirely in .NET — combining the OpenAI Realtime API over WebRTC for low-latency, bidirectional voice; F# discriminated unions and async state machines for agent orchestration; .NET MAUI (via Fabulous) for cross-platform native UI on iOS, Android, macOS, and Windows; and Microsoft.Extensions.AI for portable LLM integration with both OpenAI and Anthropic models.

Under the hood, a custom RTFlow framework hosts multiple specialized agents — a Voice Agent, a CodeGen Agent, a Query Agent, and an App Agent — that communicate over a strongly-typed async bus, while a deterministic state-machine (the “Flow”) keeps the non-deterministic LLM behavior in check. The sample also showcases an unconventional RAG approach: instead of vector search, user queries are translated into Prolog and executed against a logic-programming knowledge base embedded in a .NET MAUI HybridWebView, yielding precise, hallucination-resistant answers.

Why telecom plan selection?

Faisal works in the telecom industry, where one of the most common customer pain points is choosing the right phone plan. Carriers offer dozens of bundled plans that mix voice, data, hotspot, streaming, and promotional pricing in ways that are genuinely difficult to compare — even for the people selling them. It’s the kind of domain where a conversational AI assistant can make a real difference: customers ask natural-language questions and get precise, verifiable answers instead of sifting through comparison matrices or waiting on hold.

RT.Assistant uses this domain as a realistic proving ground. The application maintains a mocked — but representative — catalog of plans modeled after a major US carrier’s actual offerings. Let’s look at what makes these plans hard to compare in the first place.

Plan offerings

Phone plans (like many other offerings these days) are bundled products and services which makes it non-trivial to ascertain which plan will work best for one’s needs. Consider the following components of a typical contemporary plan:

  • Base plan with voice, text and data rates & limits
  • Mobile hotspot data rates & limits (20gb, 50gb, 100gb?)
  • Premium data rates & limits (with premium, one’s data gets preference in a crowded/overloaded network)
  • Streaming services e.g. Netflix, Hulu, Apple TV, etc.
  • Taxes and fees may or may not be included in the plan price
  • Special discounts for Military Veterans, First Responders, Seniors, etc.
  • In-flight data rates & limits
  • Then there may be seasonal or campaign promotions

Additionally, the available features may be dependent on the number of lines (distinct phone numbers). For example, Netflix may be excluded for a single line but included for two or more lines.

Technologies Showcased

The system internally maintains a mocked—but representative—set of phone plans, modeled as Prolog facts, simulating offerings from a typical major telecom provider. From (a) capturing the voice input, to (b) querying the Prolog knowledge base, and finally (c) generating the results, multiple components work together seamlessly.

This sample highlights the integration of the following frameworks and technologies:

  • RTFlow: Multi-agent framework for realtime GenAI applications – written in F#
  • RTOpenAI: F# library for interfacing with the OpenAI realtime API via the WebRTC protocol
  • Fabulous for .NET MAUI: F# library for building native mobile applications (IOS, Android, +) with Microsoft .NET MAUI
  • Tau: JavaScript-based Prolog engine and its use in a native mobile app via the .NET MAUI HybridWebView control.
  • Integration with the OpenAI & Anthropic ‘chat’ APIs for Prolog code generation with the Microsoft.Extensions.AI library

Overview

There is a lot going on here: generative AI; old-school symbolic AI; multi-agents; realtime voice; cross-platform native mobile apps; to name some. The following explains how these are all stitched together into a comprehensive system.

  • Voice-enabled interaction: The assistant allows users to ask questions about phone plans through natural speech, making the experience conversational and intuitive.
  • Structured knowledge base: Plan details are represented as logically consistent Prolog facts, ensuring clarity and eliminating ambiguity in feature–price combinations.
  • Internally, multiple specialized agents work together to process and respond to user queries:
    • Voice Agent – Maintains a connection to the OpenAI realtime ‘voice’ API, enabling natural voice conversations about phone plans. It handles the steady stream of messages from the API, include tool calls containing natural language queries. These queries are then routed to other agents, and the resulting answers are returned to the voice model, which conveys them back to the user in audio form.
    • CodeGen Agent – Converts natural language queries into Prolog statements using another LLM API, then leverages the Tau Prolog engine to evaluate those statements against the knowledge base of facts.
    • Query Agent – Executes predefined (“canned”) Prolog queries directly against the Prolog engine for quick, structured lookups.
    • App Agent – Oversees communication among agents and reports activity back to the user interface for transparency and monitoring.

All agents are orchestrated by the RTFlow framework, which provides hosting and communication services. The diagram below illustrates the RTFlow agent arrangement for the RT.Assistant sample:

RTFlow

As there are multiple frameworks / technologies in play here. Let’s briefly delve into each one of them – in the order of perceived importance.

1. RTFlow

RTFlow is a framework for building real-time, agentic applications. It is composed of three primary elements: Flow, Bus, and Agents.

Bus

The Bus provides the communication substrate that connects Agents to one another and to the Flow. It exposes two distinct logical channels:

  • Agent broadcast channel All agent-intent messages published to this channel are broadcast to all agents.
  • Flow input channel Messages published to this channel are delivered exclusively to the Flow. Agents do not receive these messages.

This separation allows agent collaboration to occur independently of system-level orchestration, while still enabling agents to explicitly signal the Flow when required.

Messages and State

Both Flow and Agents maintain private internal state and communicate exclusively via strongly typed, asynchronous messages. Message ‘schemas’ are defined as F# discriminated unions (DUs) types and are fixed at implementation time, providing:

  • Compile-time exhaustiveness checking
  • Explicit modeling of intent and system events
  • Clear separation between agent-level and flow-level concerns

Flow

The Flow is an asynchronous, deterministic state machine. Its state transitions are triggered solely by messages arriving on the Flow input channel.

Depending on application requirements, the Flow can range from minimal to highly directive:

  • Minimal control A simple lifecycle state machine (e.g., Start → Run → Terminate), where agents primarily interact with each other via the broadcast channel and the Flow plays a supervisory role.
  • Orchestrated control A more granular state machine in which agents primarily communicate with the Flow, and the Flow explicitly coordinates agent behavior based on its current state.

This design allows system-level determinism and control to be introduced incrementally, without constraining agent autonomy where it is unnecessary.

Topology

From a multi-agent systems perspective, RTFlow employs a hybrid bus–star topology:

  • The Bus enables broadcast-based, peer-style agent communication.
  • The Flow acts as a central coordinating node when orchestration is required.

This hybrid model balances scalability and decoupling with deterministic system control.

The F# language offers a clean way to model asynchronous state machines (or more precisely Mealy machines) where the states are functions and transitions happen via pattern matching over messages (DUs) or with ‘active patterns’. In the snippet below s_XXX are functions as states and M_xxx are messages that arrive on the Bus. The structure F packages the next state along with any output messages to be sent to agents.

let rec s_start msg = async {
  match msg with 
  | M_Start -> return F(s_run,[M_Started]) //transition to run
  | _       -> return F(s_start,[]) //stay in start state
  }

and s_run msg = async {
  match msg with 
  | M_DoSomething -> do! doSomething()
                     return F(s_run,[M_DidSomething])                     
  | M_Terminate   -> return F(s_terminate,[])
  | _             -> return F(s_run,[])
}

and s_terminate msg = async {
...

LLMs are inherently non-deterministic. RTFlow offers a way to control non-determinism to keep the overall system stable. As applications move from being human-centric to being more autonomous, we will need increasingly sophisticated methods to manage non-determinism. RTFlow’s approach is to inject a deterministic state machine in the mix to effect such control.

Given the relatively simple building blocks of RTFlow, we can construct rich agentic systems that can support many realtime needs with the ability to dial-in the desired degree of control, when needed.

2. RTOpenAI

RTOpenAI wraps the OpenAI realtime voice API for native mobile(+) apps. Its two key features are a) support for the WebRTC protocol; and b) strongly-typed realtime protocol messages. These are discussed next.

WebRTC

The OpenAI voice API can be used via Web Sockets or WebRTC, where WebRTC has some key advantages over the other;

  • Firstly, WebRTC was designed for bidirectional, realtime communication. It has built-in resiliency for minor network disconnects – which crucially Web Sockets does not.
  • WebRTC has separate channels for voice and data. (Also video – which is not currently used). This means that typically the application only needs to handle the data channel explicitly. The in/out audio channels are wired to the audio hardware by the underlying WebRTC libraries. In the case of Web Sockets, the application explicitly needs to handle in/out audio as well as the data.
  • WebRTC transmits audio via the OPUS codec that has excellent compression but also retains good audio quality. For Web Sockets multiple choices exist. High quality audio is sent as uncompressed 24KHz PCM binary as base64 encoded strings. The bandwidth required is 10X that for OPUS. There are other telephony formats available but the audio quality drops significantly.

Strongly-Typed Event Handling

The RTOpenAI.Events library attempts to define F# types for all OpenAI realtime API protocol messages (that are currently documented).

Additionally, the server (and client) messages are wrapped in DUs, which is convenient for consuming applications; incoming events can be handled with simple pattern matching. After the realtime connection is established, there is a steady flow of incoming events from the server that the application needs to accept and handle. The following snippet is an impressionistic version of how the Voice Agent handles server events:

let handleEvent (ev:ServerEvent) = async {
  match ev with
  | SessionCreated                                   -> ...
  | ResponseOutputItemDone ev when isFunctionCall ev -> ...
  | _                                                -> ... //choose to ignore
}

The RTOpenAI library is a cross-platform .NET MAUI (see next) library and as such supports realtime voice applications for IOS, MacOS, Android and Windows.

3. Fabulous .NET MAUI Controls

Microsoft .NET MAUI is a technology for building cross-platform native apps. The F# library Fabulous.MauiControls enables building of .NET MAUI apps in F#.

Fabulous is a functional-reactive UI framework (influenced by Elm and React).

Fabulous is a joy to use. UI’s can be defined declaratively in simple and understandable F#. UI ‘events’ are messages, which again are F# DU types that are ‘handled’ with pattern matching. In the simplest case, events update the application state, which is then rendered by Fabulous on to the screen.

Fabulous for .NET MAUI has a rich feature set, which cannot be fully covered here but the Counter App sample is replicated below to provide some sense of how the library works:

/// A simple Counter app

type Model =         //application state
    { Count: int }

type Msg =           //DU message types
    | Increment
    | Decrement

let init () =
    { Count = 0 }

let update msg model = //function to handle UI events/messages
    match msg with
    | Increment -> { model with Count = model.Count + 1 }
    | Decrement -> { model with Count = model.Count - 1 }

let view model =  
    Application(
        ContentPage(
            VStack(spacing = 16.) {                     //view
                Image("fabulous.png")

                Label($"Count is {model.Count}")

                Button("Increment", Increment)
                Button("Decrement", Decrement)
            }
        )
    )

The RT.Assistant is a .NET MAUI application and so the project structure is defined by .NET MAUI. Its a single project that targets multiple platforms. Components specific to each target platform are under the Platforms folder:

/RT.Assistant
  /Platforms
    /Android
    /IOS
    /MacCatalyst
    /Windows

The platform specific folders contain the native-app required components (plists, app manifests, etc.). For example, here is the IOS plist.

RT.Assistant application code is 90% shared across platforms. However platform-specific libraries are required when interfacing with hardware that .NET MAUI does not cover. For WebRTC, RTOpenAI uses platform-native libraries with Native Library Interop. The iOS WebRTC binding library wraps the WebRTC.xcframework written in C++. And for Android the native libwebrtc.aar Android Archive is wrapped.

Since most mobile apps have both IOS and Android versions, as such, .NET MAUI makes a lot of sense. Instead of maintaining multiple code bases and dev teams, with .NET MAUI one can maintain a single code base with 90% shared code across platforms. And unlike other mobile platforms (e.g. React Native), .NET MAUI apps are proper native apps. For example, it would be problematic to host a realtime multi-agent systems like RTFlow in a JavaScript-based system like React Native.

4. Prolog for RAG

To make the sample somewhat fun and interesting, I decided to use Prolog-based ‘RAG’. Generative AI meets Symbolic AI.

Prolog is a language for logic programming that was created almost 50 years ago. It has endured well even till today. The best known open source implementation is SWI- Prolog. However here I am using the much lighter weight Tau Prolog engine that runs in the browser.

Fortunately web content can be easily hosted in .NET MAUI apps via the HybridWebViewControl. In RT.Assistant there is a hidden web view that loads the Tau engine and the plan facts.

Prolog Representation

The typical phone plans from the major telecoms are ‘rich’ offerings. The interplay of base plans, number of lines, features and promotions suggest a rules engine based approach. This is precisely where Prolog excels. By representing valid combinations of plans, features, and pricing as logical facts, Prolog ensures consistency and removes ambiguity.

Prolog is a declarative language for first-order logic programming. A Prolog ‘database’ consists of facts (e.g. plans and their features) and rules (to derive new facts from existing ones). A Prolog implementation will find any and all possible solutions that satisfy a query, given a set of facts and rules.

The ‘schema’ for the plan and its features is in plan_schema.pl. The skeletal form is:

plan(title,category,prices,features)
% where each feature may have a different attribute set

A partial fact for the ‘Connect’ plan is given below:

plan(
    "Connect",
    category("all"),
    prices([
      line(1, monthly_price(20), original_price(25)),
      line(2, monthly_price(40), original_price(26)),
      ...
    ]),
    features([

        feature(
            netflix(
                desc("Netflix Standard with Ads On Us"),
                included(yes)
            ),
            applies_to_lines(lines(2, 2))
        ),

        feature(
            autopay_monthly_discount(
                desc("$5 disc. per line up to 8 lines w/AutoPay & eligible payment method."),
                discount_per_line(5),
                lines_up_to(8),
                included_in_monthly_price(yes)
            ),
            applies_to_lines(all)
        ),
        ...

    ])
).

Note

The full Prolog fact may seem complex, however the same rules expressed in a relational database schema would be far more complex to understand and query. The metadata (columns, tables, relations) required to represent the rules and facts will be far greater than what is required under Prolog.

Query Processing

While we can obtain an answer by prompting the LLM with text descriptions of the plans along with the query, there is a sound reason for not doing so. LLMs are not perfect and can make mistakes. And here we desire a more precise answer. So, instead we transform the natural language user query into an equivalent Prolog query – with the help of an LLM. It is surmised that the reformulation of the question is easier for the LLM, i.e. the LLM is less likely to hallucinate compared to the case of generating the answer directly. For direct answer generation, the LLM will need to sift through a much larger context – the entire plan database as plain text. For query generation, the LLM need only look at the database ‘schema’ – which is much more compact, especially in the case of Prolog.

If query transformation goes awry then the Prolog query may fail entirely or produce strange results. Either way the user will be alerted and will not rely on the results to make a decision. If on the other hand, the answer is generated directly, a hallucination may subtly alter or miss facts. The user is likely to accept it without questioning because the answer looks plausible. This is a more egregious error.

Example Queries

Below are some typical questions that can be asked:

  • What categories of plans are available?
  • What is the lowest cost plan for 4 lines?
    • Follow up: Does this plan include taxes and fees in the price?
  • Are there any special plans for military veterans?
    • Follow up: What is the amount of mobile hotspot data available for this plan?

The RT.Assistant application shows the natural language query; the generated Prolog; and the Prolog query results on the UI in realtime.

Example:

Natural language query generated by voice model from conversation:

Find the plans in the category 'military_veteran' for 2 lines and list their costs.

Prolog query:

plan(Title,
     category(military_veteran),
     prices(Lines),
     _),
member(price(2, monthly_price(Price), _), Lines).

Note

In Prolog, uppercase-starting names are ‘free’ variables that can be bound to values. For example, ‘Title’ above will bind to each of the plan titles for the found solutions. A solution satisfies all constraints. One obvious constraint is ‘category=military_veteran’ so only Military Veteran plans will be considered.

Results:

Title = Connect Next Military, Lines =
[line(1,monthly_price(85),original_price(90)),
 line(2,monthly_price(130),original_price(140)),
 line(3,monthly_price(165),original_price(180)),
 line(4,monthly_price(200),original_price(220)),
 line(5,monthly_price(235),original_price(260))],
Price = 130

Title = Core, Lines = ...

If a Prolog error occurs, the system regenerates the Prolog query but this time includes the Prolog error message along with the original query. This cycle may be repeated up to a limit.

Prolog Code Generation and Results

For code generation, the application allows for a choice between Claude Sonnet 4.5 and GPT 5.1 (via the app Settings). The GPT Codex model was also tested but there the latency is too high for realtime needs.

For this particular task, GPT-5.1 has the clear edge, generating code that produces concise and relevant output. See this analysis for more details.

For what it’s worth, both models generate syntactically correct Prolog 99% of the time. (A retry loop corrects generated errors, if any.)

For question-answering, the OpenAI realtime model generates satisfactory answers to user queries from the generated Prolog output. Note that for any real production system there should be a well-crafted ‘eval’ suite to truly gauge the performance.

The post RT.Assistant: A Multi-Agent Voice Bot Using .NET and OpenAI appeared first on .NET Blog.

Read the whole story
alvinashcraft
58 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Stop Paying for the Same Answer Twice: Agent Cache in Fiddler Everywhere

1 Share

Agent Caching in Fiddler Everywhere allows you to iterate as you build an agent without having to pay for every response when it hasn’t changed.

If you have ever built a model-powered agent, you know the development loop. Write some code, fire it at the endpoint, check the response, tweak the parsing, fire it again. Repeat until the output looks right. It is a perfectly normal workflow—and it quietly drains your token budget with every single iteration.

The new Progress Telerik Fiddler Everywhere Agent Cache feature is designed to break that cycle. Once you capture a response from a model-provider endpoint, you can flip a single switch and have Fiddler software replay that response for every subsequent matching call—without the request ever leaving your machine. Same output, zero additional tokens consumed on the provider side.

This post walks through exactly how that works, using a small open-source demo project to make everything concrete.

The Hidden Cost of Agent Development

Building an agent that calls a completion endpoint involves a lot of repetition that has nothing to do with the model itself. You are iterating on:

  • How you construct the prompt
  • How you parse and validate the structured response
  • How you surface the result to the rest of your system
  • How your error handling behaves when the response is malformed

None of those iterations require a new, unique response from the model. You already have a good one from the first call. But unless you manually save the raw response and mock it yourself, every invocation sends a fresh request, and the provider charges for it.

Once agents move beyond demos, three pressures show up together and stay for the duration of development:

  • Cost – Repeated runs during development burn budget. For a simple agent that exchanges a few hundred tokens per call, this might feel negligible. But development sessions involve dozens of sequential runs, teams have multiple developers iterating in parallel, and the costs compound quickly.
  • Latency – Every round trip to the provider stretches the feedback loop. When you are tweaking prompt construction or adjusting response parsing, waiting on a live call each time slows everything down.
  • Determinism – The same input does not always produce the same output. That variability makes it harder to isolate whether a difference in behavior came from your code change or from the model.

This is especially visible in teams that build many small, task-specific agents rather than one large agent. Even small per-run costs compound when iteration is constant—and none of that spend actually improves the agent.

What Teams Already Do

Most teams already compensate for this manually. Common patterns include separating development runs from real execution, validating agent wiring before triggering model calls, reusing mocked or previously captured responses, and avoiding live execution early to keep iteration fast.

These approaches work, but they are fragmented. Provider-level caching helps in some cases but is limited. Custom mocks and fixtures are costly to maintain. Replay logic often lives outside the main development flow, and different teams end up solving the same problem with different local tooling.

The problem is not a lack of solutions. It is the lack of a low-friction one that fits naturally into everyday iteration.

What Agent Cache Does

Fiddler Everywhere acts as a proxy that sits between your agent and the remote endpoint. When your agent makes an HTTPS call to, say, api.anthropic.com, Fiddler software intercepts it, forwards it and logs the full request-response pair in the Traffic pane.

The new Agent Calls tab is a focused view inside that pane. It automatically filters and displays HTTPS sessions that target supported model-provider endpoints—such as OpenAI, Anthropic and Gemini—so you are not wading through noise from other traffic. Every captured call gets a Caching toggle.

Enable the toggle, and Fiddler software starts intercepting any outbound call that matches that session’s request. Instead of forwarding the request, it immediately returns the cached response. The endpoint never receives the duplicate call. Your agent sees the exact same payload it would have received from a live call. Token count: zero.

Disable the toggle at any time and live traffic resumes, no restarts required.

How Agent Calls and Caching Behave

A few details that matter when you start using it:

  • Deterministic filtering: Sessions appear in Agent Calls automatically when Fiddler software detects traffic to a supported agentic endpoint. You do not need to configure which endpoints to watch.
  • First-match caching: If two or more sessions target the same endpoint (for example, https://api.anthropic.com/v1/messages) and both are cached, Fiddler software returns the response from the first cached session.
  • No rule interference: Fiddler rules are executed only for non-cached sessions. Cached responses are returned as-is, without rule evaluation.
  • Visibility split: After a session is cached, subsequent matching requests appear only in Live Traffic. The Agent Calls tab continues to show the original non-cached captures.

Why It Matters During Development

Agent Cache is built around three practical benefits that matter most during active development.

  1. Faster iterations: Replaying a cached response is instant. Instead of waiting on a round trip to the provider on every run, you get a result back immediately—shortening the feedback loop so you can move through prompt and code changes without unnecessary delays.
  2. Lower execution costs: Each cached run consumes zero tokens on the provider side. During active development, where the same request may be triggered dozens of times, this directly reduces the token spend that accumulates before a feature is even complete.
  3. More predictable behavior: A cached response is fixed and repeatable. Running the same agent logic against the same response on every iteration makes it straightforward to verify that a code change had the intended effect, without having to account for variability in live model output.

Demo: Bug Report Analyzer

To make this tangible, walk through the agent-cache-demo—a minimal Python agent that takes a fixed bug report and returns a structured analysis (severity, category, a plain-English summary and a suggested next step).

The input never changes between runs, which makes it a perfect showcase for Agent Cache: the model’s answer to an identical prompt is always reusable, so there is genuinely no reason to pay for it more than once.

What the Agent Does

The core of agent.py is straightforward:

message = client.messages.create(  
model=MODEL,  
max_tokens=256,  
system=SYSTEM_PROMPT,  
messages=[  
{"role": "user", "content": f"Analyze this bug report:\n\n{report}"}  
],  
)  

It sends the bug report to the Claude API and expects a JSON response like this:

{  
"severity": "high",  
"category": "crash",  
"summary": "App crashes with a NullPointerException when attempting to log in under no network connectivity.",  
"suggested_next_step": "Add a null or connectivity check in NetworkManager.checkConnectivity() before network calls."  
}  

That response is then formatted and printed to the terminal:

── Bug Report Analysis ─────────────────────────────────────  
Severity  : HIGH  
Category  : crash  
Summary  : App crashes with a NullPointerException when attempting to  
log in under no network connectivity.  
Next step : Add a null or connectivity check in  
NetworkManager.checkConnectivity() before network calls.  
─────────────────────────────────────  

Setup

Clone the repository and install dependencies:

git clone [https://github.com/NickIliev/agent-cache-demo](https://github.com/NickIliev/agent-cache-demo)  
cd agent-cache-demo  
  
python -m venv .venv  
source .venv/bin/activate  # macOS / Linux  
.venv\Scripts\activate  # Windows  
  
pip install -r requirements.txt  
export ANTHROPIC_API_KEY=sk-ant-... # macOS / Linux (Git Bash)  
set ANTHROPIC_API_KEY=sk-ant-... # Windows (CMD)  

The demo supports routing traffic through the Fiddler proxy or running directly against the provider. It also covers SSL/TLS trust configuration for HTTPS interception. See the repository README for full details on proxy setup, environment variables and certificate options.

Step 1: The First Live Call

Start Fiddler Everywhere and run the agent:

python agent.py  

The terminal shows the result and, crucially, the token consumption:

[tokens] Input: 312  |  Output: 68  |  Total: 380  

Switch to Fiddler Everywhere and open Traffic > Agent Calls. You will see the captured call to api.anthropic.com with the full request and response visible.

Fiddler Everywhere Traffic Agent Calls shows the captured call to api.anthropic.com with the full request and response.

This is your baseline. You paid for 380 tokens. That is fair—you needed the live call to validate the end-to-end flow.

Step 2: Enable the Cache

In the Agent Calls grid, find the captured session and flip its Caching switch to on. That is the entire configuration step.

Fiddler Everywhere Agent Calls has caching toggled on

Step 3: All Subsequent Runs Are Free

Run the agent again:

python agent.py  

The output in the terminal is byte-for-byte identical to the first run, including the token count display. Because the Caching switch was on, Fiddler software served the stored response immediately and never forwarded the request to the provider. The endpoint never saw the call.

Fiddler caching stored response immediately and never forwarded the request to the provider.

You can now iterate on agent.py as many times as you need—refactor the display logic, adjust the JSON parsing, add logging—and none of those runs cost a single token.

On the Claude Console: Only the first call was received by the provider. All sequential calls were successfully cached by Fiddler and were not taxed.

When to Use Agent Cache

Agent Cache is a development-stage tool. It is particularly valuable when:

  • Iterating on response handling: Your agent already returns a correct response from the model. You are now working on how your code handles that response—formatting, validation, error recovery. None of that work requires fresh model calls.
  • Sharing a working state with teammates: Cache a known-good response and share the Fiddler session. Everyone on the team can iterate against the same replay without burning tokens or depending on network access to the provider.
  • Working offline or in restricted environments: Once the cache is populated, your agent keeps working even without connectivity to the provider.

Things to Keep in Mind

  • Cache matching is request-based. If your agent changes the prompt, the model or any request headers, the cached session will no longer match. Capture and cache the updated variant separately.
  • The cache lives in the current Fiddler session. Closing and reopening Fiddler clears the cache state, so the next run after a restart will make a live call. Review cached sessions periodically to keep stored responses aligned with your current workflow.
  • Cache is for development, not production. Agent Cache is designed for development workflows where deterministic, repeatable responses are the goal. When you are ready to validate against a live endpoint, disable the cache and resume live calls.

Availability

Agent Cache is available on Fiddler Everywhere Trial, Pro and Enterprise tiers. The feature is not included in Lite licenses.

Try It Yourself

The full demo is on GitHub: github.com/NickIliev/agent-cache-demo. Clone it, set your Anthropic API key, and you can see the before-and-after token counts yourself in under five minutes.

The point is not really the 380 tokens saved in a single run. It is the dozens of runs you make in a typical development session, the parallel runs across a team—all of which can stop paying for answers they already have.

Agent Cache does not change how you build agents. It just removes the tax on iterating.

If you aren’t already using Fiddler Everywhere, it does come with a free trial:

Try Fiddler Everywhere

Leave Feedback

Agent development workflows are still evolving quickly, and your feedback shapes what comes next. If you try Agent Cache during development—or if there is something you wish it did differently—we want to hear about it.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

With AI Writing Code, What Are Developers For?

1 Share

As a software consultant, I’ve noticed a pattern play out at nearly every client over the last year. A team adopts Cursor or Claude Code or Copilot and their productivity, especially on greenfield tasks, jumps noticeably. And then, someone asks: “If the AI can do this, what are the developers for?”

It’s a valid question, and one I’ve been thinking about myself as AI as improved at many software development tasks over the last year or so. Using these tools daily on client projects, internal work, side project has made the answer clear to me. No matter how good AI gets at some of our daily tasks, developers will still be needed for their systems thinking, their setting of guardrails for the AI, and most importantly, their human judgement.

Software Development Was Always Repetitive

For the last many decades, a huge part of professional software development is following the patterns that already exist in a codebase. You might stand up a new REST API, and that first endpoint is genuinely hard. You’re making real decisions about design patterns, systems architecture, URL structure, authentication/authorization, database access, caching, and error handling. But for endpoint two through twenty, you’re just following the recipe you already wrote. We run into this a lot with client teams. They might not have the experience to architect a well-designed system from scratch, but once we get them going on a good pattern, they can easily follow it for new features.

AI is also very good at following recipes. Point it at a codebase with established conventions and it’ll crank out the next endpoint, the next service method, or the next React component in the same shape as the ones before it. In that way it’s like a junior developer who reads the existing code before writing new code and carefully follows the established patterns.

So yes, a meaningful chunk of what we used to spend our days typing is now automatable. My fingers don’t hurt at the end of the day anymore, and I don’t think they’re going to again, especially as more developers leverage voice chat capabilities with their AI tools.

AI Doesn’t Know What Good Looks Like

Here’s what I keep running into. These models are trained on an internet’s worth of code. A lot of that code, most of it really, is mediocre. Developers who have tried to find solutions to their questions on StackOverflow for the last deccade already know this. Whether it is tutorial snippets or Reddit, Stack Overflow answers written in a hurry, or open source projects with no review process, a lot of the code on the internet (and in the world) is poor-to-mediocre. These models are fundamentally averaging machines the guess at the most likely next word or token. If you give them a vague prompt, you will most like get back something that looks like the average of what’s out there on the internet. It might eventually compile, it might work, but it won’t reflect the specific decisions and tradeoffs your project needs.

I’ve never seen an AI look at a codebase and suggest a better architecture unless it’s specifically asked to by the developer running it. It probably won’t notice that your auth middleware has a subtle timing vulnerability. It won’t propose event sourcing because it picked up on a pattern of concurrency bugs in your shared state. It doesn’t know your deployment constraints, your team’s skill level, your future scalability needs, or the fact that your biggest customer hammers one particular endpoint at the same time every Monday morning.

Those are judgment calls that require human experience. In my experience, the developers getting the most out of AI right now are the ones whose judgment is already sharp. They can tell when the model missed and know exactly how to correct it.

The Specification Problem Remains

We have a joke in consulting: clients say they have “detailed specs” and then hand you the title of their project. I’ve been doing this long enough to know that the gap between what someone says they want and what they actually need is where most project risk can be found.

AI has the exact same problem. A quick, vague prompt by someone without experience can generate a lot of impressive-looking code fast. Then you spend three times as long iterating it into something that actually meets the requirements–requirements you should have pinned down before you started generating anything.

The teams I’ve seen get real traction with AI-assisted development aren’t the ones who figured out some magic prompt template. They’re the ones who already had their software fundamentals together: clear requirements, fast CI pipelines, strong automated test coverage, pull request reviews that actually catch things. Those aren’t AI skills. Those are engineering discipline. AI just raised the stakes on them.

If your feedback loops are slow (if you don’t know your code is broken until it’s in staging) then AI is only going to make you produce broken code faster, and that’s not a win.

What About the Next Generation?

This one’s harder to talk about, and I think our industry is being too quiet about it. Hiring managers at companies I work with are pausing junior roles. Not eliminating them all together, just pausing hiring for them, so they can see how far their current teams can scale with AI. Honestly, it’s rough for new graduates right now.

I lived through the outsourcing scare of the mid-2000s. Teams I worked on lost people to offshore replacements. For a while it felt like the bottom was falling out. But it didn’t. The work evolved, the value proposition shifted, and eventually it stabilized. Those who moved up the value chain survived. I think something similar is happening with AI, but I’m not going to pretend it’s happening on the same timeline. AI is a much more quickly moving disruption.

What I’ll say is this: the developers entering the field now need to lead with judgment earlier than my generation did. Writing decent algorithms or code that compiles aren’t differentiators when AI can do that. Understanding why certain decisions matter and being able to look at generated code and say “this won’t scale” or “this misses the actual requirement,” that’s what will stand out in the marketplace. It’s a higher bar, and the industry owes it to junior devs to be upfront about that instead of pretending nothing has changed. Recently, Mark Russinovich and Scott Hanselman from Microsoft published an excellent paper about what organizations can do to avoid some of the pitfalls of this era for junior engineering talant.

Developers Are For Judgment

After about year of working with these tools, here’s where I’ve landed. Developers aren’t paid for typing code. The best ones never really were. It just felt that way because typing in code took up so much of the day. Developers are paid for knowing which endpoint needs the cache and which one doesn’t. They are paid for catching that the generated migration will lock a production table for twenty minutes and slow down other critical tasks. They are paid for understanding that what product owner thinks they need isn’t what they actually need, and to help them see it.

AI now handles the mechanical translation of intent into code. Developers are the ones who make sure that intent is right in the first place and the ones who know how to fix it when it isn’t. And that’s not a new skill. It’s the skill that was always underneath the typing. We just get to spend more time on it now.

I recently went deep on the tactical side of all this with my guest Cory House on an extended edition of the Blue Blazes Podcast. We discussion choosing between AI harnesses, model selection, multi-agent workflows, and how to actually structure your prompts and feedback loops. If you’re adopting AI-assisted development on your team, give it a watch or listen.

The post With AI Writing Code, What Are Developers For? appeared first on Trailhead Technology Partners.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories