Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156562 stories
·
33 followers

The performance dividend: Optimizing PostgreSQL on Azure directly in Visual Studio Code

1 Share

Poor database performance is never just a database problem. In enterprise teams, it shows up as missed service-level agreements (SLAs), delayed releases, frustrated development teams, and rising operational risk. The performance problem compounds further in business impact, often resulting in frustrated customers, retention and conversion risk, and lost revenue.

I have seen this repeatedly while working with enterprises building and running large‑scale data platforms, both as a customer and partner, and now with Microsoft. When teams are forced to jump between SQL editors, monitoring dashboards, cloud portals, and documentation just to diagnose a slow query, the real cost is not just technical. It’s also time, trust, and momentum lost across the business.

A more integrated way to run PostgreSQL on Azure

This is why I am optimistic about where PostgreSQL on Azure stands today. Microsoft’s investment in open source and PostgreSQL has matured significantly over the last several years. Azure Database for PostgreSQL has evolved into a fully managed, fully open-source enterprise-ready platform, Azure HorizonDB has entered the conversation as the next-gen Postgres on Azure delivering 3x faster performance than self-managed Postgres, and Microsoft is extending that value directly into the tools developers and database administrators (DBAs) already use. The PostgreSQL extension for Visual Studio Code is a clear example of that progress, especially with its new performance‑enhancing capabilities.

Most enterprise teams do not lack tooling. They lack integration. Performance work often breaks down because insights live in one place, actions live in another, and context is lost in between. Microsoft’s direction with the PostgreSQL extension for VS Code focuses on closing those gaps by bringing development, diagnostics, and tuning into a single workflow.

The extension is designed to help teams manage PostgreSQL across the full lifecycle, from authoring queries and exploring schemas to monitoring server health and optimizing performance. For organizations standardizing PostgreSQL on Azure, this creates a more coherent operating model that reduces friction between developers, DBAs, and platform teams.

Seeing performance clearly with the Server Metrics Dashboard

One of the most impactful additions is the server metrics dashboard. For DBAs and platform engineers, this dashboard brings key performance signals such as CPU, memory, storage, and connections directly into VS Code. Instead of switching contexts to investigate an issue, teams can view metrics where they already work.

Because the dashboard is integrated with Azure, it provides Azure‑specific telemetry and historical insights that help teams understand trends, not just snapshots. When performance issues arise, the time from detection to investigation is significantly reduced.

From insight to action with Azure Advisor in VS Code

Observability only matters if it leads to action. The PostgreSQL extension surfaces Azure Advisor recommendations directly in the editor, connecting performance insights with concrete guidance. These recommendations can include suggestions around configuration, indexing, and resource optimization based on Azure telemetry.

For enterprise teams, this shortens the feedback loop. Instead of manually correlating metrics with best practices, teams receive contextual recommendations aligned to their actual workloads. This improves operational confidence and helps standardize tuning practices across environments.

Faster diagnosis with Query Plan visualization and AI assistance

Performance tuning often comes down to understanding query behavior. Recent improvements to the extension enhance query plan visualization, making execution plans easier to interpret during troubleshooting and optimization.

Beyond visualization, Microsoft is embedding AI‑assisted query analysis and optimization directly into the workflow. Developers and DBAs can analyze query plans, understand potential bottlenecks, and explore optimization options without leaving VS Code. This does not replace deep PostgreSQL expertise, but it helps teams move faster and make better decisions earlier in the development cycle.

These capabilities are especially valuable for enterprise environments where not every developer is a PostgreSQL specialist, yet performance expectations remain high.

Better authoring experiences reduce performance issues upstream

Performance work does not start in production. It starts when schemas are designed and queries are written. The PostgreSQL extension improves this experience with schema‑aware IntelliSense, search_path‑aware query authoring, and reliable object explorer behavior for large and complex databases.

Developers can write, run, and refine SQL with better context, while DBAs benefit from more consistent and predictable interactions with large schema estates. Improvements to object explorer reliability also matter at enterprise scale, where long‑running sessions and frequent refreshes are common.

Combined with Microsoft Entra ID authentication and integrated Azure resource discovery, the extension provides a secure and governed way to work with PostgreSQL across development and production environments.

From tuning to performance payout

Taken together, these capabilities change the day‑to‑day experience of running PostgreSQL on Azure. Azure Database for PostgreSQL already delivers the managed fundamentals enterprises expect, including high availability, security, and best‑practice guidance. The PostgreSQL extension for VS Code extends that value into execution by making performance management part of the same workflow as development.

This integration is a practical differentiator. It reflects an understanding of how enterprise teams actually work and where time is lost today. Instead of adding more tools, Azure is tightening the loop between insight and action.

A look ahead: AI‑native PostgreSQL with Azure HorizonDB

As enterprises look toward AI‑native architectures, Microsoft is also introducing Azure HorizonDB in public preview. Azure HorizonDB is designed for cloud‑native, AI‑ready PostgreSQL‑compatible workloads that require advanced scalability and integrated AI capabilities.

For most production workloads today, Azure Database for PostgreSQL remains the recommended choice. Azure HorizonDB represents an adjacent, forward‑looking option for teams exploring what comes next for their AI‑powered applications.

Turning performance into a competitive advantage

The real advantage of these new capabilities is the way they come together to reduce friction, improve clarity, and help teams act faster. For enterprises managing PostgreSQL at scale, that translates directly into better reliability, faster delivery, and lower operational risk.

If you are running PostgreSQL on Azure today, now is a good time to see what this looks like in practice. Try the PostgreSQL extension for VS Code and connect it to your Postgres databases on Azure to diagnose issues faster, optimize performance with greater confidence, and keep critical workloads running the way your business and your customers expect.

Try the PostgreSQL extension for VS Code

Diagnose issues faster and optimize performance with confidence

Abstract 3D cubes and spheres floating on a blue grid background with a curved turquoise line.

The post The performance dividend: Optimizing PostgreSQL on Azure directly in Visual Studio Code appeared first on Microsoft Azure Blog.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

ESLint v10.6.0 released

1 Share

Highlights

New option checkRelationalComparisons in no-constant-binary-expression

ESLint v10.6.0 introduces a new option checkRelationalComparisons for the no-constant-binary-expression rule. When enabled, the rule reports relational comparisons using <, <=, >, or >= whose result is always constant based on their literal operands.

For example:

const value = "a" > "b"; // always `false`
1
while (0 <= 0) { // always `true`
    /* ... */
}
1
2
3

Rule refinements

The following rules have been tweaked to improve correctness and ensure more consistent or intuitive behavior in edge cases:

Features

Bug Fixes

Documentation

  • a83683d docs: Update README (GitHub Actions Bot)
  • f5449f9 docs: document userland patterns for global assertionOptions in RuleT… (#20986) (playgirl)
  • bea49f7 docs: Update README (GitHub Actions Bot)
  • e5f70f9 docs: update code-path diagrams (#20984) (Tanuj Kanti)
  • 8890c2d docs: add TypeScript config guidance for MCP server (#20796) (Pierluigi Lenoci)
  • 3eb3d9b docs: Update README (GitHub Actions Bot)
  • c5bb59c docs: Update README (GitHub Actions Bot)
  • eb3c97c docs: fix grammar in prefer-const rule description (#20983) (lumir)

Chores

  • 6a42034 ci: run ecosystem tests on main branch (#20891) (sethamus)
  • 3dbacdb ci: bump actions/checkout from 6 to 7 (#21014) (dependabot[bot])
  • c3abfca chore: correct JSDoc param types in html formatter (#21018) (Minseon Kim)
  • a832320 ci: split ecosystem tests into separate jobs (#21001) (xbinaryx)
  • 27166e7 chore: update ecosystem plugins (#21005) (ESLint Bot)
  • 865d76e ci: bump pnpm/action-setup from 6.0.8 to 6.0.9 (#20989) (dependabot[bot])
  • 27a88c9 chore: update dependency markdown-it to v14 in root (#20994) (Milos Djermanovic)
  • 970cea6 chore: update dependency markdown-it to v14 (#20993) (Milos Djermanovic)
  • b482120 chore: update dependency prettier to v3.8.4 (#20990) (renovate[bot])
  • 6993fb3 chore: update ecosystem plugins (#20985) (ESLint Bot)
Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

How to Build a Personal AI Web Research Agent with Ollama and Qwen

1 Share

In this tutorial, I’ll show you how to build an AI web research agent using Ollama, Qwen, and Python. The agent searches the web for a topic, fetches relevant pages, and uses a local LLM to generate a concise digest.

Table of Contents

Background

Most of us have used ChatGPT or Claude to send queries to a large language model. You've probably also seen hallucinations in the response when the model didn't know something, sometimes because its knowledge was out of date.

With the rise of tool calling, LLMs can now use tools to search the web for the latest information. They can then bring that information into context and use it to generate an output, summarize results, and extract key points from retrieved sources.

In this tutorial, I'll show you how I built a personal research agent that searches the internet for any topic and uses local LLM to summarize what it finds. It runs entirely on my own machine to preserve privacy and has no API costs. So it's completely free.

To follow this tutorial, you'll need Ollama installed on your machine and a free Ollama account. The tutorial works on macOS, Windows, and Linux. I'm using a MacBook Pro with 32 GB of RAM, but you can run this on a lower-memory machine by choosing a smaller Qwen model from Ollama.

Motivation and Architecture

The motivation behind this project is to have agents running on my machine that can handle a variety of tasks every day. I can spin off agents to create a daily digest of AI news, surface the latest world events, or look for new job postings.

Running a local LLM also means none of these queries leave my machine. My research history stays private, and there are no per-query API costs to worry about.

For this project, we'll use Ollama web search for retrieval and local Qwen LLM for summarization (rather than rely on hosted chat tools like ChatGPT or Claude). The system diagram below shows how the agent works.

When run in the terminal, the agent asks the user what they want to research. It then calls the Ollama web search API to fetch the top 5 results for the query, downloads each of those pages, and extracts the readable text.

The extracted content from all five pages is sent to the local Qwen model along with the user's prompt and a system prompt: "Use these web results and page contents to answer in Markdown format." The model's response is then saved as a Markdown file on disk.

Diagram of the process: user prompt, Ollama web search API, top 5 result URLs, requests + BeautifulSoup, clean page text,  local Qwen model via Ollama, markdown digest saved to disk.

Step 1: Install Ollama and Get an API Key

To get started, install the Ollama application and create an account to get an API key. The free tier of Ollama will suffice for this tutorial.

Once you have the key, place it in an environment variable:

export OLLAMA_API_KEY="paste-key-here"

Step 2: Pull the Qwen Model

We'll use Qwen for this tutorial, an open-weight model that's currently one of the best smaller sized models available.

I'm using the 4-billion-parameter variant because it follows structured prompts well and runs on a laptop without a dedicated GPU. There are other sizes like 2b or 9b available.

To use Qwen3.5:4b locally, install it using Ollama. The 4b model size is around 3.4 GB on my machine. If your machine has lower RAM, you can use qwen3.5:0.8b instead of the 4b model.

ollama pull qwen3.5:4b

Step 3: Install Python Dependencies

python3 -m venv venv
source venv/bin/activate
pip install ollama requests beautifulsoup4

Step 4: Write the Agent Code

The below Python code does four things: it takes a research prompt from the terminal, calls Ollama's web search API for the top 5 results, downloads the webpages using Requests and cleans each page's text using BeautifulSoup, then sends everything to a local Qwen model with an instruction to summarize in Markdown. Finally, it saves the result to a timestamped .md file.

Save the code in your research_agent.py file.

The summarization prompt is intentionally basic. Feel free to tweak it to match the kind of output you want.

import os
import json
import requests
import ollama
from bs4 import BeautifulSoup
from datetime import datetime
from pathlib import Path

API_KEY = os.getenv("OLLAMA_API_KEY")
SEARCH_URL = "https://ollama.com/api/web_search"
MODEL = "qwen3.5:4b"

# Search web using Ollama web search 
def search_web(query):
    response = requests.post(
        SEARCH_URL,
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"query": query, "max_results": 5},
        timeout=30,
    )
    response.raise_for_status()
    return response.json().get("results", [])

# Fetch full web page content
def fetch_text(url):
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
    except requests.RequestException as e:
        return ""
    soup = BeautifulSoup(response.text, "html.parser")
    for tag in soup(["script", "style", "nav", "footer"]):
        tag.decompose()
    return soup.get_text(separator="\n", strip=True)


def main():
    user_prompt = input("Enter your prompt: ").strip()
    if not user_prompt:
        print("Prompt cannot be empty.")
        return

    results = search_web(user_prompt)

    # For each url in web search result, fetch full content
    pages = []
    for item in results:
        url = item.get("url")
        if not url:
            continue

        print(f"Fetching: {url}")
        page_text = fetch_text(url)

        pages.append({
            "title": item.get("title", ""),
            "url": url,
            "snippet": item.get("content", ""),
            "page_text": page_text,
        })

    # Prompt to send to Qwen model with web data
    prompt = f"""
    User request:
    {user_prompt}

    Use these web results and page contents to answer in markdown format.

    Data:
    {json.dumps(pages, ensure_ascii=False)}
    """

    # Invoke local Qwen model 
    response = ollama.chat(
        model=MODEL,
        messages=[{"role": "user", "content": prompt}],
    )

    digest = response.message.content

    # Build a unique filename using today's date and time
    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
    filename = f"digest-{timestamp}.md"

    # Save the digest to disk
    with open(filename, "w") as f:
        f.write(digest)
    
    print(f"Saved to digest")

if __name__ == "__main__":
    main()

Step 5: Run the Agent

python research_agent.py

The script will prompt you to enter the topic you'd like to research.

Sample Output

The summarized digest is saved as a timestamped Markdown file. The agent also prints the source URLs as it fetches them.

Before trusting the summary, skim it and spot-check a claim or two against the original source. Local models are smaller than hosted frontier models and tend to hallucinate more. So spot-checking can help with accuracy.

As a test run, I asked the research agent: "What's new in LLMs" and it fetched 5 web pages as seen below:

Enter your prompt: What's new in LLMs
Fetching: https://openai.com/nl-NL/index/chatgpt-memory-dreaming/
Fetching: https://pub.towardsai.net/tai-210-glm-5-2-closes-most-of-the-open-weight-gap-in-ten-weeks-2f970c5f1326
Fetching: https://www.globenewswire.com/news-release/2026/06/23/3315999/0/en/Multiverse-Computing-Launches-Pulsar-16B-in-collaboration-with-NVIDIA-Frontier-Grade-Reasoning-at-Half-the-Parameters.html
Fetching: https://thenextweb.com/news/anthropic-claude-tag-slack-always-on-ai-teammate
Fetching: https://www.aidoers.io/blog/claude-mythos-5-and-fable-5-explained-what-anthropic-actually-shipped

Saved to digest

The digest came out reasonably well-structured for a 4B local model. It's organized into sections with all the relevant data from the sources. I spot-checked the summary and it was accurate.

Here's what it produced:

# What's New in LLMs (June 2026)

The landscape of Large Language Models (LLMs) has evolved rapidly in June 2026, with significant updates in memory synthesis, new frontier models, enterprise integrations, and market dynamics.

## 1. Memory & Personalization: OpenAI’s "Dreaming" Update
OpenAI has deployed a new memory architecture for ChatGPT, referred to as **Dreaming V3**.
*   **Purpose:** Improves memory synthesis to optimize freshness, continuity, and relevance.
*   **Evolution:**
    *   **2024:** "Saved memories" (manual instruction-based).
    *   **2025:** "Dreaming V0" (background process curating memories from chat history).
    *   **2026:** **Dreaming V3** (significantly more capable and compute-efficient architecture).
*   **Impact:** Memory is now reviewable via a summary page, allowing users to update information and set instructions on topics to bring up.
*   **Availability:** Rolled out to ChatGPT Plus and Pro users in the US today, expanding to additional countries and Free/Go users over coming weeks.
*   **Capability:** The model now remembers specific user setups (e.g., photography gear preferences) and constraints (e.g., vegetarian diet, hotel AC preferences) without requiring explicit "remember" cues.

## 2. New Frontier Models & Benchmarks

### Claude Fable 5 & Mythos 5 (Anthropic)
*   **Classification:** Mythos-class tier, sitting above Opus in raw capability.
*   **Differentiation:** **Fable 5** is available to the public. **Mythos 5** is the identical model with cybersecurity safeguards removed, restricted to **Project Glasswing** partners only.
*   **Pricing:** \(10 per million input tokens / \)50 per million output tokens.
*   **Availability:** Included at no extra cost on Pro, Max, Team, and enterprise plans until June 22.
*   **Capabilities:** Significant jumps in **Knowledge work**, **Agentic coding**, **Vision**, **Legal reasoning**, and **Biology**.

### Z.ai GLM-5.2 (Open Weights)
*   **Release:** Z.ai (Z.AI) released GLM-5.2 under an MIT license on June 16, 2026.
*   **Performance:** Closed the open-weight gap in ten weeks. Scored **51** on the Artificial Analysis Intelligence Index.
    *   **Context:** Expanded from 200K to **1 million tokens**.
    *   **Architecture:** Utilizes "IndexShare" for long-context efficiency and "Compaction-aware reinforcement learning" for agents.
*   **Benchmarks:** Ranked third on the AA-Briefcase (91 held-out tasks), behind Fable and Opus 4.8 but ahead of GPT-5.5.
*   **Cost:** ~\(0.52 per task (compared to \)0.86 for GPT-5.5 and $1.80 for Opus 4.8).

### Multiverse Pulsar 16B (NVIDIA Collaboration)
*   **Parameters:** 16.15B total parameters (3.1B active).
*   **Performance:** Delivers 30B-class intelligence at half the parameter count.
*   **Validation:** Matches 30B-class architectures (e.g., Nemotron-3-Nano-30B-A3B) on reasoning, coding, and math.
*   **Deployment:** Available on Hugging Face under Apache 2.0 license. Optimized for lower-memory GPUs and single-node environments.

## 3. Enterprise Integration & Tools

*   **Claude Tag (Anthropic):**
    *   An "always-on AI teammate" available to **Claude Enterprise and Team** customers.
    *   **Features:** Lives inside Slack, follows conversations, learns context, and uses an **ambient mode** to proactively flag updates and tasks.
    *   **Scoping:** Identity-based permissions allow admins to restrict which channels/teams the AI can access.
*   **MCP Connectors (Anthropic):**
    *   Launched **Enterprise-Managed Authorization (EMA)**.
    *   Allows IT admins to provision connector access via identity providers (Okta) without individual OAuth flows.
*   **Perplexity Brain (Computer Agent):**
    *   Research preview for Max/Enterprise Max subscribers.
    *   Self-improving memory system that remembers what the agent *did* rather than user preferences.
    *   Results show 25% increase in answer correctness on repeated tasks.

## 4. Industry Trends & Personnel Moves

*   **Market Dynamics:** ChatGPT market share dropped below 50% (46.4% by May 2026). Claude leads in subscription conversion (13%).
*   **Talent Shifts:**
    *   **Noam Shazeer:** Co-inventor of Transformer (Google) joins OpenAI as Lead for Architecture Research.
    *   **John Jumper:** Nobel Laureate (DeepMind) joins Anthropic for AI-for-science infrastructure.
*   **Corporate M&A:**
    *   **SpaceX** acquires **Cursor** (Anysphere) for **$60 Billion** in a Q3 2026 deal to strengthen its AI coding division.
    *   **Alibaba** released the **Qwen-Robot Suite** (Qwen-RobotNav, Manip, World) for embodied intelligence and robotic control.

Conclusion

In this tutorial, you learned how to build a personal AI web research agent that searches the web, summarizes results with a local LLM, and saves a Markdown digest. All this runs on your own machine with no data leaving your laptop. You have full control over the model and prompts without any API costs.

From here, you can try new prompts to research different topics, tweak the system prompt to change the output, swap in other local models like Qwen 3.6 or Mistral, or extend the script to fit your own workflow. Happy tinkering!

If you enjoyed this tutorial, you can find more of my writing on my blog (recent posts include system design paper series), my work on my personal website, and updates on LinkedIn.



Read the whole story
alvinashcraft
25 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Getting Started With NATS JetStream in .NET

1 Share

Coder Agents is the agentic sidekick that already runs on your trusted infrastructure. Add a label like "coder" to a Github issue and watch your agent read the context, write the code, and open a pull request for your review. Explore it here.

AI has made generating applications incredibly fast, but only 8% of technical leaders describe their current AI governance as strong. Read the Retool 2026 State of AI Governance Report to explore how enterprises are tackling the shadow IT problem while continuing to empower their developers.

When .NET developers need a message queue, they reach for RabbitMQ, Azure Service Bus, or a Postgres table.

NATS almost never comes up. That's a shame: it's quietly become one of my favorite tools for this.

NATS is a messaging system written in Go that runs as a single binary with no external dependencies. JetStream, its durable layer, turns it into a real queue with at-least-once delivery. And the .NET client is a pleasure to work with.

Core NATS vs JetStream

NATS has two layers, and the difference matters.

Core NATS is fire-and-forget pub/sub. You publish to a subject, and whoever is subscribed at that moment gets it. If no one is listening, the message is gone, which suits live notifications but not a work queue.

JetStream is the persistence layer on top. It captures messages published to a subject into a stream on disk, so a consumer can read them later, even after a restart. That persistence is what turns a subject into a durable queue.

Core NATS drops a message when no subscriber is online; JetStream persists it to a file-backed stream and delivers it later

Why It's Worth a Look

A few things stood out coming from the usual brokers:

  • Tiny. The official server image is about 18 MB, a single Go binary with no ZooKeeper or Erlang to babysit.
  • Fast. Core NATS pushes millions of small messages per second on a single node. JetStream adds disk persistence, so it's slower, but still comfortably in the hundreds of thousands per second.
  • Cheap to run. A server idles in tens of megabytes of RAM, so it runs right next to your app.
  • Flexible per stream. Each stream sets its own storage and retention, so one server can host a cache and a strict work queue side by side.

Set It Up

You need the server and two NuGet packages.

Run the server with JetStream enabled. -js turns it on, and -sd points it at a directory so streams survive a restart:

# docker-compose.yml
nats:
  image: nats:2.14-alpine
  command: ['-js', '-sd', '/data']
  ports: ['4222:4222']
  volumes:
    - nats-data:/data
  restart: unless-stopped

Add the client and its dependency-injection integration:

dotnet add package NATS.Net
dotnet add package NATS.Extensions.Microsoft.DependencyInjection

Then wire it into Program.cs. AddNatsClient registers one multiplexed, self-reconnecting connection, and the next line exposes a JetStream context to inject anywhere:

// Program.cs
builder.Services.AddNatsClient(nats =>
    nats.ConfigureOptions(opts => opts with { Url = "nats://localhost:4222" }));

builder.Services.AddSingleton(sp =>
    sp.GetRequiredService<INatsConnection>().CreateJetStreamContext());

Publish a Job

With the JetStream context in DI, a Minimal API endpoint publishes in one call. Job is a plain record, and NATS.Net serializes it to JSON for you, so you work with typed messages, no extra setup. EnsureSuccess throws if the stream didn't store the message:

app.MapPost("/jobs", async (CreateJob request, INatsJSContext js, CancellationToken ct) =>
{
    var job = new Job(Guid.NewGuid(), request.Payload);

    PubAckResponse ack = await js.PublishAsync("jobs.work", job, cancellationToken: ct);
    ack.EnsureSuccess();

    return Results.Accepted($"/jobs/{job.Id}");
});
A producer publishes to a work-queue stream, and a pool of workers competes on one durable pull consumer

Process Jobs in a Worker

A BackgroundService is the natural home for the consumer. It creates the stream and durable consumer on startup, then pulls messages in a loop. Every running instance shares the workers consumer, so they compete for jobs and each runs once:

public class JobWorker(INatsJSContext js) : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        await js.CreateStreamAsync(new StreamConfig("JOBS", ["jobs.work"])
        {
            Retention = StreamConfigRetention.Workqueue, // a queue: acked messages are removed
            Storage   = StreamConfigStorage.File         // durable: survives a restart
        }, ct);

        var consumer = await js.CreateOrUpdateConsumerAsync("JOBS", new ConsumerConfig("workers")
        {
            AckPolicy  = ConsumerConfigAckPolicy.Explicit,
            AckWait    = TimeSpan.FromSeconds(30), // must exceed your worst-case processing time
            MaxDeliver = 5                         // drop a poison message after 5 tries
        }, ct);

        await foreach (var msg in consumer.ConsumeAsync<Job>(cancellationToken: ct))
        {
            await ProcessAsync(msg.Data, ct);          // the side effect
            await msg.AckAsync(cancellationToken: ct); // then ack
        }
    }
}

Register it with builder.Services.AddHostedService<JobWorker>(). The worker is a singleton, so resolve scoped dependencies like DbContext through IServiceScopeFactory.

Two stream settings shape how the queue behaves.

Storage is File (on disk, survives restarts) or Memory (faster, but gone on restart).

Retention controls when a message leaves the stream:

  • Limits (the default) keeps every message until it hits an age, size, or count limit. The stream is a replayable log, and reading a message doesn't remove it.
  • Workqueue drops a message the moment a consumer acks it, so the stream itself is the queue. Messages are delivered in publish order, oldest first (FIFO).
  • Interest keeps a message only while a consumer still needs it, then drops it once every interested consumer acks.

For a job queue: Workqueue on File, as in the worker above.

Acknowledge After the Side Effect

Look closely at the worker loop: it processes first, then acks. That order is the rule that makes JetStream reliable, and most quickstarts skip it.

Acknowledge the message after the side effect, never before.

JetStream gives you at-least-once delivery. If a worker runs a job and crashes before acking, JetStream redelivers it. But ack before the work is finished, and a crash leaves the job marked done with nothing to show for it.

A worker fetches a job, runs it, persists the result, and only then acks; a crash before the ack causes a redelivery

The flip side is that a job can run more than once, so your handler has to be idempotent. The usual fix is to track the messages you've already handled and skip duplicates, in the same transaction as the side effect. I covered the full pattern in The Idempotent Consumer Pattern in .NET. At-least-once delivery only holds up when the handler reading the stream is idempotent.

Summary

NATS JetStream gives you a durable, at-least-once work queue from a single 18 MB binary, and it slots into an ASP.NET Core app cleanly: publish from an endpoint, process in a BackgroundService, ack after the work is done.

I went in skeptical, half-expecting to miss RabbitMQ. It won me over: easy to operate, no surprises, and it clusters with Raft-based replication when a bigger load calls for it. It's now the first thing I reach for when I need a queue and don't want to think much about the broker.

If you haven't tried it, spin up the container and publish a message. That's all there is to getting started.

Thanks for reading.

And stay awesome!




Read the whole story
alvinashcraft
41 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Saying the obvious thing

1 Share

Stating the obvious is surprisingly useful. Most of your knowledge lives below the threshold of conscious awareness, so it’s possible for a piece of writing to remind you of what you already know. It’s common to know you don’t like something without being quite sure why, and reading an obvious statement (such as “accuracy matters, even when you agree with the broad strokes”) can help clarify why you find certain things distasteful.

Sometimes you can see some obvious truth that nobody seems to be talking about, and reading it in someone else’s words can prompt an “oh god, I’m not crazy” moment of catharsis. For many junior engineers, it’s almost a rite of passage to notice that some percentage of software engineers do virtually no work. Since nobody talks about it (how would you even bring it up in the workplace?), they often feel like they’re losing their minds: surely this state of affairs wouldn’t be allowed to continue, so they must be completely misreading the situation. But in fact it’s true.

Stating the obvious is hard. It can even be dangerous: sometimes there’s a good reason nobody says the obvious thing. But I think the bigger reason it’s hard is for the same reason that it’s hard to draw what you actually see. When I look at a person and try to draw them, I’m not drawing the lines and shades my eye sees (like a printer or camera might). I’m drawing what I know the person looks like, which is a kind of stick-figure approximation. It takes time and effort to drop the layer of interpretation and draw what’s actually there1.

Many of the posts I’m most proud of are times when I’ve managed to articulate something I think is obviously true: engineer reputation is determined by ratchet effects, good engineers are right most of the time, you shouldn’t just do JIRA tickets (or glue work), and so on. These are all things I’ve believed for a while, but have only (relatively) recently been able to notice that I believe them. Sometimes I’m helped along by reading something I vehemently disagree with (like “nobody gets promoted for doing simple work”, or ”big egos have no place in tech”).

Stating the obvious doesn’t mean avoiding nuance. Every obvious claim carries with it a host of subtle, non-obvious claims. For example, I believe that having a big ego can be very useful as a software engineer. But why exactly is that, and what do I mean by ego? Obviously it’s not good to be constantly flexing your status on other people, or to be unable to tolerate the possibility of being wrong. However, I do think you need to be able to take firm technical positions even when the situation is uncertain, which means you have to be confident in your technical instincts. Teasing out that distinction (and its implications) is very interesting, but in order to do it you need to be able to first articulate the obvious part.

I’ve been talking about stating the obvious in technical blogging. But this principle applies just as well to other kinds of communication. When I write a technical design document at work, it’s very important to state the obvious. In fact, technical communication is so hard and general understanding is so poor that just getting people aligned on the obvious things is often enormously valuable. Much great literature and poetry aims to bring out some obvious but hard-to-articulate part of human experience.

Don’t avoid writing something down just because you think it’s obvious. The thing you think is obvious now might recede into your subconscious in an hour; get it written down while you can! And don’t avoid writing something down because you think it’s dangerous to say and everyone already knows it. For people new to the area, reading your words can help them feel like they’re not losing their minds. Finally, once you write down the obvious thing, it allows you to go on and draw out the parts that are less obvious, in a way that you couldn’t do if you try to just skip straight to the subtleties.


  1. Incidentally, this is why most people cannot draw a bicycle on their first attempt. Unless you’re a mechanical engineer, you probably do not have a stick-figure-level approximation of what a bicycle looks like in your head, so you begin confidently (after all, you’ve seen a thousand bicycles) and get stuck after the first few lines.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

The case of the DLL that was not present in memory despite not being formally unloaded, part 2

1 Share

Last time, we looked at crashes caused by a DLL being removed from memory behind everybody’s back, causing crashes when somebody tried to call into that no-longer-there DLL that everybody thought was still there.

A colleague of mine who was looking at other crashes coming from this process found that most of those other crashes were also of the form “a data structure was corrupted because somebody wrote the single byte 01 into it.” That piece of information made everything fall into place for my side of the investigation.

We saw earlier that the bottom bit of the HMODULE is set for datafile module handles. Therefore, if one of these stray 01 bytes happens to overwrite the bottom byte of an existing HMODULE handle, that turns it into a (fake) datafile module handle. And then, during process destruction, a component dutifully cleans up the DLLs they loaded by freeing them (say because they were stored in an RAII type like wil::unique_hmodule), the code will pass this (fake) datafile module handle to Free­Library. The Free­Library function sees the bottom bit set and says, “Oh, this must be the handle to a module that was loaded via LOAD_LIBRARY_AS_DATAFILE,” so it frees it as a datafile.

Freeing a datafile module means undoing the steps that were taken when the module was loaded as a datafile: Unmapping the DLL from memory. In particular, loading a module as a datafile does not add the DLL to the list of DLLs that were loaded as code; therefore, unloading a datafile module doesn’t remove it from that list. As far as the DLL list is concerned, the DLL is still in memory.

A one-bit error caused the code to lie and attempt to free a module handle that did not correspond to a Load­Library call, resulting in mass havoc.

The “DLL unmapped from memory” crash is just an alternate manifestation of the “somebody is writing 01 bytes to places they shouldn’t” bug. The original bug had a larger bucket spray than we initially thought.

The good news is that all of the crashes have funneled down to a single bug. The bad news is that you now have to debug this one memory corruption bug.

Unfortunately, at the time of this writing, the root memory corruption bug in the third party program has yet to be identified. We don’t know whether it’s coming from an operating system component or from the program itself. Though the fact that it appears to occur only in one process, where it sprays across multiple modules, suggests that it’s a problem with that program, or that there’s something peculiar about how this specific process uses the system.

If you look at the original stack trace, you can see that the problem is occurring at process termination. That’s probably why the problem has lurked for so long: Crashes at exit often go unnoticed because there is no end-user loss of functionality. The user was finished with the program anyway. Whether it exits cleanly or with a crash doesn’t affect the user much.

Sorry. Not all stories have a happy ending.

The post The case of the DLL that was not present in memory despite not being formally unloaded, part 2 appeared first on The Old New Thing.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories