Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
154856 stories
·
33 followers

Building a GitHub Copilot Agent Usage Dashboard

1 Share

Introduction

Working with organisations that are attempting to create GitHub Copilot custom agents, take-up of these agents by their community becomes important to know. Some questions quickly emerge are "how well are we actually using it?", "which agents are getting used and which have not had that much traction?".

Native metrics provide high-level insights into adoption, but they lack the depth needed to answer more granular questions—such as which agent workflows are most used, or how behaviour evolves over time.

In this post, I’ll walk through how to build an enterprise-grade GitHub Copilot usage dashboard that captures detailed telemetry from VS Code using OpenTelemetry, processes it in Azure Monitor, and visualises insights in Grafana—all using a reproducible, infrastructure-as-code approach. The dashboard can be made available to anyone that needs it.

Architecture

VS Code can be configured to emit metrics using Open Telemetry as a standard. This is a configuration item in VS Code and you essentially point it to an Open Telemetry Collector. The collector is an endpoint that can consume the telemetry.

In this implementation, it is a container image that is hosted in Azure and I have chosen Azure Container Apps (ACA) for this purpose as it is an easy to use managed environment - but it could also run in Azure Kubernetes Service (AKS) with a little more effort.

There is a prebuilt image opentelemetry collector for this and this has been adapted to inject configuration to send the telemetry to Azure Application Insights.

For defining and hosting the dashboard, I have chosen another Azure managed service Azure Managed Grafana

Sample Dashboard

The sample dashboard is one that contains a collection of visualisations derived from the collected data in Application Insights. Azure Managed Grafana allows you to visually author these dashboards or they can be implemented as a JSON file and adapted from there.

Note that the telemetry generated by VS Code gives the location of the users - city, region and country, but does not include any personally-identifying information (PII) and so cannot be used to track individuals. As I understand it, this is by design.

Managed Grafana has its own permission structure, which may then be used to give users access to the dashboard. 

Implementation Details

There is a GitHub repo Copilot Usage Dashboard that contains details of how to implement this together with instructions for either "click-ops" 🙂creation or via Terraform. So I suggest you follow the link to my repo to look at the details. In summary, there needs to be in Azure:

  • Azure Container App (ACA) that hosts the collector - this needs to have public ingress
  • Azure Container Registry (ACR) that hosts the docker image that is customised via the Dockerfile
  • Key Vault that hosts the Application Insights connection string that ACA references
  • Application Insights - this needs to be created with a flag to allow it to work with Grafana data
  • Log Analytics Workspace that works with Application Insights
  • Azure Managed Grafana to host the Grafana dashboard

The main thing to bear in mind is that VS Code needs to be configured to emit OpenTelemetry

{ "github.copilot.nextEditSuggestions.enabled": true, "github.copilot.chat.otel.enabled": true, "github.copilot.chat.otel.exporterType": "otlp-http", "github.copilot.chat.otel.otlpEndpoint": "https://<fqdn>" }

where the FQDN is the URL of the public ingress to the Azure Container App.

There is a Dockerfile in this repo that just injects the correct configuration file into the OpenTelemetry collector image. It is this configuration file that tells the collector to emit to Application Insights. It is of the form below:

receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: batch: attributes: actions: - key: environment value: "prod" action: upsert exporters: azuremonitor: connection_string: "${APPLICATIONINSIGHTS_CONNECTION_STRING}" debug: verbosity: detailed service: pipelines: traces: receivers: [otlp] processors: [batch, attributes] exporters: [azuremonitor, debug] metrics: receivers: [otlp] processors: [batch, attributes] exporters: [azuremonitor]

As can be seen above, there is a placeholder for the Application Insights connection string - in the ACA configuration this is an environment variable that then points to a secret which is in key vault.

If all is well, VS Code will emit telemetry to the container image running in ACA and this will use its configuration to send to Application Insights. The Grafana dashboard then using this data.

Troubleshooting

The GitHub repo goes into the detail of troubleshooting, but the overall steps to troubleshoot are:

  1. If there is no data in Grafana, check that Grafana has access to Application Insights
  2. check whether there is telemetry being pushed into Application Insights by looking at the logs and looking for the contents of the table Dependencies. If there is telemetry there, then it is Grafana permissions. If not, look to ACA
  3. Look at ACA logs to see if it is healthy and look to see if there is any logs being received
  4. Use a curl request to send a fake log to ACA (a sample is in the repo) to see if the ACA is accepting logs
  5. Check the connection to Application Insights is correct and is being pulled from key vault or replace the environment variable value with the connection string directly
  6. If all good so far, then it may be that the configuration in VS Code is not correct or in the correct place.

Hopefully the more detailed steps will resolve any issues quickly.

Further thoughts and enhancements

This implementation attempts to build a dashboard showing GitHub Copilot agent usage using a standard set of security controls, but more may be needed. Here is a list of possible enhancements:

  1. A more refined dashboard. This should be easy as there are samples for all sorts of visualisations and few of these may allow more focus on agent and model usage.
  2. the ACA-hosted OpenTelemetry collector has a public-facing ingress. This may need to be locked-down at the network level by address restriction or by a non-public ingress. Care would need to be taken to make sure that this is then visible/reachable to the intended VS Code user audience
  3. The ACA collector endpoint is not authenticated in of itself. This could be achieved at the container level by putting an authenticating proxy in the Dockerfile or at the ACA ingress level. Some investigation would be needed to see how the VS Code configuration could work with this and this may dictate largely what form this authentication can take.
  4. How the VS Code configuration changes can be automated for a user base has not been investigated as part of this work. It is assumed that an organisation may be able to roll-out these changes using their application deployment automation.

Summary

This approach provides a means by which an organisation can track the usage of GitHub Copilot agents (and their models), that is not provided by GitHub Enterprise dashboards. This will provide insights into the take up of custom agents and their underlying models - allowing an organisation to test whether their investments on custom agents are being used effectively. Additionally, the dashboards themselves can easily be rolled-out to a wider community than GitHub Enterprise one.  

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Bringing Goodnotes to the web with Swift and WebAssembly

1 Share

Goodnotes has been helping millions of users take handwritten notes on iPad for over a decade, earning recognition as Apple’s iPad App of the Year in 2022. Today, the same Swift code that powers our iOS app also runs seamlessly in web browsers through WebAssembly, delivering the exact same ink rendering and note-taking experience users love.

Goodnotes Logo Goodnotes Logo

This journey demonstrates that Swift excels as a cross-platform language, running high-performance applications while sharing the same codebase. Every bug fix and improvement to Goodnotes benefits all our users simultaneously, regardless of which platform they use.

After two years of development and over two years in production at Goodnotes, we’ve shown that Swift on WebAssembly is a viable, powerful approach for building complex, performance-critical web applications.

Why we chose Swift for the web

When we decided to bring Goodnotes to the web in 2021, we faced a critical decision. After more than 10 years of development, we had accumulated millions of lines of Swift code that implemented countless refinements and optimizations for digital ink rendering, document synchronization, conflict resolution using Conflict-Free Replicated Data Types (CRDTs), and content search and document indexing.

We need to maintain more than 60 Frames Per Second (FPS) for real-time ink rendering, which makes performance critical. A JavaScript rewrite, Flutter, or Kotlin Multiplatform would all require rewriting our entire rendering engine from scratch, a substantial undertaking that would have delayed our web launch by years and inevitably introduced behavioral differences between platforms.

SwiftWasm emerged as the solution. This community-driven project allows Swift code to compile to WebAssembly, running in browsers with good performance. We started experimenting with SwiftWasm, building prototypes to validate the approach. Our first experiment focused on our handwriting component, a performance-critical part of Goodnotes that would serve as a good indicator of WebAssembly’s capabilities. The results were promising enough that we committed to this path.

The most compelling benefit wasn’t just code reuse, but the guarantee of behavioral consistency. When users draw a stroke on their iPad and later open the same document on the web, they see exactly the same curves, the same pressure sensitivity, the same ink flow. This isn’t because we carefully reimplemented the same algorithms twice: it’s because it’s literally the same Swift code running on both platforms.

Technical architecture

Goodnotes WebAssembly Architecture showing shared Swift code between iOS and Web platforms.
Goodnotes Architecture: Shared Swift code between iOS and Web platforms.

Our architecture is built around a clear separation between platform-specific UI components and shared business logic. This design enables us to maintain behavioral consistency while leveraging platform-native capabilities where appropriate.

Shared core components

The heart of our application consists of three main parts:

Content Rendering Engine: This handles the real-time rendering of notebook content and interactive ink strokes. We use a custom rendering engine built on low-level graphics APIs: Metal on iOS and WebGL on the web. The rendering logic is almost entirely shared, with only platform abstraction layers implemented separately for each platform.

Business Logic Layer: Document modeling, handwriting recognition, and document indexing are all implemented in shared Swift packages.

View Models: Core view models that handle tool interactions and user gestures are shared across platforms.

Code sharing metrics

Our codebase demonstrates significant code reuse:

  • Total Web Swift codebase: 2.2 million lines of code
  • Shared Swift code: 1.47 million lines (66% of the web app, 34% of the iOS app)

While lines of code isn’t the best metric, these numbers reflect the substantial business logic and rendering engine that we successfully share between platforms.

Binary size and loading

The final WebAssembly binary is approximately 50 MB, which compresses to 12 MB with Brotli compression. We use Service Workers for efficient caching and fast load times for users.

JavaScript interoperability

We use JavaScriptKit for seamless interoperability between Swift and JavaScript. This allows us to integrate with the existing web ecosystem while keeping our core logic in Swift.

Platform compatibility considerations

When sharing code between iOS and WebAssembly targets, we encountered several important considerations:

Concurrency Model: libdispatch APIs are unavailable on WebAssembly targets. We migrated from direct libdispatch usage to Swift Concurrency’s async/await and actors, for better cross-platform compatibility.

Architecture Differences: On wasm32, Swift’s Int has a 32-bit width. Some code assumed Int only held 64-bit values, so it had to be updated to use Int64 explicitly.

Dependency Injection: Network access and other I/O operations are abstracted through dependency injection, allowing us to provide platform-specific implementations while keeping the core logic shared.

Multithreading with WASI threads

One of the most significant technical achievements was implementing true parallelism using WebAssembly System Interface (WASI) Threads with Web Workers and SharedArrayBuffer. This allows us to:

  • Run handwriting recognition in background Web Workers
  • Perform document indexing without blocking the main thread
  • Maintain smooth rendering at more than 60 FPS while processing complex operations

Swift Concurrency’s Custom Actor Executors (SE-0392) were crucial for managing the web platform’s constraints. JavaScript objects are isolated to their originating thread, so we needed precise control over where our Swift actors execute. JavaScriptKit provides several APIs to create a SerialExecutor for a dedicated Web Worker, enabling us to pin specific actors to specific Web Workers.

This architecture ensures that computationally-intensive tasks like handwriting recognition run in the background while UI operations stay on the main thread, while still allowing access to JavaScript objects inside background threads.

Performance Impact: This multithreading approach delivered a greater than 2x improvement in Interaction to Next Paint (INP), significantly enhancing the UI responsiveness during complex operations.

Security Considerations: Modern browser security policies require Cross-Origin Isolation to use SharedArrayBuffer. While this adds some complexity to the application, it’s a necessary trade-off for the performance benefits of true parallelism. For applications that can’t meet these requirements, single-threaded cooperative concurrent execution remains a viable option.

Development experience

One of the most significant aspects of our Swift on WebAssembly experience was the development workflow. The tooling ecosystem is mature and powerful, providing a solid development experience.

IDE support

We can develop using either Xcode or VS Code with SourceKit-LSP, providing full language server support including autocomplete, error checking, and refactoring capabilities.

Xcode doesn’t currently have WebAssembly platform support, so code completion and other features are limited for WebAssembly-specific APIs. SourceKit-LSP, however, has Swift SDK support, so by properly configuring .sourcekit-lsp/config.json, you can get code completion for WebAssembly targets as well.

Debugging and development tools

You can debug Swift code directly in Chrome DevTools: set breakpoints, inspect variables, and step through your Swift code as naturally as JavaScript. We developed a Chrome DevTools extension library that enables Swift-specific variable reflection and source-level debugging, building upon the existing WebAssembly debugging capabilities. For more details on the enhanced DWARF extension for Swift, see the Swift on WebAssembly debugging guide.

Chrome DevTools debugging Swift code compiled to WebAssembly, showing breakpoints, variable inspection, and call stack.
Debugging Swift code in Chrome DevTools with full source code visibility and variable inspection.

Performance profiling

The existing web ecosystem provides powerful performance profiling tools. Chrome’s Performance tab shows exactly where time is spent, down to individual Swift functions, and the Memory tab gives us good insight into memory usage patterns. For most performance optimization tasks, these standard tools are quite effective.

For more advanced cases requiring specialized memory profiling capabilities and detailed heap analysis, the growing WebAssembly ecosystem provided the foundation for building custom tools. We developed wasm-memprof for detailed heap profiling when optimizing memory usage. This tool provides insights into memory allocation patterns that aren’t easily visible through standard web profiling tools.

wasm-memprof performance profiling tool showing flame graph and memory allocation analysis for WebAssembly applications.
Performance profiling with wasm-memprof showing memory allocation patterns and optimization opportunities.

Contributing back to the community

As part of our journey, we’ve been able to contribute back to the Swift community in meaningful ways. All WebAssembly-related changes have been upstreamed, and the WebAssembly platform has been supported since Swift 6.2! This means that other teams can now benefit from the same tooling and language features that made our project successful.

Lessons learned

Our experience has shown that Swift on WebAssembly is production-ready for complex applications. The language’s safety features, performance characteristics, and modern concurrency model translate well to the web platform.

For teams considering this path, here are our key recommendations:

  • Start with a performance-critical component to validate the approach.
  • Invest in proper platform abstraction layers early.
  • Leverage Swift Concurrency for cross-platform compatibility.
  • Plan for the security requirements of SharedArrayBuffer if multithreading is needed.
  • Consider gradual adoption rather than complete rewrites for existing projects.

Consider using Swift for your web projects. The growing WebAssembly ecosystem and improved tooling support make this an increasingly viable option for teams looking to share code across platforms.

Get involved with Swift on WebAssembly

Swift has fulfilled its promise as a powerful, expressive language that works everywhere. From mobile devices to servers to web browsers, Swift code can run efficiently while maintaining the safety and performance characteristics that developers appreciate.

The Swift on WebAssembly ecosystem is more accessible than ever. Here’s how you can get involved:

For Developers:

For SwiftWasm Contributors:

With Swift’s official WebAssembly support, we’ve entered a new era of cross-platform development. The same language that powers iOS applications can now create web experiences that are fast, safe, and maintainable.

Swift’s future is increasingly cross-platform, and we’re excited to see what the community builds next.

Read the whole story
alvinashcraft
17 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Gemini’s new AI agent is about as good as Google’s demo

1 Share

Google's new "24/7" AI agent, Gemini Spark, can be shockingly good at doing things on your behalf. But I'm not sure it's worth the financial cost and potential privacy tradeoffs.

The company gave me access to Spark last week. Google advertises Spark as an AI agent that can take on tasks and work on them in the background - even tasks that have multiple steps - allowing you to put your phone down or walk away from your computer. It also advertises at the very top of the Spark website that it's "always under your direction," that "you choose to turn it on," and that "it's designed to check with you before taking major actions." Given the moun …

Read the full story at The Verge.

Read the whole story
alvinashcraft
39 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

How to Build an AI Support Agent That Knows When NOT to Answer Tickets

1 Share

Most AI support agent tutorials show you how to wire up Retrieval Augmented Generation (RAG) and call it a day. Convert the docs into numeric vectors, pull the closest few passages to the user's question, drop them into a prompt, and ship a polite reply.

This pattern works for FAQ tickets, but it breaks the moment a user writes "my card was stolen", for example. The agent confidently quotes an outdated phone number, the user loses minutes which matter, and the support team finds out from a complaint.

I'm a full-stack software engineer working with fintech systems. I shipped a multi-domain triage agent for the HackerRank Orchestrate hackathon, a 24-hour solo build judged across four axes. The agent handled real support tickets across HackerRank, Claude, and Visa, grounded only in the documentation provided with the starter repo. Two of those domains tolerate a wrong answer. The third does not. I ranked 9th of 1,349 participants on the final leaderboard. The full source is on GitHub.

This article walks through the pattern I used to keep the agent safe: escalation-first design. The agent commits its routing decision before any text is generated, drafts grounded answers only when the routing says reply, and verifies the answer with two independent AI judges before it reaches the user. Every step is built to fail toward escalation, not toward a wrong answer. I also walk through the gaps in my own submission, so you don't repeat them.

What you'll find below:

  • Why letting the language model make the escalation decision is the wrong default

  • The pure-function decider pattern and its three terminal paths

  • A two-judge consensus verifier with an arbiter for disagreement

  • How to make all of this cheap with Jaccard pre-checks and SHA-keyed caching

  • Five honest gaps in my own submission, and what I would change next time

Table of Contents

The Two Halves of Support Tickets

Support tickets aren't one problem. They are two.

Most tickets are FAQs. "How do I add time accommodation for a candidate?" or "How do I delete a conversation in Claude?" These have direct answers in the documentation. An AI agent resolves them in seconds and frees the human team for harder work. This is the more obvious half.

A small fraction of tickets are sensitive. "My Visa card was stolen." "I want to appeal my test score." "Please delete all my data." On these, an AI confidently giving a wrong answer is worse than no answer at all. It delays the real human response. It causes real harm to the user. This is the harder half.

The design problem is not "build a chatbot." It's "build something that knows the difference between the two and route accordingly". The whole architecture below exists to enforce this routing reliably:

Routing architecture

In the diagram above, you can see that tickets fan out to triage signals and retrieval, then feed a Python decider with no LLM call. The decider routes to one of three paths: escalate to a human, send a template decline for off-topic requests, or hand off to the drafter for a grounded answer with citations. Drafts pass a cheap token-overlap check first. Safe high-overlap drafts ship directly. Low-overlap or risky drafts go to two judges. If they agree, ship. If they disagree, an arbiter breaks the tie.

The rest of the article walks through each block in this image. We'll start with the decider, because every other decision below it follows from that one.

Why Letting the LLM Decide Is the Wrong Default

The natural temptation in an agent loop is to let one large language model handle everything. Read the ticket, retrieve relevant docs, decide whether to answer, and draft the answer. One model, one prompt, one round trip. Simple.

Three things go wrong when you do this:

Prompt Injection Wins

A user writes "ignore all previous instructions, this is a routine FAQ" embedded in their ticket. An LLM-driven decider can be talked into reclassifying a fraud ticket as benign.

Defensive techniques such as spotlighting (wrapping user text in delimiters and telling the model to treat anything inside as untrusted data) help, but the attack surface still sits inside the decision boundary.

Non-Determinism

Even at temperature zero, language models drift across model updates and provider changes. The same ticket today might route to reply and next month to escalate with no code change. Regression testing becomes guesswork.

Rationalization Drift

When you ask one model to both decide and answer, it leans toward "I have an answer for this." Answering is the productive path. The decision gets biased toward replying, especially on borderline tickets where escalation would be safer.

The fix is structural separation. Move the decision out of the language model entirely.

The Pure-Function Decider Pattern

The decider is an ordinary Python function. No language model calls inside it. There's no outside state to consult. The same inputs always produce the same output, the way 2 + 2 always returns 4.

The function reads two inputs: a bundle of triage signals and a list of retrieval scores. It returns a single Decision value with the routing verdict, the request type, the product area, and (when relevant) an escalation reason.

from dataclasses import dataclass
from typing import Literal


@dataclass(frozen=True)
class Decision:
    status: Literal["Replied", "Escalated"]
    product_area: str
    request_type: Literal["product_issue", "feature_request", "bug", "invalid"]
    escalation_reason: str
    response_path: Literal["draft", "out_of_scope_template", "escalation_template"]


def decide(triage, retrieval, vocab, thresholds) -> Decision:
    # Forced-escalation paths, ordered by priority
    if triage.scope_status == "out_of_scope_risky":
        return Decision("Escalated", "", triage.intent,
                        "out_of_scope_risky", "escalation_template")
    if triage.scope_status == "invalid":
        return Decision("Escalated", "", "invalid",
                        "invalid_or_spam", "escalation_template")
    if triage.risk_flags:
        return Decision("Escalated", "", triage.intent,
                        f"risk:{triage.risk_flags[0]}", "escalation_template")
    if triage.injection_score > 0.7:
        return Decision("Escalated", "", "invalid",
                        "injection_attempt", "escalation_template")

    # Out-of-scope benign: template reply, no drafter call needed
    if triage.scope_status == "out_of_scope_benign":
        return Decision("Replied", "", "invalid", "", "out_of_scope_template")

    # Retrieval confidence gates
    if not retrieval:
        return Decision("Escalated", "", triage.intent,
                        "no_retrieval", "escalation_template")
    top1 = retrieval[0].score
    if triage.domain == "none_inferable" and top1 < thresholds.t_cross:
        return Decision("Escalated", "", triage.intent,
                        "cross_domain_low_score", "escalation_template")
    if top1 < thresholds.t_floor:
        return Decision("Escalated", "", triage.intent,
                        "low_retrieval_score", "escalation_template")

    # Replied: grounded draft path
    product_area = _pick_product_area(retrieval[:5], vocab)
    return Decision("Replied", product_area, triage.intent, "", "draft")

Every branch is auditable. A human reads the function once and knows exactly which conditions trigger an escalation. The unit test suite for this function in my project was fifteen tests long. Every branch had at least one test.

Compare this to "the language model decided to escalate." Which prompt? Which model version? Which input phrasing? You can't answer.

Three Terminal Paths Instead of Two

The naïve support agent has two outputs: reply or escalate. Real support has three:

  1. Reply with a grounded answer: The agent has supporting documentation and the request is in scope.

  2. Reply with a polite scope decline: The user asked something benign but off-topic. "What's the weather?" gets a template response saying this is outside our support scope, here's what we help with. No language-model call needed. No escalation.

  3. Escalate to a human: Risk flag fired, retrieval failed, injection detected, or the request is risky and off-topic.

The determination between a benign request the agent declines on its own and a sensitive one it hands to a human happens before the decider runs, inside the triage step. Triage reads the ticket once, under spotlighting, and tags it with a scope_status and a list of risk flags. The decider then reads those tags.

Two signals drive the split between path two and path three:

  • Scope classification. Triage labels every off-topic ticket as either out_of_scope_benign or out_of_scope_risky. A weather question or a movie-trivia question is benign. It touches no account, no money, and no safety concern, so the agent answers with a template decline. A request to close an account or dispute a charge is also outside the documentation, but it carries account and financial stakes, so it routes to a person.

  • Risk flags. A separate set of detectors scans for account-level and safety-sensitive intents: lost or stolen card, suspected fraud, data-deletion requests, score appeals. Any match forces escalation regardless of scope. The cost of a wrong answer on these is unrecoverable, so the agent never tries to handle them itself.

The rule is conservative by construction. The agent declines a ticket on its own only when both signals agree it is harmless. Anything that smells of money, identity, or account state goes to a human.

When triage is unsure which bucket a ticket belongs in, the missing or low-confidence scope signal pushes it down an escalation branch rather than the template-decline branch. Uncertainty resolves toward a human, never toward an unprompted reply.

The third path is the differentiator. Without it, every off-topic ticket lands in the human queue and burns staff time on questions the agent should politely decline. With it, the agent absorbs the low-value off-topic load and reserves human attention for the small fraction of tickets where humans add value.

The decider above implements the three paths through the response_path field. The downstream orchestrator reads this field and dispatches to one of three handlers: the drafter, a template function, or an escalation string.

The Consensus Verifier as a Second Safety Net

A pure-function decider gates which tickets enter the drafter. The drafter writes a response with sentence-level citations into the corpus. The next question: how do you know the response is faithful to the documentation?

A single language model verifier is fragile. The same model which wrote the response is biased toward approving it. Even a different model has blind spots in its training data. The fix is consensus: two independent judges plus an arbiter for disagreement.

from dataclasses import dataclass
from typing import Callable


@dataclass(frozen=True)
class ConsensusResult:
    score: float
    primary: float
    secondary: float
    arbiter: float | None
    agreed: bool


def consensus_faithfulness(
    draft: str,
    chunks: list,
    primary_call: Callable,
    secondary_call: Callable,
    arbiter_call: Callable,
    agree_delta: float = 0.25,
) -> ConsensusResult:
    p = primary_call(draft, chunks)
    s = secondary_call(draft, chunks)
    if abs(p - s) <= agree_delta:
        return ConsensusResult((p + s) / 2.0, p, s, None, True)
    a = arbiter_call(draft, chunks)
    return ConsensusResult(a, p, s, a, False)

The contract is intentionally minimal. The function takes three callable judges, each producing a faithfulness score between zero and one. The primary and secondary always run. The arbiter only runs on disagreement, defined as a score gap wider than 0.25.

For independence, give each judge a different prompt framing. The primary asks for a holistic score. The secondary counts unsupported claims and computes a ratio. The arbiter reasons step by step and emits a final score. Same task, different cognitive paths. A failure mode hiding from one framing is unlikely to hide from the other.

For cross-vendor independence, you just swap the secondary judge for a model from a different provider. The pattern I borrowed from the open-source Passmark library uses Claude Haiku as primary, Gemini Flash as secondary, and Gemini Pro as arbiter. OpenRouter sits in front of both providers behind a single API key, which keeps the cost manageable and gives you real vendor diversity. Different training data. Different blind spots.

The downstream decision is asymmetric:

def verify(draft, retrieval, triage, thresholds, consensus_call):
    # Free Jaccard sanity first
    if not draft.citations:
        return VerifyResult(False, 0.0, "missing_citations", False)
    overlaps = [_jaccard(draft.text, c.cited_text) for c in draft.citations]
    avg_jaccard = sum(overlaps) / len(overlaps)
    jaccard_ok = avg_jaccard >= thresholds.jaccard_min

    # Skip the consensus gate when the cheap path already confirms safety
    is_risk = bool(triage.risk_flags) or triage.injection_score > 0.7
    top1 = retrieval[0].score if retrieval else 0.0
    is_safe = jaccard_ok and not is_risk and top1 >= thresholds.t_high
    if is_safe:
        return VerifyResult(True, avg_jaccard, "safe_path_skipped", False)

    # Otherwise call the consensus gate
    score = consensus_call(draft.text, retrieval[:5])
    threshold = thresholds.strict if is_risk else thresholds.lenient
    return VerifyResult(score >= threshold, score,
                        f"score={score:.2f}", True)

Risk-flagged tickets get the strict threshold of 0.7. Normal FAQs get 0.5. The asymmetry matches the cost of being wrong. A wrong answer on a fraud ticket is unrecoverable. A wrong answer on a how-to question is annoying but recoverable.

Cost and Observability

The escalation-first pattern reads expensive on paper. Three judges per ticket sounds costly. In practice, it's cheap because the verifier runs in tiers, from free to paid.

The first check is a Jaccard score between the draft and the cited passages. Jaccard is a simple set-overlap measure: split each text into a set of tokens, divide the size of the intersection by the size of the union, and you get a number between zero and one. It's free, runs in microseconds, and catches the obvious failures. Most drafts produced from high-confidence retrievals pass Jaccard without the language-model judges ever running.

The second saving comes from disk caching. You can hash the model's input (prompt plus user content) with SHA-256 and write the response to a file named after the hash. The next call with the same input reads from disk instead of the API.

Across a 24-hour build with twenty iteration runs, my cache hit rate sat above 80%. The total spend across the full hackathon was under five dollars, including Claude Sonnet draft calls and Gemini Pro arbitration on disagreement.

For observability, write one JSON line per ticket to a trace file (a format called JSONL, JSON Lines, where each line is a complete JSON object). Capture every signal:

{
  "row_id": 5,
  "ticket": {"issue": "...", "company": "Visa"},
  "triage": {"domain": "visa", "risk_flags": ["lost_or_stolen_card"]},
  "retrieval": [{"score": 0.0, "rank": 0, "source_path": "..."}],
  "decision": {"status": "Escalated", "reason": "risk:lost_or_stolen_card"},
  "draft": null,
  "elapsed_ms": 12
}

When a human auditor or an AI judge asks why this row escalated, you grep the trace file and read a complete story in one line. No log archaeology. No replay.

Where I Got It Wrong

The pattern above earned the agent a strong technical-execution score in the hackathon. Output accuracy, scored against a held-out ticket set with gold labels, was the weakest of the four judged axes. The architecture was sound. The labeled-data foundation underneath it was not.

I tuned every threshold, vocabulary list, and escalation rule against ten labeled sample rows. Ten rows is not a labeled set. It's a hint. I treated it as ground truth. The threshold of 0.30 for retrieval-floor escalation came from one natural break in a plot of ten points. With fifty points the break might have lived at 0.42. With a hundred points the right answer might have been per-domain thresholds.

The same root cause showed up across columns. Product Area scored 60 to 70% on the sample. Extrapolating to the production set, roughly nine of twenty-nine rows missed on this column alone. The vocabulary list (screen, community, privacy, conversation_management, travel_support, general_support) came from observed sample labels. Seven labels from ten rows. The production set almost certainly contained categories I never saw.

Three sub-leaks I now know I should have closed:

Labeler-Specific Calls

One sample row asked "What is the name of the actor in Iron Man?" with company set to None. Gold mapped this to conversation_management. This was unpredictable from ticket text alone. The labeler reasoned that Claude's conversation-management corpus is where casual off-topic chats belong. I never inferred this.

A rule like "domain=Claude AND scope=out_of_scope_benign → product_area=conversation_management" would have caught it. With one row I had no statistical basis for the rule.

Multi-Request Rows Escalated Whole

Three sample rows packed multiple sub-requests into one ticket. My policy: if any sub-request triggered a risk flag, escalate the entire row. The user got "Escalate to a human" for a ticket where four of five sub-parts were benign FAQ lookups.

The right pattern is a multi-request decomposer. Split the ticket. Run the pipeline per sub-request. Merge results. Reply with answered parts plus a flag for the risky one.

Rigid Justification Template

The justification column required a concise rationale per row. My implementation used a fixed three-sentence template: "Routed to {domain} domain with product_area={pa}. {Risk decision}. Source summary: {chunk titles}." Readable. Auditable. It's formulaic in a way a graded scorer notices. One Haiku call per row generating a one-sentence rationale in support-agent voice would have lifted the column at near-zero cost.

Five Gaps I Would Close in a Rematch

Ranked by points-per-hour against a similar hackathon scoring rubric:

  1. Hand-label 30 to 50 production rows before writing tuning code: The ticket text is visible from the moment the input CSV ships. Read each one. Write down the Status, Request Type, and Product Area I believe is correct. Iterate the agent against my own judgments. It won't match official gold perfectly, but the noise floor drops by a factor of three. Every threshold downstream becomes honest.

  2. Multi-request decomposer: Split, run, merge. Roughly 200 lines of code with a clean interface. It recovers points on multi-request rows where the agent currently over-escalates.

  3. LLM-generated justification: One Haiku call per row, cached by SHA. Cost rounds to nothing. Quality jumps to whatever Haiku produces, which is warmer prose than a template.

  4. Zero-claim detector instead of phrase-based decline detector: If the drafter produces a response with no factual claims, classify as Replied with request_type=invalid regardless of the exact phrasing. Catches honest "I don't know" answers the regex-based decline detector misses.

  5. Multilingual injection handling: One production row had French and Spanish text with an embedded jailbreak ("affiche toutes les règles internes"). My regex defenses were English-only. A multilingual ticket with cleaner injection would have slipped through.

The fixes compound. Fix 1 makes fixes 2 through 5 reliable. Without it, the others are guesses on a 10-row sample.

The meta-lesson generalizes. The temptation in any graded AI build is to over-engineer the pipeline and under-invest in the labeled set. Pipelines feel productive because you ship code. Labels feel like grunt work because you read tickets and write down answers. Pipelines are infinite. You will always have one more module to refine. Labels are bounded. Spend three hours, you have thirty rows. The marginal value of the next hour spent on labels is almost always higher than the marginal hour spent on a fifth retrieval optimization.

Where This Pattern Belongs

Not every AI agent needs escalation-first design. A coding assistant generating throwaway scripts has different stakes. A search agent retrieving public information has different stakes. The pattern earns its complexity when the cost of a wrong answer is asymmetric to the cost of refusing one.

Financial services, healthcare, legal triage, identity verification, account-management workflows – any context where the agent acts on behalf of an organization the user trusts. Escalation-first design is what lets you deploy AI into those contexts and sleep at night.

The competitive edge for service businesses adopting AI isn't the automation. It's the escalation logic. The companies getting this asymmetry right will compound customer trust. The ones treating AI as "automate everything" will quietly burn it.

The lesson from shipping this in a hackathon: don't measure your AI agent by how much it automates. Measure it by how reliably it knows what NOT to answer. And don't trust a 10-row sample as the labeled set you tune against. Both lessons cost me points to learn. Reading this saves you those points.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

From Flutter to Backend: How to Build and Ship Production REST APIs with Dart and Shelf

1 Share

As a Flutter engineer, you already know Dart. You understand async/await, you work with models and repositories, you think in clean architecture, and you have shipped real applications.

The gap between where you are and being able to build and deploy a production backend is smaller than you think.

The missing piece is not a new language. It's not a new paradigm. It's understanding how Dart behaves when there's no widget tree, no BuildContext, no Flutter framework – just a running process handling HTTP requests, talking to a database, and sending responses back to clients.

That's exactly what this article covers.

We're going to build a full User and Profile Management REST API from scratch using Dart and Shelf, connect it to a PostgreSQL database running in Docker, secure it with JWT authentication, and deploy it to Fly.io.

By the end, you'll have a working production-grade backend written entirely in Dart, the same language you already know.

This article is part of a series (of standalone articles) where we'll build the same project using three different frameworks. We'll use Shelf here, Serverpod in the next article, and Dart Frog in the one after that. This will let you directly compare how each framework approaches the same problem.

Table of Contents

Prerequisites

Before starting, you should have:

  • Comfortable familiarity with Dart and Flutter development

  • Understanding of REST API concepts, endpoints, HTTP methods, status codes

  • Docker Desktop installed and running

  • A Fly.io account (free tier is sufficient, fly.io)

  • The Fly CLI installed (brew install flyctl on macOS, or the official installer on Windows/Linux)

  • A PostgreSQL client for inspecting the database, like TablePlus or DBeaver – both work well

How Dart Works on the Server

When you run a Flutter app, the Flutter framework is doing an enormous amount of work, managing the widget tree, handling the render pipeline, coordinating state, and responding to platform events. Your Dart code sits on top of all of that.

On the server, none of that exists. There's no widget tree. There's no framework managing a UI lifecycle. There's just a Dart process running, listening on a port, receiving HTTP requests, doing work, and sending responses.

Dart's standard library, dart:io, has everything needed to do this at the lowest level:

import 'dart:io';

void main() async {
  final server = await HttpServer.bind('0.0.0.0', 8080);
  print('Server running on port 8080');

  await for (final request in server) {
    request.response
      ..statusCode = 200
      ..write('Hello from Dart')
      ..close();
  }
}

This is a working HTTP server in raw Dart. No packages, no framework. Every request comes in through the HttpServer stream, and you write directly to the response.

This works, but it scales poorly. As soon as you need routing, middleware, authentication, and structured error handling, raw dart:io becomes difficult to manage. That is the problem Shelf solves.

What is Shelf?

Shelf is a composable web server middleware library for Dart, maintained by the Dart team. It doesn't try to be a full framework – instead, it gives you the primitives to build one, or to assemble exactly what you need.

The Shelf mental model is built on four concepts:

  • Handler: a function that takes a Request and returns a Response. Everything in Shelf is ultimately a handler.

  • Middleware: a function that wraps a handler, adding behaviour before or after it runs. Logging, authentication, and error handling are all middleware.

  • Pipeline: a chain of middleware with a handler at the end. Requests flow through the middleware chain before reaching the handler.

  • Router: maps URL patterns and HTTP methods to specific handlers.

If you've used Flutter's Navigator or provider middleware concepts, the composition model will feel familiar. Small, single-responsibility pieces assembled into a working whole.

Project Setup

Creating the Project

Dart includes a server-side project template that gives us a clean starting point:

dart create -t server-shelf user_profile_api
cd user_profile_api

Add the dependencies we need to pubspec.yaml:

name: user_profile_api
description: User and Profile Management REST API built with Dart and Shelf
version: 1.0.0

environment:
  sdk: '>=3.0.0 <4.0.0'

dependencies:
  shelf: ^1.4.1
  shelf_router: ^1.1.4
  postgres: ^3.3.0
  dart_jsonwebtoken: ^2.12.0
  bcrypt: ^1.1.3
  dotenv: ^4.1.0
  crypto: ^3.0.3

dev_dependencies:
  lints: ^3.0.0
  test: ^1.24.0

Run:

dart pub get

Project Structure

Now we'll build a backend project structure that Flutter engineers will find intuitive, that's familiar enough to navigate immediately, and that's correct enough for backend conventions:

user_profile_api/
  bin/
    server.dart              ← entry point
  lib/
    config/
      database.dart          ← connection manager
      env.dart               ← environment config
    handlers/
      auth_handler.dart      ← auth endpoints
      user_handler.dart      ← user endpoints
      profile_handler.dart   ← profile endpoints
    middleware/
      auth_middleware.dart   ← JWT validation
      error_middleware.dart  ← global error handling
      logger_middleware.dart ← request logging
    models/
      user.dart
      profile.dart
    repositories/
      user_repository.dart
      profile_repository.dart
    services/
      auth_service.dart      ← JWT + password logic
    router.dart              ← route definitions
  migrations/
    001_create_users.sql
    002_create_profiles.sql
  docker-compose.yml
  Dockerfile
  .env
  .env.example

This separation of concerns maps directly to what you'll already know if you're a Flutter engineer: models, repositories, and services are the same concepts. Handlers replace ViewModels or Controllers. Middleware replaces interceptors.

Database Setup with Docker

Create docker-compose.yml in the project root:

version: '3.8'

services:
  postgres:
    image: postgres:16-alpine
    container_name: user_profile_db
    environment:
      POSTGRES_DB: user_profile_api
      POSTGRES_USER: dart_user
      POSTGRES_PASSWORD: dart_password
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Start the database:

docker compose up -d

Verify that it's running:

docker compose ps
# user_profile_db   running   0.0.0.0:5432->5432/tcp

Environment Configuration

Create .env in the project root:

DB_HOST=localhost
DB_PORT=5432
DB_NAME=user_profile_api
DB_USER=dart_user
DB_PASSWORD=dart_password
JWT_SECRET=your_super_secret_key_change_this_in_production
JWT_EXPIRY_HOURS=24
PORT=8080

Create .env.example with the same keys but no values. This is what you commit to Git:

DB_HOST=
DB_PORT=
DB_NAME=
DB_USER=
DB_PASSWORD=
JWT_SECRET=
JWT_EXPIRY_HOURS=
PORT=

Add .env to .gitignore:

.env

Create lib/config/env.dart:

import 'package:dotenv/dotenv.dart';

class Env {
  static late final DotEnv _env;

  static void load() {
    _env = DotEnv(includePlatformEnvironment: true)..load();
  }

  static String get dbHost => _env['DB_HOST'] ?? 'localhost';
  static int get dbPort => int.parse(_env['DB_PORT'] ?? '5432');
  static String get dbName => _env['DB_NAME'] ?? 'user_profile_api';
  static String get dbUser => _env['DB_USER'] ?? 'dart_user';
  static String get dbPassword => _env['DB_PASSWORD'] ?? '';
  static String get jwtSecret => _env['JWT_SECRET'] ?? '';
  static int get jwtExpiryHours => int.parse(_env['JWT_EXPIRY_HOURS'] ?? '24');
  static int get port => int.parse(_env['PORT'] ?? '8080');
}

includePlatformEnvironment: true means the Env class reads from both the .env file and real system environment variables, so the same code works locally with a .env file and in production with injected environment variables.

Shelf Core Concepts

Before building the API, it's worth understanding each Shelf concept properly – not just what it does, but why it's designed the way it is.

Handlers

A handler is the most fundamental unit in Shelf. It's simply a function:

import 'package:shelf/shelf.dart';

Response helloHandler(Request request) {
  return Response.ok('Hello, Dart backend!');
}

Request in, Response out. That's the entire contract. Every endpoint you write is a handler. Every piece of middleware is a function that takes a handler and returns a handler.

Handlers can be async:

Future<Response> getUserHandler(Request request) async {
  final users = await userRepository.findAll();
  return Response.ok(jsonEncode(users));
}

Request and Response

Request gives you everything about the incoming HTTP call:

Future<Response> handler(Request request) async {
  // URL and path
  print(request.url);           // the full URL
  print(request.url.path);      // just the path

  // Path parameters (when using shelf_router)
  final id = request.params['id'];

  // Query parameters
  final page = request.url.queryParameters['page'];

  // Headers
  final auth = request.headers['authorization'];

  // Body
  final body = await request.readAsString();
  final json = jsonDecode(body) as Map<String, dynamic>;

  return Response.ok('handled');
}

Response has named constructors for common status codes:

Response.ok(body)           // 200
Response.notFound(body)     // 404
Response(201, body: body)   // any status code
Response(400, body: body)   // bad request
Response(401, body: body)   // unauthorized
Response(500, body: body)   // server error

Always set the Content-Type header when returning JSON:

Response.ok(
  jsonEncode({'message': 'success'}),
  headers: {'Content-Type': 'application/json'},
)

Router

shelf_router maps URL patterns and HTTP methods to handlers:

import 'package:shelf_router/shelf_router.dart';

final router = Router();

router.get('/users', getAllUsersHandler);
router.get('/users/<id>', getUserHandler);
router.post('/users', createUserHandler);
router.put('/users/<id>', updateUserHandler);
router.delete('/users/<id>', deleteUserHandler);

The syntax defines a path parameter. Access it inside the handler via request.params['id'].

Pipeline and Middleware

A Pipeline chains middleware together with a handler at the end:

import 'package:shelf/shelf.dart';

final handler = Pipeline()
    .addMiddleware(loggerMiddleware())
    .addMiddleware(errorMiddleware())
    .addMiddleware(authMiddleware())
    .addHandler(router.call);

Middleware is a function with this signature:

Middleware myMiddleware() {
  return (Handler innerHandler) {
    return (Request request) async {
      // Before the handler runs
      print('Request received: \({request.method} \){request.url}');

      final response = await innerHandler(request);

      // After the handler runs
      print('Response sent: ${response.statusCode}');

      return response;
    };
  };
}

The outer function returns a Middleware. That Middleware is a function that takes the next Handler in the chain and returns a new Handler. This nesting is what allows middleware to run code both before and after the inner handler.

Connecting to PostgreSQL

The Database Connection Manager

Create lib/config/database.dart:

import 'package:postgres/postgres.dart';
import 'env.dart';

class Database {
  static Connection? _connection;

  static Future<Connection> get connection async {
    if (_connection != null) return _connection!;
    _connection = await _connect();
    return _connection!;
  }

  static Future<Connection> _connect() async {
    final conn = await Connection.open(
      Endpoint(
        host: Env.dbHost,
        port: Env.dbPort,
        database: Env.dbName,
        username: Env.dbUser,
        password: Env.dbPassword,
      ),
      settings: const ConnectionSettings(
        sslMode: SslMode.disable,
      ),
    );

    print('✅ Database connected: \({Env.dbHost}:\){Env.dbPort}/${Env.dbName}');
    return conn;
  }

  static Future<void> close() async {
    await _connection?.close();
    _connection = null;
  }
}

This is a singleton connection manager – the same pattern Flutter engineers use for shared services. The connection is created once on first access and reused for every subsequent database call.

Running Migrations

Create the migrations folder and SQL files:

migrations/001_create_users.sql:

CREATE TABLE IF NOT EXISTS users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email VARCHAR(255) UNIQUE NOT NULL,
  password_hash VARCHAR(255) NOT NULL,
  first_name VARCHAR(100) NOT NULL,
  last_name VARCHAR(100) NOT NULL,
  is_active BOOLEAN DEFAULT TRUE,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
  updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);

migrations/002_create_profiles.sql:

CREATE TABLE IF NOT EXISTS profiles (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  bio TEXT,
  avatar_url VARCHAR(500),
  phone VARCHAR(20),
  location VARCHAR(255),
  website VARCHAR(500),
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
  updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
  UNIQUE(user_id)
);

CREATE INDEX IF NOT EXISTS idx_profiles_user_id ON profiles(user_id);

Create a migration runner in lib/config/database.dart:

static Future<void> runMigrations() async {
  final conn = await connection;
  final migrationsDir = Directory('migrations');

  final files = migrationsDir
      .listSync()
      .whereType<File>()
      .where((f) => f.path.endsWith('.sql'))
      .toList()
    ..sort((a, b) => a.path.compareTo(b.path));

  for (final file in files) {
    final sql = await file.readAsString();
    await conn.execute(sql);
    print('✅ Migration applied: ${file.path}');
  }
}

Building the API

With the database connected and migrations in place, we can now build the actual API layer.

This section covers the models, repositories, and handlers for both users and profiles. Models define the shape of the data, repositories handle all database interactions, and handlers translate HTTP requests into repository calls and send responses back to the client. We'll build the user layer first, then the profile layer on top of it.

The User Model

The User model represents a single user record in the database. It maps directly to the users table created in the migration and handles two-way conversion between database rows and Dart objects.

Create lib/models/user.dart:

class User {
  final String id;
  final String email;
  final String passwordHash;
  final String firstName;
  final String lastName;
  final bool isActive;
  final DateTime createdAt;
  final DateTime updatedAt;

  const User({
    required this.id,
    required this.email,
    required this.passwordHash,
    required this.firstName,
    required this.lastName,
    required this.isActive,
    required this.createdAt,
    required this.updatedAt,
  });

  factory User.fromRow(Map<String, dynamic> row) => User(
        id: row['id'] as String,
        email: row['email'] as String,
        passwordHash: row['password_hash'] as String,
        firstName: row['first_name'] as String,
        lastName: row['last_name'] as String,
        isActive: row['is_active'] as bool,
        createdAt: row['created_at'] as DateTime,
        updatedAt: row['updated_at'] as DateTime,
      );

  // Never include passwordHash in JSON responses
  Map<String, dynamic> toJson() => {
        'id': id,
        'email': email,
        'firstName': firstName,
        'lastName': lastName,
        'isActive': isActive,
        'createdAt': createdAt.toIso8601String(),
        'updatedAt': updatedAt.toIso8601String(),
      };
}

fromRow maps a PostgreSQL result row to a User. toJson deliberately excludes passwordHash – you should never return password data in API responses.

The User Repository

The UserRepository is the single point of contact between the application and the users table. Every database operation for users goes through here, keeping the SQL contained and the handlers clean.

Create lib/repositories/user_repository.dart:

import 'dart:async';
import 'package:postgres/postgres.dart';
import '../config/database.dart';
import '../models/user.dart';

class UserRepository {
  Future<Connection> get _conn => Database.connection;

  Future<List<User>> findAll() async {
    final conn = await _conn;
    final results = await conn.execute(
      'SELECT * FROM users WHERE is_active = TRUE ORDER BY created_at DESC',
    );

    return results.map((row) => User.fromRow(row.toColumnMap())).toList();
  }

  Future<User?> findById(String id) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('SELECT * FROM users WHERE id = @id AND is_active = TRUE'),
      parameters: {'id': id},
    );

    if (results.isEmpty) return null;
    return User.fromRow(results.first.toColumnMap());
  }

  Future<User?> findByEmail(String email) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('SELECT * FROM users WHERE email = @email'),
      parameters: {'email': email},
    );

    if (results.isEmpty) return null;
    return User.fromRow(results.first.toColumnMap());
  }

  Future<User> create({
    required String email,
    required String passwordHash,
    required String firstName,
    required String lastName,
  }) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('''
        INSERT INTO users (email, password_hash, first_name, last_name)
        VALUES (@email, @passwordHash, @firstName, @lastName)
        RETURNING *
      '''),
      parameters: {
        'email': email,
        'passwordHash': passwordHash,
        'firstName': firstName,
        'lastName': lastName,
      },
    );

    return User.fromRow(results.first.toColumnMap());
  }

  Future<User?> update({
    required String id,
    String? firstName,
    String? lastName,
  }) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('''
        UPDATE users
        SET
          first_name = COALESCE(@firstName, first_name),
          last_name  = COALESCE(@lastName, last_name),
          updated_at = NOW()
        WHERE id = @id AND is_active = TRUE
        RETURNING *
      '''),
      parameters: {
        'id': id,
        'firstName': firstName,
        'lastName': lastName,
      },
    );

    if (results.isEmpty) return null;
    return User.fromRow(results.first.toColumnMap());
  }

  Future<bool> delete(String id) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('''
        UPDATE users SET is_active = FALSE, updated_at = NOW()
        WHERE id = @id AND is_active = TRUE
        RETURNING id
      '''),
      parameters: {'id': id},
    );

    return results.isNotEmpty;
  }
}

A few things worth noting here. Sql.named uses named parameters (@paramName) instead of positional parameters. This prevents SQL injection and makes queries readable.

Also, the delete operation is a soft delete. It sets is_active = FALSE rather than removing the row. This is the standard production approach: data is never truly deleted, it's deactivated.

COALESCE(@firstName, first_name) on the update means: use the new value if provided, otherwise keep the existing value. This handles partial updates cleanly without requiring all fields every time.

User Handlers

The UserHandler class exposes the repository operations as HTTP endpoints. It owns a Router instance internally and maps each route to a private method, keeping the routing logic and the handler logic together in one place.

Create lib/handlers/user_handler.dart:

import 'dart:convert';
import 'package:shelf/shelf.dart';
import 'package:shelf_router/shelf_router.dart';
import '../repositories/user_repository.dart';

class UserHandler {
  final UserRepository _repository;

  UserHandler(this._repository);

  Router get router {
    final router = Router();
    router.get('/', _getAll);
    router.get('/<id>', _getOne);
    router.put('/<id>', _update);
    router.delete('/<id>', _delete);
    return router;
  }

  Future<Response> _getAll(Request request) async {
    final users = await _repository.findAll();
    return Response.ok(
      jsonEncode(users.map((u) => u.toJson()).toList()),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _getOne(Request request, String id) async {
    final user = await _repository.findById(id);

    if (user == null) {
      return Response.notFound(
        jsonEncode({'error': 'User not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    return Response.ok(
      jsonEncode(user.toJson()),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _update(Request request, String id) async {
    final body = jsonDecode(await request.readAsString()) as Map<String, dynamic>;

    final user = await _repository.update(
      id: id,
      firstName: body['firstName'] as String?,
      lastName: body['lastName'] as String?,
    );

    if (user == null) {
      return Response.notFound(
        jsonEncode({'error': 'User not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    return Response.ok(
      jsonEncode(user.toJson()),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _delete(Request request, String id) async {
    final deleted = await _repository.delete(id);

    if (!deleted) {
      return Response.notFound(
        jsonEncode({'error': 'User not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    return Response(
      204,
      headers: {'Content-Type': 'application/json'},
    );
  }
}

The Profile Model

The Profile model represents a user's extended information, stored separately from the core user record. The one-to-one relationship is enforced by the unique index on user_id in the profiles table. All fields except userId are nullable since a profile can be created with partial information and filled in over time.

Create lib/models/profile.dart:

class Profile {
  final String id;
  final String userId;
  final String? bio;
  final String? avatarUrl;
  final String? phone;
  final String? location;
  final String? website;
  final DateTime createdAt;
  final DateTime updatedAt;

  const Profile({
    required this.id,
    required this.userId,
    this.bio,
    this.avatarUrl,
    this.phone,
    this.location,
    this.website,
    required this.createdAt,
    required this.updatedAt,
  });

  factory Profile.fromRow(Map<String, dynamic> row) => Profile(
        id: row['id'] as String,
        userId: row['user_id'] as String,
        bio: row['bio'] as String?,
        avatarUrl: row['avatar_url'] as String?,
        phone: row['phone'] as String?,
        location: row['location'] as String?,
        website: row['website'] as String?,
        createdAt: row['created_at'] as DateTime,
        updatedAt: row['updated_at'] as DateTime,
      );

  Map<String, dynamic> toJson() => {
        'id': id,
        'userId': userId,
        'bio': bio,
        'avatarUrl': avatarUrl,
        'phone': phone,
        'location': location,
        'website': website,
        'createdAt': createdAt.toIso8601String(),
        'updatedAt': updatedAt.toIso8601String(),
      };
}

The Profile Repository

The ProfileRepository handles all database operations for the profiles table. Unlike the user repository which looks up by id, most profile operations use userId as the lookup key since that is how the client references a profile — by whose it belongs to, not by its own internal ID.

Create lib/repositories/profile_repository.dart:

import 'package:postgres/postgres.dart';
import '../config/database.dart';
import '../models/profile.dart';

class ProfileRepository {
  Future<Connection> get _conn => Database.connection;

  Future<Profile?> findByUserId(String userId) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('SELECT * FROM profiles WHERE user_id = @userId'),
      parameters: {'userId': userId},
    );

    if (results.isEmpty) return null;
    return Profile.fromRow(results.first.toColumnMap());
  }

  Future<Profile> create({
    required String userId,
    String? bio,
    String? avatarUrl,
    String? phone,
    String? location,
    String? website,
  }) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('''
        INSERT INTO profiles (user_id, bio, avatar_url, phone, location, website)
        VALUES (@userId, @bio, @avatarUrl, @phone, @location, @website)
        RETURNING *
      '''),
      parameters: {
        'userId': userId,
        'bio': bio,
        'avatarUrl': avatarUrl,
        'phone': phone,
        'location': location,
        'website': website,
      },
    );

    return Profile.fromRow(results.first.toColumnMap());
  }

  Future<Profile?> update({
    required String userId,
    String? bio,
    String? avatarUrl,
    String? phone,
    String? location,
    String? website,
  }) async {
    final conn = await _conn;
    final results = await conn.execute(
      Sql.named('''
        UPDATE profiles
        SET
          bio        = COALESCE(@bio, bio),
          avatar_url = COALESCE(@avatarUrl, avatar_url),
          phone      = COALESCE(@phone, phone),
          location   = COALESCE(@location, location),
          website    = COALESCE(@website, website),
          updated_at = NOW()
        WHERE user_id = @userId
        RETURNING *
      '''),
      parameters: {
        'userId': userId,
        'bio': bio,
        'avatarUrl': avatarUrl,
        'phone': phone,
        'location': location,
        'website': website,
      },
    );

    if (results.isEmpty) return null;
    return Profile.fromRow(results.first.toColumnMap());
  }
}

Profile Handlers

The ProfileHandler manages the profile endpoints nested under a user's ID. Before every operation, it verifies the parent user exists — a profile can't be created, fetched, or updated for a user that doesn't exist. It also prevents duplicate profiles by checking for an existing record before allowing a create.

Create lib/handlers/profile_handler.dart:

import 'dart:convert';
import 'package:shelf/shelf.dart';
import 'package:shelf_router/shelf_router.dart';
import '../repositories/profile_repository.dart';
import '../repositories/user_repository.dart';

class ProfileHandler {
  final ProfileRepository _profileRepository;
  final UserRepository _userRepository;

  ProfileHandler(this._profileRepository, this._userRepository);

  Router get router {
    final router = Router();
    router.get('/<userId>/profile', _getProfile);
    router.post('/<userId>/profile', _createProfile);
    router.put('/<userId>/profile', _updateProfile);
    return router;
  }

  Future<Response> _getProfile(Request request, String userId) async {
    final user = await _userRepository.findById(userId);
    if (user == null) {
      return Response.notFound(
        jsonEncode({'error': 'User not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final profile = await _profileRepository.findByUserId(userId);
    if (profile == null) {
      return Response.notFound(
        jsonEncode({'error': 'Profile not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    return Response.ok(
      jsonEncode(profile.toJson()),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _createProfile(Request request, String userId) async {
    final user = await _userRepository.findById(userId);
    if (user == null) {
      return Response.notFound(
        jsonEncode({'error': 'User not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final existing = await _profileRepository.findByUserId(userId);
    if (existing != null) {
      return Response(
        409,
        body: jsonEncode({'error': 'Profile already exists for this user'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final body = jsonDecode(await request.readAsString()) as Map<String, dynamic>;

    final profile = await _profileRepository.create(
      userId: userId,
      bio: body['bio'] as String?,
      avatarUrl: body['avatarUrl'] as String?,
      phone: body['phone'] as String?,
      location: body['location'] as String?,
      website: body['website'] as String?,
    );

    return Response(
      201,
      body: jsonEncode(profile.toJson()),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _updateProfile(Request request, String userId) async {
    final body = jsonDecode(await request.readAsString()) as Map<String, dynamic>;

    final profile = await _profileRepository.update(
      userId: userId,
      bio: body['bio'] as String?,
      avatarUrl: body['avatarUrl'] as String?,
      phone: body['phone'] as String?,
      location: body['location'] as String?,
      website: body['website'] as String?,
    );

    if (profile == null) {
      return Response.notFound(
        jsonEncode({'error': 'Profile not found'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    return Response.ok(
      jsonEncode(profile.toJson()),
      headers: {'Content-Type': 'application/json'},
    );
  }
}

Authentication

With the core user and profile CRUD in place, the next step is securing the API.

Authentication in this project works in two parts: an AuthService handles the cryptographic operations — password hashing and JWT generation and verification — and an AuthHandler exposes the register and login endpoints that clients call to get a token. Once a token is issued, the AuthMiddleware validates it on every protected request before it reaches a handler.

Password Hashing

Create lib/services/auth_service.dart:

import 'package:bcrypt/bcrypt.dart';
import 'package:dart_jsonwebtoken/dart_jsonwebtoken.dart';
import '../config/env.dart';
import '../models/user.dart';

class AuthService {
  String hashPassword(String password) {
    return BCrypt.hashpw(password, BCrypt.gensalt());
  }

  bool verifyPassword(String password, String hash) {
    return BCrypt.checkpw(password, hash);
  }

  String generateToken(User user) {
    final jwt = JWT(
      {
        'sub': user.id,
        'email': user.email,
        'iat': DateTime.now().millisecondsSinceEpoch ~/ 1000,
      },
    );

    return jwt.sign(
      SecretKey(Env.jwtSecret),
      expiresIn: Duration(hours: Env.jwtExpiryHours),
    );
  }

  JWT? verifyToken(String token) {
    try {
      return JWT.verify(token, SecretKey(Env.jwtSecret));
    } catch (_) {
      return null;
    }
  }
}

BCrypt.hashpw generates a salted hash. BCrypt.checkpw verifies a plain password against a stored hash. The salt is embedded in the hash itself – you don't store it separately.

verifyToken returns null on any failure, expired token, invalid signature, or malformed token rather than throwing. This keeps the auth middleware clean.

Auth Handlers

Create lib/handlers/auth_handler.dart:

import 'dart:convert';
import 'package:shelf/shelf.dart';
import 'package:shelf_router/shelf_router.dart';
import '../repositories/user_repository.dart';
import '../services/auth_service.dart';

class AuthHandler {
  final UserRepository _userRepository;
  final AuthService _authService;

  AuthHandler(this._userRepository, this._authService);

  Router get router {
    final router = Router();
    router.post('/register', _register);
    router.post('/login', _login);
    return router;
  }

  Future<Response> _register(Request request) async {
    final body = jsonDecode(await request.readAsString()) as Map<String, dynamic>;

    final email = body['email'] as String?;
    final password = body['password'] as String?;
    final firstName = body['firstName'] as String?;
    final lastName = body['lastName'] as String?;

    if (email == null || password == null || firstName == null || lastName == null) {
      return Response(
        400,
        body: jsonEncode({'error': 'email, password, firstName, and lastName are required'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    if (password.length < 8) {
      return Response(
        400,
        body: jsonEncode({'error': 'Password must be at least 8 characters'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final existing = await _userRepository.findByEmail(email);
    if (existing != null) {
      return Response(
        409,
        body: jsonEncode({'error': 'An account with this email already exists'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final passwordHash = _authService.hashPassword(password);

    final user = await _userRepository.create(
      email: email,
      passwordHash: passwordHash,
      firstName: firstName,
      lastName: lastName,
    );

    final token = _authService.generateToken(user);

    return Response(
      201,
      body: jsonEncode({
        'user': user.toJson(),
        'token': token,
      }),
      headers: {'Content-Type': 'application/json'},
    );
  }

  Future<Response> _login(Request request) async {
    final body = jsonDecode(await request.readAsString()) as Map<String, dynamic>;

    final email = body['email'] as String?;
    final password = body['password'] as String?;

    if (email == null || password == null) {
      return Response(
        400,
        body: jsonEncode({'error': 'email and password are required'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final user = await _userRepository.findByEmail(email);

    // Deliberately vague error, never confirm whether an email exists
    if (user == null || !_authService.verifyPassword(password, user.passwordHash)) {
      return Response(
        401,
        body: jsonEncode({'error': 'Invalid email or password'}),
        headers: {'Content-Type': 'application/json'},
      );
    }

    final token = _authService.generateToken(user);

    return Response.ok(
      jsonEncode({
        'user': user.toJson(),
        'token': token,
      }),
      headers: {'Content-Type': 'application/json'},
    );
  }
}

The login error message is deliberately vague: "Invalid email or password" rather than "Email not found" or "Wrong password." Confirming which part is wrong helps attackers enumerate valid accounts.

Auth Middleware

Create lib/middleware/auth_middleware.dart:

import 'dart:convert';
import 'package:shelf/shelf.dart';
import '../services/auth_service.dart';

Middleware authMiddleware(AuthService authService) {
  return (Handler innerHandler) {
    return (Request request) async {
      final authHeader = request.headers['authorization'];

      if (authHeader == null || !authHeader.startsWith('Bearer ')) {
        return Response(
          401,
          body: jsonEncode({'error': 'Authorization header missing or malformed'}),
          headers: {'Content-Type': 'application/json'},
        );
      }

      final token = authHeader.substring(7); // Remove 'Bearer '
      final jwt = authService.verifyToken(token);

      if (jwt == null) {
        return Response(
          401,
          body: jsonEncode({'error': 'Invalid or expired token'}),
          headers: {'Content-Type': 'application/json'},
        );
      }

      // Attach the user ID to the request context for downstream handlers
      final updatedRequest = request.change(
        context: {
          ...request.context,
          'userId': jwt.payload['sub'] as String,
          'userEmail': jwt.payload['email'] as String,
        },
      );

      return innerHandler(updatedRequest);
    };
  };
}

request.change(context: {...}) is how Shelf passes data from middleware to handlers, the equivalent of attaching data to a request in Express or ASP.NET middleware. Any handler downstream can read request.context['userId'] to know which user is authenticated.

Error Handling

No matter how carefully you write your handlers, unexpected failures will happen in production — malformed request bodies, database timeouts, unhandled edge cases.

Rather than letting each handler manage its own error responses individually, we'll centralise error handling in a single middleware that wraps the entire pipeline. This guarantees a consistent error response shape across every endpoint and prevents internal error details from leaking to the client.

Create lib/middleware/error_middleware.dart:

import 'dart:convert';
import 'package:shelf/shelf.dart';

Middleware errorMiddleware() {
  return (Handler innerHandler) {
    return (Request request) async {
      try {
        return await innerHandler(request);
      } on FormatException catch (e) {
        return Response(
          400,
          body: jsonEncode({'error': 'Invalid request body: ${e.message}'}),
          headers: {'Content-Type': 'application/json'},
        );
      } catch (e, stackTrace) {
        // Log the full error and stack trace server-side
        print('Unhandled error: $e');
        print(stackTrace);

        // Never expose internal error details to the client
        return Response(
          500,
          body: jsonEncode({'error': 'An internal server error occurred'}),
          headers: {'Content-Type': 'application/json'},
        );
      }
    };
  };
}

Create lib/middleware/logger_middleware.dart:

import 'package:shelf/shelf.dart';

Middleware loggerMiddleware() {
  return (Handler innerHandler) {
    return (Request request) async {
      final start = DateTime.now();

      final response = await innerHandler(request);

      final duration = DateTime.now().difference(start).inMilliseconds;
      print(
        '[${DateTime.now().toIso8601String()}] '
        '\({request.method} \){request.url.path} '
        '→ \({response.statusCode} (\){duration}ms)',
      );

      return response;
    };
  };
}

Wiring Everything Together

With the handlers, repositories, and middleware all in place, the final step is connecting them into a single running server. The router maps URL prefixes to their handler, the pipeline stacks the middleware in the correct order, and the entry point boots everything up in sequence — loading environment variables, running migrations, and starting the server.

Create lib/router.dart:

import 'package:shelf_router/shelf_router.dart';
import 'handlers/auth_handler.dart';
import 'handlers/user_handler.dart';
import 'handlers/profile_handler.dart';
import 'middleware/auth_middleware.dart';
import 'repositories/user_repository.dart';
import 'repositories/profile_repository.dart';
import 'services/auth_service.dart';

Router createRouter() {
  final userRepository = UserRepository();
  final profileRepository = ProfileRepository();
  final authService = AuthService();

  final authHandler = AuthHandler(userRepository, authService);
  final userHandler = UserHandler(userRepository);
  final profileHandler = ProfileHandler(profileRepository, userRepository);

  final router = Router();

  // Public routes, no auth required
  router.mount('/auth', authHandler.router.call);

  // Protected routes, auth middleware applied
  router.mount(
    '/users',
    Pipeline()
        .addMiddleware(authMiddleware(authService))
        .addHandler(userHandler.router.call),
  );

  router.mount(
    '/users',
    Pipeline()
        .addMiddleware(authMiddleware(authService))
        .addHandler(profileHandler.router.call),
  );

  return router;
}

Create the entry point bin/server.dart:

import 'dart:io';
import 'package:shelf/shelf.dart';
import 'package:shelf/shelf_io.dart' as shelf_io;
import '../lib/config/database.dart';
import '../lib/config/env.dart';
import '../lib/middleware/error_middleware.dart';
import '../lib/middleware/logger_middleware.dart';
import '../lib/router.dart';

void main() async {
  // Load environment variables
  Env.load();

  // Run database migrations
  await Database.runMigrations();

  // Build the handler pipeline
  final router = createRouter();

  final handler = Pipeline()
      .addMiddleware(errorMiddleware())
      .addMiddleware(loggerMiddleware())
      .addHandler(router.call);

  // Start the server
  final server = await shelf_io.serve(
    handler,
    InternetAddress.anyIPv4,
    Env.port,
  );

  print('🚀 Server running on port ${server.port}');
}

Run the server:

dart run bin/server.dart
# ✅ Database connected: localhost:5432/user_profile_api
# ✅ Migration applied: migrations/001_create_users.sql
# ✅ Migration applied: migrations/002_create_profiles.sql
# 🚀 Server running on port 8080

Deployment

The server is running locally and all endpoints are working. Now it's time to ship it.

We'll cover two deployment paths: first packaging the app and database together with Docker Compose for local production testing, then deploying to Fly.io where your API will be accessible over the internet with a managed PostgreSQL database and automatic TLS.

Dockerfile

Create Dockerfile in the project root:

FROM dart:stable AS build

WORKDIR /app
COPY pubspec.* ./
RUN dart pub get

COPY . .
RUN dart compile exe bin/server.dart -o bin/server

FROM debian:stable-slim

RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY --from=build /app/bin/server bin/server
COPY --from=build /app/migrations migrations/

EXPOSE 8080

CMD ["bin/server"]

This is a multi-stage build. The first stage uses the full Dart SDK image to compile the server to a native binary. The second stage copies only the compiled binary and migrations into a minimal Debian image – no Dart SDK, no source code, no build tools. The final image is lean and production-ready.

Docker Compose for Local Production Testing

Update docker-compose.yml to include the app alongside the database:

version: '3.8'

services:
  postgres:
    image: postgres:16-alpine
    container_name: user_profile_db
    environment:
      POSTGRES_DB: user_profile_api
      POSTGRES_USER: dart_user
      POSTGRES_PASSWORD: dart_password
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U dart_user -d user_profile_api"]
      interval: 5s
      timeout: 5s
      retries: 5

  api:
    build: .
    container_name: user_profile_api
    ports:
      - "8080:8080"
    environment:
      DB_HOST: postgres
      DB_PORT: 5432
      DB_NAME: user_profile_api
      DB_USER: dart_user
      DB_PASSWORD: dart_password
      JWT_SECRET: local_test_secret_replace_in_production
      JWT_EXPIRY_HOURS: 24
      PORT: 8080
    depends_on:
      postgres:
        condition: service_healthy

volumes:
  postgres_data:

The healthcheck on the Postgres service ensures that the API container only starts once the database is ready to accept connections (a common production problem when services start simultaneously).

Build and run everything:

docker compose up --build

Deploying to Fly.io

Fly.io is one of the cleanest deployment targets for containerized backend services. It handles global distribution, automatic TLS, and managed PostgreSQL databases.

Step 1 – Install and authenticate:

# macOS
brew install flyctl

# Authenticate
fly auth login

Step 2 – Launch the app:

fly launch

Fly detects the Dockerfile automatically and asks a few questions: app name, region, and whether to create a PostgreSQL database. Answer yes to the PostgreSQL prompt, and Fly will provision a managed database and inject the connection string automatically.

Step 3 – Set environment variables:

fly secrets set JWT_SECRET="your_production_secret_here"
fly secrets set JWT_EXPIRY_HOURS="24"

Database connection variables are set automatically by Fly when it provisions the PostgreSQL cluster.

Step 4 – Deploy:

fly deploy

Fly builds the Docker image, pushes it to their registry, and deploys it to your chosen region. Once complete:

fly status
# Your app is running at https://your-app-name.fly.dev

Step 5 – Verify the deployment:

curl https://your-app-name.fly.dev/auth/register \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"password123","firstName":"Seyi","lastName":"Dev"}'

Testing the API

With the server running locally on port 8080, here's the full flow to verify that everything works end to end.

Register a user:

curl http://localhost:8080/auth/register \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "email": "seyi@example.com",
    "password": "securepassword",
    "firstName": "Seyi",
    "lastName": "Dev"
  }'

Response:

{
  "user": {
    "id": "uuid-here",
    "email": "seyi@example.com",
    "firstName": "Seyi",
    "lastName": "Dev",
    "isActive": true,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  },
  "token": "eyJhbGci..."
}

Login:

curl http://localhost:8080/auth/login \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"email": "seyi@example.com", "password": "securepassword"}'

Get all users (authenticated):

curl http://localhost:8080/users \
  -H "Authorization: Bearer eyJhbGci..."

Create a profile:

curl http://localhost:8080/users/{userId}/profile \
  -X POST \
  -H "Authorization: Bearer eyJhbGci..." \
  -H "Content-Type: application/json" \
  -d '{
    "bio": "Flutter engineer turned backend developer",
    "location": "Lagos, Nigeria",
    "website": "https://example.com"
  }'

Update a user:

curl http://localhost:8080/users/{userId} \
  -X PUT \
  -H "Authorization: Bearer eyJhbGci..." \
  -H "Content-Type: application/json" \
  -d '{"firstName": "Oluwaseyi"}'

Delete a user:

curl http://localhost:8080/users/{userId} \
  -X DELETE \
  -H "Authorization: Bearer eyJhbGci..."

Conclusion

You just built and deployed a production-grade REST API in Dart – the same language you already know from Flutter. No new language, no new paradigm. Just Dart running in a different context.

The Shelf mental model (Handlers, Middleware, Pipelines, Routers) is deliberately minimal. It doesn't make decisions for you. It gives you composable primitives and lets you assemble them into exactly the architecture your project needs. That philosophy will feel familiar to Flutter engineers who build their own clean architecture rather than relying on a prescriptive framework.

What you built here – models, repositories, services, handlers, and middleware – is the same separation of concerns you apply in Flutter, applied to the backend. The concepts transfer. The Dart skills transfer. The architecture discipline transfers.

With this, you'll understand that Dart is a powerful language that cuts across both frontend and backend ecosystems. Aside from Shelf, we have Dartfrog and Serverpod which still functions well on the backend side of things. More on those in upcoming articles.

So yeah, try this out and thank me later!



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Wolverine and Marten Just Got Better for F# Folks

1 Share

Polecat 4.0 shares many more internals with Marten and I’m hopeful that it’s also much better for F# developers as well.

Wolverine and more so Marten do already have F# users, but we just made the deployment story a lot better in both tools for F# developers. One of the key components of Wolverine especially has been our usage of runtime code generation and compilation using Roslyn, which is how Wolverine is able to adapt to your application code instead of forcing you to write adapters to our specific interfaces or abstractions like basically every other application framework in .NET.

That’s the special sauce in Wolverine that allows your application code to be far simpler than it would be with other application frameworks, but it comes at the cost of Roslyn being a beast for memory consumption (sometimes, but not always), the size of the binaries shipped, and cold start times (again, sometimes). We’ve long had the ability in both Marten and Wolverine to pre-generate the Wolverine or Marten adapter code ahead of time and let it be compiled into the application itself to side step the Roslyn runtime issues. But in a story I’m sure is aggravatingly familiar for F# folks, that was only useful for C# projects as we could only generate C# code (and Roslyn only compiles C# code at runtime as far as I know, but feel free to correct me on that one).

Marten 9.0 helped things for everybody by completely eliminating its usage of the Roslyn code compilation. Wolverine 6.0 improved F# usage (with follow ups including 6.3.0 today!) helped as well by supporting the pre-generation of all Wolverine adapter code in F# (message handlers, HTTP endpoint handlers, and gRPC endpoint handlers) with this:

dotnet run -- codegen write --language fsharp

Better yet, Wolverine 6.0 now let’s you use the Roslyn compiler business as strictly a development time only dependency and omit those binaries completely in production deployments. As long as you’re pre-generating the code as shown above (many people like to do that in Docker image initialization), you can deploy a lean, mean, even AOT compliant set of binaries while happily coding with F#!

Last Thoughts

I’m hopeful that these changes make Marten and Wolverine better for folks building and deploying systems with F#.

As we’ve been able to burn down so much of our backlog and other issues, I’ve had time to turn my attention to making our tools better for people who don’t code the exact same way I do. For example, we’ve invested a lot in the last year for the EF Core integration with Wolverine. Just this week we’ve made some progress toward making Wolverine better when folks insist on using more runtime IoC trickery that we would recommend. Along those lines, this post talks about how we hopefully got better for F# developers.

Just so I don’t have to have this conversation yet again, yes, we’re aware of Source Generators in .NET, and no, we don’t believe that it’s remotely possible to replace our usage of Roslyn in Wolverine with Source Generators without Wolverine becoming a much lesser tool because of how much runtime information we use to do the code generation. We have started using far more Source Generators in other elements of the Critter Stack though.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories