Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
155138 stories
·
33 followers

From CLI to PR: Automating the path to merged code | BRK203

1 Share
From: Microsoft Developer
Duration: 46:50
Views: 13

Everyone talks about agents, but the real challenge is applying them to daily sprints. Moving beyond chat, we'll show how GitHub Copilot functions as an agentic partner in your workflow by live-coding a full cycleโ€”from planning in the terminal to delegating work to the cloud and automating PR reviews. No high-level abstractions here. Just technical mechanics: context management, advanced features with Copilot CLI, and the patterns that make agentic workflows actually stick.

Seating for this session is first-come, first-served. Add it to your schedule to plan your day and arrive early to secure a spot.

To learn more, please check out these resources:
* https://aka.ms/build26/BRK203

๐—ฆ๐—ฝ๐—ฒ๐—ฎ๐—ธ๐—ฒ๐—ฟ๐˜€:
* Cassidy Williams
* Evan Boyle

๐—ฆ๐—ฒ๐˜€๐˜€๐—ถ๐—ผ๐—ป ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป:
This is one of many sessions from the Microsoft Build 2026 event. View even more sessions on-demand and learn about Microsoft Build at https://build.microsoft.com

BRK203 | English (US) | Developer tools & frameworks

Breakout | (400) Expert

#MSBuild

Chapters:
0:00 - Discussion on AI tools and growth in code commits
00:03:15 - Adding skills and using built-in agents in the CLI
00:09:41 - Introduction of new commands: research and experimental including voice mode
00:10:57 - Introduction to speech-to-text functionality in the CLI
00:12:30 - Introduction of /every automation command for scheduling tasks
00:24:17 - Explaining Integration of Spec Kit with Plan Mode
00:26:10 - Selecting 'Always on Top Octocad' Idea for the Demo
00:34:34 - Exploring token budgeting and efficiency controls for AI tasks
00:39:46 - Audience introduces 'goblin mode' issue inspired by AI personality behavior

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Scott and Mark learn...how agents reshape software engineering | BRK247

1 Share
From: Microsoft Developer
Duration: 46:52
Views: 79

AI is changing how software is createdโ€”and what it means to be a software engineer. Weโ€™ll explore how AI agents are reshaping development: where they accelerate progress, where they fall short, and whatโ€™s changing for the profession. Along the way, weโ€™ll share failure modes, lessons learned, and propose ways engineers and organizations can adapt. Real talk, no hype.

Seating for this session is first-come, first-served. Add it to your schedule to plan your day and arrive early to secure a spot.

To learn more, please check out these resources:
* https://aka.ms/build26/BRK247

๐—ฆ๐—ฝ๐—ฒ๐—ฎ๐—ธ๐—ฒ๐—ฟ๐˜€:
* Mark Russinovich
* Scott Hanselman

๐—ฆ๐—ฒ๐˜€๐˜€๐—ถ๐—ผ๐—ป ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป:
This is one of many sessions from the Microsoft Build 2026 event. View even more sessions on-demand and learn about Microsoft Build at https://build.microsoft.com

BRK247 | English (US) | Agents & apps

Breakout | (300) Advanced

#MSBuild

Chapters:
0:00 - Project Lobster and Aspire team demonstrate AI-augmented software practices
00:14:42 - AI Compared to an Intern: Limited Context and Learning Ability
00:15:33 - Examples of AIโ€™s Faulty Fixes and Benchmark Misinterpretations
00:19:55 - AI Challenges Demonstrated Through Zoomit Panorama Feature
00:22:33 - Discussion on complexities of Cleartype and pixel color challenges
00:27:12 - Observations on AI-generated code pitfalls and effects on early-career developers
00:39:38 - Historical perspectiveโ€”each technology wave prompts panic but skills evolution continues
00:42:00 - Preceptor demo analogyโ€”training through guided real experiences and safe mistakes
00:44:27 - Future outlookโ€”AI will not replace human oversight; focus shifts to learning, mentoring, and cognitive engagement

Read the whole story
alvinashcraft
17 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Build context-aware agents: From data to decisions | BRK240

1 Share
From: Microsoft Developer
Duration: 44:30
Views: 63

High performance agents are built on intelligence that brings together context, enterprise data, orchestration, and governance. Learn how Foundry IQ, Fabric IQ, and Work IQ provide the enterprise intelligence layer for AI agents. Design agents that can search across organizational knowledge, reason over business data, and operate with awareness of people and work signals. Take action within trusted boundaries, providing a practical foundation for building scalable and reliable agents.

Seating for this session is first-come, first-served. Add it to your schedule to plan your day and arrive early to secure a spot.

To learn more, please check out these resources:
* https://aka.ms/build26/BRK240
* https://aka.ms/build/foundrydiscord

๐—ฆ๐—ฝ๐—ฒ๐—ฎ๐—ธ๐—ฒ๐—ฟ๐˜€:
* Amanda Silver
* Marco Casalaina

๐—ฆ๐—ฒ๐˜€๐˜€๐—ถ๐—ผ๐—ป ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป:
This is one of many sessions from the Microsoft Build 2026 event. View even more sessions on-demand and learn about Microsoft Build at https://build.microsoft.com

BRK240 | English (US) | Agents & apps

Breakout | (200) Intermediate

#MSBuild

Chapters:
0:00 - Challenges in agentic AI projects due to lack of context
00:09:38 - Introduction to Foundry IQ connecting agents with enterprise knowledge
00:12:59 - Transition to demonstration connecting refund agent with Web IQ and Foundry IQ
00:13:28 - Marco faces network issues and improvises demo setup
00:21:28 - Demonstration of Fabric IQ ontology generation and semantic integration
00:24:13 - Using ontology as an MCP server and connecting with Foundry IQ
00:25:39 - Explaining context delegation in data agents
00:27:25 - Introduction to Work IQ and its connection to organizational knowledge
00:41:42 - Building Secure, Governed, Scalable Agent Ecosystems

Read the whole story
alvinashcraft
22 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

#725 โ€“ The Secret Life of Circuits with lcamtuf / Michaล‚ Zalewski

1 Share

Welcome Michaล‚ Zalewski, AKA lcamtuf!





Download audio: https://traffic.libsyn.com/theamphour/TheAmpHour-725-The-Secret-Life-of-Circuits-with-lcamtuf.mp3
Read the whole story
alvinashcraft
36 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Giving Developers Claude Code with Azure API Management and Claude Models in Microsoft Foundry

1 Share

Summary

You want to give your engineering org Claude Code without handing out Anthropic API keys, without per-developer billing sprawl, and without losing visibility into who is spending what. This post shows a battle-tested pattern:

  • Claude models run in Microsoft Foundry, billed through your Azure subscription โ€” no Anthropic contract or keys required.
  • Azure API Management (APIM) sits in front as an LLM gateway: it authenticates each developer with Entra ID, enforces per-user rate limits and token quotas, and emits per-user usage metrics for chargeback.
  • Foundry lives in its own Azure subscription, and APIM authenticates to it with a Foundry API key โ€” so there's no cross-subscription RBAC to untangle.
  • Developers hold only short-lived Entra tokens. The Foundry key never leaves APIM.

Everything below is grounded in the Claude Code LLM gateway requirements and Azure API Management's GenAI gateway policies. All command-line steps are shown in PowerShell for Windows developers.

 

The problem

Claude Code is a terminal- and IDE-native coding agent that talks to Claude over the Anthropic Messages API. Pointing it directly at Anthropic (or even directly at Foundry) creates three headaches for any organization beyond a handful of users:

  1. Key sprawl and billing. Direct API keys mean either a shared key (no per-user attribution, a rotation nightmare) or many keys (procurement and offboarding overhead).
  2. No throttle. Claude Code is token-heavy โ€” it reads files, plans, and edits in long loops. One runaway session or an over-enthusiastic team can produce a surprising bill with nothing standing between the developer and the model.
  3. No visibility. Finance wants to know cost per team. Security wants to know who is calling what. A raw key gives you neither.

The fix is a gateway that every request flows through โ€” one that knows who the developer is (Entra ID), enforces how much they can use (APIM GenAI policies), and records what they used (Azure Monitor). Claude Code supports exactly this through its gateway configuration.

 

Architecture

Claude Code on a developer laptop authenticates to Azure API Management with an Entra ID bearer token; APIM validates the token, applies per-user token and request limits, swaps in the Foundry API key, and forwards the Anthropic Messages request to Claude in Microsoft Foundry in a separate subscription; per-user token usage is emitted to Application Insights. 

 

The request path:

Developer laptop (Claude Code CLI / VS Code) | Authorization: Bearer <Entra access token for the APIM app> v Azure API Management (the LLM gateway) [Subscription A] | 1. validate-jwt confirm Entra identity, audience, app role | 2. extract oid per-user counter key | 3. llm-token-limit per-user tokens/min + monthly token quota | 4. rate-limit-by-key per-user requests/min | 5. strip Authorization; set api-key from secret named value | 6. llm-emit-token-metric per-user usage to App Insights v (forwards Anthropic Messages format; anthropic-* headers preserved) Microsoft Foundry https://{resource}.services.ai.azure.com/anthropic/v1/messages v [Subscription B] Claude deployments (Sonnet 4.6 / Haiku 4.5 / Opus 4.6)

The key idea: developer-facing auth and backend auth are independent. Developers always authenticate as themselves with Entra ID at the gateway. How the gateway authenticates to Foundry is a separate decision โ€” and you have two good options.

Choosing how the gateway authenticates to Foundry

Both options below are independent of the developer-facing Entra ID auth, and both work whether Foundry is in the same subscription as APIM or a different one. The only hard constraint for managed identity is that both resources live in the same Entra tenant.

 Option A โ€” Foundry API keyOption B โ€” Managed identity
How APIM authenticatesapi-key header from a secret named valueEntra token from APIM's managed identity, in the Authorization header
SetupRead the key once, store it in APIMEnable APIM's identity, assign Cognitive Services User on Foundry
Same subscriptionWorksWorks
Cross-subscriptionWorks โ€” no RBAC crosses the boundaryWorks โ€” role assignment spans subscriptions in the same tenant
Cross-tenantWorksNot supported โ€” use a key
Shared secret to rotateYesNone
Best forFastest start; cross-tenant; key-only environmentsProduction; eliminates the shared secret

This guide builds the key-based path end to end, then shows the managed-identity swap inline at each step (Parts 3 and 4). Pick one โ€” you don't need both.

What this design achieves

GoalHow it's met
Developers use Claude Code with no Anthropic billing or keysClaude runs in Microsoft Foundry, billed through your Azure subscription
Foundry can live in a different subscriptionAPIM reaches Foundry by URL + API key only โ€” no cross-subscription RBAC
Every developer authenticates as themselvesEntra ID tokens validated at the APIM gateway
Per-developer rate limits and quotasrate-limit-by-key + llm-token-limit keyed on the Entra oid claim
Per-developer usage and cost trackingllm-emit-token-metric โ†’ Application Insights / Log Analytics
No Foundry keys on developer laptopsThe Foundry key lives only inside APIM; developers hold short-lived Entra tokens

Prerequisites

  • Two Azure subscriptions, both pay-as-you-go. Subscription A holds APIM; Subscription B holds Foundry. (Foundry Claude does not run on free, trial, sponsored, or CSP subscriptions.)
  • Microsoft Foundry resource (Subscription B) in a region where Claude is available โ€” currently East US 2 or Sweden Central โ€” with Claude deployments created and at least one API key under Keys and Endpoint.
  • An API Management instance (Subscription A). Developer SKU is fine for a pilot; Standard v2 or Premium for production and VNet integration.
  • Permission to read the Foundry key in Subscription B, contributor on the APIM instance, and the ability to register Entra apps.
  • Developers on Windows 10/11 with PowerShell (5.1 built-in, or 7), the Azure CLI (winget install Microsoft.AzureCLI), and the Claude Code CLI installed.

Option A (key): no cross-subscription role assignment โ€” the only cross-subscription action is reading the Foundry key once (Part 3), which you can also do from the Foundry portal. Option B (managed identity): one cross-subscription role assignment (Cognitive Services User), supported as long as APIM and Foundry share an Entra tenant.

Part 1 โ€” Deploy Claude in Foundry (Subscription B)

  1. In the Foundry portal, open Model catalog, search Claude, and deploy the models Claude Code uses. Name each deployment to match its model ID so the gateway can pass the model field through unchanged:
    RoleDeployment name (recommended)
    Primary (general coding)claude-sonnet-4-6
    Fast (file reads, small edits, background tasks)claude-haiku-4-5
    Extended thinking (optional)claude-opus-4-6
  2. Pin versions โ€” select a specific version, not auto-update to latest. Without pinning, a new model release can break every developer at once.
  3. On the resource's Keys and Endpoint blade, copy the endpoint and one of the two API keys. The Anthropic endpoint base is:
https://{resource}.services.ai.azure.com/anthropic

Critical: Foundry's Claude endpoint is the Anthropic surface (/anthropic/v1/messages), not the OpenAI surface (/openai/deployments/.../chat/completions?api-version=...). When you build the APIM API, do not apply the OpenAI policy template, do not add an api-version query parameter, and do not rewrite to an /openai/... path. Any of these produces the "not supported" or "resource not found" errors people commonly hit.

 

โœ… Checkpoint: You now have Claude deployed in Foundry. Verify your deployment before continuing to Part 2.

Part 2 โ€” Entra ID app registration (developer-facing)

This registration lives in Subscription A's tenant. It defines the audience developers' tokens are issued for, and what APIM validates. It's unaffected by where Foundry lives.

  1. App registrations โ†’ New registration โ†’ name it e.g. Claude Code Gateway.
  2. Expose an API โ†’ set the Application ID URI, e.g. api://claude-code-gateway. Add a scope access_as_user (admin + user consent).
  3. (Optional, for tiering) App roles โ†’ add roles such as Claude.Standard and Claude.Premium. Assign developers or groups under Enterprise applications โ†’ this app โ†’ Users and groups.
  4. Note the Application (client) ID, the Application ID URI, and your Tenant ID.

Developers request tokens for this app's audience; APIM validates aud = api://claude-code-gateway.

Part 3 โ€” Provision the APIM API and Foundry backend (Subscription A)

3.1 Option A โ€” Store the Foundry API key in APIM

First read the key from Foundry in Subscription B (use --subscription so you don't have to switch your active context):

# Read a Foundry key from Subscription B $FOUNDRY_KEY = az cognitiveservices account keys list ` --name <foundry-resource> ` --resource-group <foundry-rg> ` --subscription <SUBSCRIPTION_B_ID> ` --query key1 -o tsv

Then store it as a secret named value in APIM (Subscription A). The policy references it as {{foundry-api-key}}:

# Create a secret named value in APIM holding the Foundry key az apim nv create -g <apim-rg> --service-name <apim-name> ` --named-value-id foundry-api-key ` --display-name foundry-api-key ` --value "$FOUNDRY_KEY" ` --secret true

Hardening: instead of the raw key in APIM, put it in Key Vault and create a Key Vault-backed named value, so rotation lives in one place. APIM needs a managed identity with Get/List secret access on that vault โ€” but the vault is in Subscription A alongside APIM, so this is still not a cross-subscription role assignment.

3.2 Option B โ€” Give APIM a managed identity instead

If you'd rather not manage a shared key, skip 3.1 and give APIM an identity that Foundry trusts. This works in the same subscription and across subscriptions alike, as long as both resources are in the same Entra tenant.

# Enable a system-assigned managed identity on APIM (Subscription A) az apim update -g <apim-rg> --name <apim-name> ` --set identity.type=SystemAssigned # Get the identity's principal (object) ID $APIM_MI = az apim show -g <apim-rg> --name <apim-name> ` --query identity.principalId -o tsv # Get the Foundry resource ID (Subscription B) $FOUNDRY_ID = az cognitiveservices account show ` --name <foundry-resource> --resource-group <foundry-rg> ` --subscription <SUBSCRIPTION_B_ID> ` --query id -o tsv # Grant Cognitive Services User on the Foundry resource (works cross-subscription) az role assignment create ` --assignee-object-id $APIM_MI ` --assignee-principal-type ServicePrincipal ` --role "Cognitive Services User" ` --scope $FOUNDRY_ID

The Cognitive Services User role (a97b65f3-24c7-4388-baec-2e87135dc908) grants data-plane access to call the model without key-management rights. Role assignments can take a few minutes to propagate. A user-assigned identity works too โ€” assign it to APIM and reference its client ID in the policy (Part 4, Option B). On this path there is no foundry-api-key named value to create or rotate.

3.3 Create the backend and API

# Named backend pointing at the Foundry Anthropic endpoint (Subscription B URL) az apim backend create -g <apim-rg> --service-name <apim-name> ` --backend-id foundry-claude ` --url "https://<foundry-resource>.services.ai.azure.com/anthropic" ` --protocol http # API with NO path suffix so callers hit /v1/messages at the gateway root az apim api create -g <apim-rg> --service-name <apim-name> ` --api-id claude-anthropic --display-name "Claude (Foundry)" ` --path="" --protocols https ` --service-url "https://<foundry-resource>.services.ai.azure.com/anthropic"

PowerShell + empty strings: write --path="" (joined with =), not --path "" as two tokens. PowerShell strips a bare "" before the az wrapper sees it, so the CLI reports argument --path: expected one argument. The = form keeps it a single token (--path=) that az reads as an empty string. The same trick applies to any empty-string value you pass to az from PowerShell.

Add the operations Claude Code calls (a wildcard covers them all):

  • POST /v1/messages
  • POST /v1/messages/count_tokens
  • GET /v1/models (only if you enable gateway model discovery โ€” see Part 5.3)

az apim can't apply XML policies. Apply the Part 4 policy via the portal (APIs โ†’ Claude (Foundry) โ†’ Inbound processing โ†’ policy editor) or via Bicep/ARM.

Part 4 โ€” The APIM policy (auth + rate limiting + metering)

Apply this at the API level. Replace the tenant ID and audience. The policy below is the key-based (Option A) version โ€” its step 6 removes the developer's Authorization header and sets the api-key header from the secret named value. For managed identity (Option B), swap step 6 as shown immediately after the policy; every other step is identical.

<policies> <inbound> <base /> <!-- On the client, Bearer token is generated and passed as x-api-key --> <set-header name="Authorization" exists-action="skip"> <value>@("Bearer " + context.Request.Headers.GetValueOrDefault("x-api-key",""))</value> </set-header> <!-- 1. Validate the developer's Entra ID token --> <validate-jwt header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized: invalid or missing Entra token."> <openid-config url="https://login.microsoftonline.com/{{tenant-id}}/v2.0/.well-known/openid-configuration" /> <audiences> <audience>{{gateway-audience}}</audience> </audiences> <issuers> <issuer>https://login.microsoftonline.com/{{tenant-id}}/v2.0</issuer> <!-- This is needed as Claude Code's Foundry Mode is looking for scope as https://cognitiveservices.azure.com/.default and audience cannot be changed to APIM Audience (api://...) --> <issuer>https://sts.windows.net/{{tenant-id}}/</issuer> </issuers> <required-claims> <claim name="roles" match="any"> <value>Claude.Standard</value> <value>Claude.Premium</value> </claim> </required-claims> </validate-jwt> <!-- 2. Per-developer key from the stable object id --> <set-variable name="callerId" value="@{ var jwt = context.Request.Headers .GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt(); return jwt.Claims.GetValueOrDefault("oid", "unknown"); }" /> <!-- 3. Tier from app role --> <set-variable name="tier" value="@{ var jwt = context.Request.Headers .GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt(); return jwt.Claims.GetValueOrDefault("roles","").Contains("Claude.Premium") ? "premium" : "standard"; }" /> <set-variable name="modelName" value="@{ var body = context.Request.Body.As<JObject>(preserveContent: true); return body?["model"]?.ToString() ?? "unknown"; }" /> <!-- 4. Token-based throttle per developer (controls LLM cost) --> <choose> <when condition="@(((string)context.Variables["tier"]) == "premium")"> <llm-token-limit counter-key="@((string)context.Variables["callerId"])" tokens-per-minute="200000" estimate-prompt-tokens="true" remaining-tokens-header-name="x-tokens-remaining" token-quota="20000000" token-quota-period="Monthly" /> <rate-limit-by-key calls="300" renewal-period="60" counter-key="@((string)context.Variables["callerId"])" retry-after-header-name="retry-after" remaining-calls-header-name="x-ratelimit-remaining" /> </when> <otherwise> <llm-token-limit counter-key="@((string)context.Variables["callerId"])" tokens-per-minute="50000" estimate-prompt-tokens="true" remaining-tokens-header-name="x-tokens-remaining" token-quota="5000000" token-quota-period="Monthly" /> <rate-limit-by-key calls="100" renewal-period="60" counter-key="@((string)context.Variables["callerId"])" retry-after-header-name="retry-after" remaining-calls-header-name="x-ratelimit-remaining" /> </otherwise> </choose> <!-- 7. Emit per-developer token usage for tracking / chargeback --> <llm-emit-token-metric namespace="claudecode"> <dimension name="UserId" value="@((string)context.Variables["callerId"])" /> <dimension name="Tier" value="@((string)context.Variables["tier"])" /> <dimension name="Model" value="@(context.Request.Body?.As<JObject>(true)?["model"]?.ToString() ?? "unknown")" /> </llm-emit-token-metric> <!-- 5. Request-rate throttle per developer --> <llm-emit-token-metric namespace="claudecode"> <dimension name="UserId" value="@((string)context.Variables["callerId"])" /> <dimension name="Tier" value="@((string)context.Variables["tier"])" /> <dimension name="Model" value="@((string)context.Variables["modelName"])" /> </llm-emit-token-metric> <!-- 6. Authenticate to Foundry with its API key (secret named value) --> <!-- Strip the developer's Entra token so Foundry never sees it --> <set-header name="Authorization" exists-action="delete" /> <set-header name="x-api-key" exists-action="override"> <value>{{foundry-api-key}}</value> </set-header> <set-backend-service backend-id="foundry-claude" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> </on-error> </policies>

Option B โ€” authenticate to Foundry with managed identity

If you chose the managed-identity path (3.2), replace step 6 above with the block below. Instead of injecting an api-key, APIM acquires an Entra token for its own identity and forwards it as the Authorization bearer token. Token validation, rate limits, and metering are unchanged.

<!-- 6 (Option B). Authenticate to Foundry with APIM's managed identity --> <!-- Replace the developer's token with an MI token scoped to AI Services --> <authentication-managed-identity resource="https://cognitiveservices.azure.com" output-token-variable-name="msi-token" /> <set-header name="Authorization" exists-action="override"> <value>@("Bearer " + (string)context.Variables["msi-token"])</value> </set-header> <set-backend-service backend-id="foundry-claude" />

The token audience for Azure AI Services / Foundry is https://cognitiveservices.azure.com. For a user-assigned identity, add client-id="<uami-client-id>" to the authentication-managed-identity element. There's no api-key named value and no secret to rotate on this path โ€” which is exactly why it's the preferred production posture.

Policy notes

  • Stripping the developer's Authorization header before forwarding (step 6) matters: that Entra token is for APIM only. Foundry must receive only the api-key header.
  • {{tenant-id}}, {{gateway-audience}}, and {{foundry-api-key}} are APIM named values. Mark foundry-api-key as secret; the first two can be plain named values.
  • llm-token-limit and llm-emit-token-metric are APIM's GenAI gateway policies โ€” they understand the Anthropic/OpenAI message formats and parse token usage, so you meter tokens, not just requests. That's the right cost lever for token-heavy Claude Code.
  • These counters are per-region per-gateway. With multi-region APIM, limits are enforced per region.

Part 5 โ€” Configure Claude Code on developer machines

Developers point Claude Code at APIM (Anthropic Messages gateway mode) and authenticate with their own Entra token. The backend-auth swap is invisible to clients. 

5.1 Entra token helper (per-developer, auto-refreshing)

Create %USERPROFILE%\.claude\get-claude-gateway-token.ps1:

# Returns a short-lived Entra access token for the APIM gateway audience. az account get-access-token ` --resource "api://claude-code-gateway" ` --query accessToken -o tsv

PowerShell scripts need no chmod. If execution policy blocks the helper, allow local scripts for your user once:

Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned

5.2 Claude Code settings (%USERPROFILE%\.claude\settings.json)

In enabling configuration of below environment variables in settings.json under .claude folder allows its usage for all Claude Code sessions (VS Code, terminal CLI, JetBrains, etc.)

{ "env": { "ANTHROPIC_BASE_URL": "https://<apim-name>azure-api.net", "ANTHROPIC_MODEL": "claude-opus-4-8", "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-8", "CLAUDE_CODE_API_KEY_HELPER_TTL_MS": "600000" }, "apiKeyHelper": "powershell -NoProfile -ExecutionPolicy Bypass -File C:\\Users\\<you>\\.claude\\get-claude-gateway-token.ps1" }

In JSON, backslashes must be doubled โ€” hence C:\\Users\\.... Use pwsh in place of powershell if you run PowerShell 7.

  • apiKeyHelper output is sent as the Authorization (and X-Api-Key) header, validated by APIM's validate-jwt. The developer never holds the Foundry key.
  • CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3600000 refreshes the token hourly (Entra access tokens last ~60โ€“90 minutes).
  • Pinning the three ANTHROPIC_DEFAULT_*_MODEL IDs ensures Claude Code sends model names that match your Foundry deployment names, so the gateway passes model through untouched.
  • Other Anthropic models like Sonnet and Haiku can be provided. Default model to be used is provided with ANTHROPIC_MODEL.

Then developers run claude from their project folder.

5.3 Optional โ€” model discovery

To list gateway models in the /model picker, expose GET /v1/models on the API and set CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY=1 (Claude Code v2.1.129+). Only IDs starting with claude or anthropic appear.

5.4 VS Code extension

Settings.json in .claude folder will control both VS Code Extension and Claude Code CLI. 

๐Ÿš€ Ready to test? Jump to Part 7 to validate your setup, or run claude from your project folder to try it live.

Part 6 โ€” Rate-limiting and usage-tracking design

Per-developer keying. Everything is keyed on the Entra oid claim โ€” stable and unique per user, unlike email or upn which can change. For service accounts or CI, key on appid instead.

Two enforcement layers:

  • llm-token-limit โ€” tokens/min plus a monthly token quota. The real cost control.
  • rate-limit-by-key โ€” requests/min. Guards against runaway loops.

Tiering is driven by Entra app roles (Claude.Standard / Claude.Premium) read from the JWT โ€” no separate APIM subscription management needed.

Usage tracking flows from llm-emit-token-metric into Application Insights with UserId, Tier, and Model dimensions. Example Log Analytics query for per-user monthly token spend:

 

customMetrics | where name == "Total Tokens" and customDimensions.namespace == "claudecode" | extend UserId = tostring(customDimensions.UserId), Model = tostring(customDimensions.Model) | summarize Tokens = sum(valueSum) by UserId, Model, bin(timestamp, 1d) | order by Tokens desc

Foundry doesn't return Anthropic's standard rate-limit headers, so manage and observe limits through APIM (the headers above) and Azure Monitor rather than relying on upstream headers.

Part 7 โ€” Test and validate

# 1. Get a token as a developer $TOKEN = az account get-access-token --resource "api://claude-code-gateway" --query accessToken -o tsv # 2. Call the gateway directly in Anthropic Messages format $body = @{ model = "claude-sonnet-4-6" max_tokens = 64 messages = @(@{ role = "user"; content = "Say hello in one word." }) } | ConvertTo-Json Invoke-RestMethod -Method Post ` -Uri "https://<apim-name>.azure-api.net/v1/messages" ` -Headers @{ "Authorization" = "Bearer $TOKEN" "anthropic-version" = "2023-06-01" "content-type" = "application/json" } ` -Body $body

Invoke-RestMethod returns the parsed body but hides response headers. To see x-tokens-remaining / x-ratelimit-remaining, use Invoke-WebRequest with -ResponseHeadersVariable resp (then read $resp), or call curl.exe -i (the real curl, not PowerShell's curl alias).

Validation checklist

  • No token / expired token โ†’ 401 from validate-jwt (confirm before trusting rate limits).
  • Valid token โ†’ 200 with a Claude completion; response carries x-tokens-remaining / x-ratelimit-remaining.
  • 401 from Foundry on a valid developer token โ†’ the api-key named value is wrong or not injected (see Troubleshooting).
  • Exceed the limit โ†’ 429 with retry-after.
  • App Insights โ†’ customMetrics shows token counts dimensioned by UserId.
  • Then run claude end to end from a project folder.

Part 8 โ€” Operations and hardening

  • Key rotation (Option A). Foundry gives you two keys. Rotate by updating the foundry-api-key named value to key2, then regenerating key1 โ€” zero downtime. A Key Vault-backed named value makes this a one-place change.
  • Prefer managed identity in production (Option B). If you started on the key path, switch to managed identity (Parts 3.2 and 4, Option B) to remove the shared secret entirely. Because the Cognitive Services User role assignment works across subscriptions in the same tenant, the cross-subscription topology doesn't block this upgrade โ€” and developers see no change, since their side of the contract is always authenticate to the gateway as yourself.
  • Private networking. Put APIM in internal VNet mode and reach Foundry over a Private Endpoint; disable Foundry public network access so the gateway is the only path in. Cross-subscription private endpoints are supported.
  • Resilience. Deploy Claude in two regions and use APIM's load-balanced backend pool with retry on 429 and 5xx.
  • Cost guardrails. Pair per-user llm-token-limit quotas with an Azure Budget and alert on the Foundry resource in Subscription B.

Troubleshooting

SymptomCause / fix
404 resource not found from FoundryBackend URL or path wrong, or an OpenAI-style rewrite applied. Backend must end in /anthropic; callers hit /v1/messages. Remove any /openai/... rewrite and api-version query param.
401 from Foundry (developer token is valid) โ€” Option AThe api-key header is missing/wrong, or the foundry-api-key named value wasn't saved as expected. Confirm the named value, and that the policy deletes the developer Authorization header and sets api-key.
401 / 403 from Foundry โ€” Option B (managed identity)The role assignment is missing or hasn't propagated yet, or the token audience is wrong. Confirm APIM's identity has Cognitive Services User on the Foundry resource, wait a few minutes, and ensure the policy requests resource="https://cognitiveservices.azure.com". For a user-assigned identity, confirm the client-id is set.
Managed identity works same-sub but not cross-subThe two resources are in different Entra tenants. Cross-tenant managed identity isn't supported โ€” use the API key (Option A) instead.
401 at the gateway even with a tokenaud or issuer mismatch. Confirm the token's aud = api://claude-code-gateway and you used the v2.0 OIDC config and issuer.
403 from FoundryThe key belongs to a different Foundry resource, or the resource disabled key auth. Re-copy a key from Keys and Endpoint, or re-enable local/key auth.
Reduced Claude Code functionalityGateway stripped anthropic-beta / anthropic-version. Ensure both headers pass through.
Model not availableClaude Code's model ID doesn't match the Foundry deployment name. Align names, or rewrite the body model field in policy.
ChainedTokenCredential authentication failed (client side)Developer not logged in. Run az login so the helper has a usable Azure credential.

 

Wrapping up

With about an afternoon of setup you get a gateway that every Claude Code request flows through: Entra ID proves who the developer is, APIM GenAI policies cap how much each person can spend, and Application Insights tells you exactly where the tokens went. For the APIM โ†’ Foundry hop you pick what fits: a Foundry API key held only inside APIM (fastest start, works cross-tenant) or a managed identity with no shared secret at all (the production posture). Either way Claude can live in its own subscription, and developers hold nothing more sensitive than a short-lived Entra token.

When you're ready to tighten the screws, the upgrade path is clean: if you started on the key, move it into Key Vault, then graduate to managed identity to eliminate the secret entirely, and put the whole path on a private network. 

'None of those steps disrupt developers, because their side of the contract โ€” authenticate to the gateway as yourself โ€” never changes.

Start your pilot today: Deploy a Developer-tier APIM instance, connect it to Foundry, and have your first developer running Claude Code through the gateway by end of day. The Prerequisites section has everything you need to begin.'

All command-line steps target Windows with PowerShell 5.1 or 7. Model IDs and Foundry regions reflect availability at time of writing; check the Foundry model catalog for current options.

 

Next Steps

Get started now: - Deploy Claude models in Microsoft Foundry โ€” browse the model catalog and create your first deployment - Create an API Management instance โ€” spin up a Developer SKU for your pilot

Go deeper: - Claude Code LLM gateway requirements โ€” full specification for gateway compatibility - APIM GenAI gateway policies reference โ€” all available token and rate limiting options

Get help: - Questions? Post in the Azure AI Community with tag #ClaudeCode - Found an issue with this guide? Open a GitHub issue

Read the whole story
alvinashcraft
47 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

REDB inside, part 1 โ€” the 13 tables the whole engine runs on (with the actual SQL, and why it's not EAV)

1 Share

REDB SQL

A couple of weeks ago I published the redb.Core intro post โ€” what RedBase is at the API level, why I wrote it, what production looks like, the LINQ surface, what generated SQL looks like for nested dictionary lookups. If you haven't read it, start there โ€” it's the wide-angle shot.

This post starts a new series โ€” "REDB inside" โ€” that drills down into the engine. One article per layer:

  • Part 1 (this post) โ€” the database schema. 13 tables, what each one does, why the design is what it is, and the SQL you'd run to dump any object flat.
  • Part 2 โ€” code-first schemes. How SyncSchemeAsync<T> walks a C# class and turns it into rows in _schemes + _structures, the _structure_hash mechanism, automatic onboarding via InitializeAsync.
  • Part 3 โ€” CRUD internals. SaveAsync, LoadAsync, the TreeDiff change-tracking algorithm, COPY-protocol bulk insert, lazy loading.
  • Part 4 โ€” LINQ-to-SQL. How Where(x => x.Salary > 80000) becomes WHERE _id_structure = X AND _Long > 80000, the pivot CTE patterns, dialect differences (Postgres array_agg FILTER vs MSSQL MAX CASE WHEN).
  • Part 5 โ€” trees. LoadTreeAsync, GetDescendantsAsync, WhereHasAncestor, closure-table vs recursive CTE.
  • Part 6 โ€” window functions. Win.RowNumber(), Win.Rank(), PartitionBy/OrderBy over REDB objects.

Each one stands alone. You don't need to read them in order โ€” but if you want to understand why anything in parts 2-6 works the way it does, you need this post. Everything else is built on the 13 tables.

The whole RedBase engine runs on 13 tables. No JSON blob hiding the schema, no NVARCHAR(MAX) catch-all column โ€” every C# type lands in its own typed column. Let me show you how that works, and why this isn't classical EAV even though it kind of looks like it at first glance.

"Wait, isn't this just EAV?"

It's the most common reaction I get, and it deserves a real answer before we look at any DDL.

Classical EAV (Entityโ€“Attributeโ€“Value) looks like this:

object_id | attribute_name | value
----------|----------------|---------
42        | "FirstName"    | "Alice"
42        | "Age"          | "28"
42        | "Salary"       | "85000"

Everything in one table. Types erased. Attribute names are strings. Any non-trivial query becomes a self-join or a giant PIVOT. The filter "employees earning over $80k" turns into:

SELECT object_id
FROM   attributes
WHERE  attribute_name = 'Salary'
  AND  value::numeric > 80000;   -- runtime cast, no usable index

Add three more conditions and you're looking at three self-joins on the same table. Explain plans get embarrassing.

REDB doesn't do that. Here's what _values actually looks like:

CREATE TABLE _values (
    _id              bigint NOT NULL,
    _id_structure    bigint NOT NULL,   -- FK โ†’ _structures (which field this is)
    _id_object       bigint NOT NULL,   -- FK โ†’ _objects
    -- typed value columns, exactly one is non-NULL per row:
    _String          text NULL,
    _Long            bigint NULL,
    _Guid            uuid NULL,
    _Double          float NULL,
    _DateTimeOffset  timestamptz NULL,
    _Boolean         boolean NULL,
    _ByteArray       bytea NULL,
    _Numeric         numeric(38,18) NULL,
    _ListItem        bigint NULL,       -- FK โ†’ _list_items
    _Object          bigint NULL,       -- FK โ†’ _objects (cross-object reference)
    -- relational collection storage:
    _array_parent_id bigint NULL,       -- FK โ†’ _values (parent element)
    _array_index     text NULL          -- '0','1','2' for arrays, key for dictionaries
);

The field identity is a foreign key (_id_structure), not a string. The value lives in a typed column chosen at write time based on the field's declared C# type. Think of it as runtime type information (RTTI) persisted into the schema: the engine always knows what type each field is, because that's recorded once in _structures._id_type and reused for every value of that field.

Reading values back is one CASE expression dispatched on type, not a self-join:

CASE c.db_type
  WHEN 'String'  THEN v._String
  WHEN 'Long'    THEN v._Long::text
  WHEN 'Guid'    THEN v._Guid::text
  ...
END

The practical differences:

Classical EAV REDB _values
Field identity string in attribute_name FK โ†’ _structures โ†’ _types
Where the value lives one text/varchar(max) column typed column per C# type
Filter Salary > 80000 WHERE attribute_name='Salary' AND value::numeric > 80000 WHERE _id_structure = $1 AND _Long > 80000
Index on the value string index + runtime cast partial B-tree index on the typed column
Arrays/dictionaries separate table or JSON blob _array_parent_id + _array_index in the same row
Schema metadata implicit in attribute names first-class rows in _schemes/_structures/_types, denormalized into a metadata cache

So yes, the row shape rhymes with EAV โ€” but the type system and the indexing story are completely different. That's why I've been pushing back on the EAV label.

The 13 tables, at a glance

_types          โ€” type catalog (~37 system rows)
_schemes        โ€” schemes (C# classes mapped to DB rows)
_structures     โ€” fields of schemes (with nesting and collection metadata)
_objects        โ€” objects (data rows, tree-shaped via self-FK)
_values         โ€” field values (typed columns + relational collections)
_lists          โ€” pick-list/dictionary catalog
_list_items     โ€” pick-list entries
_users          โ€” users (system IDs โˆ’1, 0, 1)
_roles          โ€” roles
_users_roles    โ€” M2M user โ†” role
_permissions    โ€” permissions on objects (inherited along the tree)
_links          โ€” M2M relations between objects
_functions      โ€” stored expressions attached to schemes
_dependencies   โ€” cross-scheme dependencies
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
_scheme_metadata_cache   โ€” denormalized cache of structures ร— types
_migrations              โ€” history of props-schema migrations

The last two live in their own SQL files but matter just as much in practice. Let's walk through them layer by layer.

Layer 1 โ€” the type catalog: _types

CREATE TABLE _types (
    _id      bigint NOT NULL PRIMARY KEY,
    _name    text NOT NULL UNIQUE,
    _db_type text NULL,   -- which _values column to use: 'Long', 'String', 'Guid', ...
    _type    text NULL    -- C# type name: 'long', 'string', 'Guid', ...
);

System type IDs are negative constants near long.MinValue. A small sample:

_id _name _db_type C# type
-9223372036854775709 Boolean Boolean bool
-9223372036854775708 DateTime DateTimeOffset DateTime
-9223372036854775704 Long Long long
-9223372036854775700 String String string
-9223372036854775695 Decimal Numeric decimal
-9223372036854775675 Class โ€” nested class (marker only)
-9223372036854775668 Array โ€” T[] / List<T> (marker only)
-9223372036854775667 Dictionary โ€” Dictionary<K,V> (marker only)

The scary-looking numbers are just constants picked far from anything a user-generated key could ever hit (user IDs start at 1_000_000 via a global_identity sequence). Class, Array, and Dictionary have no column of their own in _values โ€” they're marker types; the actual leaves live in child rows.

There are ~37 built-in types total. Numeric ones (Int, Short, Byte, Float) physically store as Long/Double. Strings include semantic variants (Email, Url, Phone) that all use _String. Then DateOnly/TimeOnly/TimeSpan, geo (Latitude/Longitude), file metadata (FilePath/MimeType), Enum/EnumInt, and collection markers (Array/Dictionary/JsonDocument/XDocument).

Layer 2 โ€” schemes and fields: _schemes + _structures

CREATE TABLE _schemes (
    _id             bigint NOT NULL,
    _id_parent      bigint NULL,          -- nesting (namespaces)
    _name           text NOT NULL UNIQUE, -- e.g. 'MyApp.Models.EmployeeProps'
    _alias          text NULL,
    _structure_hash uuid NULL,            -- field hash for fast change detection
    _type           bigint NOT NULL       -- FK โ†’ _types (Class by default)
);

CREATE TABLE _structures (
    _id              bigint NOT NULL,
    _id_parent       bigint NULL,     -- nested props class
    _id_scheme       bigint NOT NULL, -- FK โ†’ _schemes
    _id_type         bigint NOT NULL, -- FK โ†’ _types
    _id_list         bigint NULL,     -- FK โ†’ _lists (for ListItem fields)
    _name            text NOT NULL,   -- C# property name
    _alias           text NULL,
    _order           bigint NULL,
    _collection_type bigint NULL,     -- NULL=scalar, Array_ID or Dictionary_ID
    _key_type        bigint NULL,     -- key type for Dictionary<K,V>
    _readonly        boolean NULL,
    _allow_not_null  boolean NULL,
    _is_compress     boolean NULL,
    _store_null      boolean NULL,
    _default_value   bytea NULL
);

When you write:

[RedbScheme("Employee")]
public class EmployeeProps
{
    public string FirstName            { get; set; } = "";
    public int    Age                  { get; set; }
    public decimal Salary              { get; set; }
    public string[]? Skills            { get; set; }
    public Address? HomeAddress        { get; set; }   // nested class
}

โ€ฆand await redb.SyncSchemeAsync<EmployeeProps>() fires for the first time, the engine:

  1. Inserts a row into _schemes with _name = "MyApp.EmployeeProps".
  2. Inserts one _structures row per public property.
  3. For Skills: sets _collection_type = Array_ID.
  4. For HomeAddress: sets _id_type = Class_ID and recursively creates child _structures rows whose _id_parent points back at the parent structure.

Then it computes a hash over all those structures and writes it to _schemes._structure_hash. Next time you call sync, comparing one UUID tells the engine whether anything actually changed โ€” no row-by-row diff needed.

There's a DB-level trigger that validates field names: no system-reserved (_id, _name, _date_create), no C# keywords (class, int, string), no leading digits. If you accidentally try to name a property int, the INSERT throws before the bad row ever lands.

Layer 3 โ€” objects: _objects

CREATE TABLE _objects (
    _id             bigint NOT NULL,
    _id_parent      bigint NULL,          -- tree parent (self-FK)
    _id_scheme      bigint NOT NULL,      -- FK โ†’ _schemes
    _id_owner       bigint NOT NULL,      -- FK โ†’ _users
    _id_who_change  bigint NOT NULL,      -- FK โ†’ _users
    _date_create    timestamptz NOT NULL,
    _date_modify    timestamptz NOT NULL,
    _date_begin     timestamptz NULL,
    _date_complete  timestamptz NULL,
    _key            bigint NULL,
    _name           text NULL,
    _note           text NULL,
    _hash           uuid NULL,
    -- value columns for RedbPrimitive<T>:
    _value_long     bigint NULL,
    _value_string   text NULL,
    _value_guid     uuid NULL,
    _value_bool     boolean NULL,
    _value_double   float NULL,
    _value_numeric  numeric(38,18) NULL,
    _value_datetime timestamptz NULL,
    _value_bytes    bytea NULL
);

Three things worth pointing out:

The tree via _id_parent is ON DELETE CASCADE. Drop a root, the whole subtree goes with it. Depth is unbounded. This is the primary organizational structure in REDB: sections, categories, folders, org charts, project trees โ€” they're all just _objects rows pointing at a parent.

The _value_* columns are for RedbPrimitive<T>. When an object is conceptually a single primitive (e.g. RedbObject<long> for a counter, RedbObject<string> for a token), there's no need to spin up _values rows โ€” the value rides in the object row itself. Eight columns, one per _db_type.

Soft delete is a scheme called @@__deleted (_id = -10). mark_for_deletion() walks the subtree via recursive CTE and atomically reparents everything under a trash container with _id_scheme = -10. Actual physical deletion is a separate, batched purge_trash(). This means you can offer "undelete" cheaply, and your data lake/CDC tools never see a destructive delete on the hot path.

Layer 4 โ€” the values: _values

This is the table that earns its keep. Everything else exists to make this one fast and consistent.

The DDL was up top. The interesting part is how collections fit into a flat row layout without a side table.

Scalar field

Age = 28 produces exactly one row:

_id_structure=struct_Age   _id_object=42   _Long=28   _array_parent_id=NULL   _array_index=NULL

Array of primitives

Skills = ["C#", "SQL", "React"] produces a marker row plus one row per element:

-- marker: "the array property exists" (without it, the property is NULL, not [])
_id=100   _id_structure=struct_Skills   _id_object=42   _array_index=NULL   _array_parent_id=NULL

-- elements; _array_parent_id points at the marker; _array_index is the position
_id=101   _id_structure=struct_Skills   _id_object=42   _String="C#"     _array_index='0'   _array_parent_id=100
_id=102   _id_structure=struct_Skills   _id_object=42   _String="SQL"    _array_index='1'   _array_parent_id=100
_id=103   _id_structure=struct_Skills   _id_object=42   _String="React"  _array_index='2'   _array_parent_id=100

That marker row matters: it's how the engine tells null (no marker, no elements) apart from [] (marker present, zero children). The same shape works for empty dictionaries too.

Dictionary

PhoneDir = { "home": "+7 999โ€ฆ", "work": "+7 495โ€ฆ" }:

-- marker
_id=200   _id_structure=struct_PhoneDir   _id_object=42   _array_index=NULL   _array_parent_id=NULL

-- entries; _array_index holds the dictionary key (as text)
_id=201   _id_structure=struct_PhoneDir   _id_object=42   _String="+7 999..."   _array_index='home'   _array_parent_id=200
_id=202   _id_structure=struct_PhoneDir   _id_object=42   _String="+7 495..."   _array_index='work'   _array_parent_id=200

_array_index is text precisely so dictionaries with string keys work without a separate table. Numeric dictionaries store keys as their string representation.

Nested class

HomeAddress.City = "Moscow" works the same way. The _structures rows for Address.City, Address.Street, etc. carry an _id_parent pointing at the parent structure (HomeAddress). The _values rows for those leaves carry an _array_parent_id pointing at the marker row for HomeAddress on this particular object.

Three unique indexes keep all of this consistent

CREATE UNIQUE INDEX UIX__values__structure_object
    ON _values (_id_structure, _id_object)
    WHERE _array_index IS NULL AND _array_parent_id IS NULL;

CREATE UNIQUE INDEX UIX__values__structure_object_parent
    ON _values (_id_structure, _id_object, _array_parent_id)
    WHERE _array_index IS NULL AND _array_parent_id IS NOT NULL;

CREATE UNIQUE INDEX UIX__values__structure_object_array_index
    ON _values (_id_structure, _id_object, _array_parent_id, _array_index)
    WHERE _array_index IS NOT NULL;

These three together guarantee: (a) at most one scalar/marker row per (structure, object), (b) at most one nested-class marker per (structure, object, parent), and (c) at most one element per (structure, object, parent, index). Try to insert a duplicate array element and the DB rejects it before any logic bug can corrupt the shape.

Layer 5 โ€” permissions: _permissions

CREATE TABLE _permissions (
    _id      bigint NOT NULL,
    _id_role bigint NULL,   -- XOR with _id_user (CHECK constraint)
    _id_user bigint NULL,
    _id_ref  bigint NOT NULL,  -- 0 = global, otherwise FK โ†’ _objects
    _select  boolean NULL,
    _insert  boolean NULL,
    _update  boolean NULL,
    _delete  boolean NULL
);

Permissions inherit along the object tree. A recursive CTE walks upwards from the target object up to 50 levels looking for the nearest ancestor that has an explicit permission row. _id_ref = 0 is the global fallback ("can this principal touch anything at all?"). Resolution priority is: user > role, specific object > global.

There's an automatic trigger that, when a child object is created without its own permission row, copies down the resolved permission from the nearest ancestor. The point isn't to materialize every permission โ€” it's to keep the recursive CTE short. After a few months of activity the depth the resolver has to climb stays roughly constant.

_scheme_metadata_cache โ€” why a cache table

Every query needs to know: "for an object with _id_scheme = X, which _structures rows exist, and what's the type of each?" That's a JOIN through _structures โ†’ _types that would otherwise fire on every single read.

So that JOIN is denormalized into a separate table that gets refreshed when the scheme changes:

CREATE TABLE _scheme_metadata_cache (
    _scheme_id           bigint NOT NULL,
    _structure_id        bigint NOT NULL,
    _parent_structure_id bigint,
    _name                text NOT NULL,
    type_name            text NOT NULL,   -- 'Long', 'String', 'Guid', ...
    db_type              text NOT NULL,   -- 'Long', 'String', 'Guid', ...
    type_semantic        text NOT NULL,   -- 'Object', '_RObject', 'Array', ...
    _collection_type     bigint,
    collection_type_name text,
    _key_type            bigint,
    key_type_name        text
    -- ... all the other _structures attributes inlined
);

A trigger on _schemes._structure_hash invalidates the cache for that scheme; on the next read, sync_metadata_cache_for_scheme(scheme_id) rebuilds it lazily. The big build_hierarchical_properties_optimized() function โ€” the one that materializes an object's full property tree into JSON โ€” never JOINs _structures or _types directly. It reads from this cache, and only from this cache.

Two SQL queries: dump any object flat

Here are two queries that show exactly what's in the box for a given object. The first uses raw JOINs (use this for ad-hoc debugging in DataGrip/SSMS). The second uses _scheme_metadata_cache โ€” what the engine actually runs at scale.

Query 1 โ€” flat pivot, no cache

-- PostgreSQL
SELECT
    s._name                                          AS scheme_name,
    st._name                                         AS field_name,
    t._name                                          AS type_name,
    CASE t._db_type
        WHEN 'String'         THEN v._String
        WHEN 'Long'           THEN v._Long::text
        WHEN 'Guid'           THEN v._Guid::text
        WHEN 'Double'         THEN v._Double::text
        WHEN 'Boolean'        THEN v._Boolean::text
        WHEN 'DateTimeOffset' THEN v._DateTimeOffset::text
        WHEN 'Numeric'        THEN v._Numeric::text
        WHEN 'ListItem'       THEN v._ListItem::text
        WHEN 'Object'         THEN v._Object::text
        WHEN 'ByteArray'      THEN encode(v._ByteArray, 'base64')
        ELSE                       NULL
    END                                              AS value_text,
    CASE
        WHEN v._array_parent_id IS NULL
         AND v._array_index     IS NULL
         AND t._name IN ('Array','Dictionary','Class')  THEN 'collection_marker'
        WHEN v._array_parent_id IS NULL
         AND v._array_index     IS NULL                 THEN 'scalar'
        WHEN v._array_parent_id IS NOT NULL             THEN 'element[' || v._array_index || ']'
        ELSE                                                 'scalar'
    END                                              AS slot,
    v._array_index,
    v._array_parent_id
FROM _values      v
JOIN _structures  st ON st._id = v._id_structure
JOIN _schemes     s  ON s._id  = st._id_scheme
JOIN _types       t  ON t._id  = st._id_type
WHERE v._id_object = :object_id
ORDER BY st._order NULLS LAST, v._array_index NULLS FIRST;
-- MS SQL Server
SELECT
    s._name                                          AS scheme_name,
    st._name                                         AS field_name,
    t._name                                          AS type_name,
    CASE t._db_type
        WHEN 'String'         THEN v._String
        WHEN 'Long'           THEN CAST(v._Long AS nvarchar(MAX))
        WHEN 'Guid'           THEN CAST(v._Guid AS nvarchar(MAX))
        WHEN 'Double'         THEN CAST(v._Double AS nvarchar(MAX))
        WHEN 'Boolean'        THEN CASE v._Boolean WHEN 1 THEN 'true' WHEN 0 THEN 'false' END
        WHEN 'DateTimeOffset' THEN CAST(v._DateTimeOffset AS nvarchar(MAX))
        WHEN 'Numeric'        THEN CAST(v._Numeric AS nvarchar(MAX))
        WHEN 'ListItem'       THEN CAST(v._ListItem AS nvarchar(MAX))
        WHEN 'Object'         THEN CAST(v._Object AS nvarchar(MAX))
        WHEN 'ByteArray'      THEN N'<binary, base64 in app code>'
        ELSE                       NULL
    END                                              AS value_text,
    CASE
        WHEN v._array_parent_id IS NULL AND v._array_index IS NULL THEN 'scalar'
        WHEN v._array_parent_id IS NOT NULL                        THEN 'element[' + ISNULL(v._array_index,'') + ']'
        ELSE                                                            'scalar'
    END                                              AS slot,
    v._array_index,
    v._array_parent_id
FROM [dbo].[_values]      v
JOIN [dbo].[_structures]  st ON st._id = v._id_structure
JOIN [dbo].[_schemes]     s  ON s._id  = st._id_scheme
JOIN [dbo].[_types]       t  ON t._id  = st._id_type
WHERE v._id_object = @object_id
ORDER BY st._order, v._array_index;

This is a diagnostic query โ€” paste it into psql/DataGrip/SSMS, plug in any object ID, and you see exactly what's stored: which fields, which slot (scalar / collection marker / array element), which type. The marker rows for arrays show up too, which is exactly what you want when you're hunting down a null vs [] regression.

Query 2 โ€” same result via _scheme_metadata_cache

-- PostgreSQL (via _scheme_metadata_cache)
SELECT
    c._scheme_id                                      AS scheme_id,
    c._name                                           AS field_name,
    c.type_name                                       AS type_name,
    c.db_type                                         AS db_type,
    c.collection_type_name                            AS collection_type,
    CASE c.db_type
        WHEN 'String'         THEN v._String
        WHEN 'Long'           THEN v._Long::text
        WHEN 'Guid'           THEN v._Guid::text
        WHEN 'Double'         THEN v._Double::text
        WHEN 'Boolean'        THEN v._Boolean::text
        WHEN 'DateTimeOffset' THEN v._DateTimeOffset::text
        WHEN 'Numeric'        THEN v._Numeric::text
        WHEN 'ListItem'       THEN v._ListItem::text
        WHEN 'Object'         THEN v._Object::text
        WHEN 'ByteArray'      THEN encode(v._ByteArray, 'base64')
        ELSE                       NULL
    END                                               AS value_text,
    v._array_index,
    v._array_parent_id,
    c._order
FROM _values                v
JOIN _scheme_metadata_cache c ON c._structure_id = v._id_structure
WHERE v._id_object = :object_id
ORDER BY c._order NULLS LAST, v._array_index NULLS FIRST;
-- MS SQL Server (via _scheme_metadata_cache)
SELECT
    c.[_scheme_id]                                    AS scheme_id,
    c.[_name]                                         AS field_name,
    c.[type_name]                                     AS type_name,
    c.[db_type]                                       AS db_type,
    c.[collection_type_name]                          AS collection_type,
    CASE c.[db_type]
        WHEN 'String'         THEN v._String
        WHEN 'Long'           THEN CAST(v._Long AS nvarchar(MAX))
        WHEN 'Guid'           THEN CAST(v._Guid AS nvarchar(MAX))
        WHEN 'Double'         THEN CAST(v._Double AS nvarchar(MAX))
        WHEN 'Boolean'        THEN CASE v._Boolean WHEN 1 THEN 'true' WHEN 0 THEN 'false' END
        WHEN 'DateTimeOffset' THEN CAST(v._DateTimeOffset AS nvarchar(MAX))
        WHEN 'Numeric'        THEN CAST(v._Numeric AS nvarchar(MAX))
        WHEN 'ListItem'       THEN CAST(v._ListItem AS nvarchar(MAX))
        WHEN 'Object'         THEN CAST(v._Object AS nvarchar(MAX))
        WHEN 'ByteArray'      THEN N'<binary>'
        ELSE                       NULL
    END                                               AS value_text,
    v._array_index,
    v._array_parent_id,
    c.[_order]
FROM [dbo].[_values]                v
JOIN [dbo].[_scheme_metadata_cache] c ON c.[_structure_id] = v.[_id_structure]
WHERE v.[_id_object] = @object_id
ORDER BY c.[_order], v.[_array_index];

Why the second one is the production query:

  1. Two heavy JOINs (_structures, _types) gone. The cache row carries everything those joins would have produced.
  2. The cache already has a B-tree on _structure_id and a stable _order for ordering โ€” no extra sorts.
  3. The cache is refreshed only when the scheme changes (_structure_hash flips), not on every read. Steady-state queries pay zero cost for it.
  4. build_hierarchical_properties_optimized() goes one step further: it slurps all _values rows for one object into a _values[] array in a single SELECT, then walks the property tree purely in memory using unnest(). Recursion into nested classes and array elements never touches the table again. For deeply nested object graphs this is a big deal โ€” you can materialize a 40-row, 8-level-deep object with two SELECTs total.

How this connects to the C# API

For context โ€” what those tables look like from a SaveAsync/LoadAsync perspective. The full API tour is in the intro post; here's just the mapping back to the tables we just looked at:

[RedbScheme("Employee")]
public class EmployeeProps
{
    public string  FirstName { get; set; } = "";
    public string  LastName  { get; set; } = "";
    public int     Age       { get; set; }
    public decimal Salary    { get; set; }
    public string[]? Skills  { get; set; }
}

// InitializeAsync scans the assembly โ†’
//   - inserts/updates rows in _schemes + _structures for each [RedbScheme]
//   - refreshes _scheme_metadata_cache on changed schemes
await redb.InitializeAsync(typeof(EmployeeProps).Assembly);

// SaveAsync โ†’
//   - one row into _objects
//   - one row per scalar into _values (using the typed column for the C# type)
//   - for Skills: one marker row + one row per element, linked via _array_parent_id
var employee = new RedbObject<EmployeeProps>
{
    name  = "Alice Johnson",
    Props = new EmployeeProps
    {
        FirstName = "Alice",
        LastName  = "Johnson",
        Age       = 28,
        Salary    = 120_000m,
        Skills    = ["C#", "PostgreSQL", "Redis"]  // โ†’ 1 marker + 3 element rows
    }
};
long id = await redb.SaveAsync(employee);

// LoadAsync โ†’
//   - SELECT _objects WHERE _id = id
//   - SELECT * FROM _values WHERE _id_object = id  (one shot, into a _values[])
//   - build_hierarchical_properties_optimized() materializes the C# graph
var loaded = await redb.LoadAsync<EmployeeProps>(id);

SaveAsync reads _scheme_metadata_cache to know which column to write each value to, mints IDs from the global_identity sequence, and writes _objects + _values. LoadAsync reads _objects first, then one SELECT pulls every _values row for that object into a memory array, and the recursive materializer never goes back to the database for that load.

A few design choices that aren't obvious

Negative constants for system IDs. User-generated keys come from a sequence starting at 1_000_000. System types, schemes, users live near long.MinValue. The two ranges can never collide. This means _types._id = -10 for the @@__deleted scheme isn't a special-case in any query โ€” it's just a perfectly normal FK that happens to be negative.

_structure_hash on _schemes. Without it, every SyncSchemeAsync<T> call would have to re-read the schema structure and compare row-by-row. With it, comparing one UUID tells you whether anything changed. The cache-invalidation trigger fires only on real changes, so steady-state operation pays nothing.

Marker rows. The trickiest design choice in _values. There's no separate "collections" table โ€” instead, a row with _array_index = NULL AND _array_parent_id = NULL and a collection-typed _id_structure is the marker, and child rows fan out from it via _array_parent_id. This is what makes null vs [] distinguishable, lets dictionaries with string keys work without a side table, and keeps nested-class hierarchies in the same physical structure as flat fields.

_Numeric NUMERIC(38, 18). Deliberate. double for money quietly loses pennies; nobody wants a 0.0000000001-off invoice total in production. The 38/18 precision/scale is more than enough for currency, percentage, weight, even small molar quantities. The cost is storage size (16 bytes vs 8), which on the kind of property volume REDB sees is noise.

Postgres vs MSSQL โ€” the cascade-delete story. On Postgres, _values has ON DELETE CASCADE on its FK to _structures. On MSSQL the same constraint would create multiple cascade paths and SQL Server refuses to compile that. The workaround: an INSTEAD OF DELETE trigger on _structures that manually cascades into _values. Same observable behavior, different machinery. There are a handful of places in the codebase where the dialect abstraction exists specifically to paper over this kind of thing.

Series links

  • Intro post โ€” An EF Core alternative for .NET apps with complex object graphs (published)
  • Part 1 (this post) โ€” the 13 tables, RTTI vs EAV, _values, collection storage, _scheme_metadata_cache, two diagnostic queries
  • Part 2 โ€” code-first schemes: SyncSchemeAsync<T>, _structure_hash, automatic onboarding
  • Part 3 โ€” CRUD internals: SaveAsync/LoadAsync, TreeDiff change tracking, COPY-protocol bulk insert
  • Part 4 โ€” LINQ โ†’ SQL: pivot CTEs, dialect splits, the OfficeLocations["HQ"].City walkthrough
  • Part 5 โ€” trees: LoadTreeAsync, WhereHasAncestor, the closure-table vs recursive-CTE story
  • Part 6 โ€” window functions: Win.RowNumber(), Win.Rank(), PartitionBy/OrderBy over REDB objects

Where to look

Questions/critique very welcome in the comments โ€” especially if you've built something similar and have war stories about indexing strategies on the values table, that's exactly the discussion I want to have.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories