Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
154993 stories
·
33 followers

1.0.59

1 Share

2026-06-02

  • Add the /voice command to dictate prompts using local speech-to-text models
Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Build smarter document workflows: What’s new in Azure Content Understanding at Build 2026

1 Share

Azure Content Understanding in Foundry Tools is Microsoft’s comprehensive content AI service. It ingests diverse data types — documents, audio, images, and video — and extracts the most critical information to power well-grounded, reliable generative AI and agentic solutions. Azure Content Understanding brings together Azure Document Intelligence’s proven traditional AI with advanced LLM-based content reasoning, enabling both structured and unstructured content extraction, as well as multimodal understanding to address your full spectrum of processing needs.

Accelerating customer momentum

Leading organizations are already using Content Understanding to move from unstructured content to production-scale automation.

DataSnipper is bringing AI-powered document analysis directly into Excel workflows

DataSnipper is embedding Content Understanding into everyday financial and audit workflows, allowing professionals to work directly with structured data derived from unstructured documents. As Vidya Peters, CEO of DataSnipper, shares, “By building with Azure Content Understanding, DataSnipper is turning unstructured documents into structured, actionable data, directly inside Excel. Together, we are enabling faster reviews, reliable evidence, and AI you can trust.”

FinHero is advancing from traditional document processing to LLM-powered understanding

FinHero is evolving from traditional document processing approaches with Azure Document Intelligence to more advanced, LLM-powered contextual reasoning using Content Understanding. By leveraging structured outputs across more complex document types and workflows, they are expanding automation beyond basic extraction into richer, end-to-end processing scenarios that support analytics and agent-driven applications.

Wolters Kluwer is automating tax workflows at scale with CU

Wolters Kluwer, for example, is applying CU across tax and financial workflows to provide measurable business outcomes. Adam Orentlicher, SVP CTO at Wolters Kluwer, noted “By integrating Content Understanding into our solutions, our customers turn complex, unstructured data into actionable insights—faster and more accurately. The result is streamlined workflows, less manual effort, and clear, measurable business value from AI.”

The signal from enterprise customers is clear: Azure Content Understanding is how enterprises operationalize unstructured content—at scale, across modalities, and in production.

Azure Content Understanding is advancing across the full developer workflow—from higher-quality extraction with GPT 5.2, to a more unified experience in Microsoft Foundry, to broader native file support and new integrations for agent and Markdown workflows. With SDKs for Python, Java, .NET, JavaScript, and TypeScript that are now generally available, these capabilities are ready to put into practice today across automation, RAG, and document processing scenarios. We’re also sharing an early look at what’s next in July, including new capabilities enabled by the next Content Understanding API version.

Improve extraction quality with the GPT 5 model family

Analyzers in Content Understanding are powered by LLM and embedding models you deploy in Microsoft Foundry. At Build, we’re expanding LLM support to include the latest GPT 5 model family (GPT 5.x), starting with GPT 5.2 (available now). With GPT 5.2, custom field extraction is enhanced, avoiding the need for prompt engineering gymnastics. Whether it’s mixed layouts, domain-specific language, or multiple languages, extraction is more accurate right out of the box. Existing analyzers built on GPT 4.1 continue to run unchanged.

Try it out

The upgrade is a two-step path you can typically complete in under 5 minutes:

  1. Deploy GPT 5.2 in Microsoft Foundry. In the Foundry portal, open your Foundry resource and go to Deployments → Deploy model → Deploy base model. Search for “gpt-5.2”, click Confirm, and click Deploy. (Learn more about this in the models and deployments guide.)
  2. Create new custom analyzers with your new deployment in the Content Understanding Studio. Click the Create custom project button. Enter a name for your custom project, and open the Advanced settings panel. In the Model for analysis dropdown menu, select the name of your GPT 5 model deployment. (Custom analyzers are not available in Microsoft Foundry.)

GPT 5 2 in Customization image

As always, we recommend running side-by-side against your existing eval set before flipping production traffic, as confidence scores, latency, and output accuracy can all shift with a new model.

Build and run Content Understanding directly in Microsoft Foundry

Microsoft Foundry brings all of your AI tools into a single, unified environment for building modern AI applications. We’re excited to announce that Content Understanding is now a first-class citizen in the new Microsoft Foundry portal. Instead of stitching together multiple tools and services, developers can now access Foundry models, prebuilt analyzers, and agentic integrations in one place, reducing the friction from experimentation to production.

With Content Understanding prebuilt analyzers now integrated into Foundry, you can:

  • Work with the latest AI models and services in one environment. Deploy and use advanced GPT models alongside Content Understanding analyzers, and combine them with capabilities like translation, PII detection, and search without switching tools.
  • Move faster from idea to working solution. Foundry provides a streamlined experience for building agentic document-processing workflows, eliminating the need to jump between separate portals or tools.
  • Prototype and validate in real time. The built-in playground experience allows you to upload documents and immediately inspect structured output side by side, reducing iteration time.
  • Deep link to Content Understanding Studio. For building custom analyzers for unique use cases, one click takes you into CU Studio with your project context preserved to get started with schema design and evaluation.

Content Understanding in Foundry New, with step by step click instructions

Try Content Understanding in Microsoft Foundry

  1. Open the Foundry portal and ensure the New Foundry experience is enabled from the top navigation.
  2. Select Build, then open Deployments from the left navigation. Under AI Services, choose Content Understanding (Read and Layout analyzers are also available here).
  3. Choose a category of prebuilt analyzers in the middle pane, then select the analyzer that best matches your document type, such as invoices, tax forms, or general layout extraction in the dropdown menu.
  4. You can run a sample document or drag and drop your own file to see results instantly. The output appears alongside the document, making it easy to understand what is extracted and how it is structured.

You’re now ready to move to production. Once you are satisfied with the results, select the key icon to retrieve your endpoint and API key, and use the provided code snippets to integrate the analyzer into your application.

CU Build2026 InvoiceDemo image

Interested in building a custom analyzer? Click Customize in CU Studio from the resource page.

Learn more: Foundry vs. Content Understanding Studio · Create a Microsoft Foundry resource

Unlock deep understanding across more file types

The shortest path from “I have a document” to “I have structured data” is the one where you don’t have to convert the file first. Azure Content Understanding now ingests a wider set of file types, gathering the context of these files without needing to convert the file types before processing.

What’s new

  • Email and message formats — .eml and .msg are now supported.
  • Legacy and open Office formats — .doc, .xls, .ppt, and the OpenDocument family (.odt, .ods, .odp) are now supported directly, without an upstream conversion step.
  • Extract embedded images in Office files — Customers can now extract figures (images, charts, diagrams) embedded in Office documents like .docx, .pptx, and .xlsx. Each figure can be retrieved using (The figureId is referenced in the markdown output for the document.):

GET /contentunderstanding/analyzerResults/{operationId}/files/figures/{figureId}

Learn more: Supported document formats

Accelerate agentic workflows with Content Understanding

We’re excited to announce that Content Understanding is now integrated with some of the most popular ways developers are building today, including Microsoft Agent Framework, Foundry IQ (Standard mode), LangChain, and MarkItDown. CU is able to meet developers in the middle of their favorite frameworks to make multi-modal building easier. With the Microsoft Agent Framework integration, for example, an agent can hand off a PDF or image mid-turn and get back structured fields or layout-aware Markdown without your code needing to orchestrate the call. We’re also bringing CU to open-source tools like, like MarkItDown, the converter for turning any document into clean Markdown for LLM consumption. By bringing the power of Content Understanding Layout into MarkItDown, developers can generate layout-aware Markdown that preserves key structures like tables, headings, and figure descriptions. CU is also integrated into LangChain, for easily transforming unstructured content into structured Document objects, and Foundry IQ (standard mode) for built-in content extraction in Microsoft’s retrieval and agent workflows.

How to use Content Understanding from Microsoft Agent Framework

Register Content Understanding as a tool on your agent. The agent’s planner will call it whenever it needs to read a document:

# pip install agent-framework-azure-contentunderstanding 
 
from agent_framework_azure_contentunderstanding import ( 
    ContentUnderstandingContextProvider, 
    AnalysisSection, 
    ContentLimits, 
) 
 
# Minimal setup (uses prebuilt-read analyzer by default) 
cu = ContentUnderstandingContextProvider( 
    endpoint="https://my-resource.cognitiveservices.azure.com/", 
    credential=DefaultAzureCredential(), 
) 
 
# Full configuration with a custom analyzer 
cu = ContentUnderstandingContextProvider( 
    endpoint="https://my-resource.cognitiveservices.azure.com/", 
    credential=DefaultAzureCredential(), 
    analyzer_id="my-custom-analyzer", 
    max_wait=10.0, 
    output_sections=[ 
        AnalysisSection.MARKDOWN, 
        AnalysisSection.FIELDS, 
        AnalysisSection.FIELD_GROUNDING, 
    ], 
    content_limits=ContentLimits(max_pages=50, max_file_size_mb=50), 
) 
  
# Snippet for use with agent 
async with cu: 
    agent = Agent(client=llm_client, context_providers=[cu]) 
    response = await agent.run(...) 

The agent decides when to call analyze_document, you don’t have to. For domain-specific extraction, swap prebuilt-layout for one of the prebuilt analyzers or your own custom analyzer ID.

Learn more: Microsoft Agent Framework overview · Tool calling patterns

How to use Content Understanding with MarkItDown

Install MarkItDown and configure it to use Content Understanding as the extraction backend. From there, any file you pass through convert() goes through CU and comes out as clean, layout-aware Markdown:

# pip install 'markitdown[az-content-understanding]'
 

from markitdown import MarkItDown

# Zero-config — auto-selects analyzer per file type

md = MarkItDown(cu_endpoint="<content_understanding_endpoint>")
result = md.convert("report.pdf")   # documents → prebuilt-documentSearch
result = md.convert("meeting.mp4")  # video → prebuilt-videoSearch
result = md.convert("call.wav")     # audio → prebuilt-audioSearch
print(result.markdown)

)

# Full configuration with a custom analyzer 

md = MarkItDown(
    cu_endpoint="<content_understanding_endpoint>",
    cu_analyzer_id="my-invoice-analyzer",

)

result = md.convert("invoice.pdf")
print(result.markdown)

# Output includes YAML front matter with extracted fields:
# ---
# contentType: document
# fields:
#   VendorName: CONTOSO LTD.
#   InvoiceDate: '2019-11-15'
# ---
# <!-- page 1 -->
# ...

 

The result is Markdown with headings, tables, and figure descriptions inline — exactly the shape downstream chunkers and embedding models prefer.

Learn more: MarkItDown on GitHub · Build a RAG solution with Content Understanding

Everything above is available for you to try out today, but we have even more exciting features coming in July.

Coming in July: even more capabilities for Content Understanding

Here’s a sneak peek at what’s landing in July 2026:

  • Real-time results, with a new Content Understanding synchronous API — Get results instantly without managing asynchronous workflows, making it easy to power responsive, user-facing experiences, like ID verification and live image capture.
  • An agentic understanding mode for complex documents — A new mode that doesn’t shy away from deep reasoning for your most complex documents. We’ll show this end-to-end in our session at Build and ship it for everyone to try in July.
  • Flexible processing, built for global scale with data zone and global zone support — Choose how and where your data is processed to meet performance, scale, and residency needs, while simplifying capacity management across regions.
  • Improved training for custom analyzers — Improve extraction quality using your own examples and domain context, reducing manual review and enabling more reliable automation for real-world workflows.
  • Labeled training data no longer stored in CU for privacy-first training — Keep full control of your data by using your own storage for training inputs, helping you meet compliance requirements without sacrificing model performance.
  • Better grounding and field normalization
  • Even broader support for the GPT-5 family models
  • New prebuilt analyzers and digital-only variants

 

If you’re attending Build 2026, join us at Session BRK242 — “Turn your agents into action” (recorded), where we’ll go deep on the agentic understanding mode demo and the Foundry IQ integration. If you’re not at Build, the recording will be online right after the session, and we’ll publish a follow-up dev blog when the July release ships, including more working code, region availability, and migration guidance for every customer currently on GPT 4.x.

The post Build smarter document workflows: What’s new in Azure Content Understanding at Build 2026 appeared first on Microsoft Foundry Blog.

Read the whole story
alvinashcraft
41 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

A harness for every task: dynamic workflows in Claude Code

1 Share
A harness for every task: dynamic workflows in Claude Code
Read the whole story
alvinashcraft
58 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft's new MAI models

1 Share

Microsoft announced two new text LLMs this morning - MAI-Thinking-1 (reasoning, 35B parameters, available to "select early partners") and MAI-Code-1-Flash (5B parameters, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code"). I've not been able to try either of them just yet.

It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4.6 in our blind human side-by-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop.

Tags: llm-release, generative-ai, ai, microsoft, llms

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Introducing azure-functions-skills: An AI-Era Workspace for Azure Functions (Preview)

1 Share

AI coding agents connected to Azure Functions skills

Today we’re announcing azure-functions-skills in public preview: a one-command way to give your favorite coding agent (GitHub Copilot CLI, Claude Code, Codex CLI, VS Code) the skills, agent definition, MCP servers, hooks, and instructions it needs to ship secure-by-default, scale-ready Azure Functions — end-to-end.

AI coding agents now write the first draft of your function, scaffold the infrastructure, and run the deploy command. But ask a general-purpose agent to build for Azure Functions and the output is usually a step behind. It leans on older programming models that have been superseded, and it has no knowledge of newer capabilities: the serverless agents runtime, Flex Consumption defaults, the new Azure MCP template service, the latest binding shapes, this week’s runtime improvements, or Go language support. Worse, the code it produces often leaves hardcoded keys, connection strings, and other secrets sitting in your function for you to clean up later, picks patterns that don’t scale (client-per-invocation, blocking I/O on the hot path), and skips identity-based access entirely. The code compiles, but it isn’t secure, isn’t current, and isn’t using what Azure Functions offers today.

azure-functions-skills closes that gap. The skills steer the agent toward managed identity, Key Vault references, Flex Consumption, and the binding and concurrency patterns that scale — and the built-in doctor catches the rest before deploy.

Try it now: npx @azure/functions-skills install

In about 5 minutes you’ll have a working Functions project scaffolded with managed identity, a deploy-ready workflow, and a doctor HTML report you can wire into CI.

Requirements: Node 18+, an Azure subscription, and one of: GitHub Copilot CLI, Claude Code, Codex CLI, or VS Code.

Availability: azure-functions-skills is in public preview on npm as @azure/functions-skills and on the GitHub Copilot CLI / Claude Code / Codex plugin marketplaces. The skill set is intentionally small at launch and will grow with each Azure Functions release.

What is azure-functions-skills?

azure-functions-skills is a plugin for AI coding agents. It builds on the broader azure-skills plugin for cross-Azure scenarios, and it ships:

  • Skills. Task-focused playbooks the agent loads on demand (setup, create, deploy, diagnostics, best-practices, health-status, inventory, doctor, feedback).
  • An agent definition (functions-copilot) that routes user requests to the right skill and proposes the next workflow when one finishes.
  • MCP server configuration, hooks, and instruction files (copilot-instructions.md, CLAUDE.md, AGENTS.md). Everything the agent needs to behave consistently across hosts.
  • A companion CLI, @azure/functions-skills, that installs all of the above with one command, lets you run the agent (chat), and validates your project before deployment (doctor).

Names you’ll see in this post: @azure/functions-skills — the npm package and CLI you run. azure-functions-skills — the plugin (skills + instructions) the CLI installs. functions-copilot — the agent definition that routes you to the right skill.

Two design choices shape every feature:

  1. Skill discovery is a first-class product surface. Skill names and granularity are tuned so the agent picks the right one at the moment a developer asks for it, and so a developer browsing the catalog can recognize what each skill is for. Where a request belongs to the broader Azure surface, we route into the azure-skills plugin rather than reinvent it.
  2. The agent responds in the language you write in. Ask in Japanese, get Japanese. Ask in English, get English. The instruction files are wired so the host agent honors the conversation language consistently.

What ships in the preview

Skill catalog

The azure-functions-agents skill is included from launch and supports the Azure Functions serverless agents runtime that just launched at Build 2026.

Skill What it does
azure-functions-setup Detects Azure CLI / azd / Core Tools / language runtimes / the azure-skills plugin on your machine and walks you through installing what’s missing.
azure-functions-create Scaffolds new Functions projects, or adds functions to an existing project, using the Azure MCP template service so you always start from the latest templates.
azure-functions-agents 🚀 Scaffolds, extends, deploys, and troubleshoots event-driven AI agents on the Azure Functions serverless agents runtime (azurefunctions-agents-runtime) that just launched at Build 2026. Picks the best deployable GPT model based on subscription / region quota, wires Microsoft Foundry, Connector Namespaces, and remote MCP servers, and offloads code execution or web browsing to Azure Container Apps dynamic sessions.
azure-functions-deploy Hands off to the azure-skills preparevalidatedeploy workflow with Functions-specific guidance (Flex Consumption, functionAppConfig, private networking, identity).
azure-functions-best-practices Reviews an existing Function App against current best practices and proposes prioritized, approval-gated remediations.
azure-functions-diagnostics Investigates deployment failures, runtime errors, trigger / binding issues, and logging gaps.
azure-functions-health-status Collects the current running state, metrics, Application Insights signals, Resource Health, and Activity Log.
azure-functions-inventory Collects static specifications: SKU, runtime, networking, identity, settings, functions, and trigger inventory.
azure-functions-doctor Pre-deployment validation, used by the doctor CLI command below.
azure-functions-feedback Turns observations from a session into a previewed GitHub issue or PR against this repo.

The set is intentionally small at launch. It already includes azure-functions-agents so you can scaffold and deploy on the Azure Functions serverless agents runtime that just launched at Build 2026. A skill to assist migrating worker code to Go is next.

Have a skill you’d like to see? Open an issue at https://github.com/Azure/azure-functions-skills/issues, or just run azure-functions-feedback mid-session and the skill itself will prepare the issue draft for you.

The CLI: install, chat, doctor

install: one command for every host

Each AI coding agent has its own plugin install flow, and several of them spread the work across multiple steps. The GitHub Copilot CLI plugin, in particular, can only be installed at user scope. That’s useful for skills, but not what you want for project-specific MCP servers, hooks, or instruction files that should live with your repository.

install collapses all of that into one command and applies the right split by default:

  • Plugin (skills) → user scope. Available to every project on your machine.
  • Workspace artifacts (MCP, agent definition, hooks, CLAUDE.md / AGENTS.md) → the current directory. Committable alongside your code.

This keeps your user-scope agent context clean and makes the Azure Functions skills findable every time you open the workspace. If you want everything in the project, add --local:

# GitHub Copilot CLI (default: plugin user-scope, workspace artifacts here)
npx @azure/functions-skills install --agent ghcp

# Everything in the project
npx @azure/functions-skills install --agent ghcp --local

Use --agent claude for Claude Code or --agent codex for Codex CLI. The CLI also absorbs future plugin-flow changes so the command stays stable for users.

chat: start the agent with the right context

chat launches your installed agent of choice, already wired into the functions-copilot agent definition.

npx @azure/functions-skills chat

A typical first message looks like this:

“Create a Python HTTP trigger that reads from Cosmos DB using managed identity, and add a Service Bus output binding.”

The agent picks the right skills (create, then best-practices), uses the Azure MCP template service for the latest scaffold, and wires identity-based access by default. No keys in your repo.

The first time you run chat in a workspace, the setup skill auto-fires. It walks through prerequisites (Azure CLI, Azure Developer CLI, Core Tools, language runtimes, the azure-skills plugin) and offers to install anything missing, so a developer brand-new to Azure Functions can get to a working environment without bouncing between docs.

After setup, the agent suggests the most useful next skill based on your project state, which makes the rest of the catalog easy to discover.

chat launches the functions-copilot agent; on first run, the setup skill auto-fires to verify prerequisites

Everything after -- is passed through to the underlying agent, so any agent-native flag you rely on still works. Subsequent chat runs skip setup because the per-workspace state lives under .azure-functions-skills/.

VS Code users get the same experience: open the workspace, pick the functions-copilot agent, and run the setup skill from there.

Selecting the functions-copilot custom agent from the GitHub Copilot Chat agent picker in VS Code

doctor: shift-left for the two biggest incident causes

Do you know the top two causes of Azure Functions support incidents reported to our team?

  1. User code defects
  2. Function App misconfiguration

Together, they account for roughly half of the Azure Functions support incidents we see internally — based on our analysis of Customer Reported Incidents (CRIs) in Q1 CY2026, about 53% were related to customer code or configuration issues. Preventing this class of issue before deploy time eliminates a large fraction of the problems customers report.

doctor checks a workspace for exactly those issues. It runs in two tiers:

  • Tier 1 (deterministic, no LLM): host.json shape, runtime version, trigger configuration, extension bundle range, deprecated settings, lockfile presence, tracked .env files, and a set of supply-chain checks (lifecycle scripts, unpinned production dependencies, install-script dependencies, and more) informed by the recent npm / PyPI compromises.
  • Tier 2 (semantic, LLM via --deep): Uses your coding agent to find issues that need to read the code: client-per-invocation patterns, blocking I/O on the hot path, hardcoded secrets, Durable Functions non-determinism (Date.now(), Math.random(), network calls in orchestrators), credential collection patterns, and more.

Run it locally and get a self-contained HTML report (the --deep --accept-deep-risk flags opt into Tier 2 LLM checks; safe to run locally, see the CI note below before using in pipelines):

npx @azure/functions-skills doctor --dir . \
  --deep --accept-deep-risk \
  --agent github-copilot \
  --format html --output doctor-report.html

A representative run looks like this:

Tier 1 (deterministic)
  ✓ host.json shape ok
  ✓ runtime version pinned (~4)
  ⚠ extension bundle range too broad   host.json:5
  ⚠ unpinned production dependency      semver:^7.0.0 → pin to 7.5.4
  ✗ tracked .env file with secret keys  .env:3

Tier 2 (semantic, via --deep)
  ⚠ blocking I/O on hot path            app/orders.py:42  (use async client)
  ✗ hardcoded connection string         app/cosmos.py:11  (use Key Vault reference)
  ⚠ client-per-invocation pattern       app/blob.py:18    (hoist client to module scope)

Summary: 2 critical, 4 warnings — see doctor-report.html

Doctor HTML report showing Tier 1 host.json and dependency findings plus Tier 2 semantic findings from the coding agent

The same command can run in CI. Wire it into your deployment pipeline and you have shift-left for the configuration and code-quality issues that drive the majority of incidents, caught while the developer (or the agent acting for them) can still fix the diff cheaply.

A word on running –deep in CI

--deep runs the coding agent with file-write and shell-execution permissions, so any input the agent sees becomes a potential prompt-injection surface. We default to refusing --deep on pull_request events. You can opt in with AZURE_FUNCTIONS_DOCTOR_TRUST_PR=1 for trusted mirror pipelines.

The recommended pattern:

  • PR validation: --no-deep (Tier 1 only). Fast, deterministic, safe to run on untrusted PR content.
  • Post-merge / release: --deep on push: main, ideally gated behind a GitHub Environment with required reviewers and a scoped secret for the agent token.

See docs/doctor-guide.md and SECURITY.md for the full security model.

Where each skill fits

When you want to… Use
Get your local environment ready for Functions development azure-functions-setup
Start a new project or add a function azure-functions-create
Build a scheduled or event-driven AI agent (daily briefing, inbox digest, connector-triggered workflow) azure-functions-agents
Deploy to Azure azure-functions-deploy
Catch problems before deployment doctor CLI (or azure-functions-doctor)
Review an existing app against current best practices azure-functions-best-practices
Investigate a failing or misbehaving Function App azure-functions-diagnostics
Check the live health of a running app azure-functions-health-status
Send us feedback or a feature request azure-functions-feedback

functions-copilot routes your request to the appropriate skill, and proposes the next step after each workflow.

Getting started

Pick the agent you already use; the rest of the flow is the same.

# 1. Install the plugin (default: skills at user scope, workspace artifacts here)
npx @azure/functions-skills install --agent ghcp     # GitHub Copilot CLI
npx @azure/functions-skills install --agent claude   # Claude Code
npx @azure/functions-skills install --agent codex    # Codex CLI

# 2. Launch the agent (setup skill auto-fires on first run)
npx @azure/functions-skills chat

# 3. Validate before deploy (--deep enables Tier 2 LLM checks; safe locally, see CI note)
npx @azure/functions-skills doctor --deep --accept-deep-risk \
  --agent github-copilot \
  --format html --output doctor-report.html

VS Code: after step 1, open the workspace in VS Code, select the functions-copilot agent in GitHub Copilot Chat, and run the setup skill. Same first-run experience as chat, just inside the IDE.

Prefer the skills scoped to the current project only? Add --local to step 1.

Full docs, CI recipes, and the supply-chain check reference live at https://github.com/Azure/azure-functions-skills.

We want your feedback

azure-functions-skills is open source, MIT licensed, and developed in the open. The repository is the right place to:

  • Ask for skills you wish were there: open an issue, or run azure-functions-feedback mid-session and have the skill prepare the draft for you.
  • Report bugs or suggest improvements. Every issue is read.
  • Contribute a skill or doc. See CONTRIBUTING.md.

Repository: https://github.com/Azure/azure-functions-skills

We’re building the AI-era developer experience for Azure Functions in the open. Star the repo, open an issue, or run azure-functions-feedback mid-session and have the skill draft the issue for you. Tell us what to ship next.

The post Introducing azure-functions-skills: An AI-Era Workspace for Azure Functions (Preview) appeared first on Azure SDK Blog.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Introducing Open-Source Skills for AWS SDK Best Practices

1 Share

We released a set of AWS SDK Skills as part of the open-source Agent Toolkit for AWS. These are AI skills that teach coding agents how to follow AWS SDK best practices. The project is available on GitHub under the Apache-2.0 license.

The problem

AI coding agents know the general shape of AWS SDK usage, but they get the details wrong. They generate incorrect API names, use incorrect parameter types, and miss SDK-specific patterns like paginators, waiters, and high-level APIs such as the transfer manager for Amazon Simple Storage Service (Amazon S3). These errors are especially common for newer SDKs like the AWS SDK for Swift, where agents generate code that looks plausible but fails to compile.

As developers increasingly rely on AI agents to write AWS SDK code, we need to make sure those agents produce code that compiles, follows best practices, and uses each SDK the way it was intended to be used.

What’s in a skill

Skills are modular packages that give AI coding agents specialized SDK knowledge. Each skill is authored by the SDK team that owns the language, so it reflects the things agents consistently get wrong for that specific SDK. A skill includes:

  • SKILL.md — core instructions with SDK-specific patterns and concrete examples
  • references/ — on-demand documentation for deeper topics, loaded only when needed
  • scripts/ — automation for build, test, and validation workflows

Skills are agent-agnostic. They work with any coding agent that supports the open skills format.

Common mistakes skills help prevent

Code that doesn’t compile. This is the most common failure mode for newer SDKs where the agent’s training data is thin or out of date. The AWS SDK for Swift uses Swift concurrency throughout. Operations are async-throwing, and so are the convenience client constructors. Agents frequently miss this and produce code that looks reasonable but doesn’t build:

// What agents tend to write. Does not compile.
let client = S3Client()
let response = client.listBuckets(input: ListBucketsInput())

Both lines are wrong: S3Client() is async throws, and so is listBuckets. With the Swift skill installed, the agent writes the modern Swift concurrency form:

let config = try await S3Client.S3ClientConfig(region: "us-west-2")
let client = S3Client(config: config)
let response = try await client.listBuckets(input: ListBucketsInput())

The first version sends the developer back to the docs to figure out why a plausible-looking line won’t build. The second one runs.

Code that runs but performs poorly or costs more. Agents often skip SDK features that exist precisely to make AWS calls efficient: paginators for ListObjects and similar APIs, waiters for resource-state polling, and the SDK’s high-level file methods like upload_file / download_file for large transfers. A handwritten loop that calls ListObjects without pagination silently drops results past the first page, polling code without waiters burns API calls and risks throttling, and manual file I/O for S3 transfers gives up multipart uploads and parallelism. The code compiles and often appears to work in small tests, but breaks once you’re dealing with real data volumes. With a skill installed, the agent reaches for the right SDK feature for the job: paginators for list operations, waiters for state polling, and the high-level transfer methods for files.

Code that runs but has subtle bugs. Manually marshalling DynamoDB types like {"S": "value"} is easy to get slightly wrong in ways that fail only on certain inputs. Catching a generic Exception instead of typed exceptions like ConditionalCheckFailedException makes retry logic swallow real failures. With a skill installed, the agent reaches for the document client (which handles the conversion correctly) and uses typed exceptions tied to the actual operations it’s calling.

Measuring the impact

We evaluate each skill against a benchmark of real SDK tasks (Amazon S3 operations, Amazon DynamoDB queries, client configuration, presigned URL generation, credential management) and grade the generated code on whether it compiles, passes lint, and actually does what the task asked for (judged by an LLM). Every task runs twice: once with no skill installed, and once with the relevant skill loaded.

Across our test suite, code generated with a skill installed consistently passed more checks than code generated without one.

Available skills

The following table summarizes the skills available at launch:

Skill SDK What it covers
aws-sdk-swift-usage AWS SDK for Swift Async patterns, struct-based config types, client initialization
aws-sdk-js-v3-usage AWS SDK for JavaScript v3 Package structure, client styles, middleware, runtime validation
aws-sdk-python-usage Boto3 / botocore Client vs. resource interfaces, paginators, waiters, error handling

Get started

You’ll need a coding agent that supports the open skills format. To install a skill from the Agent Toolkit for AWS, run:

npx skills add aws/agent-toolkit-for-aws/skills --skill <skill>

Replace <skill> with the one you want:

  • aws-sdk-swift-usage
  • aws-sdk-js-v3-usage
  • aws-sdk-python-usage

Or pass --skill multiple times to install more than one.

If your favorite SDK is missing or you’ve seen agents make mistakes that aren’t covered yet, open an issue or submit a skill. Visit the repository on GitHub to try it out.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories