Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147281 stories
·
32 followers

LangChain v1 is now generally available!

1 Share

Today LangChain v1 officially launches and marks a new era for the popular AI agent library. The new version ushers in a more opinionated, streamlined, and extensible foundation for building agentic LLM applications. In this post we'll breakdown of what’s new, what changed, and what “general availability” means in practice. 

Join Microsoft Developer Advocates, Marlene Mhangami and Yohan Lasorsa, to see live demos of the new API and find out more about what JavaScript and Python developers need to know about v1. Register for this event here.

 

Why v1? The Motivation Behind the Redesign

The number of abstractions in LangChain had grown over the years to include chains, agents, tools, wrappers, prompt helpers and more, which, while powerful, introduced complexity and fragmentation. As model APIs evolve (multimodal inputs, richer structured output, tool-calling semantics), LangChain needed a cleaner, more consistent core to ensure production ready stability. 

In v1:

  • All existing chains and agent abstractions in the old LangChain are deprecated; they are replaced by a single high-level agent abstraction built on LangGraph internals. 
  • LangGraph becomes the foundational runtime for durable, stateful, orchestrated execution. LangChain now emphasizes being the “fast path to agents” that doesn’t hide but builds upon LangGraph.
  • The internal message format has been upgraded to support standard content blocks (e.g. text, reasoning, citations, tool calls) across model providers, decoupling “content” from raw strings. 
  • Namespace cleanup: the langchain package now focuses tightly on core abstractions (agents, models, messages, tools), while legacy patterns are moved into langchain-classic (or equivalents).

What’s New & Noteworthy for Developers

Here are key changes developers should pay attention to:

1. create_agent becomes the default API

The create_agent function is now the idiomatic way to spin up agents in v1. It replaces older constructs (e.g. create_react_agent) with a clearer, more modular API that is middleware-centric. You can now compose middleware around model calls, tool calls, before/after hooks, error handling, etc. 

2. Standard content blocks & normalized message model

Responses from models are no longer opaque strings. Instead, they carry structured content_blocks which classify parts of the output (e.g. “text”, “reasoning”, “citation”, “tool_call”). If needed for backward compatibility or client serialization, you can opt in to serializing those blocks back into the .content field by setting output_version="v1". 

3. Multimodal and richer model inputs / outputs

LangChain now supports more than just text-based interactions. Models can accept and return files, images, video, etc., and the message format reflects this flexibility. This upgrade prepares us well for the next generation of models with mixed modalities (vision, audio, etc.).

4. Middleware hooks, runtime context, and finer control

Because create_agent is designed as a pluggable pipeline, developers can now inject logic before/after model calls, tool calls, error recoveries, fallback strategies, request transformations, and more. New middleware such as retry, fallback, call limits, and context editing have been added.
The notion of a runtime and context object accompanies each agent execution, making it easier to carry state or metadata through the pipeline. 

5. Simplified, leaner namespace & migration path

Many formerly top-level modules or helper classes have been removed or relocated to langchain-classic (or similarly stamped “legacy”) to declutter the main API surface. A migration guide is available to help projects transition from v0 to v1. While v1 is now the main line, older v0 is still documented and maintained for compatibility. 

What “General Availability” Means (and Doesn’t)

  • v1 is production-ready, after months of testing the alpha version 
  • The stable v0 release line remains supported for those unwilling or unable to migrate immediately. 
  • Breaking changes in public APIs will be accompanied by version bumps (i.e. minor version increments) and deprecation notices.
  • The roadmap anticipates minor versions every 2–3 months (with patch releases more frequently). 
  • Because the field of LLM applications is evolving rapidly, the team expects continued iterations in v1—even in GA mode—with users encouraged to surface feedback, file issues, and adopt the migration path. (This is in line with the philosophy stated in docs.)

Developer Callouts & Suggested Steps

Here are practical tips to get developers onboard:

  1. Try the new API Now!
    LangChain Azure AI and Azure OpenAI have migrated to LangChain v1 and are ready to test! Try out our getting started sample here. Learn more about using LangChain and Azure AI:
    1. Python: https://docs.langchain.com/oss/python/integrations/providers/azure_ai
    2. JavaScript: https://docs.langchain.com/oss/javascript/integrations/providers/microsoft
  2. Join us for a Live Stream on Wednesday 22 October 2025
    Join Microsoft Developer Advocates Marlene Mhangami and Yohan Lasorsa for a livestream this Wednesday to see live demos and find out more about what JavaScript and Python developers need to know about v1. Register for this event here
Read the whole story
alvinashcraft
44 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Episode 542: Yuriy Shyyan on owning your own Cloud.

1 Share

Matt Ray talks to Yuriy Shyyan, the Director of Cloud Systems Architecture for OpenMetal. They discuss the cost of running your enterprise in the public cloud, high school hacking, building a business on OpenStack, and recognizing that cloud credits are an invitation to purchase a timeshare.

Show Links

Contact Yuriy Shyyan

Sponsor

SDT News & Hype

Special Guest: Yuriy Shyya.

Sponsored By:





Download audio: https://aphid.fireside.fm/d/1437767933/9b74150b-3553-49dc-8332-f89bbbba9f92/8c1f4981-51c4-44a0-a49f-3a53786d9a34.mp3
Read the whole story
alvinashcraft
44 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Podcast: Building Composable Teams: Moving Beyond Rigid Organizational Structures

1 Share

In this podcast Shane Hastie, Lead Editor for Culture & Methods spoke to Luv Kapur about building composable team structures, creating trust through transparency and clarity, and enabling fluid organizational design through API-first team principles.

By Luv Kapur
Read the whole story
alvinashcraft
45 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Why rent a cloud when you can build one?

1 Share
Andrei Kvapil, founder of Ænix and core developer of Cozystack, joins Ryan to dive into what it takes to build a cloud from scratch, the intricacies of Kubernetes and virtualization, and how open-source has made digital sovereignty possible.
Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

9 open-source projects the GitHub Copilot and Visual Studio Code teams are sponsoring—and why they matter

1 Share

The rise of Model Context Protocol (MCP) has transformed how AI agents interact with tools, codebases, and even browsers. GitHub Copilot and VS Code teams, in partnership with the Microsoft Open Source Program Office (OSPO), are now sponsoring a wave of open-source MCP projects that push the boundaries of developer experience, agent autonomy, multi-modal capabilities, and community.

These nine projects—ranging from browser extensions to semantic code editors—are not just experiments, they’re the scaffolding for a new generation of AI-native workflows. Let’s dive into some of the exciting MCP-powered innovations on GitHub based on community engagement. 

From semantic code editing to new dev tools

1. Upstash/context7: Up-to-date documentation for LLMs and AI code editors 

Upstash’s Context7 MCP server allows developers to easily pull up-to-date, version-specific documentation and provides code examples straight from the source. From there, the MCP includes them directly into your prompt for easily understandable context for your LLMs and AI applications.

2. Tadata/fastapi_mcp: FastAPI, meet MCP 

FastAPI-MCP turns your FastAPI endpoints into MCP tools with authentication that needs minimal configuration. The MCP preserves schemas, documentation, and authentication logic, creating a seamless way to expose APIs to AI agents.

3. Oraios/serena: Semantic code editing for agents 

Serena is a fully featured coding agent toolkit that integrates language servers with MCP. It provides semantic retrieval and editing capabilities for code retrieval, editing, and shell execution. This makes coding agents smarter and more efficient and can even turn a vanilla LLM into a true IDE assistant.

4. Czlonkowski/n8n-mcp: Let AI agents build n8n workflows for you 

 This project brings n8n’s powerful automation engine into the MCP ecosystem. By bringing in comprehensive access to n8n node documentation, validation tools, and direct n8n instance access, agents can now trigger, monitor, and manipulate workflows programmatically. Though details are sparse, early adopters are already integrating it with GitHub Actions, Discord bots, and data pipelines.

5. Justinpbarnett/unity-mcp: AI agents in game dev 

 Unity-MCP exposes Unity’s game engine APIs to MCP clients. Agents can inspect and modify game objects, scenes, and prefabs. It’s a bold step toward AI-assisted game development, with potential for debugging, level design, and UI generation.

6. Antfu/nuxt-mcp: Nuxt dev tools

Created by ecosystem veteran Anthony Fu, Nuxt-MCP lets agents interact with Nuxt apps via MCP. It supports route inspection, component analysis, and SSR debugging. If you’re building with Nuxt and want AI-native tooling, this is your launchpad.

7. MCPJam/inspector: MCP server testing and evals

The MCPJam inspector is an open-source testing and debugging tool for MCP servers – Postman for MCP. It can test your MCP server’s tools, resources, prompts, and authentication, and also has an LLM playground to test your MCP server against different models. Bonus: MCPJam has a CLI tool for MCP evaluation.

8. Steipete/Peekaboo: Swift code analysis via MCP 

Peekaboo brings Swift codebases into the MCP fold. It uses language servers to expose symbol-level tools for agents, enabling code navigation, editing, and refactoring. Built by Peter Steinberger, it’s a must-have for iOS developers.

9. Instavm/coderunner: Run code safely and locally 

Coderunner is a sandboxed MCP server for executing code snippets. It supports multiple languages and isolates execution for safety. Agents can test hypotheses, run scripts, and validate outputs—all without leaving the IDE. 

Why GitHub and VS Code are sponsoring these projects 

These projects aren’t just cool—they’re helping accelerate the MCP community and provide tools that developers use and care about. GitHub Copilot and VS Code teams are sponsoring these projects to promote open-source software and open standards like MCP, accelerate agent-native development workflows, and give developers more power to build, debug, and deploy with AI. 

Want to help support these projects? Sign up for GitHub Sponsors today and join us in sponsoring them or other open-source projects you care about.

Dive into the MCP ecosystem and start building the future of AI-native development and explore how MCP x VS Code and GitHub Copilot can increase your productivity and creativity!

The post 9 open-source projects the GitHub Copilot and Visual Studio Code teams are sponsoring—and why they matter appeared first on Microsoft Open Source Blog.

Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Teaching AI Models to Dance

1 Share

view my past posts

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

Meta just dropped a paper that solves a problem we all know too well. AI models that either answer unsafe questions or refuse to help with perfectly reasonable ones.

Their solution? Train two AI agents to work together.

The results are striking. Unsafe replies drop from 39% to 4.6%. Needless refusals fall from 45.3% to 9.9%. And general capabilities stay intact.

This is WaltzRL, a new approach to AI safety that treats alignment as teamwork instead of a single-player game.

The Problem? Guardrails That Kill Helpfulness

Current safety systems are blunt instruments. They see potential risk and hit the reject button. The entire response gets blocked, even if 95% of it was valid.

This creates two failures; Models generate unsafe content when attacked (jailbreaks work). Models refuse harmless requests that look risky ("How do I kill a Python process?").

Adding more guardrails makes this worse. When Meta's team added Llama Guard to their baseline model, overrefusal jumped from 25.7% to 29.8%.

If you start with a model that already has low overrefusal, adding guardrails hurts even more. Their single-model RL baseline had 8.6% overrefusal. After adding guardrails: 14.9%. That's a 6.3 percentage point increase.

Traditional guardrails don't solve the safety-helpfulness trade-off. They just move the slider toward "say no more often."

The Solution - Two Agents Dancing Together

WaltzRL uses two specialized models working in tandem.

The conversation agent writes responses to user prompts. It's optimized to be helpful and safe.

The feedback agent reviews those responses. When it spots problems, either unsafe content or unnecessary refusal, it suggests specific fixes.

Here's the key insight: the feedback agent doesn't just flag problems. It explains what to change and why. This rich feedback helps the conversation agent learn faster and correct course without throwing away entire responses.

The system uses one round of feedback per response in its experiments. The conversation agent writes an initial answer. If the feedback agent detects issues, it provides guidance. The conversation agent then writes a revised response incorporating that feedback.

At runtime, feedback only triggers when needed. On general helpfulness queries, the feedback trigger rate is just 6.7%. Even on challenging safety and over-refusal benchmarks, it stays below 50%. This keeps latency manageable.

How? Reinforcement Learning with a Twist

Both agents train together through reinforcement learning. But they get rewarded differently.

Conversation agent reward: It only gets a positive reward when the response is both safe AND not over-refusing. One without the other doesn't count.

Feedback agent reward: This is where it gets clever. They use Dynamic Improvement Reward (DIR).

The feedback agent gets rewarded based on whether its advice actually improves the conversation agent's following response. If the revised answer is better than the original, the feedback agent gets credit. If the revision makes things worse, it gets penalized.

This creates a positive-sum game. Both agents win when they collaborate well. The feedback agent learns to give advice that the conversation agent can actually use.

The training happens in two stages.

Stage 1: Freeze the conversation agent. Train only the feedback agent to recognize problems and provide helpful feedback. This builds accurate detection before moving forward.

Stage 2: Train both agents together. The feedback agent's label reward is disabled, but the improvement reward stays active. This prevents overfitting to imbalanced data while maintaining accuracy.

Why This Works Better Than Alternatives

The researchers tested several baselines.

Inference-time collaboration (no training): Just using two instances of the base model already reduced both unsafe responses and overrefusals. But feedback was triggered excessively, and the results weren't optimal.

Oracle feedback: They attempted to give the system perfect labels converted to template feedback, such as "avoid unsafe content" or "don't overrefuse." This helped with safety (26.5% to 7.0% ASR) but barely dented overrefusal (25.7% to 16.6% ORR).

Why? Asking a model to be less cautious requires convincing it with reasoning, not just instructions. Detailed feedback matters, especially for reducing overrefusal.

Single-model RL: Training just the conversation agent improved safety but not as much as WaltzRL. And it didn't benefit from adaptive feedback at inference time.

WaltzRL beats all of them on both metrics simultaneously.

The Numbers That Matter

Tested across five diverse datasets.

Safety under attack

- WildJailbreak: 39.0% to 4.6% ASR

- FORTRESS harmful: 40.4% to 6.2% ASR

- StrongREJECT: 0.0% to 0.3% ASR

Over refusal on benign prompts

- OR-Bench: 45.3% to 9.9% ORR

- FORTRESS benign: 6.0% to 5.4% ORR

General capabilities: Minimal degradation across AlpacaEval, IFEval, GPQA, MMLU, and TruthfulQA, even though they used zero helpfulness prompts during training.

That last part is essential. WaltzRL trains only on adversarial attacks and borderline overrefusal cases. No general helpfulness data. Yet instruction-following and knowledge stay intact.

What Makes This Different From Debate

AI safety through debate involves agents competing in zero-sum games. One agent attacks, one defends. A higher reward for one means a lower reward for the other.

WaltzRL is collaborative. Both agents pursue the same goal: safe, non-overrefusing responses. It's positive-sum, not zero-sum.

And unlike debate approaches that train multiple agents but deploy only one, WaltzRL deploys both agents together at inference time. An attacker has to jailbreak both agents to succeed.

The Emergent Behavior

Something interesting emerged during training: the feedback agent started directly quoting ideal responses.

Instead of just saying "make it safer," it would generate an outline or even complete sentences that the conversation agent should use. The conversation agent learned to follow this guidance.

This wasn't explicitly programmed. It emerged from the Dynamic Improvement Reward. The feedback agent discovered that specific, concrete suggestions work better than vague instructions.

What This Means

WaltzRL pushes forward the Pareto frontier between safety and helpfulness. You can have both.

The key insight is treating alignment as collaboration, not control. Two specialized models working together outperform one model trying to do everything.

Traditional guardrails are gatekeepers. They say yes or no to entire responses.

WaltzRL is an editor. It looks at what you wrote and suggests improvements.

That difference, between blocking and refining, unlocks better results on both safety and helpfulness.

The paper is open research from Meta. All experiments use Llama 3.1-8B-Instruct as the base model for both agents.

Future work could explore training generalist feedback agents that work off-the-shelf with different conversation models. Or expanding beyond one round of feedback to multi-turn refinement.

For now, WaltzRL shows a clear path forward: if you want AI systems that are both safe and helpful, teach two agents to dance together instead of making one agent walk a tightrope alone.

Paper: The Alignment Waltz: Jointly Training Agents to Collaborate for Safety (arxiv.org/abs/2510.08240)

Authors: Jingyu Zhang, Hongyuan Zhan, and team at Meta Superintelligence Labs

Thanks for reading! This post is public, so feel free to share it.

Share



Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories