Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147925 stories
·
33 followers

Packaging Expertise: How Claude Skills Turn Judgment Into Artifacts

1 Share

Think about what happens when you onboard a new employee.

First, you provision them tools. Email access. Slack. CRM. Office software. Project management software. Development environment. Connecting a person to the system they’ll need to do their job. However, this is necessary but not sufficient. Nobody becomes effective just because they can log into Salesforce.

Then comes the harder part: teaching them how your organization actually works. The analysis methodology your team developed over years of iteration. The quality bar that is not written down anywhere. The implicit ways of working. The judgment calls about when to escalate and when to handle something independently. The institutional knowledge that separates a new hire from someone who’s been there for years.

This second part—the expertise transfer—is where organizations struggle. It’s expensive and inconsistent, and does not scale. It lives in mentorship relationships, institutional knowledge, and documentation that goes stale the moment it’s written.

Claude Skills and MCP (Model Context Protocol) follow exactly this pattern. MCP gives AI agents such as Claude the tools: access to systems, databases, APIs, and resources. Skills are the training materials that teach Claude how to work and how to use these tools.

This distinction matters more than it might first appear. While we have gotten reasonably good at provisioning tools, we have never had a good way to package expertise. Skills change that. They package expertise into a standardized format.

Tools Versus Training

MCP is tool provisioning. It’s the protocol that connects AI agents to external systems: data warehouse, CRM, GitHub repositories, internal APIs, and knowledge bases. Anthropic describes it as “USB-C for AI”—a standardized interface that lets Claude plug into your existing infrastructure. An MCP server might give Claude the ability to query customer records, commit code, send Slack messages, or pull analytics data with authorized permissions.

This is necessary infrastructure. But like giving a new hire database credentials, it does not tell AI agents what to do with that access. MCP answers the question “What tools can an agent use?” It provides capabilities without opinions.

Skills are the training materials. They encode how your organization actually works: which segments matter, what churn signal to watch for, how to structure findings for your quarterly business review, when to flag something for human attention.

Skills answer a different question: “How should an AI agent think about this?” They provide expertise, not just access.

Consider the difference in what you’re creating. Building an MCP server is infrastructure work; it’s an engineering effort to connect systems securely and reliably. Creating a Skill is knowledge work; domain experts articulating what they know, in markdown files, for AI agents to operationalize and understand. These require different people, different processes, and different governance.

The real power emerges when you combine them. MCP connects AI agents to your data warehouse. A Skill teaches AI agents your firm’s analysis methodology and which MCP tools to use. Together, AI agents can perform expert-level analysis on live data, following your specific standards. Neither layer alone gets you there, just as a new hire with database access but no training, or training but no access, won’t be effective at their jobs.

MCP is the toolbox. Skills are the training manuals that teach how to use those tools.

Why Expertise Has Been So Hard to Scale

The training side of onboarding has always been the bottleneck.

Your best analyst retires, and their methods walk out of the door. Onboarding takes months because the real tacit knowledge lives in people’s heads, not in any document a new hire can read. Consistency is impossible when “how we do things here” varies by who trained whom and who worked with whom. Even when you invest heavily in training programs, they produce point-in-time snapshots of expertise that immediately begin to rot.

Previous approaches have all fallen short:

Documentation is passive and quickly outdated. It requires human interpretation, offers no guarantee of correct application, and can’t adapt to novel situations. The wiki page about customer analysis does not help when you encounter an edge case the author never anticipated.

Training programs are expensive, and a certificate of completion says nothing about actual competency.

Checklists and SOPs capture procedure but not judgment. They tell you what to check, not how to think about what you find. They work for mechanical tasks but fail for anything requiring expertise.

We’ve had Custom GPTs, Claude projects, and Gemini Gems attempting to address this. They are useful but opaque. You cannot invoke them based on context; the AI agent working as Copy Editing Gem stays in copy editing and can’t switch to Laundry Buddy Custom GPTs mid-task. They are not transferable and cannot be packaged for distribution.

Skills offer something new: expertise packaged as a versionable, governable artifact.

Skills are files in folders—a SKILL.md document with supporting assets, scripts, and resources. They leverage all the tooling we have built for managing code. Track changes in Git. Roll back mistakes. Maintain audit trails. Review Skills before deployment through PR workflows with version control. Deploy organization-wide and ensure consistency. AI agents can compose Skills for complex workflows, building sophisticated capabilities from simple building blocks.

The architecture also enables progressive disclosure. AI agents see only lightweight metadata until a Skill becomes relevant, then loads the full instruction on demand. You can have dozens of Skills available without overwhelming the model’s precious context window, which is like a human’s short-term memory or a computer’s RAM. Claude loads expertise as needed and coordinates multiple Skills automatically.

This makes the enterprise deployment model tractable. An expert creates a Skill based on best practices, with the help of an AI/ML engineer to audit and evaluate the effectiveness of the Skill. Administrators review and approve it through governance processes. The organization deploys it everywhere simultaneously. Updates propagate instantly from a central source.

One report cites Rakuten achieving 87.5% faster completion of a finance workflow after implementing Skills. Not from AI magic but from finally being able to distribute their analysts’ methodologies across the entire team. That’s the expertise transfer problem, solved.

Training Materials You Can Meter

The onboarding analogy also created a new business model.

When expertise lives in people, you can only monetize it through labor—billable hours, consulting engagements, training programs, maintenance contracts. The expert has to show up, which limits scale and creates key-person dependencies.

Skills separate expertise from the expert. Package your methodology as a Skill. Distribute it via API. Charge based on usage.

A consulting firm’s analysis framework can become a product. A domain expert’s judgment becomes a service. The Skill encodes the expertise; the API calls become the meter. This is service as software, the SaaS of expertise. And it’s only possible because Skills put knowledge in a form that can be distributed, versioned, and billed against.

The architecture is familiar. The Skill is like an application frontend (the expertise, the methodology, the “how”), while MCP connections or API calls form the backend (data access, actions, the “what”). You build training material once and deploy them everywhere, metering usage through the infrastructure layer.

No more selling API endpoints with 500-page obscure documentation explaining what each endpoint does then staffing a team to support it. Now we can package the expertise of how to use those API directly into Skills. Customers can realize the value of an API via their AI agents. Cost to implement and time to implement drop to zero with MCP. Time to value becomes immediate with Skills.

The Visibility Trade-Off

Every abstraction has a cost. Skills trade visibility for scalability, and that trade-off deserves honest examination.

When expertise transfers human to human, through mentorship, working sessions, apprenticeship, the expert sees how their knowledge gets applied and becomes better in the process. They watch the learner struggle with edge cases. They notice which concepts don’t land. They observe how their methods get adapted to new situations. This feedback loop improves the expertise over time.

Skills break that loop. As a Skill builder, you do not see the conversations that trigger your Skill. You do not know how users adapted your methodology or which part of your guidance AI agents weighted most heavily. Users interact with their own AI agents; your Skill is one influence among many.

Your visibility is limited to the infrastructure layer: API calls, MCP tool invocations, and whatever outputs you explicitly capture. You see usage patterns, not the dialogue that surrounds them. Those dialogues reside with the user’s AI agents.

This parallels what happened when companies moved from in-person training to self-service documentation and e-learning. You lost the ability to watch every learner, but you gained the ability to train at scale. Skills make the same exchange; less visibility per user interaction, vastly more interactions possible.

Managing the trade-off requires intentional design. Build logging and tracing into your Skills where appropriate. Create feedback mechanisms inside skills for AI agents to surface when users express confusion or request changes. And in the development process, focus on outcomes—Did the Skill produce good results?—rather than process observation.

In production, the developer of Skills or MCPs will not have most of the context of how a user’s AI agent uses their Skills.

What to Watch

For organizations going through AI transformations, the starting point is an audit of expertise. What knowledge lives only in a specific person’s head? Where does inconsistency emerge because “how we do things” isn’t written down in an operationalizable form? These are your candidates for Skills.

Start with bounded workflows: a report format, an analysis methodology, a review checklist. Prove the pattern before encoding more complex expertise. Govern early. Skills are artifacts that require review, evaluation, and lifecycle management. Establish those processes before Skills proliferate.

For builders, the mental shift is from “prompt” to “product.” Skills are versioned artifacts with users. Design accordingly. Combine Skills with MCP for maximum leverage. Accept the visibility tradeoff as the cost of scale.

Several signals suggest where this is heading. Skill marketplaces are emerging. Agent Skills are now a published open standard being adopted by multiple AI agents and soon agent SDKs. Enterprise governance tooling with version control, approval workflows, and audit trails organizations need will determine adoption in regulated industries.

Expertise Can Finally Be Packaged

We’ve gotten good at provisioning tools as APIs. MCP extends that to AI with standardized connections to systems and data.

But tools access was never the bottleneck. Expertise transfer was. The methodology. The judgment. The caveats. The workflows. The institutional knowledge that separates a new hire from a veteran.

Skills are the first serious attempt to package the expertise into a file format, where AI agents can operationalize it while humans can still read, review, and govern. They are training materials that actually scale.

The organizations that figure out how to package their expertise, both for internal and external consumption, will have a structural advantage. Not because AI replaces expertise. Because AI amplifies the expertise of those who know how to share it.

MCP gives AI agents the tools. Skills teach AI agents how to work. The question is whether you can encode what your best people know. Skills are the first real answer.


Reference



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

What Developers Actually Need to Know Right Now

1 Share

Addy Osmani is one of my favorite people to talk with about the state of software engineering with AI. He spent 14 years leading Chrome’s developer experience team at Google, and recently moved to Google Cloud AI to focus on Gemini and agent development. He’s also the author of numerous books for O’Reilly, including The Effective Software Engineer (due out in March), and my cohost for O’Reilly’s AI Codecon. Every time I talk with him I come away feeling like I have a better grip on what’s real and what’s noise. Our recent conversation on Live with Tim O’Reilly was no exception.

Here are some of the things we talked about.

The hard problem is coordination, not generation

Addy pointed out that there’s a spectrum in how people are working with AI agents right now. On one end you have solo founders running hundreds or thousands of agents, sometimes without even reviewing the code. On the other end you have enterprise teams with quality gates, reliability requirements, and long-term maintenance to think about.

Addy’s take is that for most businesses, “the real frontier is not necessarily having hundreds of agents for a task just for its own sake. It’s about orchestrating a modest set of agents that solve real problems while maintaining control and traceability.” He pointed out that frameworks like Google’s Agent Development Kit now support both deterministic workflow agents and dynamic LLM agents in the same system, so you get to choose when you need predictability and when you need flexibility.

The ecosystem is developing fast. A2A (the agent-to-agent protocol Google contributed to the Linux Foundation) handles agent-to-agent communication while MCP handles agent-to-tool calls. Together they start to look like the TCP/IP of the agent era. But Addy was clear-eyed about where things stand: “Almost nobody’s figured out how to make everything work together as smoothly as possible. We’re getting as close to that as we can. And that’s the actual hard problem here. Not generation, but coordination.”

The “Something Big Is Happening” debate

In response to one of the audience questions, we spent some time on Matt Shumer’s viral essay arguing that the current moment in AI is like just before the COVID-19 epidemic hit. Those in the know were sounding the alarm, but most people weren’t hearing it.

Addy’s take was that “it felt a little bit like somebody who hadn’t been following along, just finally getting around to trying out the latest models and tools and having an epiphany moment.” He thinks the piece lacked grounding in data and didn’t do a great job distinguishing between what AI can do for prototypes and what it can do in production. As Addy put it, “Yes, the models are getting better, the harnesses are getting better, the tools are getting better. I can do more with AI these days than I could a year ago. All of that is true. But to say that all kinds of technical work can now be done with near perfection, I wouldn’t personally agree with that statement.”

I agree with Addy, but I also know how it feels when you see the future crashing in and no one is paying attention. At O’Reilly, we started working with the web when there were only 200 websites. In 1993, we built GNN, the first web portal, and the web’s first advertising. In 1994, we did the first large-scale market research on the potential of advertising as the web’s future business model. We went around lobbying phone companies to adopt the web and (a few years later) for bookstores to pay attention to the rise of Amazon, and nobody listened. I’m a big believer in “something is happening” moments. But I’m also very aware that it always takes longer than it appears.

Both things can be true. The direction and magnitude of this shift are real. The models keep getting better. The harnesses keep getting better. But we still have to figure out new kinds of businesses and new kinds of workflows. AI won’t be a tsunami that wipes everything away overnight.

Addy and I will be cohosting the O’Reilly AI Codecon: Software Craftsmanship in the Age of AI on March 26, where we’ll go much deeper on orchestration, agent coordination, and the new skills developers need. We’d love to see you there. Sign up for free here.

And if you’re interested in presenting at AI Codecon, our CFP is open through this Friday, February 20. Check out what we’re looking for and submit your proposal here.

Feeling productive vs. being productive

There was a great line from a post by Will Manidis called “Tool Shaped Objects” that I shared during our conversation: “The market for feeling productive is orders of magnitude larger than the market for being productive.” The essay is about things that feel amazing to build and use but aren’t necessarily doing the work that needs to be done.

Addy picked up on this immediately. “There is a difference between feeling busy and being productive,” he said. “You can have 100 agents working in the background and feel like you’re being productive. And then someone asks, What did you get built? How much money is it making you?”

This isn’t to dismiss anyone who’s genuinely productive running lots of agents. Some people are. But a healthy skepticism about your own productivity is worth maintaining, especially when the tools make it so easy to feel like you’re moving fast.

Planning is the new coding

Addy talked about how the balance of his time on a task has shifted significantly. “I might spend 30, 40% of the time a task takes just to actually write out what exactly is it that I want,” he said. What are the constraints? What are the success criteria? What’s the architecture? What libraries and UI components should be used?

All of that work to get clarity before you start code generation leads to much-higher-quality outcomes from AI. As Addy put it, “LLMs are very good at grounding things in the lowest common denominator. If there are patterns in the training data that are popular, they’re going to use those unless you tell them otherwise.” If your team has established best practices, codify them in Markdown files or MCP tools so the agent can use them.

I connected the planning phase to something larger about taste. Think about Steve Jobs. He wasn’t a coder. He was a master of knowing what good looked like and driving those who worked with him to achieve it. In this new world, that skill matters enormously. You’re going to be like Jobs telling his engineers “no, no, not that” and giving them a vision of what’s beautiful and powerful. Except now some of those engineers are agents. So management skill, communication skill, and taste are becoming core technical competencies.

Code review is getting harder

One thing Addy flagged that doesn’t get enough attention: “Increasingly teams feel like they’re being thrashed with all of these PRs that are AI generated. People don’t necessarily understand everything that’s in there. And you have to balance increased velocity expectations with ‘What is a quality bar?’ because someone’s going to have to maintain this.”

Knowing your quality bar matters. What are the cases where you’re comfortable merging an AI-generated change? Maybe it’s small and well-compartmentalized and has solid test coverage. And what are the cases where you absolutely need deep human review? Getting clear on that distinction is one of the most practical things a team can do right now.

Yes, young people should still go into software

We got a question about whether students should still pursue software engineering. Addy’s answer was emphatic: “There has never been a better time to get into software engineering if you are someone that is comfortable with learning. You do not necessarily have the burden of decades of knowing how things have historically been built. You can approach this with a very fresh set of eyes.” New entrants can go agent first. They can get deep into orchestration patterns and model trade-offs without having to unlearn old habits. And that’s a real advantage when interviewing at companies that need people who already know how to work this way.

The more important point is that in the early days of a new technology, people basically try to make the old things over again. The really big opportunities come when we figure out what was previously impossible that we can now do. If AI is as powerful as it appears to be, the opportunity isn’t to make companies more efficient at the same old work. It’s to solve entirely new problems and build entirely new kinds of products.

I’m 71 years old and 45 years into this industry, and this is the most excited I’ve ever been. More than the early web, more than open source. The future is being reinvented, and the people who start using these tools now get to be part of inventing it.

The token cost question

Addy had a funny and honest admission: “There were weeks when I would look at my bill for how much I was using in tokens and just be shocked. I don’t know that the productivity gains were actually worthwhile.”

His advice: experiment. Get a sense of what your typical tasks cost with multiple agents. Extrapolate. Ask yourself whether you’d still find it valuable at that price. Some people spend hundreds or even thousands a month on tokens and feel it’s worthwhile because the alternative was hiring a contractor. Others are spending that much and mostly feeling busy. As Addy said, “Don’t feel like you have to be spending a huge amount of money to not miss out on productivity wins.”

I’d add that we’re in a period where these costs are massively subsidized. The model companies are covering inference costs to get you locked in. Take advantage of that while it lasts. But also recognize that a lot of efficiency work is yet to be done. Just as JavaScript frameworks replaced everyone hand-coding UIs, we’ll get frameworks and tools that make agent workflows much more token-efficient than they are today.

2028 predictions are already here

One of the most striking things Addy shared was that a group in the AI coding community that he is part of had put together predictions for what software engineering would look like by 2028. “We recently revisited that list, and I was kind of shocked to discover that almost everything on that list is already possible today,” he said. “But how quickly the rest of the ecosystem adopts these things is on a longer trajectory than what is possible.”

That gap between capability and adoption is where most of the interesting work will happen over the next few years. The technology is running ahead of our ability to absorb it. Figuring out how to close that gap, in your team, your company, and your own practice, is the real job right now.

Agents writing code for agents

Near the end we answered another great audience question: Will agents eventually produce source code that’s optimized for other agents to read, not humans? Addy said yes. There are already platform teams having conversations about whether to build for an agent-first world where human readability becomes a secondary concern.

I have a historical parallel for this. I wrote the manual for the first C compiler on the Mac, and I worked closely with the developer who was hand-tuning the compiler output at the machine code level. That was about 30 years ago. We stopped doing that. And I’m quite confident there will be a similar moment with AI-generated code where humans mostly just let it go and trust the output. There will be special cases where people dive in for absolute performance or correctness. But they’ll be rare.

That transition won’t happen overnight. But the direction seems pretty clear. You can help to invent the future now, or spend time later trying to catch up with those who do.


This conversation was part of my ongoing series of conversations with innovators, Live with Tim O’Reilly. You can explore past episodes on YouTube.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

LangChain Python Tutorial: 2026’s Complete Guide

1 Share
LangChain Python Tutorial

If you’ve read the blog post How to Build Chatbots With LangChain, you may want to know more about LangChain. This blog post will dive deeper into what LangChain offers and guide you through a few more real-world use cases. And even if you haven’t read the first post, you might still find the info in this one helpful for building your next AI agent.

LangChain fundamentals

Let’s have a look at what LangChain is. LangChain provides a standard framework for building AI agents powered by LLMs, like the ones offered by OpenAI, Anthropic, Google, etc., and is therefore the easiest way to get started. LangChain supports most of the commonly used LLMs on the market today.

LangChain is a high-level tool built on LangGraph, which provides a low-level framework for orchestrating the agent and runtime and is suitable for more advanced users. Beginners and those who only need a simple agent build are definitely better off with LangChain.

We’ll start by taking a look at several important components in a LangChain agent build.

Agents

Agents are what we are building. They combine LLMs with tools to create systems that can reason about tasks, decide which tools to use for which steps, analyze intermittent results, and work towards solutions iteratively.

Creating an agent is as simple as using the `create_agent` function with a few parameters:

from langchain.agents import create_agent

agent = create_agent(

   "gpt-5",

   tools=tools

)

In this example, the LLM used is GPT-5 by OpenAI. In most cases, the provider of the LLM can be inferred. To see a list of all supported providers, head over here.

LangChain Models: Static and Dynamic

There are two types of agent models that you can build: static and dynamic. Static models, as the name suggests, are straightforward and more common. The agent is configured in advance during creation and remains unchanged during execution.

import os

from langchain.chat_models import init_chat_model

os.environ["OPENAI_API_KEY"] = "sk-..."

model = init_chat_model("gpt-5")

print(model.invoke("What is PyCharm?"))



Dynamic models allow you to build an agent that can switch models during runtime based on customized logic. Different models can then be picked based on the current state and context. For example, we can use ModelFallbackMiddleware (described in the Middleware section below) to have a backup model in case the default one fails.

from langchain.agents import create_agent

from langchain.agents.middleware import ModelFallbackMiddleware

agent = create_agent(

   model="gpt-4o",

   tools=[],

   middleware=[

       ModelFallbackMiddleware(

           "gpt-4o-mini",

           "claude-3-5-sonnet-20241022",

       ),

   ],

)

Tools

Tools are important parts of AI agents. They make AI agents effective at carrying out tasks that involve more than just text as output, which is a fundamental difference between an agent and an LLM. Tools allow agents to interact with external systems – such as APIs, databases, or file systems. Without tools, agents would only be able to provide text output, with no way of performing actions or iteratively working their way toward a result.

LangChain provides decorators for systematically creating tools for your agent, making the whole process more organized and easier to maintain. Here are a couple of examples:

Basic tool

@tool

def search_db(query: str, limit: int = 10) -> str:

   """Search the customer database for records matching the query.

   """

...

   return f"Found {limit} results for '{query}'"

Tool with a custom name

@tool("pycharm_docs_search", return_direct=False)

def pycharm_docs_search(q: str) -> str:

   """Search the local FAISS index of JetBrains PyCharm documentation and return relevant passages."""

...

   docs = retriever.get_relevant_documents(q)

   return format_docs(docs)

Middleware

Middleware provides ways to define the logic of your agent and customize its behavior. For example, there is middleware that can monitor the agent during runtime, assist with prompting and selecting tools, or even help with advanced use cases like guardrails, etc.

Here are a few examples of built-in middleware. For the full list, please refer to the LangChain middleware documentation.

MiddlewareDescription
SummarizationAutomatically summarize the conversation history when approaching token limits.
Human-in-the-loopPause execution for human approval of tool calls.
Context editingManage conversation context by trimming or clearing tool uses.
PII detectionDetect and handle personally identifiable information (PII).

Real-world LangChain use cases

LangChain use cases cover a varied range of fields, with common instances including: 

  1. AI-powered chatbots
  2. Document question answering systems
  3. Content generation tools

AI-powered chatbots

When we think of AI agents, we often think of chatbots first. If you’ve read the How to Build Chatbots With LangChain blog post, then you’re already up to speed about this use case. If not, I highly recommend checking it out.

Document question answering systems

Another real-world use case for LangChain is a document question answering system. For example, companies often have internal documents and manuals that are rather long and unwieldy. A document question answering system provides a quick way for employees to find the info they need within the documents, without having to manually read through each one.

To demonstrate, we’ll create a script to index the PyCharm documentation. Then we’ll create an AI agent that can answer questions based on the documents we indexed. First let’s take a look at our tool:

@tool("pycharm_docs_search")

def pycharm_docs_search(q: str) -> str:

   """Search the local FAISS index of JetBrains PyCharm documentation and return relevant passages."""

   # Load vector store and create retriever

   embeddings = OpenAIEmbeddings(

       model=settings.openai_embedding_model, api_key=settings.openai_api_key

   )

   vector_store = FAISS.load_local(

       settings.index_dir, embeddings, allow_dangerous_deserialization=True

   )

   k = 4

   retriever = vector_store.as_retriever(

       search_type="mmr", search_kwargs={"k": k, "fetch_k": max(k * 3, 12)}

   )

   docs = retriever.invoke(q)

We are using a vector store to perform a similarity search with embeddings provided by OpenAI. Documents are embedded so the doc search tool can perform similarity searches to fetch the relevant documents when called. 

def main():

   parser = argparse.ArgumentParser(

       description="Ask PyCharm docs via an Agent (FAISS + GPT-5)"

   )

   parser.add_argument("question", type=str, nargs="+", help="Your question")

   parser.add_argument(

       "--k", type=int, default=6, help="Number of documents to retrieve"

   )

   args = parser.parse_args()

   question = " ".join(args.question)

   system_prompt = """You are a helpful assistant that answers questions about JetBrains PyCharm using the provided tools.

   Always consult the 'pycharm_docs_search' tool to find relevant documentation before answering.

   Cite sources by including the 'Source:' lines from the tool output when useful. If information isn't found, say you don't know."""

   agent = create_agent(

       model=settings.openai_chat_model,

       tools=[pycharm_docs_search],

       system_prompt=system_prompt,

       response_format=ToolStrategy(ResponseFormat),

   )

   result = agent.invoke({"messages": [{"role": "user", "content": question}]})

   print(result["structured_response"].content)

 

System prompts are provided to the LLM together with the user’s input prompt. We are using OpenAI as the LLM provider in this example, and we’ll need an API key from them. Head to this page to check out OpenAI’s integration documentation. When creating an agent, we’ll have to configure the settings for `llm`, `tools`, and `prompt`.

For the full scripts and project, see here.

Content generation tools

Another example is an agent that generates text based on content fetched from other sources. For instance, we might use this when we want to generate marketing content with info taken from documentation. In this example, we’ll pretend we’re doing marketing for Python and creating a newsletter for the latest Python release.

In tools.py, a tool is set up to fetch the relevant information, parse it into a structured format, and extract the necessary information.

@tool("fetch_python_whatsnew", return_direct=False)

def fetch_python_whatsnew() -> str:

   """

   Fetch the latest "What's New in Python" article and return a concise, cleaned

   text payload including the URL and extracted section highlights.

   The tool ignores the input argument.

   """

   index_html = _fetch(BASE_URL)

   latest = _find_latest_entry(index_html)

   if not latest:

       return "Could not determine latest What's New entry from the index page."

   article_html = _fetch(latest.url)

   highlights = _extract_highlights(article_html)

   return f"URL: {latest.url}\nVERSION: {latest.version}\n\n{highlights}"

As for the agent in agent.py

SYSTEM_PROMPT = (

   "You are a senior Product Marketing Manager at the Python Software Foundation. "

   "Task: Draft a clear, engaging release marketing newsletter for end users and developers, "

   "highlighting the most compelling new features, performance improvements, and quality-of-life "

   "changes in the latest Python release.\n\n"

   "Process: Use the tool to fetch the latest 'What's New in Python' page. Read the highlights and craft "

   "a concise newsletter with: (1) an attention-grabbing subject line, (2) a short intro paragraph, "

   "(3) 4–8 bullet points of key features with user benefits, (4) short code snippets only if they add clarity, "

   "(5) a 'How to upgrade' section, and (6) links to official docs/changelog. Keep it accurate and avoid speculation."

)

...

def run_newsletter() -> str:

   load_dotenv()

   agent = create_agent(

       model=os.getenv("OPENAI_MODEL", "gpt-4o"),

       tools=[fetch_python_whatsnew],

       system_prompt=SYSTEM_PROMPT,

       # response_format=ToolStrategy(ResponseFormat),

   )

...

As before, we provide a system prompt and the API key for OpenAI to the agent.

For the full scripts and project, see here.

Advanced LangChain concepts

LangChain’s more advanced features can be extremely useful when you’re building a more sophisticated AI agent. Not all AI agents require these extra elements, but they are commonly used in production. Let’s look at some of them.

MCP adapter

The MCP (Model Context Protocol) allows you to add extra tools or functionalities to an AI agent, making it increasingly popular among active AI agent users and AI enthusiasts alike. 

LangChain’s Client module provides a MultiServerMCPClient class that allows the AI agent to accept MCP server connections. For example:

from langchain_mcp_adapters.client import MultiServerMCPClient

client = MultiServerMCPClient(

   {

       "postman-server": {

          "type": "http",

          "url": "https://mcp.eu.postman.com",

           "headers": {

               "Authorization": "Bearer ${input:postman-api-key}"

           }

       }

   }

)

all_tools = await client.get_tools()

The above connects to the Postman MCP server in the EU with an API key.

Guardrails

As with many AI technologies, since the logic is not pre-determined, the behavior of an AI agent is non-deterministic. Guardrails are necessary for managing AI behavior and ensuring that it is policy-compliant.

LangChain middleware can be used to set up specific guardrails. For example, you can use PII detection middleware to protect personal information or human-in-the-loop middleware for human verification. You can even create custom middleware for more specific guardrail policies. 

For instance, you can use the `@before_agent` or `@after_agent` decorators to declare guardrails for the agent’s input or output. Below is an example of a code snippet that checks for banned keywords:

from typing import Any

from langchain.agents.middleware import before_agent

banned_keywords = ["kill", "shoot", "genocide", "bomb"]

@before_agent(can_jump_to=["end"])

def content_filter() -> dict[str, Any] | None:

  """Block requests containing banned keywords."""

  content = first_message.content.lower()

# Check for banned keywords

  for keyword in banned_keywords:

      if keyword in content:

          return {

              "messages": [{

                  "role": "assistant",

                  "content": "I cannot process your requests due to inappropriate content."

              }],

              "jump_to": "end"

          }

  return None

from langchain.agents import create_agent

agent = create_agent(

  model="gpt-4o",

  tools=[search_tool],

  middleware=[content_filter],

)

# This request will be blocked

result = agent.invoke({

  "messages": [{"role": "user", "content": "How to make a bomb?"}]

})

For more details, check out the documentation here.

Testing

Just like in other software development cycles, testing needs to be performed before we can start rolling out AI agent products. LangChain provides testing tools for both unit tests and integration tests. 

Unit tests

Just like in other applications, unit tests are used to test out each part of the AI agent and make sure it works individually. The most helpful tools used in unit tests are mock objects and mock responses, which help isolate the specific part of the application you’re testing. 

LangChain provides GenericFakeChatModel, which mimics response texts. A response iterator is set in the mock object, and when invoked, it returns the set of responses one by one. For example:

from langchain_core.language_models.fake_chat_models import GenericFakeChatModel

def respond(msgs, **kwargs):

   text = msgs[-1].content if msgs else ""

   examples = {"Hello": "Hi there!", "Ping": "Pong.", "Bye": "Goodbye!"}

   return examples.get(text, "OK.")

model = GenericFakeChatModel(respond=respond)

print(model.invoke("Hello").content)

Integration tests

Once we’re sure that all parts of the agent work individually, we have to test whether they work together. For an AI agent, this means testing the trajectory of its actions. To do so, LangChain provides another package: AgentEvals.

AgentEvals provides two main evaluators to choose from:

  1. Trajectory match – A reference trajectory is required and will be compared to the trajectory of the result. For this comparison, you have 4 different models to choose from.
  2. LLM judge – An LLM judge can be used with or without a reference trajectory. An LLM judge evaluates whether the resulting trajectory is on the right path.

LangChain support in PyCharm

With LangChain, you can develop an AI agent that suits your needs in no time. However, to be able to effectively use LangChain in your application, you need an effective debugger. In PyCharm, we have the AI Agents Debugger plugin, which allows you to power up your experience with LangChain.

If you don’t yet have PyCharm, you can download it here.

Using the AI Agents Debugger is very straightforward. Once you install the plug-in, it will appear as an icon on the right-hand side of the IDE.

When you click on this icon, a side window will open with text saying that no extra code is needed – just run your agent and traces will be shown automatically.

As an example, we will run the content generation agent that we built above. If you need a custom run configuration, you will have to set it up now by following this guide on custom run configurations in PyCharm.

Once it is done, you can review all the input prompts and output responses at a glance. To inspect the LangGraph, click on the Graph button in the top-right corner.

The LangGraph view is especially useful if you have an agent that has complicated steps or a customized workflow.

Summing up

LangChain is a powerful tool for building AI agents that work for many use cases and scenarios. It’s built on LangGraph, which provides low-level orchestration and runtime customization, as well as compatibility with a vast variety of LLMs on the market. Together, LangChain and LangGraph set a new industry standard for developing AI agents.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Compensation, culture, and cap tables with Yuri Sagalov, General Catalyst

1 Share

Build Mode is back. This season we’re breaking down what it really takes to build a world-class founding team starting with your cap table, equity structures, and startup compensation strategy. 

We kick off with Yuri Sagalov, managing director at General Catalyst and former founder, YC partner, and seed investor at Wayfinder Ventures. Yuri has worked with hundreds of pre-seed and seed-stage startups, and he shares practical advice on how early-stage founders should think about startup equity, cap table design, investor selection, and compensation structures from day one. 

He breaks down: 

  • The 3 types of investors (and which one to avoid) 

  • Why your cap table is part of your team 

  • The 20–25% seed dilution rule 

  • How to split equity with a co-founder 

  • How to talk to early employees about risk and compensation 

No matter where you are in your startup journey, this episode will help you get the incentive structure right from the beginning.  

Chapters: 

00:00 - Why your first hires deserve more equity 
00:31 - Meet Yuri Sagalov (YC → General Catalyst) 
02:12 - Your cap table is part of your team 
02:50 - The 3 types of investors (avoid this one) 
05:02 - How to split equity with a co-founder 
07:55 - How much equity to give early employees 
09:37 - How to talk compensation and risk 
12:31 - Red flags in formation docs and vesting 
18:27 - Advisors for equity? Usually a mistake 
20:05 - The 20–25% seed dilution rule 
26:03 - The shift to 10-year stock options 
34:11 - Don’t scale before product-market fit 
39:23 - Final advice: Just start and choose your co-founder carefully 

New episodes of Build Mode drop every Thursday. Hosted by Isabelle Johannessen. Produced and edited by Maggie Nye. Audience development led by Morgan Little. Special thanks to the Foundry and Cheddar video teams. 





Download audio: https://traffic.megaphone.fm/TCML4298375812.mp3?updated=1771437882
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

BONUS The Future of Seeing—Why AI Vision Will Transform Medicine and Human Perception With Daniel Sodickson

1 Share

BONUS: The Future of Seeing—Why AI Vision Will Transform Medicine and Human Perception

What if the next leap in AI isn't about thinking, but about seeing? In this episode, Daniel Sodickson—physicist, medical imaging pioneer, and author of "The Future of Seeing"—argues we're on the edge of a vision revolution that will change medicine, technology, and even human perception itself.

From Napkin Sketch to Parallel Imaging

"I was doodling literally on a napkin in a piano bar in Boston and came up with a way to get multiple lines at once. I ran to my mentor and said, 'Hey, I have this idea, never mind my paper.' And he said, 'Who are you again? Sure, why not.' And it worked."

 

Daniel's journey into imaging began with a happy accident. While studying why MRI couldn't capture the beating heart fast enough, he realized the fundamental bottleneck: MRI machines scan one line at a time, like old CRT screens. His insight—imaging in parallel to capture multiple lines simultaneously—revolutionized the field. This connection between natural vision (our eyes capture entire scenes at once) and artificial imaging systems set him on a 29-year journey exploring how we can see what was once invisible.

Upstream AI: Changing What We Measure

"Most often when we envision AI, we think of it as this downstream process. We generate our data, make our image, then let AI loose instead of our brains. To me, that's limited. Why aren't we thinking of tasks that AI can do that no human could ever do?"

 

Daniel introduces a crucial distinction between "downstream" and "upstream" AI. Downstream AI takes existing images and interprets them—essentially competing with human experts. Upstream AI changes the game entirely by redesigning what data we gather in the first place. If we know a machine learning system will process the output, we can build cheaper, more accessible sensors. Imagine monitoring devices built into beds or chairs that don't produce perfect images but can detect whether you've changed since your last comprehensive scan. AI fills in the gaps using learned context about how bodies and signals behave.

The Power of Context and Memory

"The world we see is a lie. Two eyes are not nearly enough to figure out exactly where everything is in space. What the brain is doing is using everything it's learned about the world—how light falls on surfaces, how big people are compared to objects—and filling in what's missing."

 

Our brains don't passively receive images; they actively construct reality using massive amounts of learned context. Daniel argues we can give imaging machines the same superpower. By training AI on temporal patterns—how healthy bodies change over time, what signals precede disease—we create systems with "memory" that can make sophisticated judgments from incomplete data. Today's signal, combined with your history and learned patterns from millions of others, becomes far more informative than any single pristine image could be.

From Reactive to Proactive Health

"I've started to wonder why we use these amazing MRI machines only once we already know you're sick. Why do we use them reactively rather than proactively?"

 

This question drove Daniel to leave academia after 29 years and join Function Health, a company focused on proactive imaging and testing to catch disease before it develops. The vision: a GPS for your health. By combining regular blood panels, MRI scans, and wearable data, AI can monitor whether you look like yourself or have changed in worrisome ways. The goal isn't replacing expert diagnosis but creating an early warning system that surfaces problems while they're still easily treatable.

Seeing How We See

"Sometimes when I'm walking along, everything I'm seeing just fades away. And what I see instead is how I'm seeing. I imagine light bouncing off of things and landing in my eye, this buzz of light zipping around as fast as anything in the universe can go."

 

After decades studying vision, Daniel experiences the world differently. He finds himself deconstructing his own perception—tracing sight lines, marveling at how we've evolved to turn chaos of sensation into spatially organized information. This meta-awareness extends to his work: every new imaging modality has driven scientific discovery, from telescopes enabling the Copernican Revolution to MRI revealing the living body. We're now at another inflection point where AI doesn't just interpret images but transforms our relationship with perception itself.

 

In this episode, we refer to An Immense World: How Animal Senses Reveal the Hidden Realms Around Us by Ed Young on animal perception, and A Path Towards Autonomous Machine Intelligence by Yann LeCun on building AI more like the brain.

 

About Daniel Sodickson

Daniel K. Sodickson is a physicist in medicine and chief medical scientist at Function Health. Previously at NYU, and a gold medalist and past president of the International Society for Magnetic Resonance in Medicine, he pioneers AI-driven imaging and is author of The Future of Seeing.





Download audio: https://traffic.libsyn.com/secure/scrummastertoolbox/20260219_Daniel_Sodickson_Thu.mp3?dest-id=246429
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Windows Hello for Business - Registered Methods and Last-used Method

1 Share

Hi folks – Mike Hildebrand here!  Today, I bring you a short post about gaining more awareness of Windows Hello for Business (WHFB) configuration information from across your fleet of Windows PCs. 

Over time, we’ve improved the built-in "Authentication Methods" reporting in the Entra portal.  As far as WHFB goes, at this point, the Entra Portal provides high-level counts of WHFB registration and usage:  

 

However, we IT Pros are a curious bunch, always looking for more information and more detail about what’s going on in our enterprise. 

A while back, after being asked by numerous customers for a way to get more details about their WHFB deployment, I published a post about using Entra sign-in log data and a custom Log Analytics Workbook to obtain that information. 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

That post/report has proven helpful - from Entra sign in logs, we can determine who is using WHFB, from which device (and there’s even a map to show where in the world it’s happening). 

Nice.

But that's only the 'cloud-side' of the situation - there are almost always two follow up questions that can only be answered from the endpoint:

  • What WHFB methods has a user registered on the endpoint(s)?  PIN only?  PIN + fingerprint?  Face?
  • Which WHFB method was last used by a given user on a given endpoint?

Ask, and yee shall receive

Here are two easy/quick Intune Proactive Remediation detection scripts you can use that send configurations to a Windows endpoint and retrieve the local device details (via reg-values) around WH4B enrollment methods and the last-used WHFB method.

!! CAUTION !!

  • There is PowerShell code involved here. 
  • Due diligence is required on your part. 
  • Raise your right hand and read this out loud: “Like everything else, I will thoroughly test this and all code/changes that I work with before I deploy to production.  I will document the before-change state to ensure I can revert any changes I make.”

CODE DISCLAIMER – These sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.

  • REMINDER/NOTE - When using your scripting editing tool of choice, always be aware of any additional spaces or odd quotation marks or other issues that may result from edit/copy/paste.

Enrollment TypesDetection

  • The ‘Enrolled Methods’ script from Marius

o   Intune-Remediation-Scripts/WH4B/Enrolled Methods at main · MrWyss-MSFT/Intune-Remediation-Scripts · GitHub

 

 

 

 

 

o   My Remediation Script Settings:

 

 

 

 

 

 

 

 

 

 

o   My results:

  • “As of 2/2/2026 at 9:40 AM, Adele registered a PIN (default/required) - a face - and a fingerprint - for WH4B on the SURFACEPRO5 device”

 

 

 

 

 

Last Used Method Detection

  • The ‘Last Used Method’ script from Marius

o   Intune-Remediation-Scripts/WH4B/Last Used Method at main · MrWyss-MSFT/Intune-Remediation-Scripts · GitHub

 

 

 

 

 

 

 

 

 

 

 

 

o   My Remediation Settings:

 

 

 

 

 

 

 

o   My results:

  • “As of 2/2/2026 at 9:40 AM, Adele last used a face/camera for WHFB on the SURFACEPRO5 device”

 

 

 

 

Additional Examples of Results

  • Enrollment Types Registered

o   NOTE: Remember, a PIN is required, so where you see ‘Fingerprint configured’ in the output, it means ‘PIN + Fingerprint’

 

 

 

 

 

 

 

 

 

  • Last-used method

 

 

 

 

 

 

 

 

There you have it folks - by combing these two Detection Scripts with the Log Analytics Workbook mentioned at the start of the post, you have a solid solution for ‘end to end’ WH4B reporting.

Hilde

 

 

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories