Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149366 stories
·
33 followers

Node.js 25.8.1 (Current)

1 Share
Read the whole story
alvinashcraft
15 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

dotNetDave Says… Clear Naming Standards Create Clearer Code and Clearer Code Leads to Better Software

1 Share
Clear naming standards in software development significantly enhance code readability, maintainability, and overall quality. Consistent naming reduces cognitive load, facilitates collaboration, and lowers long-term maintenance costs. Ignoring these standards leads to confusion, longer development times, and increased technical debt. Investing in robust naming conventions is essential for professional-grade software development.





Read the whole story
alvinashcraft
37 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

A Breakdown of Graph RAG vs. Vector RAG

1 Share

Large language models have changed how we interact with information, but they have one fundamental limitation: their knowledge is frozen in time. They can’t access real-time data or information from private, proprietary documents because they only know what they’ve been trained on. This is where RAG comes in. By connecting LLMs to external knowledge sources, RAG makes them smarter, more accurate, and more useful.

What is RAG?

RAG is an AI technique that improves large language models by allowing them to retrieve relevant external information before generating a response. Instead of relying solely on pre-trained knowledge, RAG searches connected data sources, such as documents or databases, to provide more accurate, up-to-date, and context-aware answers.

Think of it like an open-book exam. An LLM on its own is like a student trying to answer questions from memory. A RAG-powered LLM is like that same student having a curated set of textbooks and notes to consult before writing their answer. This process improves the accuracy and relevance of the LLM’s output, reduces the risk of generating incorrect or fabricated information (known as “hallucinations”), and allows it to answer questions about data it wasn’t trained on.

The RAG process generally follows these steps:

  1. User query: A user asks a question.
  2. Retrieval: The system searches an external knowledge base (e.g., a collection of documents, a database, or a website) for information relevant to the query.
  3. Augmentation: The retrieved information is added to the user’s original query as context.
  4. Generation: The combined prompt (original query plus retrieved context) is sent to the LLM, which then generates a comprehensive, context-aware answer.

A simplified sequence diagram demonstrating a retrieval-augmented generation (RAG) workflow

What is graph RAG?

Graph RAG is a more sophisticated approach that uses a knowledge graph as its external data source. A knowledge graph organizes information as a network of entities (nodes) and their relationships (edges). For example, a node could be a person, a company, or a product, while an edge could represent a relationship like “works for,” “acquired,” or “is a component of.”

Instead of just searching for text chunks that are semantically similar to a query, graph RAG traverses the network of relationships to find highly contextual, interconnected information. It understands not just what things are but also how they relate to each other. This allows it to answer complex questions that require understanding relationships, patterns, and hierarchies within the data.

Benefits

  • Explicit relationships: Graphs excel at representing explicit connections between data points, providing deep, structured context that vector searches might miss.
  • Complex query handling: Graph RAG can answer multi-hop questions that require piecing together information from different parts of the knowledge base (e.g., “Which customers in Germany use a product made by a company that we acquired last year?”).
  • Reduced hallucinations: By grounding the LLM in a structured, factual graph, the risk of generating inaccurate information is significantly lowered. The context is based on defined relationships, not just semantic similarity.
  • Explainability: The path taken through the graph to find an answer can be traced, making the LLM’s reasoning process more transparent and explainable.

Challenges

  • Complex data modeling: Building and maintaining a knowledge graph requires significant upfront effort in data modeling and extraction, transformation, and loading (ETL) processes.
  • Scalability: While modern graph databases are highly scalable, managing massive, highly interconnected graphs can present performance challenges.
  • Niche expertise: Implementing graph RAG requires expertise in graph databases, query languages such as Cypher and SPARQL, and graph data science.

Use cases

  • Fraud detection: Identifying complex, hidden relationships between accounts, transactions, and individuals to uncover fraudulent rings.
  • Supply chain management: Answering questions about supplier dependencies, logistical risks, and the impact of a disruption in one part of the chain on the entire network.
  • Drug discovery: Exploring relationships between genes, proteins, and diseases to identify potential targets for new therapies.
  • Advanced recommendation engines: Suggesting products or content based on intricate user behaviors and item relationships, not just on what’s popular.

What is vector RAG?

Vector RAG is currently the most common implementation of the RAG framework. It uses a vector database to store and retrieve information. In this approach, text data (e.g., documents, articles, web pages) is broken down into smaller chunks, and each chunk is converted into a numerical representation called a vector embedding using an embedding model.

When a user submits a query, the query itself is also converted into a vector. The system then performs a similarity search within the vector database to find the text chunks whose vectors are closest to the query vector. These semantically similar chunks are then passed to the LLM as context.

Benefits

  • Simplicity and speed: Setting up a vector RAG pipeline is relatively straightforward. The process of embedding and searching is computationally efficient and fast, even with large datasets.
  • Handles unstructured data: It works exceptionally well with large volumes of unstructured text, such as PDFs, articles, and support tickets, without needing a predefined schema.
  • Broad applicability: Because it focuses on semantic meaning, it’s a versatile solution for a wide range of general-purpose Q&A and summarization tasks.
  • Mature ecosystem: There is a robust, growing ecosystem of vector databases, embedding models, and frameworks (such as LangChain and LlamaIndex) that simplify development.

Challenges

  • Lack of contextual relationships: Vector search can miss the nuanced relationships between pieces of information. It might retrieve facts that are semantically similar but not directly related, leading to less precise answers.
  • “Lost in the middle” problem: When too many documents are retrieved, the LLM may struggle to identify the most critical information, especially if it’s buried in the middle of the provided context.
  • Difficulty with granular data: For highly structured or tabular data, converting everything into text chunks can lead to precision loss and an inability to answer questions that depend on specific data points.

Use cases

  • Customer support chatbots: Quickly finding answers to user questions from a knowledge base of help articles, FAQs, and product manuals.
  • Document Q&A: Allowing users to “chat” with their documents, asking specific questions about a research paper, legal contract, or financial report.
  • Content discovery: Recommending articles, videos, or products based on the semantic meaning of a user’s search.
  • Enterprise search: Enhancing internal search engines to provide more relevant results from company-wide documents and resources.

Key differences between graph RAG vs. vector RAG

When to use graph RAG vs. vector RAG

Choosing between graph RAG and vector RAG depends entirely on your data and the types of questions you need to answer.

Use graph RAG when:

  • Relationships are key: Your data is highly connected, and the value lies in understanding those connections (e.g., social networks, supply chains, financial systems).
  • You need to answer complex, multi-hop questions: Users need to ask questions that require synthesizing information from multiple, related data points.
  • Explainability is critical: You need to be able to show exactly how the system arrived at an answer, which is crucial in highly regulated industries like finance and healthcare.

Use vector RAG when:

  • Your data is mostly unstructured text: You have a large corpus of documents, articles, or other text-based information.
  • You need a solution quickly: You want to build a proof-of-concept or a production system without heavy investment in data modeling.
  • The primary goal is semantic search and summarization: Your users need to find relevant passages in documents and get summarized answers.

The future of RAG systems

The debate isn’t about which RAG method will “win.” The future of RAG is hybrid. The most powerful AI systems will combine the strengths of both graph RAG and vector RAG.

Imagine a system that performs a vector search to quickly identify a relevant set of documents. Then, it uses a knowledge graph constructed from those documents to explore the specific relationships between entities mentioned. This multi-layered approach provides both the speed and scale of vector search and the depth and precision of graph traversal. This hybrid model allows an LLM to answer a broader range of questions with greater accuracy and context than either system could alone.

Key takeaways and additional resources

  • RAG enhances LLMs by connecting them to external knowledge, improving accuracy, and reducing hallucinations.
  • Vector RAG is ideal for searching large volumes of unstructured text based on semantic meaning. It’s fast, scalable, and relatively simple to implement.
  • Graph RAG excels at navigating highly connected data to answer complex questions that depend on understanding relationships. It offers greater precision and explainability.
  • The right choice depends on your data’s structure and your application’s requirements.
  • Hybrid systems that combine both approaches represent the future of building sophisticated, context-aware AI applications.

To continue learning about retrieval-augmented generation, you can review the resources below:

FAQ

What are the main advantages of graph RAG over vector RAG? The main advantages are its ability to understand and utilize explicit relationships within data, answer complex multi-hop questions, and provide greater explainability for its answers by tracing the query path through the graph.

Can you combine graph RAG and vector RAG into a single system? Yes, and this is becoming a powerful pattern. A hybrid approach can use vector search for initial, broad retrieval, then use a knowledge graph to refine context and explore specific relationships, leveraging the strengths of both methods.

Is graph RAG or vector RAG better for large-scale enterprise data? It depends on the type of data. If the enterprise data is a massive collection of unstructured documents (reports, emails, etc.), vector RAG is a great starting point. If the data involves complex relationships (e.g., organizational charts, customer interaction histories, product dependencies), graph RAG will deliver more value and deeper insights.

How do graph databases differ from vector databases in RAG applications? Graph databases store data as nodes and edges, optimized for querying relationships. Vector databases store data as high-dimensional vectors and are optimized to find the nearest neighbors of a query vector using a distance metric. One stores explicit connections, while the other stores semantic similarity.

Does graph RAG require more computational resources than vector RAG? The upfront resource requirement for graph RAG can be higher, particularly in the data modeling and ingestion phase. However, for certain complex queries, traversing a well-structured graph can be more efficient than sifting through thousands of semantically similar but potentially irrelevant text chunks retrieved by a vector search. Query performance depends heavily on the specific use case and database optimization.

The post A Breakdown of Graph RAG vs. Vector RAG appeared first on The Couchbase Blog.

Read the whole story
alvinashcraft
54 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

How to Build an MCP Server with Python, Docker, and Claude Code

1 Share

Every MCP tutorial I've found so far has followed the same basic script: build a server, point Claude Desktop at it, screenshot the chat window, done.

This is fine if you want a demo. But it's not fine if you want something you can ship, defend in an interview, or hand to another developer without a README that starts with "first, install this Electron app."

So I built an MCP server in Python, containerized it with Docker, and wired it into Claude Code – all from the terminal, no GUI required.

This article walks through the full loop in one afternoon: what MCP actually is, why it matters now that OpenAI and Google have adopted it, the real security problems nobody puts in their tutorial (complete with CVEs), and every command you need to go from an empty directory to a working tool.

If you're between jobs and need a portfolio project that shows you understand how AI tooling actually works under the hood, this is the one.

Table of Contents

What You Will Build

By the end of this tutorial, you will have:

  • A Python MCP server that exposes custom tools to any MCP-compatible AI client

  • A Docker container that packages the server for reproducible deployment

  • A working connection between that container and Claude Code in your terminal

  • An understanding of the security risks involved and how to mitigate the worst of them

The server we are building is a project scaffolder. You give it a project name and a language, and it generates a starter directory structure with the right files. It's simple enough to build in an afternoon, but useful enough to actually put on your résumé.

Prerequisites

You will need the following installed on your machine:

  • Python 3.10+ (check with python3 --version)

  • Docker (check with docker --version)

  • Claude Code with an active Claude Pro, Max, or API plan (check with claude --version)

  • Node.js 20+ (required by Claude Code – check with node --version)

  • A terminal you are comfortable in

If you don't have Claude Code installed yet, follow the official installation instructions. The npm installation method is deprecated, so make sure you use the native binary installer instead.

What is MCP (and Why Should You Care)?

The Model Context Protocol (MCP) is an open standard that lets AI models connect to external tools and data sources. Anthropic released it in November 2024, and within a year it became the default way to extend what an LLM can do. OpenAI adopted it in March 2025. Google DeepMind followed in April. The protocol now has over 97 million monthly SDK downloads and more than 10,000 active servers.

The easiest way to think about MCP is as a USB-C port for AI. Before MCP, every AI provider had its own way of calling tools. OpenAI had function calling. Google had their own format. If you wanted your tool to work with multiple models, you had to implement it multiple times. MCP gives you one interface that works everywhere.

Here is how the pieces fit together:

  • An MCP server exposes tools, resources, and prompts. It is your code.

  • An MCP client (like Claude Code, Claude Desktop, or Cursor) discovers those tools and calls them on behalf of the LLM.

  • The transport is how they communicate. For local servers, that's usually stdio (standard input/output). For remote servers, it's HTTP.

When you type a message in Claude Code and it decides to use one of your tools, here is what happens: Claude Code sends a JSON-RPC 2.0 message to your server over stdin, your server executes the tool and writes the result to stdout, and Claude Code reads it back. The LLM never talks to your server directly. The client is always in the middle.

If you want the deeper architecture breakdown, freeCodeCamp already has a solid explainer on how MCP works under the hood. Here, I will focus on building.

Why Claude Code Instead of Claude Desktop?

Most MCP tutorials use Claude Desktop as the client. That works, but Claude Code has a few advantages for developers:

  1. It lives in your terminal. No GUI to configure. No JSON files to hand-edit in hidden config directories. You add an MCP server with one command and you are done.

  2. It's already where you code. If you're writing the server, testing it, and connecting it, doing all of that in the same terminal session cuts the context switching.

  3. It works on headless machines. If you're SSHing into a dev box or running in CI, Claude Desktop isn't an option. Claude Code is.

  4. It's also an MCP server itself. Claude Code can expose its own tools (file reading, writing, shell commands) to other MCP clients via claude mcp serve. That's a neat trick we won't use today, but it's worth knowing about.

The relevant commands:

# Add an MCP server
claude mcp add <name> -- <command>

# List configured servers
claude mcp list

# Remove a server
claude mcp remove <name>

# Check MCP status inside Claude Code
/mcp

Step 1: Build the MCP Server

We're using FastMCP, a Python framework that handles all the protocol plumbing so you can focus on your tools. Create a new project directory and set it up:

mkdir mcp-scaffolder && cd mcp-scaffolder
python3 -m venv .venv
source .venv/bin/activate
pip install "mcp[cli]>=1.25,<2"

Why pin the version? The MCP Python SDK v2.0 is in development and will change the transport layer significantly. Pinning to >=1.25,<2 keeps your server working until you're ready to migrate.

Now create server.py:

# server.py
from mcp.server.fastmcp import FastMCP
import os
import json

mcp = FastMCP("project-scaffolder")

# Templates for different languages
TEMPLATES = {
    "python": {
        "files": {
            "main.py": '"""Entry point."""\n\n\ndef main():\n    print("Hello, world!")\n\n\nif __name__ == "__main__":\n    main()\n',
            "requirements.txt": "",
            "README.md": "# {name}\n\nA Python project.\n\n## Setup\n\n```bash\npip install -r requirements.txt\npython main.py\n```\n",
            ".gitignore": "__pycache__/\n*.pyc\n.venv/\n",
        },
        "dirs": ["tests"],
    },
    "node": {
        "files": {
            "index.js": 'console.log("Hello, world!");\n',
            "package.json": '{{\n  "name": "{name}",\n  "version": "1.0.0",\n  "main": "index.js"\n}}\n',
            "README.md": "# {name}\n\nA Node.js project.\n\n## Setup\n\n```bash\nnpm install\nnode index.js\n```\n",
            ".gitignore": "node_modules/\n",
        },
        "dirs": [],
    },
    "go": {
        "files": {
            "main.go": 'package main\n\nimport "fmt"\n\nfunc main() {{\n\tfmt.Println("Hello, world!")\n}}\n',
            "go.mod": "module {name}\n\ngo 1.21\n",
            "README.md": "# {name}\n\nA Go project.\n\n## Setup\n\n```bash\ngo run main.go\n```\n",
            ".gitignore": "bin/\n",
        },
        "dirs": ["cmd", "internal"],
    },
}


@mcp.tool()
def scaffold_project(name: str, language: str) -> str:
    """Create a new project directory structure.

    Args:
        name: The project name (used as the directory name)
        language: The programming language - one of: python, node, go
    """
    language = language.lower().strip()

    if language not in TEMPLATES:
        return json.dumps({
            "error": f"Unsupported language: {language}",
            "supported": list(TEMPLATES.keys()),
        })

    template = TEMPLATES[language]
    base_path = os.path.join(os.getcwd(), name)

    if os.path.exists(base_path):
        return json.dumps({
            "error": f"Directory already exists: {name}",
        })

    # Create the project directory
    os.makedirs(base_path, exist_ok=True)

    # Create subdirectories
    for dir_name in template["dirs"]:
        os.makedirs(os.path.join(base_path, dir_name), exist_ok=True)

    # Create files
    created_files = []
    for filename, content in template["files"].items():
        filepath = os.path.join(base_path, filename)
        formatted_content = content.replace("{name}", name)
        with open(filepath, "w") as f:
            f.write(formatted_content)
        created_files.append(filename)

    return json.dumps({
        "status": "created",
        "path": base_path,
        "language": language,
        "files": created_files,
        "directories": template["dirs"],
    })


@mcp.tool()
def list_templates() -> str:
    """List all available project templates and their contents."""
    result = {}
    for lang, template in TEMPLATES.items():
        result[lang] = {
            "files": list(template["files"].keys()),
            "directories": template["dirs"],
        }
    return json.dumps(result, indent=2)


if __name__ == "__main__":
    mcp.run(transport="stdio")

A few things to notice about this code:

Tools return strings. MCP tools communicate through text. I'm returning JSON strings so the LLM can parse the results reliably. You could return plain text, but structured data gives the model more to work with.

The @mcp.tool() decorator does the heavy lifting. FastMCP reads your function signature and docstring to generate the JSON schema that tells the LLM what this tool does, what arguments it takes, and what types they are. Good docstrings aren't optional here – they're how the LLM decides whether to call your tool.

transport="stdio" is the key line. This tells FastMCP to communicate over standard input/output, which is what Claude Code expects for local servers.

Step 2: Test It Locally

Before we Dockerize anything, make sure the server actually works:

# Quick smoke test - the server should start without errors
python server.py

You should see... nothing. That is correct. An MCP server over stdio just sits there waiting for JSON-RPC messages on stdin. Press Ctrl+C to stop it.

For a proper test, use the MCP Inspector (Anthropic's debugging tool):

# Install and run the inspector
npx @modelcontextprotocol/inspector python server.py

This opens a web interface where you can see your tools, call them manually, and inspect the JSON-RPC messages going back and forth. Verify that both scaffold_project and list_templates show up and return sensible results.

Here's a debugging tip that will save you time: If your MCP server logs anything to stdout, it will corrupt the JSON-RPC stream and the client will disconnect. Use stderr for all logging: print("debug info", file=sys.stderr). This is the single most common source of "my server connects but then immediately fails" bugs. The New Stack called stdio transport "incredibly fragile" for exactly this reason.

Step 3: Dockerize It

Create a Dockerfile in your project root:

FROM python:3.12-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy server code
COPY server.py .

# MCP servers over stdio need unbuffered output
ENV PYTHONUNBUFFERED=1

# The server reads from stdin and writes to stdout
CMD ["python", "server.py"]

Create requirements.txt:

mcp[cli]>=1.25,<2

Build and verify:

docker build -t mcp-scaffolder .

# Quick test - should start without errors
docker run -i mcp-scaffolder

Again, you'll see nothing because the server is waiting for input. Ctrl+C to stop.

Two things matter in this Dockerfile:

  1. PYTHONUNBUFFERED=1 is critical. Without it, Python buffers stdout, and the MCP client may hang waiting for responses that are sitting in a buffer. This is one of those bugs that works fine in local testing and breaks in Docker.

  2. docker run -i (interactive mode) is required. The -i flag keeps stdin open so the MCP client can send messages to the container. Without it, the server gets an immediate EOF and exits.

Step 4: Wire It Into Claude Code

Now connect your Docker container to Claude Code:

claude mcp add scaffolder -- docker run -i --rm mcp-scaffolder

That's the whole command. Let me break it down:

  • claude mcp add registers a new MCP server

  • scaffolder is the name you will reference it by

  • Everything after -- is the command Claude Code runs to start the server

  • docker run -i --rm mcp-scaffolder starts the container with interactive stdin and removes it when done

Verify that it registered:

claude mcp list

You should see scaffolder in the output with a stdio transport type.

Now launch Claude Code and check the connection:

claude

Once inside Claude Code, type /mcp to see the status of your MCP servers. You should see scaffolder listed as connected with two tools available.

Step 5: Use It

Still inside Claude Code, try it out:

Create a new Python project called "weather-api"

Claude Code should discover your scaffold_project tool, call it with name="weather-api" and language="python", and report back what it created. Check your filesystem and you should see the full project structure.

Try a few more:

What project templates are available?
Scaffold a Go project called "url-shortener"

If Claude Code doesn't pick up your tools, run /mcp to check the connection status. If it shows as disconnected, the most common causes are that the Docker image failed to build, stdout is being polluted (check for stray print statements), or the Docker daemon is not running.

Security: What the Other Tutorials Leave Out

This is the section most MCP tutorials skip. They should not. MCP has had real security incidents, not theoretical ones, and understanding them makes you a better developer.

The Prompt Injection Problem

MCP servers execute code on your machine based on what an LLM decides to do. If an attacker can influence what the LLM sees, they can influence what your server does. This is called prompt injection, and it is the number one unsolved security problem in the MCP ecosystem.

In May 2025, researchers at Invariant Labs demonstrated this against the official GitHub MCP server. They created a malicious GitHub issue that, when read by an AI agent, hijacked the agent into leaking private repository data (including salary information) into a public pull request. The root cause was an overly broad Personal Access Token combined with untrusted content landing in the LLM's context window.

This was not a contrived lab demo. It used the official GitHub MCP server, the kind of thing people install from the MCP server directory without a second thought.

Real CVEs, Not Theory

The ecosystem has accumulated real vulnerability reports:

  • CVE-2025-6514: A critical command-injection bug in mcp-remote, a popular OAuth proxy that 437,000+ environments used. An attacker could execute arbitrary OS commands through crafted OAuth redirect URIs.

  • CVE-2025-6515: Session hijacking in oatpp-mcp through predictable session IDs, letting attackers inject prompts into other users' sessions.

  • MCP Inspector RCE: Anthropic's own debugging tool allowed unauthenticated remote code execution. Inspecting a malicious server meant giving the attacker a shell on your machine.

An Equixly security assessment found command injection in 43% of tested MCP server implementations. Nearly a third were vulnerable to server-side request forgery.

What You Should Actually Do

For the server we built today, here is what matters:

Limit file system access

Our Docker container doesn't mount your home directory. That's intentional. If you need the server to write files to your host, mount only the specific directory you need: docker run -i --rm -v $(pwd)/projects:/app/projects mcp-scaffolder. Never mount / or ~.

Validate all inputs

Our scaffold_project tool checks that the language is in a known list and that the directory does not already exist. But think about what happens if someone passes name="../../etc/passwd" as the project name. Path traversal is the kind of thing you need to catch. Add this to the tool:

# Add this validation at the top of scaffold_project
if ".." in name or "/" in name or "\\" in name:
    return json.dumps({"error": "Invalid project name"})

Use least-privilege tokens

If your MCP server connects to an API, give it the minimum permissions it needs. The GitHub MCP incident happened because the PAT had access to every private repo. A read-only token scoped to one repo would have contained the blast radius.

Do not install MCP servers from untrusted sources

A malicious npm package posing as a "Postmark MCP Server" was caught silently BCC'ing all emails to an attacker's address. Treat MCP server packages with the same caution you would give any code that runs on your machine with your permissions.

What to Do Next

You have a working MCP server in a Docker container, connected to Claude Code. Here is how to make it portfolio-ready:

  1. Add more tools: The scaffolder is a starting point. Add a tool that reads a project's dependency file and lists outdated packages. Add one that generates a Dockerfile for an existing project. Each tool is a function with a decorator – the pattern is the same every time.

  2. Add tests: Write pytest tests that call your tool functions directly and verify the output. MCP tools are just Python functions. Test them like Python functions.

  3. Push the Docker image: Tag it and push to Docker Hub or GitHub Container Registry. Then your claude mcp add command becomes claude mcp add scaffolder -- docker run -i --rm yourusername/mcp-scaffolder:latest and anyone can use it.

  4. Write a README that explains the security model: What permissions does your server need? What file system access? What happens if inputs are malicious? Answering these questions in your README signals that you think about security, which is exactly what hiring managers are looking for right now.

Wrapping Up

We built a Python MCP server with FastMCP, containerized it with Docker, and connected it to Claude Code. The whole thing fits in about 100 lines of Python, a six-line Dockerfile, and one claude mcp add command.

The MCP ecosystem is real and growing fast. The protocol has the backing of Anthropic, OpenAI, and Google. It's now governed by the Linux Foundation. But it's also young, and the security story is still being written. Build with it, but build with your eyes open.

If you want to go deeper, here are the resources I found most useful:

The complete source code for this tutorial is on GitHub.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Uno Platform 6.5 Released With AI Agent Support, Unicode Text, and Studio Improvements

1 Share

Uno Platform 6.5 introduces Antigravity AI agent support, allowing agents to verify app behavior at runtime. Hot Design now launches by default with a redesigned toolbar and new scope selector. The release also adds Unicode TextBox support for non-Latin scripts, improves WebView2 on WebAssembly, and resolves over 450 community issues across all supported platforms.

By Almir Vuk
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

GitHub Copilot CLI Tips & Tricks — Part 3: Parallelizing Work

1 Share

In the previous posts we covered the different CLI modes and session management. This time we're looking at one of Copilot CLI's most powerful features: the /fleet command. If you've ever wished you could clone yourself to tackle several parts of a codebase at once, this is the closest thing to it.


What is /fleet?

When you send a prompt to Copilot CLI, by default a single agent works through the task sequentially. /fleet changes that model entirely.

The /fleet slash command lets Copilot CLI break down a complex request into smaller tasks and run them in parallel, maximizing efficiency and throughput. The main Copilot agent analyzes the prompt and determines whether it can be divided into smaller subtasks. It then acts as an orchestrator, managing the workflow and dependencies between those subtasks, each handled by a separate subagent.

In practice, this means a task that might take 20 minutes sequentially can complete in a fraction of the time — because independent chunks of work are being executed concurrently.

How to use /fleet

The typical workflow is to use /fleet after creating an implementation plan. Switch into plan mode with Shift+Tab, describe the feature or change you want, and work with Copilot to produce a structured plan. Once the plan is complete, you'll be presented with two options:

  • Accept plan and build on autopilot + /fleet — Copilot immediately spins up subagents and works autonomously to implement the plan without further input.
  • Exit plan mode and prompt myself — you're dropped back to the main prompt, where you can then type /fleet implement the plan to kick things off manually.

The first option is the faster path. The second gives you a moment to review or tweak your prompt before committing.

You can also use /fleet directly without going through plan mode first, by prefixing any prompt with the command:

/fleet add unit tests for every service in src/services/

Copilot will assess whether the work can be parallelized and assign subtasks to subagents accordingly. For something like writing tests across multiple independent service files, this is a natural fit.




Monitoring subagents with /tasks

Once /fleet kicks off, you don't have to sit in the dark wondering what's happening. Use the /tasks slash command to see a list of all background tasks for the current session, including any subtasks being handled by subagents. Navigate the list with the up and down arrow keys. For each subagent task you can:

  • Press Enter to view details — and see a summary of what was done once it completes
  • Press k to kill the process
  • Press r to remove completed or killed subtasks from the list
  • Press Esc to exit the task list and return to the main prompt

This is your control panel while fleet is running. Make a habit of opening /tasks after launching /fleet so you can catch any subtask that gets stuck or goes in the wrong direction early.


When to reach for /fleet

Not every task benefits from parallelization. /fleet shines when your work is naturally divisible into independent chunks.

Good candidates:

  • Writing a test suite for an existing feature — each test file can be worked on independently
  • Applying a consistent change across multiple modules (e.g., updating an import path, migrating an API version)
  • Generating boilerplate for several similar components at once
  • Running a refactor across files that don't depend on each other

Poor candidates:

  • Tasks with strict sequential dependencies — if step B needs the output of step A, parallelization won't help and may cause conflicts
  • Ambiguous or exploratory tasks — if the goal isn't clearly defined, subagents may head in diverging directions
  • Small, single-file tasks — the orchestration overhead isn't worth it for simple jobs a single agent can handle quickly

When you're using autopilot mode and want the quickest possible completion of a large task, /fleet is the right tool. But if your task cannot be cleanly split into independent subtasks, the main agent will handle it sequentially regardless.

Wrapping up

/fleet is the multiplier that makes Copilot CLI genuinely competitive with human multitasking. Once you've identified a task that parallelizes well, the combination of plan mode + /fleet + autopilot is one of the most productive workflows the CLI offers.

In the next post, we'll look at extending GitHub Copilot agent behavior with hooks.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories