Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147476 stories
·
33 followers

The Two Best Books on AI Aren't About AI

1 Share

When people discuss the classic texts that define our current moment in artificial intelligence, the usual suspects always crop up. You’ll hear about Ray Kurzweil’s The Singularity Is Near, predicting our merger with machines, or Nick Bostrom’s Superintelligence, warning of the existential risks of powerful AI. These are important books, certainly. But I’ve found that two other books — neither of which is usually considered a classic AI text — have done far more to shape my understanding of where we are and where we’re going.

They might not be the obvious choices, but I reckon they are essential reading for anyone trying to make sense of the current moment.

The Economics of Brain Emulation

The first recommendation is The Age of Em by Robin Hanson.

Hanson published this book in 2016, just as deep learning was gaining traction but well before the large language model explosion. His premise is: instead of a superintelligence constructed from new machine learning algorithms, imagine a future where we simply emulate human brains in silicon. These “Ems” (emulators) aren’t alien gods; they are copy-paste versions of human minds, running on fast hardware.

The book is a strange beast. It is technically speculative fiction, but it is not a novel. Instead, Hanson writes it like an economics textbook beamed back from the future. He consistently applies the principles of physics and economics to this hypothetical world.

What happens to wages when you can copy a worker a thousand times? What does retirement look like when you can run a simulation of yourself at 100x speed?

While our current path — building powerful intelligence from scratch via transformers — looks different from Hanson’s brain emulation scenario, the logic of his arguments is still relevant. We are entering an era where intelligence is becoming a commodity: cheap, copyable, and deployable at scale. Hanson’s analysis of armies of cloned brains operating the economy might be the best mental model we have for a future of millions of autonomous AI agents. He teaches us to think about the constraints on technology — energy, bandwidth, physical space — rather than just the magic of it.

What Minds are Really Like

My second pick is The Mind Is Flat by Nick Chater.

Chater, a Professor of Behavioural Science, published this in 2018. It is a book about psychology and cognitive science, and does not mention Generative AI. Although it was published just as that technology was starting to work, the book is unrelated. Yet, the mind it describes sounds a lot like a large language model.

We tend to believe that our minds have immense depth — that beneath our conscious thoughts lies a vast reservoir of memories, beliefs, motives, and stable personality traits. Chater argues that this is a spectacular illusion. He presents a wealth of empirical evidence to suggest that the mind is actually a serial generative device. We don’t excavate deep truths from our subconscious; we generate and improvise thoughts on the fly, moment by moment (if not token by token).

In other words, the human mind Chater presents to us is not all that different from a large language model, and there’s plenty of evidence to suggest that this is indeed the mind we all possess.

When critics say that AI is “just a stochastic parrot” or “merely predicting the next word”, they often imply that human cognition is something far more profound. Chater demonstrates the uncomfortable opposite: we are all stochastic parrots. The mental depth we imagine is a trick of the light. Understanding this not only demystifys the human mind, it also helps us appreciate that an improvising, surface-level intelligence can still be incredibly powerful and creative.



Read the whole story
alvinashcraft
13 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Anthropic Safety Researcher Quits, Warning 'World is in Peril'

1 Share
An anonymous reader shares a report: An Anthropic safety researcher quit, saying the "world is in peril" in part over AI advances. Mrinank Sharma said the safety team "constantly [faces] pressures to set aside what matters most," citing concerns about bioterrorism and other risks. Anthropic was founded with the explicit goal of creating safe AI; its CEO Dario Amodei said at Davos that AI progress is going too fast and called for regulation to force industry leaders to slow down. Other AI safety researchers have left leading firms, citing concerns about catastrophic risks.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
54 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Teaching AI Development Through Gamification:

1 Share

Teaching AI Development Through Gamification: Building an Interactive Learning Platform with Foundry Local

Introduction

Learning AI development can feel overwhelming. Developers face abstract concepts like embeddings, prompt engineering, and workflow orchestrationโ€”topics that traditional tutorials struggle to make tangible. How do you teach someone what an embedding "feels like" or why prompt engineering matters beyond theoretical examples?

The answer lies in experiential learning through gamification. Instead of reading about AI concepts, what if developers could play a game that teaches these ideas through progressively challenging levels, immediate feedback, and real AI interactions? This article explores exactly that: building an educational adventure game that transforms AI learning from abstract theory into hands-on exploration.

We'll dive into Foundry Local Learning Adventure, a JavaScript-based game that teaches AI fundamentals through five interactive levels. You'll learn how to create engaging educational experiences, integrate local AI models using Foundry Local, design progressive difficulty curves, and build cross-platform applications that run both in browsers and terminals. Whether you're an educator designing technical curriculum or a developer building learning tools, this architecture provides a proven blueprint for gamified technical education.

Why Gamification Works for Technical Learning

Traditional technical education follows a predictable pattern: read documentation, watch tutorials, attempt exercises, struggle with setup, eventually give up. The problem isn't content quality, it's engagement and friction. Gamification addresses both issues simultaneously. By framing learning as progression through levels, you create intrinsic motivation. Each completed challenge feels like unlocking a new ability in a game, triggering the same dopamine response that keeps players engaged in entertainment experiences. Progress is visible, achievements are celebrated, and setbacks feel like natural parts of the journey rather than personal failures.

More importantly, gamification reduces friction. Instead of "install dependencies, configure API keys, read documentation, write code, debug errors," learners simply start the game and begin playing. The game handles setup, provides guardrails, and offers immediate feedback. When a concept clicks, the game celebrates it. When learners struggle, hints appear automatically. For AI development specifically, gamification solves a unique challenge: making probabilistic, non-deterministic systems feel approachable. Traditional programming has clear right and wrong answers, but AI outputs vary. A game can frame this variability as exploration rather than failure, teaching developers to evaluate AI responses critically while maintaining confidence.

Architecture Overview: Dual-Platform Design for Maximum Reach

The Foundry Local Learning Adventure implements a clever dual-platform architecture that runs identically in web browsers and command-line terminals. This design maximizes accessibility, learners can start playing instantly in a browser, then graduate to CLI mode with real AI when they're ready to go deeper.

The web version prioritizes zero-friction onboarding. Open web/index.html directly in any browser, no server, no build step, no dependencies. The game loads immediately with simulated AI responses that teach concepts without requiring Foundry Local installation. Progress saves to localStorage, badges unlock as you complete challenges, and the entire experience works offline after the first load. This version is perfect for classrooms, conference demos, and learners who want to try before committing to local AI setup.

The CLI version provides the full experience with real AI interactions. Built on Node.js, this version connects to Foundry Local for authentic model responses. Instead of simulated answers, learners get actual AI outputs, see real latency measurements, and experience how prompt quality affects results. The terminal interface adds a nostalgic hacker aesthetic that appeals to developers while teaching command-driven AI interaction patterns.

Both versions share the same core game logic, level progression, and learning objectives. The abstraction layer looks like this:

// Shared game core (game/src/levels.js)
export const LEVELS = [
  {
    id: 1,
    title: "Meet the Model",
    objective: "Send your first message to an AI",
    challenge: "Start a conversation with the AI model",
    successCriteria: (response) => response.length > 10,
    hints: ["Just say hello!", "Any friendly greeting works"],
    points: 100
  },
  // ... 4 more levels
];

// Platform-specific AI adapters
// Web version (web/game-web.js)
async function getAIResponse(prompt, level) {
  // Simulated responses that teach concepts
  return simulateResponse(prompt, level);
}

// CLI version (src/game.js)
async function getAIResponse(prompt, level) {
  // Real Foundry Local API calls
  const response = await foundryClient.chat.completions.create({
    model: 'phi-4',
    messages: [{ role: 'user', content: prompt }]
  });
  return response.choices[0].message.content;
}

This architecture demonstrates several key principles for educational software:

  • Progressive disclosure: Start simple (web), add complexity optionally (CLI with real AI)
  • Consistent learning outcomes: Both platforms teach the same concepts, just with different implementation details
  • Zero barriers to entry: No installation required eliminates the #1 reason learners abandon technical tutorials
  • Clear upgrade path: Web learners naturally want "the real thing" after completing simulated levels

Level Design: Teaching AI Concepts Through Progressive Challenges

The game's five levels form a carefully designed curriculum that builds AI understanding incrementally. Each level introduces one core concept, provides hands-on practice, and validates learning before proceeding.

Level 1: Meet the Model teaches the fundamental request-response pattern. Learners send their first message to an AI and see it respond. The challenge is deliberately trivialโ€”just say helloโ€”because the goal is building confidence. The level succeeds when the learner realizes "I can talk to an AI and it understands me." This moment of agency sets the foundation for everything else.

The implementation focuses on positive reinforcement:

// Level 1 success handler
function completeLevel1(userPrompt, aiResponse) {
  console.log("\n๐ŸŽ‰ SUCCESS! You've made contact with the AI!");
  console.log(`\nYou said: "${userPrompt}"`);
  console.log(`AI responded: "${aiResponse}"`);
  console.log("\nโœจ You earned the 'Prompt Apprentice' badge!");
  console.log("๐Ÿ† +100 points");
  
  // Show what just happened
  console.log("\n๐Ÿ“š What you learned:");
  console.log("  โ€ข AI models communicate through text messages");
  console.log("  โ€ข You send a prompt, the AI generates a response");
  console.log("  โ€ข This pattern works for any AI-powered application");
  
  // Tease next level
  console.log("\n๐ŸŽฏ Next up: Level 2 - Prompt Mastery");
  console.log("   Learn why some prompts work better than others!");
  
  updateProgress(1, true, 100);
}

This celebration pattern repeats throughout, explicit acknowledgment of success, explanation of what was learned, preview of what's next. It transforms abstract concepts into concrete achievements.

Level 2: Prompt Mastery introduces prompt quality through comparison. The game presents a deliberately poor prompt: "tell me stuff about coding." Learners must rewrite it to be specific, contextual, and actionable. The game runs both prompts, displays results side-by-side, and asks learners to evaluate the difference.

// Level 2 challenge
async function runLevel2() {
  console.log("\n๐Ÿ“ Level 2: Prompt Mastery\n");
  
  const badPrompt = "tell me stuff about coding";
  console.log("โŒ Poor prompt example:");
  console.log(`   "${badPrompt}"`);
  console.log("\n   Problems:");
  console.log("   โ€ข Too vague - what about coding?");
  console.log("   โ€ข No context - skill level? language?");
  console.log("   โ€ข Unclear format - list? tutorial? examples?");
  
  console.log("\nโœ๏ธ  Your turn! Rewrite this to be clear and specific:");
  const userPrompt = await getUserInput();
  
  console.log("\nโš–๏ธ  Comparing results...\n");
  
  const badResponse = await getAIResponse(badPrompt, 2);
  const goodResponse = await getAIResponse(userPrompt, 2);
  
  console.log("๐Ÿ“Š Bad Prompt Result:");
  console.log(`   ${badResponse.substring(0, 150)}...`);
  console.log(`   Length: ${badResponse.length} chars\n`);
  
  console.log("๐Ÿ“Š Your Prompt Result:");
  console.log(`   ${goodResponse.substring(0, 150)}...`);
  console.log(`   Length: ${goodResponse.length} chars\n`);
  
  // Success criteria: longer response + specific keywords
  const success = assessPromptQuality(userPrompt, goodResponse);
  
  if (success) {
    console.log("โœ… Your prompt was much better!");
    console.log("   Notice how specificity generated more useful output.");
    completeLevel2();
  } else {
    console.log("๐Ÿ’ก Hint: Try adding these elements:");
    console.log("   โ€ข What programming language?");
    console.log("   โ€ข What's your skill level?");
    console.log("   โ€ข What format do you want? (tutorial, examples, etc.)");
  }
}

This comparative approach is powerful, learners don't just read about prompt engineering, they experience its impact directly. The before/after comparison makes quality differences undeniable.

Level 3: Embeddings Explorer demystifies semantic search through practical demonstration. Learners search a knowledge base about Foundry Local using natural language queries. The game shows how embedding similarity works by returning relevant content even when exact keywords don't match.

// Level 3 knowledge base
const knowledgeBase = [
  {
    id: 1,
    content: "Foundry Local runs AI models entirely on your device without internet",
    embedding: [0.23, 0.87, 0.12, ...] // Pre-computed for demo
  },
  {
    id: 2,
    content: "Use embeddings to find semantically similar content",
    embedding: [0.45, 0.21, 0.93, ...]
  },
  // ... more entries
];

async function searchKnowledge(query) {
  console.log(`\n๐Ÿ” Searching for: "${query}"\n`);
  
  // In real version, get embedding from Foundry Local
  // In web version, use pre-computed embeddings
  const queryEmbedding = await getEmbedding(query);
  
  // Calculate similarity to all knowledge base entries
  const results = knowledgeBase.map(item => ({
    ...item,
    similarity: cosineSimilarity(queryEmbedding, item.embedding)
  }))
  .sort((a, b) => b.similarity - a.similarity)
  .slice(0, 3);
  
  console.log("๐Ÿ“‘ Top matches:\n");
  results.forEach((result, index) => {
    console.log(`${index + 1}. (${(result.similarity * 100).toFixed(1)}% match)`);
    console.log(`   ${result.content}\n`);
  });
  
  return results;
}

Learners query things like "How do I run AI offline?" and discover content about Foundry Local's offline capabilitiesโ€”even though the word "offline" appears nowhere in the result. This concrete demonstration of semantic understanding beats any theoretical explanation.

Level 4: Workflow Wizard teaches AI pipeline composition. Learners build a three-step workflow: summarize text โ†’ extract keywords โ†’ generate questions. Each step uses the previous output as input, demonstrating how complex AI tasks decompose into chains of simpler operations.

// Level 4 workflow execution
async function runWorkflow(inputText) {
  console.log("โš™๏ธ  Starting 3-step workflow...\n");
  
  // Step 1: Summarize
  console.log("๐Ÿ“ Step 1: Summarizing text...");
  const summary = await getAIResponse(
    `Summarize this in 2 sentences: ${inputText}`,
    4
  );
  console.log(`   Result: ${summary}\n`);
  
  // Step 2: Extract keywords (uses summary output)
  console.log("๐Ÿ”‘ Step 2: Extracting key terms...");
  const keywords = await getAIResponse(
    `Extract 5 important keywords from: ${summary}`,
    4
  );
  console.log(`   Keywords: ${keywords}\n`);
  
  // Step 3: Generate questions (uses keywords)
  console.log("โ“ Step 3: Generating study questions...");
  const questions = await getAIResponse(
    `Create 3 quiz questions about these topics: ${keywords}`,
    4
  );
  console.log(`   Questions:\n${questions}\n`);
  
  console.log("โœ… Workflow complete!");
  console.log("\n๐Ÿ’ก Notice how each step built on the previous output.");
  console.log("   This is how production AI applications work!");
}

This level bridges the gap between "toy examples" and real applications. Learners see firsthand how combining simple AI operations creates sophisticated functionality.

Level 5: Build Your Own Tool challenges learners to create a custom function that AI can invoke. This introduces agentic AI patterns where models don't just generate text, they take actions.

// Level 5 tool creation
function createTool() {
  console.log("๐Ÿ”ง Level 5: Build Your Own Tool\n");
  console.log("Create a JavaScript function the AI can use.");
  console.log("Example: A calculator, weather lookup, or data formatter\n");
  
  console.log("Template:");
  console.log(`
function myTool(param1, param2) {
  // Your code here
  return result;
}
  `);
  
  const userCode = getUserToolCode();
  
  // Register tool with AI system
  registerTool({
    name: "user_custom_tool",
    description: "A tool created by the learner",
    function: eval(userCode) // Sandboxed in real version
  });
  
  // Give AI a task that requires the tool
  console.log("\n๐Ÿค– AI is now trying to use your tool...");
  const response = await getAIResponseWithTools(
    "Use the custom tool to solve this problem: ...",
    availableTools
  );
  
  console.log(`\nAI called your tool and got: ${response}`);
  console.log("๐ŸŽ‰ Congratulations! You've extended AI capabilities!");
}

Completing this level marks true understandingโ€”learners aren't just using AI, they're shaping what it can do. This empowerment is the ultimate goal of technical education.

Building the Web Version: Zero-Install Educational Experience

The web version demonstrates how to create educational software that requires absolutely zero setup. This is critical for workshops, classroom settings, and casual learners who won't commit to installation until they see value.

The architecture is deliberately simple vanilla JavaScript, no build tools, no package managers:

<!-- game/web/index.html -->
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Foundry Local Learning Adventure</title>
  <link rel="stylesheet" href="styles.css">
</head>
<body>
  <div class="game-container">
    <header>
      <h1>๐ŸŽฎ Foundry Local Learning Adventure</h1>
      <div class="progress-bar">
        <span id="level-indicator">Level 1 of 5</span>
        <span id="points-display">0 points</span>
      </div>
    </header>
    
    <main id="game-content">
      <!-- Level content loads here dynamically -->
    </main>
    
    <div class="controls">
      <button id="hint-btn">๐Ÿ’ก Hint</button>
      <button id="progress-btn">๐Ÿ“Š Progress</button>
    </div>
  </div>
  
  <script type="module" src="game-web.js"></script>
</body>
</html>

The JavaScript uses ES6 modules for clean organization without requiring a build step:

// game/web/game-web.js
import { LEVELS } from './game-data.js';
import { simulateAI } from './ai-simulator.js';

class LearningAdventure {
  constructor() {
    this.currentLevel = 1;
    this.progress = this.loadProgress();
    this.initializeUI();
  }
  
  loadProgress() {
    const saved = localStorage.getItem('learning-adventure-progress');
    return saved ? JSON.parse(saved) : {
      completedLevels: [],
      totalPoints: 0,
      badges: []
    };
  }
  
  saveProgress() {
    localStorage.setItem(
      'learning-adventure-progress',
      JSON.stringify(this.progress)
    );
  }
  
  async startLevel(levelNumber) {
    const level = LEVELS[levelNumber - 1];
    this.renderLevel(level);
    
    // Listen for user input
    document.getElementById('submit-btn').addEventListener('click', async () => {
      const userInput = document.getElementById('user-input').value;
      await this.handleUserInput(userInput, level);
    });
  }
  
  async handleUserInput(input, level) {
    // Show loading state
    this.showLoading(true);
    
    // Simulate AI response (web version)
    const response = await simulateAI(input, level.id);
    
    // Display response
    this.displayResponse(response);
    
    // Check success criteria
    if (level.successCriteria(response)) {
      this.completeLevel(level);
    } else {
      this.showHint(level.hints[0]);
    }
    
    this.showLoading(false);
  }
  
  completeLevel(level) {
    // Update progress
    this.progress.completedLevels.push(level.id);
    this.progress.totalPoints += level.points;
    this.progress.badges.push(level.badge);
    this.saveProgress();
    
    // Show celebration
    this.showSuccess(level);
    
    // Unlock next level
    if (level.id < 5) {
      this.unlockLevel(level.id + 1);
    } else {
      this.showGameComplete();
    }
  }
  
  showSuccess(level) {
    const modal = document.createElement('div');
    modal.className = 'success-modal';
    modal.innerHTML = `
      <div class="modal-content">
        <h2>๐ŸŽ‰ Level Complete!</h2>
        <p>${level.title} - <strong>${level.points} points</strong></p>
        <div class="badge-earned">
          ${level.badge.emoji} ${level.badge.name}
        </div>
        <h3>What You Learned:</h3>
        <ul>
          ${level.learnings.map(l => `<li>${l}</li>`).join('')}
        </ul>
        <button onclick="this.closest('.success-modal').remove()">
          Continue to Next Level โ†’
        </button>
      </div>
    `;
    document.body.appendChild(modal);
  }
}

// Start the game
const game = new LearningAdventure();
game.startLevel(1);

This architecture teaches several patterns for web-based educational tools:

  • LocalStorage for persistence: Progress survives page refreshes without requiring accounts or databases
  • ES6 modules for organization: Clean separation of concerns without build complexity
  • Simulated AI for offline operation: Scripted responses teach concepts without requiring API access
  • Progressive enhancement: Basic functionality works everywhere, enhanced features activate when available
  • Celebration animations: Visual feedback reinforces learning milestones

Implementing the CLI Version with Real AI Integration

The CLI version provides the authentic AI development experience. This version requires Node.js and Foundry Local, but rewards setup effort with genuine model interactions.

Installation uses a startup script that handles prerequisites:

#!/bin/bash
# scripts/start-game.sh

echo "๐ŸŽฎ Starting Foundry Local Learning Adventure..."

# Check Node.js
if ! command -v node &> /dev/null; then
  echo "โŒ Node.js not found. Install from https://nodejs.org/"
  exit 1
fi

# Check Foundry Local
if ! command -v foundry &> /dev/null; then
  echo "โŒ Foundry Local not found."
  echo "   Install: winget install Microsoft.FoundryLocal"
  exit 1
fi

# Start Foundry service
echo "๐Ÿš€ Starting Foundry Local service..."
foundry service start

# Wait for service
sleep 2

# Load model
echo "๐Ÿ“ฆ Loading Phi-4 model..."
foundry model load phi-4

# Install dependencies
echo "๐Ÿ“ฅ Installing game dependencies..."
npm install

# Start game
echo "โœ… Launching game..."
npm start

The game logic integrates with Foundry Local using the official SDK:

// game/src/game.js
import { FoundryLocalClient } from 'foundry-local-sdk';
import readline from 'readline/promises';

const client = new FoundryLocalClient({
  endpoint: 'http://127.0.0.1:5272' // Default Foundry Local port
});

async function getAIResponse(prompt, level) {
  try {
    const startTime = Date.now();
    
    const completion = await client.chat.completions.create({
      model: 'phi-4',
      messages: [
        {
          role: 'system',
          content: `You are Sage, a friendly AI mentor teaching ${LEVELS[level-1].title}.`
        },
        {
          role: 'user',
          content: prompt
        }
      ],
      temperature: 0.7,
      max_tokens: 300
    });
    
    const latency = Date.now() - startTime;
    
    console.log(`\nโฑ๏ธ  AI responded in ${latency}ms`);
    return completion.choices[0].message.content;
    
  } catch (error) {
    console.error('โŒ AI error:', error.message);
    console.log('๐Ÿ’ก Falling back to demo mode...');
    return getDemoResponse(prompt, level);
  }
}

async function playLevel(levelNumber) {
  const level = LEVELS[levelNumber - 1];
  
  console.clear();
  console.log(`\n${'='.repeat(60)}`);
  console.log(`   Level ${levelNumber}: ${level.title}`);
  console.log(`${'='.repeat(60)}\n`);
  console.log(`๐ŸŽฏ ${level.objective}\n`);
  console.log(`๐Ÿ“š ${level.description}\n`);
  
  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
  });
  
  const userPrompt = await rl.question('Your prompt: ');
  rl.close();
  
  console.log('\n๐Ÿค– AI is thinking...');
  const response = await getAIResponse(userPrompt, levelNumber);
  
  console.log(`\n๐Ÿ“จ AI Response:\n${response}\n`);
  
  // Evaluate success
  if (level.successCriteria(response, userPrompt)) {
    celebrateSuccess(level);
    updateProgress(levelNumber);
    
    if (levelNumber < 5) {
      const playNext = await askYesNo('Play next level?');
      if (playNext) {
        await playLevel(levelNumber + 1);
      }
    } else {
      showGameComplete();
    }
  } else {
    console.log(`\n๐Ÿ’ก Hint: ${level.hints[0]}\n`);
    const retry = await askYesNo('Try again?');
    if (retry) {
      await playLevel(levelNumber);
    }
  }
}

The CLI version adds several enhancements that deepen learning:

  • Latency visibility: Display response times so learners understand local vs cloud performance differences
  • Graceful fallback: If Foundry Local fails, switch to demo mode automatically rather than crashing
  • Interactive prompts: Use readline for natural command-line interaction patterns
  • Progress persistence: Save to JSON files so learners can pause and resume
  • Command history: Log all prompts and responses for learners to review their progression

Key Takeaways and Educational Design Principles

Building effective educational software for technical audiences requires balancing several competing concerns: accessibility vs authenticity, simplicity vs depth, guidance vs exploration. The Foundry Local Learning Adventure succeeds by making deliberate architectural choices that prioritize learner experience.

Key principles demonstrated:

  • Zero-friction starts win: The web version eliminates all setup barriers, maximizing the chance learners will actually begin
  • Progressive challenge curves build confidence: Each level introduces exactly one new concept, building on previous knowledge
  • Immediate feedback accelerates learning: Learners know instantly if they succeeded, with clear explanations of why
  • Real tools create transferable skills: CLI version uses professional developer tools (Node, real APIs) that apply beyond the game
  • Celebration creates emotional investment: Badges, points, and success animations transform learning into achievement
  • Dual platforms expand reach: Web attracts casual learners, CLI converts them to serious practitioners

To extend this approach for your own educational projects, consider:

  • Domain-specific challenges: Adapt level structure to your technical domain (e.g., API design, database optimization, security practices)
  • Multiplayer competitions: Add leaderboards and time trials to introduce social motivation
  • Adaptive difficulty: Track learner performance and adjust challenge difficulty dynamically
  • Sandbox modes: After completing the curriculum, provide free-play areas for experimentation
  • Community sharing: Let learners share custom levels or challenges they've created

The complete implementation with all levels, both web and CLI versions, comprehensive tests, and deployment guides is available at github.com/leestott/FoundryLocal-LearningAdventure. You can play the web version immediately at leestott.github.io/FoundryLocal-LearningAdventure or clone the repository to experience the full CLI version with real AI.

Resources and Further Reading

Read the whole story
alvinashcraft
54 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Building a Local Research Desk: Multi-Agent Orchestration

1 Share

Introduction

Multi-agent systems represent the next evolution of AI applications. Instead of a single model handling everything, specialised agents collaborate, each with defined responsibilities, passing context to one another, and producing results that no single agent could achieve alone. But building these systems typically requires cloud infrastructure, API keys, usage tracking, and the constant concern about what data leaves your machine.

What if you could build sophisticated multi-agent workflows entirely on your local machine, with no cloud dependencies? The Local Research & Synthesis Desk demonstrates exactly this. Using Microsoft Agent Framework (MAF) for orchestration and Foundry Local for on-device inference, this demo shows how to create a four-agent research pipeline that runs entirely on your hardware, no API keys, no data leaving your network, and complete control over every step.

This article walks through the architecture, implementation patterns, and practical code that makes multi-agent local AI possible. You'll learn how to bootstrap Foundry Local from Python, create specialised agents with distinct roles, wire them into both sequential and concurrent orchestration patterns, and implement tool calling for extended functionality. Whether you're building research tools, internal analysis systems, or simply exploring what's possible with local AI, this architecture provides a production-ready foundation.

Why Multi-Agent Architecture Matters

Single-agent AI systems hit limitations quickly. Ask one model to research a topic, analyse findings, identify gaps, and write a comprehensive reportโ€”and you'll get mediocre results. The model tries to do everything at once, with no opportunity for specialisation, review, or iterative refinement.

Multi-agent systems solve this by decomposing complex tasks into specialised roles. Each agent focuses on what it does best:

  • Planners break ambiguous questions into concrete sub-tasks
  • Retrievers focus exclusively on finding and extracting relevant information
  • Critics review work for gaps, contradictions, and quality issues
  • Writers synthesise everything into coherent, well-structured output

This separation of concerns mirrors how human teams work effectively. A research team doesn't have one person doing everythingโ€”they have researchers, fact-checkers, editors, and writers. Multi-agent AI systems apply the same principle to AI workflows, with each agent receiving the output of previous agents as context for their own specialised task.

The Local Research & Synthesis Desk implements this pattern with four primary agents, plus an optional ToolAgent for utility functions. Here's how user questions flow through the system:

User question
     โ”‚
     โ–ผ
  Planner          โ† sequential (must run first)
     โ”‚
     โ”œโ”€โ”€โ–บ Retriever โ”
     โ”‚               โ”œโ”€โ–บ merge   โ† concurrent (independent tasks)
     โ””โ”€โ”€โ–บ ToolAgent โ”˜
              โ”‚
              โ–ผ
           Critic          โ† sequential (needs retriever output)
              โ”‚
              โ–ผ
           Writer          โ† sequential (needs everything above)
              โ”‚
              โ–ผ
        Final Report

This architecture demonstrates two essential orchestration patterns: sequential pipelines where each agent builds on the previous output, and concurrent fan-out where independent tasks run in parallel to save time.

The Technology Stack: Microsoft Agent Framework + Foundry Local

Before diving into implementation, let's understand the two core technologies that make this architecture possible and why they work so well together.

Microsoft Agent Framework (MAF)

The Microsoft Agent Framework provides building blocks for creating AI agents in Python and .NET. Unlike frameworks that require specific cloud providers, MAF works with any OpenAI-compatible APIโ€”which is exactly what Foundry Local provides.

The key abstraction in MAF is the ChatAgent. Each agent has:

  • Instructions: A system prompt that defines the agent's role and behaviour
  • Chat client: An OpenAI-compatible client for making inference calls
  • Tools: Optional functions the agent can invoke during execution
  • Name: An identifier for logging and observability

MAF handles message threading, tool execution, and response parsing automatically. You focus on designing agent behaviour rather than managing low-level API interactions.

Foundry Local

Foundry Local brings Azure AI Foundry's model catalog to your local machine. It automatically selects the best hardware acceleration available (GPU, NPU, or CPU) and exposes models through an OpenAI-compatible API. Models run entirely on-device with no data leaving your machine.

The foundry-local-sdk Python package provides programmatic control over the Foundry Local service. You can start the service, download models, and retrieve connection informationโ€”all from your Python code. This is the "control plane" that manages the local AI infrastructure.

The combination is powerful: MAF handles agent logic and orchestration, while Foundry Local provides the underlying inference. No cloud dependencies, no API keys, complete data privacy:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Your Machine                          โ”‚
โ”‚                                                          โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    Control Plane     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚  Python App   โ”‚โ”€โ”€โ”€(foundry-local-sdk)โ”€โ”€โ–บโ”‚Foundry Local โ”‚ โ”‚
โ”‚  โ”‚  (MAF agents) โ”‚                     โ”‚   Service     โ”‚ โ”‚
โ”‚  โ”‚               โ”‚    Data Plane       โ”‚              โ”‚ โ”‚
โ”‚  โ”‚  OpenAIChatClientโ”€โ”€(OpenAI API)โ”€โ”€โ”€โ”€โ–บโ”‚  Model (LLM) โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Bootstrapping Foundry Local from Python

The first practical challenge is starting Foundry Local programmatically. The FoundryLocalBootstrapper class handles this, encapsulating all the setup logic so the rest of the application can focus on agent behaviour.

The bootstrap process follows three steps: start the Foundry Local service if it's not running, download the requested model if it's not cached, and return connection information that MAF agents can use. Here's the core implementation:

from dataclasses import dataclass
from foundry_local import FoundryLocalManager

@dataclass
class FoundryConnection:
    """Holds endpoint, API key, and model ID after bootstrap."""
    endpoint: str
    api_key: str
    model_id: str
    model_alias: str

This dataclass carries everything needed to connect MAF agents to Foundry Local. The endpoint is typically http://localhost:<port>/v1 (the port is assigned dynamically), and the API key is managed internally by Foundry Local.

class FoundryLocalBootstrapper:
    def __init__(self, alias: str | None = None) -> None:
        self.alias = alias or os.getenv("MODEL_ALIAS", "qwen2.5-0.5b")

    def bootstrap(self) -> FoundryConnection:
        """Start service, download & load model, return connection info."""
        from foundry_local import FoundryLocalManager
        
        manager = FoundryLocalManager()
        model_info = manager.download_and_load_model(self.alias)
        
        return FoundryConnection(
            endpoint=manager.endpoint,
            api_key=manager.api_key,
            model_id=model_info.id,
            model_alias=self.alias,
        )

Key design decisions in this implementation:

  • Lazy import: The foundry_local import happens inside bootstrap() so the application can provide helpful error messages if the SDK isn't installed
  • Environment configuration: Model alias comes from MODEL_ALIAS environment variable or defaults to qwen2.5-0.5b
  • Automatic hardware selection: Foundry Local picks GPU, NPU, or CPU automaticallyโ€”no configuration needed

The qwen2.5 model family is recommended because it supports function/tool calling, which the ToolAgent requires. For higher quality outputs, larger variants like qwen2.5-7b or qwen2.5-14b are available via the --model flag.

Creating Specialised Agents

With Foundry Local bootstrapped, the next step is creating agents with distinct roles. Each agent is a ChatAgent instance with carefully crafted instructions that focus it on a specific task.

The Planner Agent

The Planner receives a user question and available documents, then breaks the research task into concrete sub-tasks. Its instructions emphasise structured outputโ€”a numbered list of specific tasks rather than prose:

from agent_framework import ChatAgent
from agent_framework.openai import OpenAIChatClient

def _make_client(conn: FoundryConnection) -> OpenAIChatClient:
    """Create an MAF OpenAIChatClient pointing at Foundry Local."""
    return OpenAIChatClient(
        api_key=conn.api_key,
        base_url=conn.endpoint,
        model_id=conn.model_id,
    )

def create_planner(conn: FoundryConnection) -> ChatAgent:
    return ChatAgent(
        chat_client=_make_client(conn),
        name="Planner",
        instructions=(
            "You are a planning agent. Given a user's research question and a list "
            "of document snippets (if any), break the question into 2-4 concrete "
            "sub-tasks. Output ONLY a numbered list of tasks. Each task should state:\n"
            "  โ€ข What information is needed\n"
            "  โ€ข Which source documents might help (if known)\n"
            "Keep it concise โ€” no more than 6 lines total."
        ),
    )

Notice how the instructions are explicit about output format. Multi-agent systems work best when each agent produces structured, predictable output that downstream agents can parse reliably.

The Retriever Agent

The Retriever receives the Planner's task list plus raw document content, then extracts and cites relevant passages. Its instructions emphasise citation formatโ€”a specific pattern that the Writer can reference later:

def create_retriever(conn: FoundryConnection) -> ChatAgent:
    return ChatAgent(
        chat_client=_make_client(conn),
        name="Retriever",
        instructions=(
            "You are a retrieval agent. You receive a research plan AND raw document "
            "text from local files. Your job:\n"
            "  1. Identify the most relevant passages for each task in the plan.\n"
            "  2. Output extracted snippets with citations in the format:\n"
            "     [filename.ext, lines X-Y]: \"quoted textโ€ฆ\"\n"
            "  3. If no relevant content exists, say so explicitly.\n"
            "Be precise โ€” quote only what is relevant, keep each snippet under 100 words."
        ),
    )

The citation format [filename.ext, lines X-Y] creates a consistent contract. The Writer knows exactly how to reference source material, and human reviewers can verify claims against original documents.

The Critic Agent

The Critic reviews the Retriever's work, identifying gaps and contradictions. This agent serves as a quality gate before the final report:

def create_critic(conn: FoundryConnection) -> ChatAgent:
    return ChatAgent(
        chat_client=_make_client(conn),
        name="Critic",
        instructions=(
            "You are a critical review agent. You receive a plan and extracted snippets. "
            "Your job:\n"
            "  1. Check for gaps โ€” are any plan tasks unanswered?\n"
            "  2. Check for contradictions between snippets.\n"
            "  3. Suggest 1-2 specific improvements or missing details.\n"
            "Output a short numbered list of issues (or say 'No issues found')."
        ),
    )

Critics are essential for production systems. Without this review step, the Writer might produce confident-sounding reports with missing information or internal contradictions.

The Writer Agent

The Writer receives everythingโ€”original question, plan, extracted snippets, and critic reviewโ€”then produces the final report:

def create_writer(conn: FoundryConnection) -> ChatAgent:
    return ChatAgent(
        chat_client=_make_client(conn),
        name="Writer",
        instructions=(
            "You are the final report writer. You receive:\n"
            "  โ€ข The original question\n"
            "  โ€ข A plan, extracted snippets with citations, and a critic review\n\n"
            "Produce a clear, well-structured answer (3-5 paragraphs). "
            "Requirements:\n"
            "  โ€ข Cite sources using [filename.ext, lines X-Y] notation\n"
            "  โ€ข Address any gaps the critic raised (note if unresolvable)\n"
            "  โ€ข End with a one-sentence summary\n"
            "Do NOT fabricate citations โ€” only use citations provided by the Retriever."
        ),
    )

The final instructionโ€”"Do NOT fabricate citations"โ€”is crucial for responsible AI. The Writer has access only to citations the Retriever provided, preventing hallucinated references that plague single-agent research systems.

Implementing Sequential Orchestration

With agents defined, the orchestrator connects them into a workflow. Sequential orchestration is the simpler pattern: each agent runs after the previous one completes, passing its output as input to the next agent.

The implementation uses Python's async/await for clean asynchronous execution:

import asyncio
import time
from dataclasses import dataclass, field

@dataclass
class StepResult:
    """Captures one agent step for observability."""
    agent_name: str
    input_text: str
    output_text: str
    elapsed_sec: float

@dataclass
class WorkflowResult:
    """Final result of the entire orchestration run."""
    question: str
    steps: list[StepResult] = field(default_factory=list)
    final_report: str = ""

async def _run_agent(agent: ChatAgent, prompt: str) -> tuple[str, float]:
    """Execute a single agent and measure elapsed time."""
    start = time.perf_counter()
    response = await agent.run(prompt)
    elapsed = time.perf_counter() - start
    return response.content, elapsed

The StepResult dataclass captures everything needed for observability: what went in, what came out, and how long it took. This information is invaluable for debugging and optimisation.

The sequential pipeline chains agents together, building context progressively:

async def run_sequential_workflow(
    question: str,
    docs: LoadedDocuments,
    conn: FoundryConnection,
) -> WorkflowResult:
    wf = WorkflowResult(question=question)
    doc_block = docs.combined_text if docs.chunks else "(no documents provided)"
    
    # Step 1 โ€” Plan
    planner = create_planner(conn)
    planner_prompt = f"User question: {question}\n\nAvailable documents:\n{doc_block}"
    plan_text, elapsed = await _run_agent(planner, planner_prompt)
    wf.steps.append(StepResult("Planner", planner_prompt, plan_text, elapsed))
    
    # Step 2 โ€” Retrieve
    retriever = create_retriever(conn)
    retriever_prompt = f"Plan:\n{plan_text}\n\nDocuments:\n{doc_block}"
    snippets_text, elapsed = await _run_agent(retriever, retriever_prompt)
    wf.steps.append(StepResult("Retriever", retriever_prompt, snippets_text, elapsed))
    
    # Step 3 โ€” Critique
    critic = create_critic(conn)
    critic_prompt = f"Plan:\n{plan_text}\n\nExtracted snippets:\n{snippets_text}"
    critique_text, elapsed = await _run_agent(critic, critic_prompt)
    wf.steps.append(StepResult("Critic", critic_prompt, critique_text, elapsed))
    
    # Step 4 โ€” Write
    writer = create_writer(conn)
    writer_prompt = (
        f"Original question: {question}\n\n"
        f"Plan:\n{plan_text}\n\n"
        f"Extracted snippets:\n{snippets_text}\n\n"
        f"Critic review:\n{critique_text}"
    )
    report_text, elapsed = await _run_agent(writer, writer_prompt)
    wf.steps.append(StepResult("Writer", writer_prompt, report_text, elapsed))
    wf.final_report = report_text
    
    return wf

Each step receives all relevant context from previous steps. The Writer gets the most comprehensive promptโ€”original question, plan, snippets, and critiqueโ€”enabling it to produce a well-informed final report.

Adding Concurrent Fan-Out

Sequential orchestration works well but can be slow. When tasks are independentโ€”neither needs the other's outputโ€”running them in parallel saves time. The demo implements this with asyncio.gather.

Consider the Retriever and ToolAgent: both need the Planner's output, but neither depends on the other. Running them concurrently cuts the wait time roughly in half:

async def run_concurrent_retrieval(
    plan_text: str,
    docs: LoadedDocuments,
    conn: FoundryConnection,
) -> tuple[str, str]:
    """Run Retriever and ToolAgent in parallel."""
    retriever = create_retriever(conn)
    tool_agent = create_tool_agent(conn)
    
    doc_block = docs.combined_text if docs.chunks else "(no documents)"
    
    retriever_prompt = f"Plan:\n{plan_text}\n\nDocuments:\n{doc_block}"
    tool_prompt = f"Analyse the following documents for word count and keywords:\n{doc_block}"
    
    # Execute both agents concurrently
    (snippets_text, r_elapsed), (tool_text, t_elapsed) = await asyncio.gather(
        _run_agent(retriever, retriever_prompt),
        _run_agent(tool_agent, tool_prompt),
    )
    
    return snippets_text, tool_text

The asyncio.gather function runs both coroutines concurrently and returns when both complete. If the Retriever takes 3 seconds and the ToolAgent takes 1.5 seconds, the total wait is approximately 3 seconds rather than 4.5 seconds.

The full workflow combines both patternsโ€”sequential where dependencies require it, concurrent where independence allows it:

async def run_full_workflow(
    question: str,
    docs: LoadedDocuments,
    conn: FoundryConnection,
) -> WorkflowResult:
    """
    End-to-end workflow that showcases BOTH orchestration patterns:
      1. Planner runs first (sequential โ€” must happen before anything else).
      2. Retriever + ToolAgent run concurrently (fan-out on independent tasks).
      3. Critic reviews (sequential โ€” needs retriever output).
      4. Writer produces final report (sequential โ€” needs everything above).
    """
    wf = WorkflowResult(question=question)
    
    # Step 1: Planner (sequential)
    plan_text, elapsed = await _run_agent(create_planner(conn), planner_prompt)
    wf.steps.append(StepResult("Planner", planner_prompt, plan_text, elapsed))
    
    # Step 2: Concurrent fan-out (Retriever + ToolAgent)
    snippets_text, tool_text = await run_concurrent_retrieval(plan_text, docs, conn)
    
    # Step 3: Critic (sequential โ€” needs retriever output)
    critic_prompt = f"Plan:\n{plan_text}\n\nSnippets:\n{snippets_text}\n\nStats:\n{tool_text}"
    critique_text, elapsed = await _run_agent(create_critic(conn), critic_prompt)
    
    # Step 4: Writer (sequential โ€” needs everything)
    writer_prompt = (
        f"Original question: {question}\n\n"
        f"Plan:\n{plan_text}\n\n"
        f"Snippets:\n{snippets_text}\n\n"
        f"Stats:\n{tool_text}\n\n"
        f"Critique:\n{critique_text}"
    )
    report_text, elapsed = await _run_agent(create_writer(conn), writer_prompt)
    wf.final_report = report_text
    
    return wf

This hybrid approach maximises both correctness and performance. Dependencies are respected, but independent work happens in parallel.

Implementing Tool Calling

Some agents benefit from deterministic tools rather than relying entirely on LLM generation. The ToolAgent demonstrates this pattern with two utility functions: word counting and keyword extraction.

MAF supports tool calling through function declarations with Pydantic type annotations:

from typing import Annotated
from pydantic import Field

def word_count(
    text: Annotated[str, Field(description="The text to count words in")]
) -> int:
    """Count words in a text string."""
    return len(text.split())

def extract_keywords(
    text: Annotated[str, Field(description="The text to extract keywords from")],
    top_n: Annotated[int, Field(description="Number of keywords to return", default=5)]
) -> list[str]:
    """Extract most frequent words (simple implementation)."""
    words = text.lower().split()
    # Filter common words, count frequencies, return top N
    word_counts = {}
    for word in words:
        if len(word) > 3:  # Skip short words
            word_counts[word] = word_counts.get(word, 0) + 1
    sorted_words = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)
    return [word for word, count in sorted_words[:top_n]]

The Annotated type with Field descriptions provides metadata that MAF uses to generate function schemas for the LLM. When the model needs to count words, it invokes the word_count tool rather than attempting to count in its response (which LLMs notoriously struggle with).

The ToolAgent receives these functions in its constructor:

def create_tool_agent(conn: FoundryConnection) -> ChatAgent:
    return ChatAgent(
        chat_client=_make_client(conn),
        name="ToolHelper",
        instructions=(
            "You are a utility agent. Use the provided tools to compute "
            "word counts or extract keywords when asked. Return the tool "
            "output directly โ€” do not embellish."
        ),
        tools=[word_count, extract_keywords],
    )

This patternโ€”combining LLM reasoning with deterministic toolsโ€”produces more reliable results. The LLM decides when to use tools and how to interpret results, but the actual computation happens in Python where precision is guaranteed.

Running the Demo

With the architecture explained, here's how to run the demo yourself. Setup takes about five minutes.

Prerequisites

You'll need Python 3.10 or higher and Foundry Local installed on your machine. Install Foundry Local by following the instructions at github.com/microsoft/Foundry-Local, then verify it works:

foundry --help

Installation

Clone the repository and set up a virtual environment:

git clone https://github.com/leestott/agentframework--foundrylocal.git
cd agentframework--foundrylocal

python -m venv .venv

# Windows
.venv\Scripts\activate

# macOS / Linux
source .venv/bin/activate

pip install -r requirements.txt
copy .env.example .env

CLI Usage

Run the research workflow from the command line:

python -m src.app "What are the key features of Foundry Local?" --docs ./data

You'll see agent-by-agent progress with timing information:

โ”Œโ”€ Local Research & Synthesis Desk โ”€โ”
โ”‚ Multi-Agent Orchestration โ€ข MAF + Foundry Local โ”‚
โ”‚ Mode: full                                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

  Model : qwen2.5-0.5b-instruct-cuda-gpu:4  (alias: qwen2.5-0.5b)
  Documents: 3 file(s), 4 chunk(s) from ./data

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ—‚  Planner โ€” breaking the question โ€ฆ   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  1. Identify key features of Foundry Local โ€ฆ
  2. Compare on-device vs cloud inference โ€ฆ
  โฑ  2.3s

โšก Concurrent fan-out โ€” Retriever + ToolAgent running in parallel โ€ฆ
  Retriever finished in 3.1s
  ToolAgent finished in 1.4s

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โœ๏ธ  Writer โ€” composing the final report โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  (Final synthesised report with citations)
  โฑ  4.2s

โœ… Workflow complete โ€” Total: 14.8s, Steps: 5

Web Interface

For a visual experience, launch the Flask-based web UI:

python -m src.app.web

Open http://localhost:5000 in your browser. The web UI provides real-time streaming of agent progress, a visual pipeline showing both orchestration patterns, and an interactive demos tab showcasing tool calling capabilities.

CLI Options

The CLI supports several options for customisation:

  • --docs: Folder of local documents to search (default: ./data)
  • --model: Foundry Local model alias (default: qwen2.5-0.5b)
  • --mode: full for sequential + concurrent, or sequential for simpler pipeline
  • --log-level: DEBUG, INFO, WARNING, or ERROR

For higher quality output, try larger models:

python -m src.app "Explain multi-agent benefits" --docs ./data --model qwen2.5-7b

Interactive Demos: Exploring MAF Capabilities

Beyond the research workflow, the web UI includes five interactive demos showcasing different MAF capabilities. Each demonstrates a specific pattern with suggested prompts and real-time results.

Weather Tools demonstrates multi-tool calling with an agent that provides weather information, forecasts, city comparisons, and activity recommendations. The agent uses four different tools to construct comprehensive responses.

Math Calculator shows precise calculation through tool calling. The agent uses arithmetic, percentage, unit conversion, compound interest, and statistics tools instead of attempting mental mathโ€”eliminating the calculation errors that plague LLM-only approaches.

Sentiment Analyser performs structured text analysis, detecting sentiment, emotions, key phrases, and word frequency through lexicon-based tools. The results are deterministic and verifiable.

Code Reviewer analyses code for style issues, complexity problems, potential bugs, and improvement opportunities. This demonstrates how tool calling can extend AI capabilities into domain-specific analysis.

Multi-Agent Debate showcases sequential orchestration with interdependent outputs. Three agentsโ€”one arguing for a position, one against, and a moderatorโ€”debate a topic. Each agent receives the previous agent's output, demonstrating how multi-agent systems can explore topics from multiple perspectives.

Key Takeaways

  • Multi-agent systems decompose complex tasks: Specialised agents (Planner, Retriever, Critic, Writer) produce better results than single-agent approaches by focusing each agent on what it does best
  • Local AI eliminates cloud dependencies: Foundry Local provides on-device inference with automatic hardware acceleration, keeping all data on your machine
  • MAF simplifies agent development: The ChatAgent abstraction handles message threading, tool execution, and response parsing, letting you focus on agent behaviour
  • Sequential and concurrent orchestration serve different needs: Sequential pipelines maintain dependencies; concurrent fan-out parallelises independent work
  • Tool calling adds precision: Deterministic functions for counting, calculation, and analysis complement LLM reasoning for more reliable results
  • The same patterns scale to production: This demo architectureโ€”bootstrapping, agent creation, orchestrationโ€”applies directly to real-world research and analysis systems

Conclusion and Next Steps

The Local Research & Synthesis Desk demonstrates that sophisticated multi-agent AI systems don't require cloud infrastructure. With Microsoft Agent Framework for orchestration and Foundry Local for inference, you can build production-quality workflows that run entirely on your hardware.

The architecture patterns shown hereโ€”specialised agents with clear roles, sequential pipelines for dependent tasks, concurrent fan-out for independent work, tool calling for precisionโ€”form a foundation for building more sophisticated systems. Consider extending this demo with:

  • Additional agents for fact-checking, summarisation, or domain-specific analysis
  • Richer tool integrations connecting to databases, APIs, or local services
  • Human-in-the-loop approval gates before producing final reports
  • Different model sizes for different agents based on task complexity

Start with the demo, understand the patterns, then apply them to your own research and analysis challenges. The future of AI isn't just cloud modelsโ€”it's intelligent systems that run wherever your data lives.

Resources

Read the whole story
alvinashcraft
54 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Writing for โ€˜civic clarityโ€™ (plus, the power of short sentences), with Roy Peter Clark

1 Share

1159. This week, we look at "civic clarity" with writing instructor Roy Peter Clark in a newly edited version of our 2020 conversation. We look at the ethical code of clear communication and why "civic clarity" is more important now than ever. We also discuss the strategy of "writing short" for social media and how to navigate the difficult process of cutting a draft to find your focus.

Poynter Institute

Roy Peter Clark's Facebook

๐Ÿ”— Join the Grammar Girl Patreon.

๐Ÿ”— Share your familect recording in Speakpipe or by leaving a voicemail at 833-214-GIRL (833-214-4475)

๐Ÿ”—ย Watch myย LinkedIn Learning writing courses.

๐Ÿ”—ย Subscribe to theย newsletter.

๐Ÿ”—ย Takeย our advertising survey.ย 

๐Ÿ”—ย Get the editedย transcript.

๐Ÿ”—ย Getย Grammar Girl books.ย 

| HOST: Mignon Fogarty

| Grammar Girl is part ofย the Quick and Dirty Tips podcast network.

  • Audio Engineer: Dan Feierabend, Maram Elnagheeb
  • Director of Podcast: Holly Hutchings
  • Advertising Operations Specialist: Morgan Christianson
  • Marketing and Video: Nat Hoopes, Rebekah Sebastian
  • Podcast Associate: Maram Elnagheeb

| Theme music byย Catherine Rannus.

| Grammar Girl Social Media:ย YouTube.ย TikTok.ย Facebook. Threads.ย Instagram.ย LinkedIn.ย Mastodon.ย Bluesky.


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.





Download audio: https://dts.podtrac.com/redirect.mp3/media.blubrry.com/grammargirl/stitcher.simplecastaudio.com/e7b2fc84-d82d-4b4d-980c-6414facd80c3/episodes/27ba2c30-3fd5-495b-93b9-39ad77151299/audio/128/default.mp3?aid=rss_feed&awCollectionId=e7b2fc84-d82d-4b4d-980c-6414facd80c3&awEpisodeId=27ba2c30-3fd5-495b-93b9-39ad77151299&feed=XcH2p3Ah
Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Role of AI in Secure Software with Ben Dechrai

1 Share
How does Artificial Intelligence impact our approach to building secure software? Carl and Richard talk to Ben Dechrai about his experiences working with AI tooling and building AI apps, and how that impacts security. Ben talks about the concerns organizations have about using AI tools - what these tools might do with the code they are exposed to, as well as the code the tools generate. The conversation steers to local AI as a solution, although so far, the equipment and tools are very limited. Ben also talks about how AI tools are being used to both attack and secure software and the challenges of this arms race - hopefully the good guys win!



Download audio: https://dts.podtrac.com/redirect.mp3/api.spreaker.com/download/episode/70004695/dotnetrocks_1989_the_role_of_ai_in_secure_software.mp3
Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories