Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
150223 stories
·
33 followers

Announcing the Microsoft Store Awards 2025 winners

1 Share
The Microsoft Store Awards honor outstanding applications that elevate user experiences, drive productivity and inspire creativity across the Windows ecosystem. We are proud to announce this year’s winners, each exemplifying technical excellence, user satisfaction and transformative potential. Whether you’re an app enthusiast or a developer looking to make your mark, discover more about our ongoing commitment to innovation and how you can join the thriving Microsoft Store community.

AI Assistants category

This year, there was a tie in the AI Assistants category, and both winners are recognized for their excellence. Perplexity screenshot.Winner: Perplexity by Perplexity AI Perplexity’s Windows app features native voice dictation, multi-modal AI search and deep desktop integration for rapid research. Advanced AI models, Pro Search and guided deep research modes facilitate interactive exploration, while IT policies and keyboard shortcuts support both personal and enterprise workflows. Enhanced accessibility and seamless desktop integration make Perplexity a leader in AI-powered knowledge discovery. ChatGPT screenshot.Winner: ChatGPT by OpenAI ChatGPT for Windows offers instant answers via the Alt + Space companion window, direct image and file uploads, and enterprise-level privacy controls. The app closely mirrors the web experience while providing unique Windows-only productivity enhancements, including IT policy controls and a privacy-first approach for business users. Its intuitive interface and capable integration make ChatGPT an indispensable tool for both individuals and organizations.

Business category

Invoice Maker & Estimate Creator screenshot.Winner: Invoice Maker & Estimate Creator by Moon Invoice by Moon Technolabs Pvt. Ltd. Moon Invoice’s Invoice Maker & Estimate Creator automates business finance tasks with customizable templates, and instant printing and sharing via WhatsApp or email. Dashboard analytics provide real-time tracking of payments and expenses, while support for multiple payment gateways and formats ensures flexibility for freelancers and SMEs. The app’s secure digital record-keeping, multi-channel communication and rapid document sharing make it a comprehensive solution for business owners seeking efficiency and control.

Computer-Using Agents (CUA) category

Manus for Windows screenshot.Winner: Manus by Manus AI Manus for Windows redefines automation with its Computer-Using Agent (CUA) architecture. Operating within a secure sandbox, Manus autonomously executes complex, multi-step tasks ranging from running code (Python, JavaScript, Bash) and controlling headless browsers for web automation, to managing files and deploying applications. Its “Manus’s Computer” interface lets users observe, pause or guide ongoing actions live, while a multi-agent system plans and executes workflows like data analysis, content creation and report generation. Manus runs tasks fully in the background, can resume interrupted workflows and keeps detailed context for advanced automation, all while maintaining strict isolation and security. This makes Manus a powerful, flexible digital assistant for professionals seeking robust, transparent and secure automation on Windows.

Creativity category

n-Track Studio screenshot.Winner: n-Track Studio by n-Track S.r.I. n-Track Studio transforms Windows devices into full-featured recording studios, supporting unlimited audio and MIDI tracks, advanced effects, custom sound import and seamless project navigation. The app’s cross-platform workflow enables creators to collaborate and produce professional music, while AI-powered tools, VST plugin support and stable export capabilities make it the preferred choice for both creative professionals and hobbyists.

Developer Tools category

ngrok for Windows screenshot.Winner: ngrok by ngrok ngrok for Windows enables secure, seamless tunneling and remote access. The app ensures automatic updates, compatibility with Windows Defender and secure reverse proxy setup. Developers can run ngrok as a background service, benefiting from integration with Windows security and app management tools. Its trusted workflows extend to Microsoft and enterprise environments, helping teams ship faster while staying secure

Education category

Scratch 3 screenshot.Winner: Scratch 3 by Scratch Foundation Scratch 3 helps students build confidence in computational thinking through hands‑on projects, programming interactive stories, games and robotics. In class, learners can visualize logic with block coding, test ideas instantly and iterate with peer feedback, which reinforces problem solving and creativity. With extensions for Bluetooth and hardware, students connect code to real‑world devices (e.g., micro:bit, LEGO), enabling STEM lessons that span sensors, motion and data. The offline editor runs smoothly on basic hardware, so schools can deliver reliable, inclusive experiences that scale from beginner labs to advanced projects.

Game category

Castle Craft screenshot.Winner: Castle Craft by Clever Apps Pte Ltd Castle Craft delivers an immersive gaming experience, leveraging Windows hardware for smooth, dynamic resource merging and time-travel exploration. Players can construct kingdoms, complete heroic quests and enjoy vivid graphics with adaptive controls. The game’s family-friendly design, scalable performance and strong security features make it ideal for all ages and devices, ensuring engaging gameplay and creative kingdom building for every player.

Music category

Moises Live screenshot.Winner: Moises Live by Moises Systems, Inc Moises Live is an exclusive app that sets a new standard for music apps by harnessing real-time AI audio separation technology. Users can isolate vocals, instruments and dialogue at the system level, making it possible to remix, practice or enjoy music with unprecedented control. The app features Karaoke mode for any track in any app (lyrics optional), letting users sing along or create custom mixes instantly. Other highlights include system-wide audio control, latency-free mixing and seamless integration with any Windows app, empowering creators, educators and enthusiasts to achieve studio-level results from their desktop. Moises Live on Copilot+ PCs (for other devices, try Moises), leverages the NPU for breakthrough speeds and lag-free, dynamic volume adjustments for music, movies and calls. All processing is done locally for privacy and instant results, with no cloud uploads or waiting required.

Productivity category

Notion for Windows screenshot.Winner: Notion by Notion Labs Inc Notion for Windows turns scattered tasks and notes into a single, searchable workspace. Teams automate busywork with templated workflows, like converting meeting notes into action items, auto‑assigning tasks from a backlog and syncing status across projects with database views. Native Windows integration (taskbar pinning, quick‑launch and offline access) helps users act faster without context switching. For example, a product team can use a “Sprint Planning” template that auto‑populates tasks from a roadmap page, sends reminders and updates progress charts to help free time for design and decision making.

Congratulations to all winners!

Each winner will be recognized in the Microsoft Store with a Store Award Winner badge, signifying their contribution to the Windows community. We thank all developers for their dedication and innovation and look forward to another year of groundbreaking apps. For highlights from the Microsoft Store Awards in China, check out our official WeChat blog.
Read the whole story
alvinashcraft
50 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

2025 – A year in recap – Windows Accessibility

1 Share
On this International Day of Persons with Disabilities, we reflect on how our products are evolving based on feedback and insights from the disability community and want to highlight some of the progress from the last year.

Nothing about us, without us

The Windows Accessibility team adheres to the disability community’s guiding principle, “nothing about us without us.” In the spirit of putting people at the center of the design process guided by Microsoft Inclusive Design, working with and getting insights from our advisory boards for the blind, mobility and hard of hearing communities is critical to creating meaningful and effective features that empower every single user. https://www.youtube.com/watch?v=SxTajh3CCrE

Fluid Dictation on Windows enables you to dictate with ease

Introducing fluid dictation user experience. Fluid Dictation, a feature designed to make voice-based text authoring seamless and intuitive for everyone, intelligently corrects grammar, punctuation and spelling in real time as you speak. This means your spoken words are instantly transformed into polished, accurate text, reducing the need for tedious manual corrections or complex voice commands. In addition, Fluid Dictation can leverage your custom vocabulary defined within Voice Access, so specialized terms and names are recognized correctly. Powered by on-device AI on Copilot+ PCs without the need to connect to the internet, this capability can be leveraged across both first- and third-party apps on Windows—whether you’re drafting emails, taking notes or collaborating in your favorite applications. With Fluid Dictation, we want you to focus on your ideas, and not the mechanics of text entry by minimizing errors and streamlining corrections when typing with your voice. To try it out, enable Fluid Dictation in the manage options menu under Voice Access settings and interact with any text editing surface of your choice. Here’s a guide to help you get started. This feature is also available for Windows Insiders for Voice Typing. Simply press Windows + H and get started.

Voice Access: Understanding everyone better

A desktop screenshot showing a user accessing wait time before accessing options in the settings menu in Voice Access. The user is presented with the options "Instant", "Very Short", "Short", "Medium", "Long", "Extended" and "Very Long".Voice input is a spectrum; not everyone’s way of communicating using their voice is the same. We’re evolving the way Voice Access understands you to make your experience seamless.
  • Wait time before acting: Everyone speaks at their own pace and with the wait time before acting setting, people can configure a delay before a command is executed. This provides greater flexibility for individuals with varying speech patterns, enabling more accurate recognition whether speaking slowly or quickly. To set this up, navigate to Voice Access settings Wait time before acting, and choose the option that best fits your preferences.
  • Custom word dictionary: We have introduced the ability for you to add your own words to the dictionary in Voice Access. Adding your own words, including difficult to pronounce words, to the dictionary in Voice Access will help improve dictation accuracy. It increases the probability of recognizing these words more accurately. The feature will be available in all the currently supported Voice Access languages. Manually add a word anywhere in your workflow by using the “Add to Vocabulary” command or directly from Voice Access settings > Add to vocabulary. 
  • Flexible and natural commanding: We released more flexible and natural command execution in Voice Access on Copilot+ PCs. Voice access now understands multiple variations of an existing command. You can say:
    • “Can you open Edge application”
    • “Switch to Microsoft Edge”
    • “Please open the Edge browser”
And Voice Access will recognize your intent and execute the command accordingly.
  • Improved speech pattern recognition: Voice Access now offers enhanced recognition for speech patterns, especially those associated with Parkinson’s, reducing errors and making dictation and navigation smoother.
  • Chinese and Japanese Support: Voice Access now supports Chinese and Japanese, expanding accessibility for more users. You can now navigate, dictate and interact with Windows using voice commands in Chinese and Japanese.

More natural and expressive voices for Narrator and Magnifier

A screenshot of the settings menu on Windows selected to "Magnifier" settings. A user has selected the voices option in the settings menu and sees an overlay that lets them add a voice. The user is presented with options for Natural and Natural HD voices.Narrator and Magnifier, in collaboration with Azure AI, now provide the option to use new state of the art, delightful human sounding voices. These voice options are designed to sound more human-like to help reduce cognitive load by mimicking the subtleties of human conversation through natural pauses, emphasis and emotional tone. To try out these natural voices, navigate to voice options in Narrator or Magnifier settings, select the HD voice option, download the voice model and you’re all set for more engaging screen reading and read-aloud experience.

Efficient document creation with Narrator and Word

We’re excited to share that creating and reading documents in Microsoft Word is now smoother and more intuitive for people using Narrator. Recent updates address your valuable feedback and enhance the overall experience across drafting and reviewing text to navigating lists, tables and comments so you stay focused on your work without interruptions. You’ll now find it easier to read and write with confidence. Narrator provides more natural, streamlined announcements when reading tabular data, footnotes and formatting changes. Engaging with comments is now easier and requires fewer keystrokes. You can efficiently read comment text and its associated content without losing your place in the document. Proofing has also become more seamless. Spelling and grammar feedback is now announced in a clearer, more concise way, with automatic speech-rate adjustments and improved keyboard shortcuts that make correcting errors faster and less disruptive. Finally, we’ve made meaningful improvements to Copilot usability with Narrator, ensuring a more accessible experience when working alongside AI.

Focus on fundamentals, powered by your feedback

A screenshot of a web browser showing a graph of Microsoft stock price compared to S&P 500 and NASDAQ composite. The screenshot shows a "Describe image" overlay that describes this chart image in detail using the Image Description feature in Narrator on a Copilot+ PC.Ensuring consistency and reliability across assistive technologies on Windows is the foundation of our experiences. We prioritize addressing critical and functional issues shared through your feedback and our engagements with our advisory boards. Through these efforts, we ensure that you can continue to focus on what’s important to you without distractions.

On Narrator, your feedback has helped shaped improvements such as:
  • Screen Curtain: When activated completely blacks out your display to enhance privacy and focus, ensuring only the user hears what’s on the screen through Narrator. Press Caps + Ctrl + C when Narrator is running to enable Screen Curtain.
  • Richer image descriptions: On Copilot+ PCs, Narrator leverages AI to provide rich detailed descriptions of images, charts and graphs. Press Narrator key + Ctrl + D to get a contextual description of the image, detailing people, objects, colors, text and numbers from the image.
  • Speech Recap and live transcription: Need to save, share or review what Narrator said last? You can press Narrator key + Alt + X to open Speech Recap and view the last 500 spoken strings. If you’re an Assistive Technology (AT) trainer, Teacher of Students with Visual Impairments (TSVI) who supports students in class or for professionals who are hard of hearing and want to use Narrator, you can quickly access spoken content, follow along with live transcription and copy what Narrator last said—all with simple keyboard shortcuts.

Join us on the journey

Thank you to all our Windows customers for trying out these new features, providing your feedback and helping us create better experiences for all, especially those who try out our Windows Insider builds to give feedback earlier in the release process. Consistency and reliability across assistive technologies on Windows is the foundation of the experience. We will continue to prioritize critical and functional issues shared through your feedback and our engagements with our advisory boards. There is still a lot more to come, so join us on this journey by trying Windows 11 and continuing to share your valuable feedback. Just press the Windows logo key + F to launch the Feedback hub to share what you like, dislike or just wish the product could do! If you are a customer with a disability and need technical assistance with Windows or any other Microsoft product, please reach out to the Disability Answer Desk via phone, chat or American Sign Language (via videophone).

Year recap

Assistive Technology Description Links
Voice Access and Voice Typing AI-powered dictation that cleans up your speech automatically across Voice Access and Voice Typing, conversational commands that “just work,” personalized vocab, adjustable command timing, profanity control, and expanded support for Chinese and Japanese. [Fluid Dictation]* • [Natural Language Commands]* • [Custom Dictionary][Wait Time][Profanity Filter][Language Support]
Narrator Get richer image descriptions, smoother Word reading, privacy tools like Screen Curtain, Speech Recap for quick review, and a new Braille Viewer and HD voices that sound natural, expressive and easier to follow. [Image Descriptions][Word Improvements][Speech Recap][Screen Curtain][Braille Viewer][HD Voices]
Magnifier Faster, clearer navigation with one-click zoom controls and HD voices for a more natural and engaging read-aloud experience. [1 Click Zoom Toggle][Reset Zoom][HD Voices]
Features marked with an asterisk (*) are available only on Copilot+ PCs.
Read the whole story
alvinashcraft
55 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Building JARVIS Properly - Phase 6: Vision Awakens (The Power of Protocol) by Robert Griffiths

1 Share

Act 1: The Disembodied Voice

In Phase 5, I successfully resisted the urge to give JARVIS a memory. I argued that, before I gave the system infinite knowledge, I needed to give it discipline. I established the blast_radius governance protocols and rebuilt the architecture to be modular and sane.

It was the right call, but it left JARVIS frustratingly limited. He could think, he could reason across multiple agents, he could even synthesise consensus from conversation history. But he couldn’t do anything. He existed in a kind of liminal space, able to discuss the world but never quite touch it.

In the MCU, this is precisely where JARVIS began: a disembodied voice. Brilliant, witty, indispensable, but fundamentally constrained by his lack of physical form. He could analyse Stark’s suit diagnostics but couldn’t repair them. He could warn of incoming threats but couldn’t intervene. It wasn’t until he was uploaded into Vision’s synthetic body that he could phase through walls, interface with computers, and interact with the physical world.

Phase 6 is about giving JARVIS hands.

My ambitions were modest. I didn’t need JARVIS piloting autonomous drones or restructuring my calendar. I just wanted him to read my project backlog in Obsidian and, ideally, tell me off for not finishing my tickets. To achieve even this simple goal required solving a fundamental architectural problem: how do you give an AI access to the real world without hard-coding every possible interaction?

Act 2: The Plumbing Problem

The traditional approach to giving LLMs “tools” is straightforward but scales terribly. You write a Python function, let’s say read_file(), and you describe it in a JSON schema that you send to OpenAI or Anthropic. The model replies with a request to call that function. You execute it, paste the result back into the conversation, and hope for the best.

This works for one or two tools. It becomes unwieldy at five. It becomes a maintenance nightmare at twenty.

If I wanted JARVIS to interact with my file system, my calendar, my GitHub repository, and my email, I would have to write custom adapters for every single one. I would become a plumber, spending my days connecting proprietary pipes between incompatible systems. Worse, every new capability would require invasive changes to JARVIS’s core orchestration logic.

This violated the modular philosophy established in Phase 5. The whole point of that architectural rebirth was to avoid monolithic coupling.

Enter the Model Context Protocol (MCP), released by Anthropic in late 2024.

MCP is, in essence, “USB for AI”. It defines a standard way for an AI Client to discover and use Tools provided by a Server. Just as USB freed us from needing a different port for every peripheral, MCP frees AI systems from needing a different adapter for every data source.

The elegance was immediately obvious. I didn’t need to hard-code a file reader into JARVIS’s brain. I just needed to build a “File System Server” that spoke MCP, and teach JARVIS how to connect to it. When I wanted to add GitHub integration later, I wouldn’t need to touch JARVIS at all, I would simply spin up a new MCP server.

This was modular AI architecture in its purest form.

Act 3: Confronting the Russian Doll

Before I could let JARVIS touch my files, I had to confront a horror of my own making.

While implementing the MCP server for my Obsidian vault, I discovered that my project directory structure had evolved into a recursive nightmare. Through a series of hasty commits and “temporary” fixes (all well-known synonyms for technical debt accrual!) during the chaotic Mini-Me era, the planning vault had been nested four folders deep.

The path looked something like: src/JARVIS/jarvis/planning/JARVIS planning/JARVIS Board.md.

It was a Russian Doll of folders, each one wrapping the next in an increasingly absurd hierarchy. It was the kind of structure that emerges when you’re moving quickly, trusting that you’ll “fix it properly later”. Later had arrived. The technical debt needed to be repaid.

A younger me, the one who built Mini-Me, might posssibly have hacked the path into the config file and moved on. Ship the feature, worry about elegance never.

But the product owner in me, the one who had painfully learned the lessons of Phase 4, called a halt. You cannot build a sophisticated AI agent on top of a shaky file structure. If the foundation is wonky, everything above it inherits that wobble.

I spent a morning performing “emergency refactoring” on the repository. I shut down the IDEs, killed the file watchers, backed everything up twice, and painstakingly moved the entire planning vault to the root level where it belonged. It was unglamorous housekeeping, the kind of work that generates no demos and impresses no stakeholders.

But it meant that when JARVIS finally opened his eyes, he wouldn’t be cross-eyed trying to find his own backlog with both of his new hands!

Sometimes the most important work is the work nobody sees.

Act 4: Architecture of Restraint

With the directories clean, I built the integration in three parts, each deliberately constrained in its responsibilities:

The Server (servers/obsidian-mcp) is a standalone Python script using the FastMCP SDK. It knows how to manipulate Markdown files in an Obsidian vault. It exposes tools like read_note, list_notes, append_note, and create_note. Critically, it knows nothing about AI, nothing about JARVIS, and nothing about the broader system architecture. It is simply a competent file handler that speaks the MCP protocol.

The Client (jarvis/services/mcp_service.py) is a bridge inside JARVIS that connects to the server via standard input/output. It handles the handshake, discovers what tools the server offers, and translates between JARVIS’s internal orchestration logic and the MCP protocol. It doesn’t care what the tools do; it just knows how to invoke them.

The ReAct Loop was the critical upgrade to the Orchestrator, and it’s where the real intelligence lives.

Previously, the Orchestrator was a simple pipeline: receive prompt, send to LLM, return response. This worked for conversational AI but couldn’t support tool use. I upgraded it to implement a Reasoning + Acting (ReAct) pattern.

Now, when I send a prompt, the Orchestrator:

  1. Connects to the MCP server and asks “What can you do?”
  2. Injects the tool definitions into the system prompt
  3. Sends the user’s prompt to the LLM
  4. Watches the response carefully

If the LLM replies with text, we show it to the user. But if the LLM replies with a JSON tool call, the Orchestrator intercepts it, executes the tool via the MCP client, and feeds the result back to the LLM as a new “observation”. The LLM can then reason about that observation and either use another tool or provide a final answer.

This is where JARVIS moves from conversation to action.

Act 5: Governance as Conscience

This is where the strategic work from Phase 5 revealed its value.

I didn’t just want JARVIS to read files. I wanted the ability to write them, to create new notes, to modify existing ones. However, giving an autonomous agent write access to your hard drive is how you get Skynet or, at the very least, a catastrophically deleted codebase.

The blast_radius governance pattern, embedded in every conversation thread since Phase 5, became the safety mechanism.

When I ran the system with --blast-radius low (Safety Mode), I deliberately asked JARVIS to do something dangerous: append a line to my Lessons Learned.md file.

The logs told the story better than I could:

INFO: 🛠️ Agent requests tool: append_note

WARN: 🛡️ Governance Block: 'append_note' blocked by LOW blast radius.

The Orchestrator saw the tool request, checked the governance rules defined in GOVERNANCE.md, and blocked the call before it reached the MCP server. JARVIS then politely apologised and explained it wasn’t permitted to write files in the current mode.

Safety was not an afterthought. It was a first-class constraint, baked into the architecture from the beginning.

This is what mature AI engineering looks like: governance as a non-negotiable layer, not a feature you add when the lawyers get nervous.

Act 6: First Contact

The final test was the full integration. I set the governance to medium (allowing writes) and asked JARVIS to read its own project backlog and create a summary note.

The terminal came alive:

✅ MCP Session Initialised

🔎 Discovered Tools: ['list_notes', 'read_note', 'create_note', 'append_note']

🛠️ Executing Tool: list_notes

🛠️ Executing Tool: read_note (path: 'JARVIS Board.md')

🛠️ Executing Tool: create_note (path: 'Summary.md')

I opened Obsidian. There, in the Planning folder, was a new file.

JARVIS had successfully reached out of the Python execution environment, navigated the file system, read a Markdown document, understood its structure, and created a persistent artefact based on that understanding.

Vision had awakened.

It was a small moment, but it felt seismic. This wasn’t a chatbot parroting responses. This was a system that could perceive, reason, and act in the real world. Constrained, yes. Governed, absolutely. But capable.

The disembodied voice finally had hands.

A Note on Reality:

Readers should not mistake this smooth narrative for a smooth implementation. This phase involved:

  1. Goal Drift: My first attempt at the “Read & Write” test failed because JARVIS read the file, got excited, and just chatted to me about the contents instead of writing the summary file. I had to update the system prompt to be significantly “bossier” to stop him getting distracted by his own voice.
  2. Regex Betrayal: My initial JSON parser was too lazy and cut off the end of long responses, causing the tools to fail silently.
  3. Running Out of Breath: Even after fixing the parser, the write operation failed because the LLM simply hit the token limit. Generating a large markdown summary inside a JSON wrapper consumes a huge number of tokens. I had to manually increase the max_tokens limit on the backend adapter to stop the model from cutting off mid-sentence.
  4. The Infinite Loop: During the final test, JARVIS tried to create a file that already existed. Instead of failing, it reasoned that it should try to append to the file instead. It then got stuck in a loop of trying to be helpful until the safety governor killed the process.
  5. The Gemini Gap: While this works beautifully with OpenAI and Anthropic, I discovered that Google’s Gemini models struggle with this specific implementation. My current approach relies on “Prompt Engineering” (telling the model to reply with a specific JSON schema). Gemini prefers its own native Function Calling API and often ignores raw JSON instructions, meaning I will need to write a specific adapter for it in a future phase.

AI engineering, it turns out, is 10% AI and 90% fixing regex, race conditions, and token limits.

Closing: The Ecosystem Expands

I now have a system that can think (LLM orchestration), judge (governance), and act (MCP). Each layer is modular, each interface clean, and each boundary well-defined.

The beauty of the MCP approach is that adding new capabilities no longer requires surgery on JARVIS’s brain. Want GitHub integration? Build a GitHub MCP server. Want calendar access? Build a calendar server. JARVIS just connects to them.

The architectural vision from Phase 5 has proven itself. Delayed gratification as competitive advantage. By resisting the urge to bolt on features quickly, I’ve built something that can grow sustainably.

Next up: Friday’s Library. I’m going to connect JARVIS to the wider filesystem and GitHub, turning him from a note-taker into a true coding companion. The retrieval layer that I delayed in Phase 5 will finally arrive, but it will arrive on a foundation solid enough to support it.

The flying suit is coming along nicely.


Technical Footnote: Implementation Details

For those following the code, three implementation challenges deserve mention:

Async/Await: The move to MCP required refactoring the CLI from synchronous code to asyncio. The Orchestrator now runs an async event loop to handle tool execution without blocking. This added complexity but made the ReAct loop far more elegant.

Environment Management: I learned the hard way that Python’s dotenv doesn’t automatically propagate to subprocesses. Explicitly injecting environment variables into the MCP server’s execution context was critical for it to locate the Obsidian vault path.

Defence in Depth: The MCP server implements strict path validation to ensure JARVIS cannot read files outside the designated vault, preventing path traversal attacks. Governance exists at two levels: the Orchestrator checks intent (should this tool be allowed?), and the Server enforces execution boundaries (is this path safe?). Both layers are necessary.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Announcing Foundry MCP Server (preview), speeding up AI development with Microsoft Foundry

1 Share

MCP (Model Context Protocol) is a standard protocol that enables AI agents to securely connect with apps, data, and systems, supporting easy interoperability and seamless platform expansion. At Ignite, Microsoft Foundry introduced Foundry Tools, which serves as a central hub for discovering, connecting, and managing both public and private MCP tools securely, simplifying integration across more than 1,400 business systems and empowering agents. Microsoft Foundry also upleveled Foundry Agent Service to empower developers to securely build, manage, and connect AI agents with Foundry Tools, enabling seamless integration, automation, and real-time workflows across enterprise platforms.

Aligning with Microsoft Foundry’s MCP and agent vision, we are pleased to share the preview of Foundry MCP Server, a secure, fully cloud-hosted service providing curated MCP tools that allow agents to perform read and write operations on a wide range of Foundry features without directly accessing backend APIs. You can quickly connect to Foundry MCP Server using Visual Studio Code or Visual Studio with GitHub Copilot, or create Foundry agents on Microsoft Foundry and link them with the Foundry MCP Server.

Foundry MCP Server

At Build 2025, we published an experimental MCP server for Microsoft Foundry (formerly known as Azure AI Foundry), showing how agents can use MCP for tasks like model discovery, knowledge-base queries, and evaluation runs on Microsoft Foundry through one interface. At Ignite 2025, our new MCP Server is now in the cloud, simplifies setup, speeds integration, and boosts reliability by removing the need to manage local server uptime, making adoption easier for developers and agents.

Why the New Foundry MCP Server Stands Out

  • Open access via public endpoints (https://mcp.ai.azure.com)
  • Supports OAuth 2.0 authentication with Entra ID using OBO (on-behalf-of) tokens, providing user-specific permissions. Tenant admins control token retrieval via Azure Policy.
  • Built with cloud-scale reliability and security in mind
  • Enables conversational workflows, including:
    • Exploring and comparing models
    • Recommending model upgrades based on capabilities, benchmarks, and deprecation schedules
    • Checking quotas and deploying models
    • Creating and evaluating agents using user data
  • Compatible with:
    • Visual Studio Code and Visual Studio with GitHub Copilot
    • Ask AI in Microsoft Foundry
    • Foundry tools and agent workflow integrations on Microsoft Foundry

Tool Capabilities by Scenario

To help you identify the most relevant features for your needs, the following table presents capabilities organized by scenario.

Scenario Capability What it does
1. Manage Agents (chatbots, copilots, evaluators)

 

Create / update / clone agents: agent_update Define model, instructions, tools, temperature, safety config, etc.
List / inspect agents: agent_get Get a single agent or list all agents in a project.
Delete agents: agent_delete Remove an existing agent by name.
2. Run Evaluations on Models or Agents

 

Register datasets for evals: evaluation_dataset_create Point at a dataset URI (file/folder) and version it.
List / inspect datasets: evaluation_dataset_get Get one dataset by name+version or list all.
Start evaluation runs: evaluation_create

 

Run built‑in evaluators (e.g., relevance, safety, QA, metrics) on a dataset.
List / inspect evaluation groups and runs: evaluation_get

 

With isRequestForRuns=false: list groups; with true: list/get runs.
Create comparison insights across runs: evaluation_comparison_create Compare baseline vs multiple treatment runs, compute deltas and significance.
Fetch comparison insights: evaluation_comparison_get Retrieve a specific comparison insight or list all.
3. Explore Models and Benchmarks (catalog / selection / switching) List catalog models: model_catalog_list Filter by publisher, license, model name, or free-playground availability.
Get benchmark overview: model_benchmark_get Large dump of benchmark metrics for many models (quality, cost, safety, perf).
Get benchmark subset: model_benchmark_subset_get Same metrics, but limited to specific model name+version pairs.
Similar models by behavior/benchmarks: model_similar_models_get Recommend models similar to a given deployment or name+version.
Switch recommendations (optimize quality/cost/latency/safety): model_switch_recommendations_get Suggest alternative models with better quality, cost, throughput, safety, or latency.
Get detailed model info: model_details_get Full model metadata plus example code snippets from Foundry.
4. Deploy and Manage Model Endpoints Create / update deployments: model_deploy Deploy a catalog model as a serving endpoint (name, sku, capacity, scale).
List / inspect deployments: model_deployment_get Get one deployment or list all in the Foundry account.
Delete deployments: model_deployment_delete Remove a specific deployment.
Deprecation info for deployments: model_deprecation_info_get See if/when a deployed model is deprecated and migration guidance.
5. Monitor Usage, Latency, and Quota

 

Deployment-level monitoring metrics: model_monitoring_metrics_get Retrieve Requests / Latency / Quota / Usage / SLI metrics for one deployment.
Subscription-level quota and usage: model_quota_list See available vs used quota for a subscription in a region.

 

End-to-End Scenarios (how tools combine)

Let’s see how these tools work together in practical workflows.

Real power of MCP server is realized when multiple tools are dynamically used to help user achieve the goal. Now you can seamlessly explore, clarify, investigate, and take action within a single conversation to execute more interconnected scenarios.

For example, when you are building a new agent from scratch:

  • Pick model using benchmarks/recommendations:
    • model_catalog_listmodel_switch_recommendations_get / model_benchmark_subset_get
  • Deploy it (if needed): model_deploy
  • Create agent: agent_update
  • Evaluate quality and safety:
    • evaluation_dataset_createevaluation_createevaluation_get
  • Compare against older agent:
    • evaluation_comparison_createevaluation_comparison_get

And if you are optimizing an existing production deployment:

  • Inspect performance and usage: model_monitoring_metrics_get
  • Check quota headroom: model_quota_list
  • Get better/cheaper alternatives: model_similar_models_get and/or model_switch_recommendations_get
  • Benchmark candidates more deeply: model_benchmark_subset_get
  • Swap deployment: model_deploy (update), then deprecate old one: model_deployment_delete (if desired).

It is important to note that you are not required to identify the tools or determine the parameters necessary for their use. Your agent, equipped with a language model capable of tool calls, will handle these tasks on your behalf. You may request summaries or additional information in your preferred format.

Getting Started

Visual Studio Code

If you want to use Visual Studio Code with GitHub Copilot and Foundry MCP Server, open VS Code, enable GitHub Copilot with agent mode, open Chat and choose your model (such as GPT-5.1). Refer to Get started with chat in VS Code.

Now you can add Foundry MCP Server to use with VS Code. Follow Use MCP servers in VS Code or click quick link for VSCode to add it: You can use https://mcp.ai.azure.com as the url, and http as the type. VS Code will generate an entry that looks like this:

get started foundry mcp json 2 image
mcp.json example showing Foundry MCP Server entry

 

Depending on your VS Code configuration, the MCP Server may start automatically when you start a chat that uses MCP server, but you can also manually start the server by clicking Start icon on your mcp.json file or by pressing Ctrl+Shift+P, MCP: List Servers, choosing the server and starting it. 

Now start chatting with GitHub Copilot. MCP Server will require you to authenticate to Azure, if you haven’t done it already. 

vscode mcp authentication to azure image
vscode asks for authentication at first connection to MCP server

 

vscode chat snapshot using Foundry MCP Server
vscode chat snapshot using Foundry MCP Server

Visual Studio

We also added support for Visual Studio 2026 Insiders. Follow Use MCP Servers – Visual Studio (Windows) | Microsoft Learn or click quick install link for VS to add MCP server entry.

foundry mcp server on vs2026insiders image
Foundry MCP Server on VS2026 Insiders

Microsoft Foundry

Or when you want to build an agent on Microsoft Foundry and empower it with Foundry MCP Server, you can visit Tools menu on Microsoft Foundry, find the server, and create a tool connection to the server with one-click.

Creating a tool connection for Foundry MCP Server in Microsoft Foundry
Creating a tool connection for Foundry MCP Server in Microsoft Foundry

Now you can simply add this tool to your agent and start chatting with your agent.

Snapshot of agent test chat using Foundry MCP Server in Microsoft Foundry
Snapshot of agent test chat using Foundry MCP Server in Microsoft Foundry

Check out Getting started guideline and sample prompts and give it a try!

Security

Foundry MCP Server requires Entra ID authentication and only accepts Entra tokens scoped to https://mcp.ai.azure.com. All actions run under the signed-in user’s Azure RBAC permissions, ensuring that operations cannot exceed user rights, and every activity is logged for auditing. Tenant admins can enforce access control through Azure Policy Conditional Access for token retrieval, providing oversight of Foundry MCP Server usage.

See Explore Foundry MCP Server best practices and security guidance – Microsoft Foundry | Microsoft Learn for further details.

 

We look forward to seeing your projects with Foundry MCP Server!

 

The post Announcing Foundry MCP Server (preview), speeding up AI development with Microsoft Foundry appeared first on Microsoft Foundry Blog.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Amazon Bedrock adds reinforcement fine-tuning simplifying how developers build smarter, more accurate AI models

1 Share

Organizations face a challenging trade-off when adapting AI models to their specific business needs: settle for generic models that produce average results, or tackle the complexity and expense of advanced model customization. Traditional approaches force a choice between poor performance with smaller models or the high costs of deploying larger model variants and managing complex infrastructure. Reinforcement fine-tuning is an advanced technique that trains models using feedback instead of massive labeled datasets, but implementing it typically requires specialized ML expertise, complicated infrastructure, and significant investment—with no guarantee of achieving the accuracy needed for specific use cases.

Today, we’re announcing reinforcement fine-tuning in Amazon Bedrock, a new model customization capability that creates smarter, more cost-effective models that learn from feedback and deliver higher-quality outputs for specific business needs. Reinforcement fine-tuning uses a feedback-driven approach where models improve iteratively based on reward signals, delivering 66% accuracy gains on average over base models.

Amazon Bedrock automates the reinforcement fine-tuning workflow, making this advanced model customization technique accessible to everyday developers without requiring deep machine learning (ML) expertise or large labeled datasets.

How reinforcement fine-tuning works
Reinforcement fine-tuning is built on top of reinforcement learning principles to address a common challenge: getting models to consistently produce outputs that align with business requirements and user preferences.

While traditional fine-tuning requires large, labeled datasets and expensive human annotation, reinforcement fine-tuning takes a different approach. Instead of learning from fixed examples, it uses reward functions to evaluate and judge which responses are considered good for particular business use cases. This teaches models to understand what makes a quality response without requiring massive amounts of pre-labeled training data, making advanced model customization in Amazon Bedrock more accessible and cost-effective.

Here are the benefits of using reinforcement fine-tuning in Amazon Bedrock:

  • Ease of use – Amazon Bedrock automates much of the complexity, making reinforcement fine-tuning more accessible to developers building AI applications. Models can be trained using existing API logs in Amazon Bedrock or by uploading datasets as training data, eliminating the need for labeled datasets or infrastructure setup.
  • Better model performance – Reinforcement fine-tuning improves model accuracy by 66% on average over base models, enabling optimization for price and performance by training smaller, faster, and more efficient model variants. This works with Amazon Nova 2 Lite model, improving quality and price performance for specific business needs, with support for additional models coming soon.
  • Security – Data remains within the secure AWS environment throughout the entire customization process, mitigating security and compliance concerns.

The capability supports two complementary approaches to provide flexibility for optimizing models:

  • Reinforcement Learning with Verifiable Rewards (RLVR) uses rule-based graders for objective tasks like code generation or math reasoning.
  • Reinforcement Learning from AI Feedback (RLAIF) employs AI-based judges for subjective tasks like instruction following or content moderation.

Getting started with reinforcement fine-tuning in Amazon Bedrock
Let’s walk through creating a reinforcement fine-tuning job.

First, I access the Amazon Bedrock console. Then, I navigate to the Custom models page. I choose Create and then choose Reinforcement fine-tuning job.

I start by entering the name of this customization job and then select my base model. At launch, reinforcement fine-tuning supports Amazon Nova 2 Lite, with support for additional models coming soon.

Next, I need to provide training data. I can use my stored invocation logs directly, eliminating the need to upload separate datasets. I can also upload new JSONL files or select existing datasets from Amazon Simple Storage Service (Amazon S3). Reinforcement fine-tuning automatically validates my training dataset and supports the OpenAI Chat Completions data format. If I provide invocation logs in the Amazon Bedrock invoke or converse format, Amazon Bedrock automatically converts them to the Chat Completions format.

The reward function setup is where I define what constitutes a good response. I have two options here. For objective tasks, I can select Custom code and write custom Python code that gets executed through AWS Lambda functions. For more subjective evaluations, I can select Model as judge to use foundation models (FMs) as judges by providing evaluation instructions.

Here, I select Custom code, and I create a new Lambda function or use an existing one as a reward function. I can start with one of the provided templates and customize it for my specific needs.

I can optionally modify default hyperparameters like learning rate, batch size, and epochs.

For enhanced security, I can configure virtual private cloud (VPC) settings and AWS Key Management Service (AWS KMS) encryption to meet my organization’s compliance requirements. Then, I choose Create to start the model customization job.

During the training process, I can monitor real-time metrics to understand how the model is learning. The training metrics dashboard shows key performance indicators including reward scores, loss curves, and accuracy improvements over time. These metrics help me understand whether the model is converging properly and if the reward function is effectively guiding the learning process.

When the reinforcement fine-tuning job is completed, I can see the final job status on the Model details page.

Once the job is completed, I can deploy the model with a single click. I select Set up inference, then choose Deploy for on-demand.

Here, I provide a few details for my model.

After deployment, I can quickly evaluate the model’s performance using the Amazon Bedrock playground. This helps me to test the fine-tuned model with sample prompts and compare its responses against the base model to validate the improvements. I select Test in playground.

The playground provides an intuitive interface for rapid testing and iteration, helping me confirm that the model meets my quality requirements before integrating it into production applications.

Interactive demo
Learn more by navigating an interactive demo of Amazon Bedrock reinforcement fine-tuning in action.

Additional things to know
Here are key points to note:

  • Templates — There are seven ready-to-use reward function templates covering common use cases for both objective and subjective tasks.
  • Pricing — To learn more about pricing, refer to the Amazon Bedrock pricing page.
  • Security — Training data and custom models remain private and aren’t used to improve FMs for public use. It supports VPC and AWS KMS encryption for enhanced security.

Get started with reinforcement fine-tuning by visiting the reinforcement fine-tuning documentation and by accessing the Amazon Bedrock console.

Happy building!
Donnie

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

How confessions can keep language models honest

1 Share
OpenAI researchers are testing “confessions,” a method that trains models to admit when they make mistakes or act undesirably, helping improve AI honesty, transparency, and trust in model outputs.
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories