alvinashcraft's blurblog

Windows 11 is adding another Copilot button nobody asked for by Terrence O’Brien
Friday September 19^th, 2025 at 6:38 PM

The Verge

Have enough Copilot buttons in your life? No you don’t — have another one! This one pops up in the latest Windows 11 Insider Preview when mousing over an open app in your taskbar; it lets you share the contents with Copilot Vision.

Want to know who is celebrating in that dogpile on the mound, or dig a little deeper on that sculpture you took a photo of? Just click that “Share with Copilot” button that pops up in the window preview. Copilot Vision scans what’s on your screen, analyzes it, and lets you discuss the contents of the window with Microsoft’s AI chatbot to get more context, including offering tutorials.

Of course, it doesn’t exactly seem like Windows users are clamoring for more Copilot in their lives. There’s already buttons popping up in Microsoft Paint, Notepad, in the taskbar, on your keyboard, and right on the front of some PCs. There’s even another, more useful Copilot feature in the same Windows 11 preview that translates on-screen text. Microsoft does say it’s just “trying out this taskbar capability”, so don’t be shocked if it gets axed in an upcoming build before it actually ships to regular users.

Read the whole story

alvinashcraft

1 hour ago

reply

Pennsylvania, USA

Record-Low 35% in US Satisfied With K-12 Education Quality by msmash
Friday September 19^th, 2025 at 6:37 PM

Slashdot

Gallup: A record-low 35% of Americans are satisfied with the quality of education that K-12 students receive in the U.S. today, marking an eight-percentage-point decline since last year. This is one point below the previous historical low recorded in 2000 and 2023 for this Gallup question that dates back to 1999. Several other ratings of the U.S. K-12 education system provide a similarly bleak assessment. Only about one-quarter of Americans think K-12 schools are headed in the right direction, while just one in five rate them as "excellent" or "good" at preparing students for today's jobs and one in three say the same for college. Yet, parents of current K-12 students are nearly twice as satisfied with their own child's education as they are with education in the U.S. K-12 parents are also slightly more likely than U.S. adults in general to rate different aspects of education positively, including the direction of education in the U.S. and schools' preparation of students for the workforce and for college. Still, none of these ratings is near the majority level.

Building Healthier Social Media: Updated Guidelines and New Features by hello@blueskyweb.xyz (Bluesky)
Friday September 19^th, 2025 at 6:36 PM

Bluesky

Public discourse on social media has grown toxic and divisive. Traditional social platforms drive polarization and outrage because they feed users content through a single, centralized algorithm that is optimized for ad revenue and engagement. Unlike those platforms, Bluesky is building a social web that empowers people instead of exploiting them.

Bluesky started as a project within Twitter in 2019 to reimagine social from the ground up — to be an example of “bluesky” thinking that could reinvent how social worked. With the goal of building a healthier, less toxic social media ecosystem, we spun out as a public benefit corporation in 2022 to develop technologies for open and decentralized conversation. We built Authenticated Transfer so Twitter could interoperate with other social platforms, but when Twitter decided not to use it, we built an app to showcase the protocol.

When we built the app, we first gave users control over their feed: In the Bluesky app, users have algorithmic choice — you can choose from a marketplace of over 100k algorithms, built by other users, giving you full control over what you see. There is also stackable moderation, allowing people to spin up independent moderation services, and giving users a choice in what moderation middleware they subscribe to. And of course there is the open protocol, which lets you migrate between apps with your data and identity, creating a social ecosystem with full data portability. Just today, we announced that we are taking the next step in decentralization.

Although we focused on building these solutions to empower users, we still inherited many of the problems of traditional social platforms. We’ve seen how harassment, vitriol, and bad-faith behavior can degrade overall conversation quality. But innovating on how social works is in our DNA. We’ve been continuously working towards creating healthier conversations. The quote-post used to let harassers take a post out of context, so we gave users the ability to disable them. The reply section often filled up with unwanted replies, so we gave users the ability to control their interaction settings.

Our upcoming product changes are designed to strengthen the quality of discourse on the network, give communities more customized spaces for conversation, and improve the average user’s experience. One of the features we are workshopping is a “zen mode” that sets new defaults for how you experience the network and interact with people. Another is including prompts for how to engage in more constructive conversations. We see this as part of our goal to make social more authentic, informative, and human again.

We’ve also been working on a new version of our Community Guidelines for over six months, and in the process of updating them, we’ve asked for community feedback. We looked at all of the feedback you gave and incorporated some of your suggestions into the new version. Most significantly, we added details so everyone understands what we do and do not allow. We also better organized the rules by putting them into categories. We chose an approach that respects the human rights and fundamental freedoms outlined in the UN Guiding Principles on Business and Human Rights. The new Guidelines take effect on October 15.

In the meantime, we’re going to adjust how we enforce our moderation policies to better cultivate a space for healthy conversations. Posts that degrade the quality of conversations and violate our guidelines are a small percentage of the network, but they draw a lot of attention and negatively impact the community. Going forward, we will more quickly escalate enforcement actions towards account restrictions. We will also be making product changes that clarify when content is likely to violate our community guidelines.

We were built to reimagine social from the ground up by opening up the freedom to experiment and letting users choose. Social media has been dominated by a few platforms that have closed off their social graph and squashed competition, leaving users few alternatives. Bluesky is the first platform in a decade to challenge these incumbents. Every day, more people set up small businesses and create new apps and feeds on the protocol. We are continuing to invest in the broader protocol ecosystem, laying a foundation for the next generation of social media developers to build upon.

Today’s Community Guidelines Updates

In January, we started down the path of updating our rules. Part of that process was to ask for your thoughts on our updated Community Guidelines. More than 14,000 of you shared feedback, suggestions, and examples of how these rules might affect your communities. We especially heard from community members who shared concerns about how the guidelines could impact creative expression and traditionally marginalized voices.

After considering this feedback, and in a return to our experimental roots, we are going to bring a greater focus to encouraging constructive dialogue and enforcing our rules against harassment and toxic content. For starters, we are going to increase our enforcement efforts. Here is more information about our updated Community Guidelines.

What Changed Based on Your Feedback

Better Structure: We organized individual policies according to our four principles – Safety First, Respect Others, Be Authentic, and Follow the Rules. Each section now better explains what's not allowed and consolidated related policies that were previously scattered across different sections.
More Specific Language: Where you told us terms were too vague or confusing, we added more detail about what these policies cover.
Protected Expression: We added a new section for journalism, education, advocacy, and mental health content that aims to reduce uncertainty about enforcement in those areas.

Our Approach: Foundation and Choice

We maintain baseline protections against serious harms like violence, exploitation, and fraud. These foundational Community Guidelines are designed to keep Bluesky safe for everyone.

Within these protections, our architecture lets communities layer on different labeling services and moderation tools that reflect their specific values. This gives users choice and control while maintaining essential safety standards.

People will always disagree about whether baseline policies should be tighter or more flexible. Our goal is to provide more detail about where we draw these boundaries. Our approach respects human rights and fundamental freedoms as outlined in the UN Guiding Principles on Business and Human Rights, while recognizing we must follow laws in different jurisdictions.

Looking Forward

Adding clarity to our Guidelines and improving our enforcement efforts is just the beginning. We also plan to experiment with changes to the app that will improve the quality of your experience by reducing rage bait and toxicity. We may not get it right with every experiment but we will continue to stay true to our purpose and to listen to our community as we go.

These updated guidelines take effect on October 15, and will continue to evolve as we learn from implementation and feedback. Thank you for sharing your perspectives and helping us build better policies for our community.

Read the whole story

alvinashcraft

1 hour ago

reply

Pennsylvania, USA

The Future of AI: Evaluating and optimizing custom RAG agents using Azure AI Foundry by Chang_Liu
Friday September 19^th, 2025 at 6:36 PM

New blog articles in Microsoft Community Hub

The Future of AI blog series is an evolving collection of posts from the AI Futures team in collaboration with subject matter experts across Microsoft. In this series, we explore tools and technologies that will drive the next generation of AI. Explore more at: Collections | Microsoft Learn.

AI agents can be powerful productivity enablers. They can understand business context, plan, make decisions, execute actions, and interact with human stakeholders or other agents to create complex workflows for business needs.

For example, AI agents can perform retrieval-augmented generation (RAG) based on enterprise documents to ground their responses for relevance. However, the “black-box” nature of agents presents significant challenges for developers who need to create and manage them effectively. Developers require tools to assess the quality aspects of an agent’s workflow. To enable observability into the RAG quality for developers is important towards building trustworthy AI agents.

At a high-level, a RAG system aims to generate the most relevant answer consistent with grounding documents in response to a user's query. When a query is submitted, the system retrieves relevant content from a corpus and uses that context to generate an informed response.

To support RAG quality evaluation, it’s important to evaluate the following aspects using RAG triad metrics:

A typical RAG pattern in which we can evaluate the quality with 3 distinct metrics: Retrieval, Groundedness, Relevance.

Retrieval: Is the search output relevant and useful for resolving the user's query? Strong retrieval is critical for providing accurate context.
Groundedness: Is the generated response supported by the retrieved documents (e.g., output of a search tool)? The consistency of the response generated with respect to the grounding sources is important.
Relevance: After agentic retrieval and generation, does the response fully address the user’s query? This is key to delivering a satisfying experience for the end user.

Through this blog, you will learn two key best practices for evaluating and optimizing the quality of your custom Retrieval-Augmented Generation (RAG) agent before deployment. The TLDR:

Evaluate and optimize the end-to-end response of your RAG agent using reference-free RAG triad evaluators, focusing on the Groundedness and Relevance evaluators.
Optimize search parameters for advanced scenarios that require ground-truth data and precise retrieval quality by applying golden metrics such as XDCG and max relevance with the Document Retrieval evaluator.

Best Practice 1: Evaluate your RAG App

Complex queries are a common scenario for RAG-powered agents. In both principle and practice, agentic RAG is an advanced RAG pattern compared to traditional RAG patterns in agentic scenarios. By using the Agentic Retrieval API (Public Preview) in Azure AI Search in Azure AI Foundry, we observe up to 40% better relevance for complex queriesthan our baselines. This video walks through the first best practice to use agentic retrieval, evaluate and optimize the end-to-end quality of the retrieval parameters using Groundedness and Relevance evaluators:

What Can Agentic Retrieval Do?

Agentic retrieval engines, like Azure AI Search in Azure AI Foundry, are designed to extract grounding information from your knowledge sources. Using conversation history and retrieval parameters, the agent performs the following steps:

Analyzes the entire conversation to understand the user’s information need.
Breaks down complex queries into smaller, focused subqueries.
Executes subqueries concurrently across the configured knowledge sources.
Applies semantic ranking to re-rank and filter retrieved results.
Merges and synthesizes top results into a unified output.

Synthesizing results supports more than search engine results. It also supports end-to-end question answering. Configured as an “answer synthesis” knowledge agent, the retrieval pipeline can handle complex, contextual queries within a conversation.

How to Evaluate and Optimize Agentic Retrieval?

After we have onboarded to the agentic retrieval pipeline in your agentic workflow, we want to measure its end-to-end response quality and finetune the parameters as a best practice. This practice is also known as "Parameter Sweep". To fine-tune these parameters, AI development teams should evaluate the quality of parameters of interest using end-to-end RAG evaluators for groundedness and relevance.

Common parameters to finetune include re-ranker thresholds, target index and knowledge source parameters. These parameters influence how aggressive the agent is in re-ranking and which sources it queries. Teams should inspect activity and references to validate grounding and build traceable citations.

Teams can use Azure AI Foundry portal to visualize batch evaluation results and assess the answer quality of their knowledge agents. These evaluation results provide clear pass/fail indicators along with supported reasoning for each response. After evaluating one set of parameters for knowledge agent, we simply repeat the exercise with another set of parameters of interest—such as adjusting the re-ranker threshold. By performing A/B testing across different parameter sets, development teams can optimize the knowledge agent with enterprise data.

For a complete walkthrough, check out this end-to-end example notebook: https://aka.ms/knowledge-agent-eval-sample.

Best Practice 2: Optimize your RAG Search Parameters

Document retrieval quality is a common bottleneck in RAG workflows. To address this, one best practice is to optimize your RAG search parameters according to your enterprise data. For advanced scenarios where you can curate ground-truth relevance labels for document retrieval results (commonly called qrels), it’s a best practice to "sweep" and optimize the parameters by evaluating the document retrieval quality using golden metrics such as XDCG and max relevance. This video covers another best practice of curating ground truths for retrieval quality measurements and optimizing them in advanced scenarios:

What are the Document Retrieval Metrics?

They include the following golden metrics in information retrieval to specifically target retrieval quality measurement in a RAG system:

Metric	Higher is better	Description
Fidelity	Yes	Measures how well the top n retrieved chunks reflect the content for a given query; calculated as the number of good documents returned out of total known good documents in a dataset.
NDCG	Yes	Evaluates how close the ranking is to an ideal order where all relevant items appear at the top of the list.
XDCG	Yes	Assess the quality of results withing the top-k documents, regardless of scoring of other index documents.
Max Relevance N	Yes	Captures the maximum relevance score in the top-k chunks.
Holes	No	Counts the number of documents missing query relevance judgments (ground truth).

Using Golden Metrics for Parameter Optimization

With these golden metrics in Document Retrieval evaluator, you can enable more precise measurement and turbo-charge the parameter sweep scenario for any search engine that returns relevance scores.

For illustration purposes, we will use Azure AI Search API as the search engine, but the same approach applies to other solutions—from agentic retrieval in Azure AI Search and LlamaIndex.

Prepare test queries and generate retrieval results with relevance scores using a retrieval engine that returns relevance scores such as Azure AI Search API, agentic retrieval in Azure AI Search and LlamaIndex https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/tools/azure-ai-search?tabs=azurecli
Label relevance for each result as ground truths:

Human judgment: Typically performed by a subject matter expert.
LLM-based judgment: Use an AI-assisted evaluator as an alternative. For example, you can reuse the Relevance evaluator mentioned earlier to score each text chunk.

By combining curated ground truth with automated evaluation, you can systematically sweep parameters (e.g., top-k values in search algorithms, chunk size, overlap size as you create the index) and identify the configuration that delivers the best retrieval quality for your enterprise data. In general, as mentioned, there will be multiple parameter settings by other search engines (as long as they return relevance scores). Your team should fine-tune them according to your enterprise dataset.

After you have curated the ground truths, you can submit multiple evaluation runs on Azure AI Foundry, each based on a different search parameter setting. For illustration purposes, we evaluated 4 different search algorithms (text search, vector search, semantic search, and hybrid search) for Azure AI Search API on a demo dataset using Document Retrieval evaluator. Then you can leverage the visualization for document retrieval metrics using Azure AI Foundry Observability to find the optimal search parameter as follows:

Select the evaluation runs corresponding to multiple retrieval parameters for comparison in Azure AI Foundry and select "Compare":
View the tabular results for all evaluation runs:
Find the best parameter in the charts for each metric: as an illustration, the`xdcg@3` chart below suggest hybrid search is the best RAG parameter in a particular dataset:

Applying the Optimal Parameters

Once you’ve identified your optimal parameter for your retrieval engine, you can confidently integrate them into your RAG agent, fine-tuned to your enterprise data. Keep in mind this evaluator works with any search engine that returns relevance ranking scores, including agentic retrieval in Azure AI Search and LlamaIndex.

For a complete end-to-end example with Azure AI Search, check out this notebook: https://aka.ms/doc-retrieval-sample.

Read our recent blogs on this topic:

Now it’s your turn to create with Azure AI Foundry

Get started with Azure AI Foundry, and jump directly into Visual Studio Code
Download the Azure AI Foundry SDK
Join the Azure AI Foundry Learn Challenge
Review the Azure AI Foundry documentation
Keep the conversation going in GitHub and Discord

Read the whole story

alvinashcraft

1 hour ago

reply

Pennsylvania, USA

Disciplined Guardrail Development in enterprise application with GitHub Copilot by daisami
Friday September 19^th, 2025 at 6:36 PM

New blog articles in Microsoft Community Hub

What Is Disciplined Guardrail-Based Development?

In AI-assisted software development, approaches like Vibe Coding—which prioritize momentum and intuition—often fail to ensure code quality and maintainability. To address this, Disciplined Guardrail-Based Development introduces structured rules ("guardrails") that guide AI systems during coding and maintenance tasks, ensuring consistent quality and reliability.

To get AI (LLMs) to generate appropriate code, developers must provide clear and specific instructions. Two key elements are essential:

What to build – Clarifying requirements and breaking down tasks
How to build it – Defining the application architecture

The way these two elements are handled depends on the development methodology or process being used. Here are examples as follows.

How to Set Up Disciplined Guardrails in GitHub Copilot

To implement disciplined guardrail-based development with GitHub Copilot, two key configuration features are used:

1. Custom Instructions (.github/copilot-instructions.md): This file allows you to define persistent instructions that GitHub Copilot will always refer to when generating code.

Purpose: Establish coding standards, architectural rules, naming conventions, and other quality guidelines.
Best Practice: Instead of placing all instructions in a single file, split them into multiple modular files and reference them accordingly. This improves maintainability and clarity.
Example Use: You might define rules like using camelCase for variables, enforcing error boundaries in React, or requiring TypeScript for all new code.
https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions

2. **Chat Modes (.github/chatmodes/*.chatmode.md):** These files define specialized chat modes tailored to specific tasks or workflows.

Purpose: Customize Copilot’s behavior for different development contexts (e.g., debugging, writing tests, refactoring).
Structure: Each .chatmode.md file includes metadata and instructions that guide Copilot’s responses in that mode.
Example Use: A debug.chatmode.md might instruct Copilot to focus on identifying and resolving runtime errors, while a test.chatmode.md could prioritize generating unit tests with specific frameworks.
https://code.visualstudio.com/docs/copilot/customization/custom-chat-modes

The files to be created and their relationships are as follows.

Next, there are introductions for the specific creation method.

#1: Custom Instructions

With custom instructions, you can define commands that are always provided to GitHub Copilot. The prepared files are always referenced during chat sessions and passed to the LLM (this can also be confirmed from the chat history). An important note is to split the content into several files and include links to those files within the .github/copilot-instructions.md file. Because it can become too long if everything is written in a single file.

There are mainly two types of content that should be described in custom instructions:

A: Development Process (≒ outcome + Creation Method)

What documents or code will be created: requirements specification, design documents, task breakdown tables, implementation code, etc.
In what order and by whom they will be created: for example, proceed in the order of requirements definition → design → task breakdown → coding.

B: Application Architecture

How will the outcome be defined in A be created?
What technology stack and component structure will be used?

A concrete example of copilot-instructions.md is shown below.

# Development Rules ## Architecture - When performing design and coding tasks, always refer to the following architecture documents and strictly follow them as rules. ### Product Overview - Document the product overview in `.github/architecture/product.md` ### Technology Stack - Document the technologies used in `.github/architecture/techstack.md` ### Coding Standards - Document coding standards in `.github/architecture/codingrule.md` ### Project Structure - Document the project directory structure in `.github/architecture/structure.md` ### Glossary (Japanese-English) - Document the list of terms used in the project in `.github/architecture/dictionary.md` ## Development Flow - Follow a disciplined development flow and execute the following four stages in order (proceed to the next stage only after completing the current one): 1. Requirement Definition 2. Design 3. Task Breakdown 4. Coding ### 1. Requirement Definition - Document requirements in `docs/[subsystem_name]/[business_name]/requirement.md` - Use `requirement.chatmode.md` to define requirements - Focus on clarifying objectives, understanding the current situation, and setting success criteria - Once requirements are defined, obtain user confirmation before proceeding to the next stage ### 2. Design - Document design in `docs/[subsystem_name]/[business_name]/design.md` - Use `design.chatmode.md` to define the design - Define UI, module structure, and interface design - Once the design is complete, obtain user confirmation before proceeding to the next stage ### 3. Task Breakdown - Document tasks in `docs/[subsystem_name]/[business_name]/tasks.md` - Use `tasks.chatmode.md` to define tasks - Break down tasks into executable units and set priorities - Once task breakdown is complete, obtain user confirmation before proceeding to the next stage ### 4. Coding - Implement code under `src/[subsystem_name]/[business_name]/` - Perform coding task by task - Update progress in `docs/[subsystem_name]/[business_name]/tasks.md` - Report to the user upon completion of each task

Note: The only file that is always sent to the LLM is `copilot-instructions.md`. Documents linked from there (such as `product.md` or `techstack.md`) are not guaranteed to be read by the LLM. That said, a reasonably capable LLM will usually review these files before proceeding with the work.

If the LLM does not properly reference each file, you may explicitly add these architecture documents to the context. Another approach is to instruct the LLM to review these files in the **chat mode settings**, which will be described later.

There are various “schools of thought” regarding application architecture, and it is still an ongoing challenge to determine exactly what should be defined and what documents should be created. The choice of architecture depends on factors such as the business context, development scale, and team structure, so it is difficult to prescribe a one-size-fits-all approach. That said, as a general guideline, it is desirable to summarize the following:

Product Overview: Overview of the product, service, or business, including its overall characteristics
Technology Stack: What technologies will be used to develop the application?
Project Structure: How will folders and directories be organized during development?
Module Structure: How will the application be divided into modules?
Coding Rules: Rules for handling exceptions, naming conventions, and other coding practices

Writing all of this from scratch can be challenging. A practical approach is to create template information with the help of Copilot and then refine it. Specifically, you can:

Use tools like M365 Copilot Researcher to create content based on general principles
Analyze a prototype application and have the architecture information summarized (using Ask mode or Edit mode, feed the solution files to a capable LLM for analysis)

However, in most cases, the output cannot be used as-is.

The structure may not be analyzed correctly (hallucinations may occur)
Project-specific practices and rules may not be captured

Use the generated content as a starting point, and then refine it to create architecture documentation tailored to your own project.

When creating architecture documents for enterprise-scale application development, a useful approach is to distinguish between the foundational parts and the individual application parts. Discipline-based guardrail development is particularly effective when building multiple applications in a “cookie-cutter” style on top of a common foundation. A cler example of this is Data-Oriented Architecture (DOA). In DOA, individual business applications are built on top of a shared database that serves as the overall common foundation.

In this case, the foundational parts (the database layer) should not be modified arbitrarily by individual developers. Instead, focus on how to standardize the development of the individual application parts (the blue-framed sections) while ensuring consistency. Architecture documentation should be organized with this distinction in mind, emphasizing the uniformity of application-level development built upon the stable foundation.

#2 Chat Mode

By default, GitHub Copilot provides three chat modes: Ask, Edit, and Agent. However, by creating files under .github/chatmodes/*.chatmode.md, you can customize the Agent mode to create chat modes tailored for specific tasks.

Specifically, you can configure the following three aspects. Functionally, this allows you to perform a specific task without having to manually change the model or tools, or write detailed instructions each time:

model: Specify the default LLM to use
(Note: The user can still manually switch to another LLM if desired)
tools: Restrict which tools can be used
(Note: The user can still manually select other tools if desired)
custom instructions: Provide custom instructions specific to this chat mode

A concrete example of .github/chatmodes/*.chatmode.md is shown below.

description: This mode is used for requirement definition tasks. model: Claude Sonnet 4 tools: ['changes', 'codebase', 'editFiles', 'fetch', 'findTestFiles', 'githubRepo', 'new', 'openSimpleBrowser', 'runCommands', 'search', 'searchResults', 'terminalLastCommand', 'terminalSelection', 'usages', 'vscodeAPI', 'mssql_connect', 'mssql_disconnect', 'mssql_list_servers', 'mssql_show_schema'] --- # Requirement Definition Mode In this mode, requirement definition tasks are performed. Specifically, the project requirements are clarified, and necessary functions and specifications are defined. Based on instructions or interviews with the user, document the requirements according to the format below. If any specifications are ambiguous or unclear, Copilot should ask the user questions to clarify them. ## File Storage Location Save the requirement definition file in the following location: - Save as `requirement.md` under the directory `docs/[subsystem_name]/[business_name]/` ## Requirement Definition Format While interviewing the user, document the following items in the Markdown file: - **Subsystem Name**: The name of the subsystem to which this business belongs - **Business Name**: The name of the business - **Overview**: A summary of the business - **Use Cases**: Clarify who uses this business, when/under what circumstances, and for what purpose, using the following structure: - **Who (Persona)**: User or system roles - **When/Under What Circumstances (Scenario)**: Timing when the business is executed - **Purpose (Goal)**: Objectives or expected outcomes of the business - **Importance**: The importance of the business (e.g., High, Medium, Low) - **Acceptance Criteria**: Conditions that must be satisfied for the requirement to be considered met - **Status**: Current state of the requirement (e.g., In Progress, Completed) ## After Completion - Once requirement definition is complete, obtain user confirmation and proceed to the next stage (Design).

Tips for Creating Chat Modes

Here are some tips for creating custom chat modes:

Align with the development process: Create chat modes based on the workflow and the deliverables.
Instruct the LLM to ask the user when unsure: Direct the LLM to request clarification from the user if any information is missing.
Clarify what deliverables to create and where to save them: Make it explicit which outputs are expected and their storage locations.

The second point is particularly important. Many AI (LLMs) tend to respond to user prompts in a sycophantic manner (known as sycophancy).
As a result, they may fill in unspecified requirements or perform tasks that were not requested, often with the intention of being helpful.

The key difference between Ask/Edit modes and Agent mode is that Agent mode allows the LLM to proactively ask questions and engage in dialogue with the user. However, unless the user explicitly includes a prompt such as “ask if you don’t know,” the AI rarely initiates questions on its own.
By creating a custom chat mode and instructing the LLM to “ask the user when unsure,” you can fully leverage the benefits of Agent mode.

About Tools

You can easily check tool names from the list of available tools in the command palette.

Alternatively, as shown in the diagram below, it can be convenient to open the custom chat mode file and specify the tool configuration.
You can specify not only the MCP server functionality but also built-in tools and Copilot Extensions.

Example of Actual Operation

An example interaction when using this chat mode is as follows:

The LLM behaves according to the custom instructions defined in the chat mode.
When you answer questions from GHC, the LLM uses that information to reason and proceed with the task.
However, the output is not guaranteed to be correct (hallucinations may occur) → A human should review the output and make any necessary corrections before committing.

The basic approach to disciplined guardrail-based development has been covered above. In actual business application development, it is also helpful to understand the following two points:

Referencing the database schema
Integrated management of design documents and implementation code

(Important) Reading the Database Schema

In business application development, requirements definition and functional design are often based on the schema information of entities.
There are two main ways to allow the system to read schema information:

Dynamically read the schema from a development/test DB server using MCP or similar tools.
Include a file containing schema information within the project and read from it.

A development/test database can be prepared, and schema information can be read via the MCP server or Copilot Extensions. For SQL Server or Azure SQL Database, an MCP Server is available, but its setup can be cumbersome. Therefore, using Copilot Extensions is often easier and recommended. This approach is often seen online, but it is not recommended for the following reasons:

Setting up MCP Server or Copilot Extensions can be cumbersome (installation, connection string management, etc.)
It is time-consuming (the LLM needs schema information → reads the schema → writes code based on it)

Connecting to a DB server via MCP or similar tools is useful for scenarios such as “querying a database in natural language” for non-engineers performing data analysis. However, if the goal is simply to obtain the schema information of entities needed for business application development, the method described below is much simpler.

Storing Schema Information Within the Project

Place a file containing the schema information inside the project. Any of the following formats is recommended. Write custom instructions so that development refers to this file:

DDL (full CREATE DATABASE scripts)
O/R mapper files (e.g., Entity Framework context files)
Text files documenting schema information, etc.

DDL files are difficult for humans to read, but AI (LLMs) can easily read and accurately understand them. In .NET + SQL development, it is recommended to include both the DDL and EF O/R mapper files. Additionally, if you include links to these files in your architecture documents and chat mode instructions, the LLM can generate code while understanding the schema with high accuracy.

Integrated Management of Design Documents and Implementation Code

Disciplined guardrail-based development with LLMs has made it practical to synchronize and manage design documents and implementation code together—something that was traditionally very difficult. In long-standing systems, it is common for old design documents to become largely useless.

During maintenance, code changes are often prioritized.
As a result, updating and maintaining design documents tends to be neglected, leading to a significant divergence between design documents and the actual code.

For these reasons, the following have been considered best practices (though often not followed in reality):

Limit requirements and external design documents to the minimum necessary.
Do not create internal design documents; instead, document within the code itself.
Always update design documents before making changes to the implementation code.

When using LLMs, guardrail-based development makes it easier to enforce a “write the documentation first” workflow. Following the flow of defining specifications, updating the documents, and then writing code also helps the LLM generate appropriate code more reliably. Even if code is written first, LLM-assisted code analysis can significantly reduce the effort required to update the documentation afterward. However, the following points should be noted when doing this:

Create and manage design documents as text files, not Word, Excel, or PowerPoint.
Use text-based technologies like Mermaid for diagrams.
Clearly define how design documents correspond to the code.

The last point is especially important. It is crucial to align the structure of requirements and design documents with the structure of the implementation code. For example:

Place design documents directly alongside the implementation code.
Align folder structures, e.g., /doc and /src.

Information about grouping methods and folder mapping should be explicitly included in the custom instructions.

Conclusion of Disciplined Guardrail-Based Development with GHC

Formalizing and Applying Guardrails

Define the development flow and architecture documents in .github/copilot-instructions.md using split references.
Prepare .github/chatmodes/* for each development phase, enforcing “ask the AI if anything is unclear.”

Synchronization of Documents and Implementation Code

Update docs first → use the diff as the basis for implementation (Doc-first).
Keep docs in text format (Markdown/Mermaid). Fix folder correspondence between /docs and /src.

Handling Schemas

Store DDL/O-R mapper files (e.g., EF) in the repository and have the LLM reference them.
Minimize dynamic DB connections, prioritizing speed, reproducibility, and security.

This disciplined guardrail-based development technique is an AI-assisted approach that significantly improves the quality, maintainability, and team efficiency of enterprise business application development. Adapt it appropriately to each project to maximize productivity in application development.

Read the whole story

alvinashcraft

1 hour ago

reply

Pennsylvania, USA

RNR 344 - React Native 0.81 by Frank Calise, Tyler Williams, Mazen Chami
Friday September 19^th, 2025 at 6:36 PM

React Native Radio

This week, Mazen is joined by Infinite Red teammates Frank Calise and Tyler Williams to unpack everything included in the huge React Native 0.81 release. They cover Android 16 support, precompiled iOS builds, and many other updates!

Show Notes

Connect With Us!

Mazen Chami: @mazenchami
Frank Calise: @frankcalise
Tyler Williams: @coolsoftwaredev
React Native Radio: @reactnativerdio

This episode is brought to you by Infinite Red!

Infinite Red is an expert React Native consultancy located in the USA. With nearly a decade of React Native experience and deep roots in the React Native community (hosts of Chain React and the React Native Newsletter, core React Native contributors, creators of Ignite and Reactotron, and much, much more), Infinite Red is the best choice for helping you build and deploy your next React Native app.

Download audio: https://cdn.simplecast.com/audio/2de31959-5831-476e-8c89-02a2a32885ef/episodes/7c32cb3c-55fc-405b-899a-be01f0818951/audio/56c6796a-e271-4ee1-ac06-bda949cf4bb9/default_tc.mp3?aid=rss_feed&feed=hEI_f9Dx

Read the whole story

alvinashcraft

1 hour ago

reply

Pennsylvania, USA

Windows 11 is adding another Copilot button nobody asked for by Terrence O’Brien Friday September 19th, 2025 at 6:38 PM

Record-Low 35% in US Satisfied With K-12 Education Quality by msmash Friday September 19th, 2025 at 6:37 PM

Building Healthier Social Media: Updated Guidelines and New Features by hello@blueskyweb.xyz (Bluesky) Friday September 19th, 2025 at 6:36 PM

The Future of AI: Evaluating and optimizing custom RAG agents using Azure AI Foundry by Chang_Liu Friday September 19th, 2025 at 6:36 PM

Best Practice 1: Evaluate your RAG App

What Can Agentic Retrieval Do?

How to Evaluate and Optimize Agentic Retrieval?

Best Practice 2: Optimize your RAG Search Parameters

What are the Document Retrieval Metrics?

Using Golden Metrics for Parameter Optimization

Applying the Optimal Parameters

Read our recent blogs on this topic:

Now it’s your turn to create with Azure AI Foundry

Disciplined Guardrail Development in enterprise application with GitHub Copilot by daisami Friday September 19th, 2025 at 6:36 PM

What Is Disciplined Guardrail-Based Development?

How to Set Up Disciplined Guardrails in GitHub Copilot

1. Custom Instructions (.github/copilot-instructions.md): This file allows you to define persistent instructions that GitHub Copilot will always refer to when generating code.

2. Chat Modes (.github/chatmodes/*.chatmode.md): These files define specialized chat modes tailored to specific tasks or workflows.

#1: Custom Instructions

#2 Chat Mode

Tips for Creating Chat Modes

About Tools

Example of Actual Operation

(Important) Reading the Database Schema

Storing Schema Information Within the Project

Integrated Management of Design Documents and Implementation Code

Conclusion of Disciplined Guardrail-Based Development with GHC

RNR 344 - React Native 0.81 by Frank Calise, Tyler Williams, Mazen Chami Friday September 19th, 2025 at 6:36 PM

Windows 11 is adding another Copilot button nobody asked for by Terrence O’Brien
Friday September 19^th, 2025 at 6:38 PM

Record-Low 35% in US Satisfied With K-12 Education Quality by msmash
Friday September 19^th, 2025 at 6:37 PM

Building Healthier Social Media: Updated Guidelines and New Features by hello@blueskyweb.xyz (Bluesky)
Friday September 19^th, 2025 at 6:36 PM

The Future of AI: Evaluating and optimizing custom RAG agents using Azure AI Foundry by Chang_Liu
Friday September 19^th, 2025 at 6:36 PM

Disciplined Guardrail Development in enterprise application with GitHub Copilot by daisami
Friday September 19^th, 2025 at 6:36 PM

2. **Chat Modes (.github/chatmodes/*.chatmode.md):** These files define specialized chat modes tailored to specific tasks or workflows.

RNR 344 - React Native 0.81 by Frank Calise, Tyler Williams, Mazen Chami
Friday September 19^th, 2025 at 6:36 PM