Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152600 stories
·
33 followers

Apple picks Google’s Gemini AI for its big Siri upgrade

1 Share

Apple will use Google's Gemini AI model to power a more personalized Siri coming this year. "After careful evaluation, Apple determined that Google's Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users," Google and Apple announced on Monday.

CNBC first reported on the multi-year partnership, which will allow Apple to use Google Gemini and the company's cloud technology for its future models. "These models will help power future Apple Intelligence features, including a more personalized Siri coming this year," Apple and Google say, a …

Read the full story at The Verge.

Read the whole story
alvinashcraft
51 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Leaked Windows 11 Feature Shows Copilot Moving Into File Explorer

1 Share

Microsoft is testing a hidden 'Chat with Copilot' button in Windows 11 File Explorer, signaling deeper AI search and a coming Agent Launchers framework.

The post Leaked Windows 11 Feature Shows Copilot Moving Into File Explorer appeared first on TechRepublic.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Want better AI outputs? Try context engineering.

1 Share
Editor’s note: A version of this blog was originally published in The GitHub Insider newsletter. Sign up now for more technical content >

If you’ve ever felt like GitHub Copilot could be even stronger with just a little more context, you’re right. Context engineering is quickly becoming one of the most important ways developers shape, guide, and improve AI-assisted development.

What is context engineering?

Context engineering is the evolution of prompt engineering. It’s focused less on clever phrasing and more, as Braintrust CEO Ankur Goyal puts it, on “bringing the right information (in the right format) to the LLM.”

At GitHub Universe this past fall, Harald Kirschner—principal product manager at Microsoft and longtime VS Code and GitHub Copilot expert—outlined three practical ways developers can apply context engineering today:

  • Custom instructions
  • Reusable prompts
  • Custom agents

Each technique gives Copilot more of the information it needs to produce code matching your expectations, your architecture, and your team’s standards.

Let’s explore all three, so you can see how providing better context helps Copilot work the way you do.

1. Custom instructions: Give Copilot the rules it should follow

Custom instruction files help Copilot understand your:

  • Coding conventions
  • Language preferences
  • Naming standards
  • Documentation style

You can use:

For example, you might define how React components should be structured, how errors should be handled in a Node service, or how you want API documentation formatted. Copilot then applies those rules automatically as Copilot works.

Learn how to set up custom instructions 👉 

2. Reusable prompts: Standardize your common workflows

Reusable prompt files let you turn frequent tasks—like code reviews, scaffolding components, generating tests, or initializing projects—into prompts that you can call instantly and consistently.

Use:

  • Prompt files: .github/prompts/*.prompts.md
  • Slash commands such as /create-react-form to trigger structured tasks

This helps teams enforce consistency, speed up onboarding, and execute repeatable workflows the same way every time.

See examples of reusable prompt files 👉 

3. Custom agents: Create task-specific AI personas

Custom agents allow you to build specialized AI assistants with well-defined responsibilities and scopes. For example:

  • An API design agent to review interfaces
  • A security agent that performs static analysis tasks
  • A documentation agent that rewrites comments or generates examples

Agents can include their own tools, instructions, constraints, and behavior models. And yes, you can even enable handoff between agents for more complex workflows.

Learn how to create and configure custom agents 👉 

Why context engineering matters

The goal isn’t just better outputs, it’s better understanding by Copilot. When you provide Copilot with clearer context:

  • You get more accurate and reliable code.
  • You reduce back-and-forth prompting.
  • You increase consistency across files and repositories.
  • You stay in flow longer instead of rewriting or correcting results.

And the more you experiment with context engineering, the more you’ll discover how deeply it can shape your development experience.

Get started with context engineering in GitHub Copilot >

More resources

The post Want better AI outputs? Try context engineering. appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

If You’re Going to Vibe Code, Vibe Responsibly!

1 Share

Thanks to LLMs, developers are getting a serious productivity boost. With vibe coding – telling AI what you want in plain language and letting it generate the code while you review and tweak it – we’re producing more code than ever.

It only makes sense that, compared to just a few years ago, we’re now expected to maintain far more repositories.

It used to be that developers wrote their own code and, through debugging and researching solutions, learned and memorized how it worked in depth. With AI-generated code, this happens less often. Since we’re not the ones writing it from scratch, it’s easy to forget where things are and how they function.

So what’s the issue?

After some time has passed, when we need to revisit a service or repository, it takes a while to remember how the details actually work.

Making AI code easy to work with

To make life easier, we’ve relied on design principles that improve code readability and make it more scalable and maintainable. These include:

  • KISS (Keep It Simple, Stupid)
  • DRY (Don’t Repeat Yourself)
  • YAGNI (You Aren’t Gonna Need It)
  • Separation of Concerns (SoC)

In a year or two, when most of our code is AI-generated, debugging and expanding the codebase will become significantly harder. That’s because we likely never explored the generated code in depth – we focused on making sure it worked, ran efficiently, and looked right.

If you can’t explain it, it’s too complex

So, how do we address this?

In my opinion, two principles are becoming more important than the rest: Readable Code is Better than Clever Code and KISS (Keep It Simple, Stupid).

Being able to open a codebase and start working immediately is going to be crucial for developers. So how do we make our services more accessible? When reviewing pull requests, one key question should guide every new feature or component:

Is this simple enough that I can read it like a notebook, without having to dig through the code?

And as always, documentation and testing remain essential. One perhaps controversial opinion:

Don’t use AI to generate docs or tests, at least not the first draft.

Rephrasing is fine, but if you can’t explain the code in your own simple words or write straightforward unit and integration tests, the service is either too complex, or you’ve “vibed” a little too hard.

Ask the LLM to generate meaningful metrics and logs

Additionally, when generating code, make sure to prompt the LLM to include meaningful metrics and logs, ones that let you pinpoint issues without even diving into the codebase. Use logs sparingly, but make debug logs detailed and informative.

Imagine opening a repository you’ve never touched before, needing to implement a change or fix a bug. What kind of logs and graphs would make it easy for you to debug the service quickly?

When balancing code simplicity and performance, remember this: while keeping code performant is important, overly complex solutions can hurt future developers (especially juniors) who may need to maintain or fix it later.

Less code, more value

Another key principle is the Pareto principle, or the 80/20 rule, which is widely used across industries, including software development. The idea is that 20% of a developer’s time produces 80% of the value.

Applied to code, you could say that 80% of the code delivers only 20% of the value. So how does this tie into vibe coding?

Sometimes an LLM might “hallucinate” a call to a library that doesn’t exist. A natural next step would be to implement that library yourself – but by then, you may have already gone too far.

I like to think of the Pareto principle in this context, as the Grug-Brained Developer puts it: “80 want with 20 code.” The solution might not have every bell-and-whistle the project manager imagined; it might even be a little rough around the edges, but it works, delivers most of the value, and keeps unnecessary complexity in check. Avoiding extra code like this helps keep the codebase more readable and maintainable in the long run.

Code fast, secure… faster!

With the rise of LLM usage, security leaks have surged – but why?

When coding with LLMs, are you giving them the real context of your application’s deployment – just letting them write code in a test playground? Most developers aren’t.

We’ve seen a YoY increase in shared API keys and other sensitive data on GitHub, and it’s pretty clear there’s a correlation.

Also, be wary of the comments you’re leaving behind.

And for the grand finale – how should we approach this?

  • Ask the AI you’re using to perform a security analysis on the codebase.
  • Use Sonar, Snyk, or another code analysis tool.

In short: If you’re going to vibe code, vibe responsibly!

The post If You’re Going to Vibe Code, Vibe Responsibly! appeared first on ShiftMag.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

MCP Sampling: When Your Tools Need to Think

1 Share

The following article originally appeared on Block’s blog and is being republished here with the author’s permission.

If you’ve been following MCP, you’ve probably heard about tools which are functions that let AI assistants do things like read files, query databases, or call APIs. But there’s another MCP feature that’s less talked about and arguably more interesting: sampling.

Sampling flips the script. Instead of the AI calling your tool, your tool calls the AI.

Let’s say you’re building an MCP server that needs to do something intelligent like summarize a document, translate text, or generate creative content. You have three options:

Option 1: Hardcode the logic. Write traditional code to handle it. This works for deterministic tasks, but falls apart when you need flexibility or creativity.

Option 2: Bake in your own LLM. Your MCP server makes its own calls to OpenAI, Anthropic, or whatever. This works, but now you’ve got API keys to manage and costs to track, and you’ve locked users into your model choice.

Option 3: Use sampling. Ask the AI that’s already connected to do the thinking for you. No extra API keys. No model lock-in. The user’s existing AI setup handles it.

How Sampling Works

When an MCP client like goose connects to an MCP server, it establishes a two-way channel. The server can expose tools for the AI to call, but it can also request that the AI generate text on its behalf.

Here’s what that looks like in code (using Python with FastMCP):

Using Python with FastMCP sampling

The ctx.sample() call sends a prompt back to the connected AI and waits for a response. From the user’s perspective, they just called a “summarize” tool. But under the hood, that tool delegated the hard part to the AI itself.

A Real Example: Council of Mine

Council of Mine is an MCP server that takes sampling to an extreme. It simulates a council of nine AI personas who debate topics and vote on each other’s opinions.

But there’s no LLM running inside the server. Every opinion, every vote, every bit of reasoning comes from sampling requests back to the user’s connected LLM.

The council has nine members, each with a distinct personality:

  • 🔧 The Pragmatist – “Will this actually work?”
  • 🌟 The Visionary – “What could this become?”
  • 🔗 The Systems Thinker – “How does this affect the broader system?”
  • 😊 The Optimist – “What’s the upside?”
  • 😈 The Devil’s Advocate – “What if we’re completely wrong?”
  • 🤝 The Mediator – “How can we integrate these perspectives?”
  • 👥 The User Advocate – “How will real people interact with this?”
  • 📜 The Traditionalist – “What has worked historically?”
  • 📊 The Analyst – “What does the data show?”

Each personality is defined as a system prompt that gets prepended to sampling requests.

When you start a debate, the server makes nine sampling calls, one for each council member:

Council of members 1

That temperature=0.8 setting encourages diverse, creative responses. Each council member “thinks” independently because each is a separate LLM call with a different personality prompt.

After opinions are collected, the server runs another round of sampling. Each member reviews everyone else’s opinions and votes for the one that resonates most with their values:

The council has voted

The server parses the structured response to extract votes and reasoning.

One more sampling call generates a balanced summary that incorporates all perspectives and acknowledges the winning viewpoint.

Total LLM calls per debate: 19

  • 9 for opinions
  • 9 for voting
  • 1 for synthesis

All of those calls go through the user’s existing LLM connection. The MCP server itself has zero LLM dependencies.

Benefits of Sampling

Sampling enables a new category of MCP servers that orchestrate intelligent behavior without managing their own LLM infrastructure.

No API key management: The MCP server doesn’t need its own credentials. Users bring their own AI, and sampling uses whatever they’ve already configured.

Model flexibility: If a user switches from GPT to Claude to a local Llama model, the server automatically uses the new model.

Simpler architecture: MCP server developers can focus on building a tool, not an AI application. They can let the AI be the AI, while the server focuses on orchestration, data access, and domain logic.

When to Use Sampling

Sampling makes sense when a tool needs to:

  • Generate creative content (summaries, translations, rewrites)
  • Make judgment calls (sentiment analysis, categorization)
  • Process unstructured data (extract info from messy text)

It’s less useful for:

  • Deterministic operations (math, data transformation, API calls)
  • Latency-critical paths (each sample adds round-trip time)
  • High-volume processing (costs add up quickly)

The Mechanics

If you’re implementing sampling, here are the key parameters:

Sampling parameters

The response object contains the generated text, which you’ll need to parse. Council of Mine includes robust extraction logic because different LLM providers return slightly different response formats:

Council of Mine robust extraction logic

Security Considerations

When you’re passing user input into sampling prompts, you’re creating a potential prompt injection vector. Council of Mine handles this with clear delimiters and explicit instructions:

Council of Mine delimiters and instructions

This isn’t bulletproof, but it raises the bar significantly.

Try It Yourself

If you want to see sampling in action, Council of Mine is a great playground. Ask goose to start a council debate on any topic and watch as nine distinct perspectives emerge, vote on each other, and synthesize into a conclusion all powered by sampling.



Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Shelf-control: Make a fresh start and declutter with Copilot

1 Share

Hello, Insiders! Let’s face it: After the holiday whirlwind, January’s clean slate beckons us to rescue our spaces from the grips of last year’s clutter. With Microsoft Copilot, you can make your home and workspace shine without any heavy (organizational) lifting!

Celebrate “Clean Your Desk” day

January 12 is National Clean Your Desk Day, and is all about starting the new year with a clear mind and a clean, organized desk. This year, ask Copilot for tips for tackling mess or organizing piles of paperwork.

Sample prompt

Create a 30-minute detox for my desk I can do right now, plus suggest simple cleaning supplies it makes sense to have within reach and a file organizer I can buy online for less than $50.

Copilot’s response

Reset your digital space

There’s something undeniably refreshing about an organized desktop and browser. Have Copilot set you up for success with suggestions for doing a quick and painless digital detox.

Sample prompt

My New Year's resolution for work this year is to better organize my to-dos, digital files, and notes. Give me 5 easy apps or techniques I can use to stay organized and know where to find things when I need them.

Copilot’s response

Get inspired to refresh your home

After the chaos of the holiday, you may want to start the new year off with a fresh look for your living room. Lean on Copilot for a reasonable and motivating checklist and ideas for new furniture or aesthetics on a budget.

Sample prompt

I want my living room décor to reflect every season, but I'm not big on flashy seasonal pieces. Suggest a few elements I can introduce that make it feel seasonally trendy, modern, and fun. My budget is $400.

Copilot’s response

 

 

This January, make room for more joy (and less stuff) with Copilot by your side. Organization has never sounded so fun – or so easy!

 

Learn about the Microsoft 365 Insider program and sign up for the Microsoft 365 Insider newsletter to get the latest information about Insider features in your inbox once a month!

Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories