Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152017 stories
·
33 followers

Prompt Engineering for Spec-Driven Development with SpecKit

1 Share

Introduction

Charlotte Yeo, UCL MEng Computer Science https://www.linkedin.com/in/charlotte-yeo-627476294/

Supervisors: Janaina Mourao-Miranda (UCL) and Lee Stott (Microsoft).

For my final-year MEng project at UCL, I investigated how to get the best results out of SpecKit, a spec-driven AI development framework, by systematically testing different prompt strategies.

Here's what I found.

Project Overview

LLMs are powerful coding assistants, but they struggle to maintain context over long development sessions, leading to hallucinations and inconsistent outputs. SpecKit addresses this by using persistent, structured specification documents as memory throughout the development process. The developer writes a natural language spec; SpecKit builds the software from it.

The problem is that no one has established best practices for writing those specs. This project aimed to fill that gap.

Experiments

I ran 10 experiments, each using SpecKit to build the same target system, a multi-agent AI code verification tool, from a different prompt formulation. The variables I tested included prompt authority, format, level of detail, and output format. By keeping the target software constant, the effect of each prompt change on SpecKit's performance is isolated.

The target system itself used Microsoft Agent Framework, Azure Cosmos DB for RAG, and Microsoft Foundry to access GPT-5.2, all orchestrated via a Python codebase. This covered a wide range of real-world engineering challenges: multi-agent coordination, cloud service integration, and working with a library new enough that the model hadn't been trained on it.

Technical Details

SpecKit runs as a series of commands inside GitHub Copilot in VS Code, powered here by Claude Sonnet 4.5. The workflow moves through seven stages: /constitution → /specify → /clarify → /plan → /tasks → /analyze → /implement. At each stage, SpecKit writes and updates Markdown files that serve as persistent memory, so the session can be paused and resumed without losing context.

Key tools used:

  • Microsoft Agent Framework — agent orchestration
  • Microsoft Foundry — access to LLMs (GPT-5.2, Text Embedding 3)
  • Azure Cosmos DB — code example database for RAG
  • Claude Sonnet 4.5 — model powering SpecKit via GitHub Copilot

Results

 

 

 

 

 

 

These were the key findings:

  • Natural language outperforms machine-readable formats. The JSON prompt (Case 1) took 40% longer and generated significantly more issues than the natural language control.
  • Authority is necessary. Removing the authoritative framing from the prompt (Case 3) caused SpecKit to treat specifications as optional, resulting in the multi-agent system not being built at all until manually corrected. Total time: 4h 53m vs. 2h 24m for the control.
  • Omit what the model already knows. Removing the scoring rubrics (Case 8) saved 34 minutes with no loss in output quality as the model inferred the rubric from context. However, omitting the Cosmos DB schema or agent architecture descriptions caused major implementation errors.
  • The model must be able to read its own outputs. Changing the output to PDF (Case 9), which Claude Sonnet 4.5 cannot read in Copilot, caused the implementation stage to increase significantly to 7h 38m, with 33 required interventions, because the model couldn't verify whether its code was working.

Best Practices Found

The biggest insight is that prompt design has as much impact on SpecKit's performance as prompt content. A complete specification written non-authoritatively or in JSON will produce worse results than a slightly shorter specification written in clear, authoritative natural language.

There is also a trade-off between token count and manual intervention. Shorter prompts are faster, but only when the omitted information is something the model can reliably infer. Leaving out details about unique libraries or architectures will result in higher debugging times later.

Future Development

These are directions for future work in this area: 

  • Running each experiment multiple times to account for model non-determinism
  • Repeating experiments with newer or different LLMs to test generalisability
  • Testing with different target systems beyond code verification
  • Supplying SpecKit with tools (e.g. Playwright MCP) to read outputs it currently cannot access, like live webpages or PDFs

Conclusion

Spec-driven development with SpecKit is a useful approach for building complex software with LLMs, but the quality of your prompt determines the quality of your outcome. For the most effective results, write in natural language, keep the whole prompt authoritative, include detail on novel or library-specific components, design your system's outputs to be readable by the model building them, and leave out only what the model can confidently infer.

If you want to explore the tools used in this project, here are some useful starting points:

Read the whole story
alvinashcraft
49 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Pierce Boggan: AI Workflows - Episode 398

1 Share

https://clearmeasure.com/developers/forums/

Pierce Boggan is the PM Lead for Visual Studio Code and GitHub Copilot at Microsoft, where he guides the product direction of the world's most popular code editor as it evolves into an AI-native development platform. He joined Microsoft through the Xamarin acquisition more than a decade ago and has worked across mobile tools, Visual Studio, and the Teams Toolkit before taking the helm of the VS Code team in late 2024. Pierce co-hosts the VS Code Insiders Podcast, presented in the GitHub Universe 2025 keynote, and recently helped his team make the historic shift from monthly to weekly releases -- powered by AI. He is also the creator of Primer, an open-source CLI that prepares codebases for AI-assisted development.

--------------------------------------------

Mentioned in This Episode

Website 
Twitter / X 
GitHub
Podcast 
Primer

Recent projects / posts:

Agent HQ in VS Code announced (Dec 2025) -- unified view for managing local, background, and cloud AI agents
GitHub Universe 2025 keynote presenter (Nov 2025)
VS Code Insiders Podcast: "VS Code -- 2025 Wrapped" (Dec 2025)
Primer CLI -- prepares repos for AI-assisted development (423 stars)
nano-banana-mcp -- MCP server enabling image creation in GitHub Copilot
VS Code team moved from monthly to weekly releases (Mar 2026 interview) 

----------------------------------------
Want to Learn More?
Visit AzureDevOps.Show for show notes and additional episodes.





Download audio: https://traffic.libsyn.com/clean/secure/azuredevops/Episode_398.mp3?dest-id=768873
Read the whole story
alvinashcraft
49 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

511: Terminals, Remote Sessions, No More Watches!

1 Share

In episode 511 James and Frank dig into audio gear (why impedance matters and the pros/cons of hardware DSP and headphone high‑power modes), explore GitHub Copilot CLI's new Remote Sessions for mobile access to your local dev environment, and reveal a clever hack: MIDI‑over‑USB turns modern iPhones into reliable wired controllers for embedded hardware and robotics. Tune in for practical tips, safety notes on power/latency, and real‑world developer workflows.

Follow Us

⭐⭐ Review Us ⭐⭐

Machine transcription available on http://mergeconflict.fm

Support Merge Conflict





Download audio: https://aphid.fireside.fm/d/1437767933/02d84890-e58d-43eb-ab4c-26bcc8524289/d0679d95-c8cc-4041-900f-6722c44e19db.mp3
Read the whole story
alvinashcraft
49 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Remove sign-up from Entra External ID user flows

1 Share

This article shows how to remove the sign-up flow from Entra External ID user flows. This is required because SMS and Phone validation can be abused by bots to run up costs on the tenant. The bots create accounts and start a phone validation or a SMS validation which is charged to the tenant. The intent of this attack is just to cause costs.

SMS or Phone verification should not be used in an unauthenticated flow.

Any IAM or user management system which does not support passkeys or Authenticator apps at the least should not be used. 2FA, MFA should be possible without inducing a usage cost.

Graph authentication using OAuth

An Azure App registration is required with the Graph application permission EventListener.ReadWrite.All granted. A user secret and can be added and the application client ID, tenant ID are required. The following script uses the Azure App registration.

Powershell script

The following script is used to disable the sign-up process on a Entra External ID tenant. Thanks to Marc Rufer who supported me in creating the Powershell script.

#Requires -Version 7.0
#Requires -Modules @{ ModuleName="Microsoft.Graph.Authentication"; ModuleVersion="2.35.1" }
#Requires -Modules @{ ModuleName="Microsoft.Graph.Identity.SignIns"; ModuleVersion="2.35.1" }

# Create a App registration for the client credentials flow
# EventListener.ReadWrite.All
PARAM
(
    [Parameter(Mandatory = $true, Position = 0, HelpMessage = "Id of the Entra External ID tenant")]
 	[string] $tenantId
    ,
    [Parameter(Mandatory = $true, Position = 1, HelpMessage = "Application (Client) Id of the app registration with IdentityUserFlow.ReadWrite.All permissions")]
    [string] $applicationId
	,
	[Parameter(Mandatory = $true, Position = 2, HelpMessage = "Client secret for the app registration with the graph permissions")]
    [string] $clientSecret
	,
	[Parameter(Mandatory = $true, Position = 3, HelpMessage = "Client Id for the app registration with the graph permissions")]
    [string] $clientId
)

$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $clientId, (ConvertTo-SecureString -String $clientSecret -AsPlainText -Force)
Connect-MgGraph -TenantId $tenantId -Credential $cred

$response = Get-MgIdentityAuthenticationEventFlow -Filter "microsoft.graph.externalUsersSelfServiceSignUpEventsFlow/conditions/applications/includeApplications/any(appId:appId/appId eq '$applicationId')"
 
$userFlowId = $response.Id
 
$body = @{
    "@odata.type" = "#microsoft.graph.externalUsersSelfServiceSignUpEventsFlow"
    "onInteractiveAuthFlowStart" = @{
        "@odata.type"   = "#microsoft.graph.onInteractiveAuthFlowStartExternalUsersSelfServiceSignUp"
        "isSignUpAllowed" = $false
    }
}
 
Update-MgIdentityAuthenticationEventFlow -AuthenticationEventsFlowId $userFlowId -BodyParameter $body
 

Using the script

The Powershell scrip can be used by setting the correct parameters.

$tenantId = "Entra-External-ID-tenant-id"
$appId = "Application-(Client)-ID-from-user-flow"
$clientSecret = "Azure-App-Registration-Client-Secret"
$clientId = "Azure-App-Registration-Application-(Client)-ID"

.\Disable-SignUpInExternalIdUserFlow.ps1  -tenantId $tenantId -applicationId $appId -clientSecret $clientSecret -clientId $clientid

Note

Once the script has been run and executed, delete the Azure App registration on the tenant.

Links

https://learn.microsoft.com/en-us/entra/external-id/customers/how-to-disable-sign-up-user-flow



Read the whole story
alvinashcraft
49 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Identity Governance Checklist You Wish You Had Six Months Ago

1 Share

Let me guess how this goes. You build a login page. You wire up registration. You hash passwords, set up MFA, ship it, and move on to the features that actually make your product interesting. Then, three months later, someone from legal walks over (or pings you on Slack, because nobody walks anywhere anymore) and asks: "Can you show me an audit trail of every admin action on user accounts for the last 90 days?"

And you stare at your screen, because you don't have one.

I have been that developer. Most of us have. Identity governance sounds like something that belongs in a compliance PDF, not in your sprint backlog. But here is the thing: when a regulation says "immutable audit trail," a developer writes that code. When GDPR says "consent must be versioned, purpose-specific, and withdrawable," a developer designs the schema accordingly. When data residency rules say "EU user data stays in the EU," a developer builds that routing logic.

Governance is a development problem. And if you are building a user management system, the checklist below is what separates you from a very uncomfortable audit.

Read the whole story
alvinashcraft
50 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The simplest way to secure a Minimal API (With Swagger)

1 Share

Secure your .NET Minimal API quickly using API key authentication, with full Swagger support for testing and protecting endpoints.

The page The simplest way to secure a Minimal API (With Swagger) appeared on Round The Code.

Read the whole story
alvinashcraft
50 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories