
In the era of agentic AI, the bottleneck for scaling projects has shifted from code volume to code integrity. To move at the speed AI allows without introducing systemic friction, we must stop treating AI as a magic wand and start treating it as a high-performance engine that requires a perfectly calibrated track.
Recently, I’ve been reflecting on what it truly means to scale an open-source project for a new wave of contributors. We are moving into an era where agentic AI can generate code so rapidly that sheer volume is no longer the bottleneck; instead, the real challenge is ensuring this influx doesn’t compromise our codebases, or our shared values as teammates. Many teams are splitting into a dumbbell shape: traditional manual coding on one end and high-velocity agentic power-users on the other.
For large apps, our infrastructure must serve as the handle that connects these two ends of the spectrum, creating a shared space where traditional rigor and agentic speed don’t just coexist, but actively harden each other. To scale effectively, we must lean into the basics by anchoring our workflows in the fundamentals of reproducible software development. To move at the speed these tools allow without drowning in technical debt or unreadable codebases, our environments must provide a definitive, high-signal feedback loop.
In this agentic context, source code health is measured strictly by passing tests, deterministic coverage metrics, and clean analysis. If these fundamentals are missing, it’s like asking a high-performance engine to drive at breakneck speed through the dark. It is perfectly reasonable — and often responsible — to feel the need to push back against these new tools if the environment isn’t ready for them.
The goal here isn’t to force a tool onto a project for the sake of speed; it’s to build the digital sensors that allow us to adopt that new velocity safely. More importantly, this transition is about bringing our teammates along for the ride. Infrastructure shouldn’t be a barrier that separates the “AI power users” from the rest of the team; it should be the common ground where we all feel confident. Being a good teammate in the agentic era means ensuring that though our repositories might move faster, our culture remains supportive. We build these systems so that everyone can move with confidence, ensuring that “agentic speed” is a win for the whole team, not just a new risk for an existing functioning software product.
However, these digital sensors are only as reliable as the environment in which they live. This is why we must anchor our trust in our CI systems. In a truly reproducible workflow, the CI is the only version of reality that matters; it is the final arbiter for an agent that lacks human intuition. If a change doesn’t pass the automated gates, it effectively doesn’t exist. By treating the CI as a high-signal feedback loop, we remove the human bottleneck of manual verification and replace it with a reproducible gate. This absolute source of truth allows us to scale more responsibly — enabling agents to fail fast, iterate, and ultimately succeed without compromising the integrity of the project or the peace of mind of the team.
A one-off prompting success is just a fluke. True scale comes from creating a flywheel: every time you or a teammate returns to the repository, the agent should be more capable than it was the last time because knowledge debt has been paid forward into a shared repository of prompts. You are building a shared institutional memory that isn’t trapped in a single developer’s head.
To make this repeatable for any developer on your team, you need a library of prompt instructions that acts as executable documentation. While many teams rush to create global “config” and “rules” files, I find these often lead to inconsistent results — an agent might “forget” a rule or misapply it to the wrong context. As scientifically minded engineers, we should prefer to remove variables. By controlling the exact prompting for a specific task, you ensure the agent has exactly what it needs for that particular task within your repository. Doing so in an intentional manner in a project is what helps both individuals and teams build genuine confidence in these new tools.
In late 2025, Anthropic introduced Agent Skills — an open standard designed to provide AI agents with specialized, modular expertise. Rather than just generic tools, Skills package instructions and resources together to ensure complex tasks are performed consistently. This is moving fast; as of early 2026, many tools are already adding support for this standard to enable cross-platform portability.
See documentation for your tool to see where to add these files.
To ensure your Skills library remains a “Runbook for Robots” rather than just a list of suggestions, every prompt in this collection follows a strict set of best practices:
---
name: pr-prep
description: Prepare current work for PR
---
## Prepare current work for PR: Verify Tests Passing and Cleanup
**Objective:** Verify the current test suite status with `flutter test`, clean up any temporary modifications, and harden test coverage for active files.
**Instructions:**
1. **Baseline:**
* Run `dart fix --apply` to apply automated fixes.
* Run `flutter analyze` to ensure no analysis issues.
* Run `flutter test` to establish the current passing state.
* Run `flutter test integration_test/app_test.dart` to verify integration integrity.
2. **Fix Failures:** If there are any test failures or analysis errors, investigate and resolve them. Prioritize fixing code over deleting tests.
3. **Cleanup:** Review any currently modified files (run `git status` or check the diff). Remove any:
* `print` / `debugPrint` statements.
* Unused imports.
* Commented-out code blocks.
* Temporary "hack" fixes that should be replaced with proper solutions.
4. **Verify & Expand:**
* For the files you touched or cleaned up, check if there are obvious edge cases missing from their unit tests. Add tests to cover these cases.
* Run `flutter analyze` again to ensure clean code.
* Run `flutter test` again to ensure cleanup didn't break anything.
* Repeat this step as necessary.
5. **Report & Review:**
* Summarize the cleanup status (e.g., "Tests passing, removed 3 debug prints").
* **Action:** Ask the user to review the changes closely to ensure no intended code was accidentally removed.
* **Do not commit or push.**
* Provide a suggested Git commit message (e.g., "Prepare for PR: Fix tests and remove debug code").
---
name: single-file-test-coverage
description: Write a new test or modify an existing test
---
## Single File Test Coverage Improvement
**Objective:** Write a new single test file, or modify an existing file, to improve coverage for a specific target class.
**Instructions:**
1. **Identify Target:** Choose a single source file (Dart) in `lib/` that has low or no test coverage and is suitable for unit testing (e.g., utility classes, logic helpers).
3. **Establish Baseline:**
* Run `flutter analyze` to ensure validity.
* Run `flutter test` to ensure the project is stable.
* Run `flutter test --coverage` and check `coverage/lcov.info`.
4. **Implement/Update Test:** Create a new test file in `test/` or update the existing one. Focus on:
* Edge cases (null inputs, empty strings, boundary values).
* Branch coverage (ensure if/else paths are exercised).
* Mocking dependencies where necessary (using `mockito` or `mocktail`).
5. **Verify & Iterate:**
* Run the tests to ensure they pass.
* Run `flutter analyze` to ensure no regressions.
* If coverage is still low, **iterate a few times**: analyze missed lines/branches and add targeted test cases.
5. **Report & Review:**
* Summarize what was fixed/covered and report coverage progress (e.g., `X% -> Y%` for `<filename>`).
* **Action:** Ask the user to review the new tests closely.
* **Do not commit or push.**
* Provide a suggested Git commit message (e.g., "Improve test coverage for [Class Name]").
---
name: migrate-to-modern-dart-features
description: Migrate to modern Dart features (Dart 3+)
---
## Migrate to Modern Dart Features
**Objective:** Optimize consistency and conciseness by migrating to modern Dart features (Dart 3+).
**Candidates for Migration:**
* `if-else` chains -> `switch` expressions.
* Data classes with manual `==`/`hashCode` -> `Records` or `equatable` (or class modifiers).
* Null checks -> pattern matching.
**Instructions:**
1. **Baseline:** Run `flutter test` and `flutter analyze`.
2. **Select Target:** Identify a *single* migration opportunity.
3. **Constraint:** Keep the change extremely small (**max 50 lines**).
4. **Migrate:** Refactor to use the new feature.
5. **Verify:**
* Run `flutter analyze`.
* Run `flutter test` to ensure no regressions.
6. **Report & Review:**
* Summarize the migration.
* **Action:** Ask the user to review the changes closely.
* **Test Location:** Explicitly state where in the app the user should go to manually test the change (e.g., "Click the bottom button after the app opens").
* **Do not commit or push.**
* Provide a suggested Git commit message (e.g., "Refactor: Use switch expression in [Class Name]").
The long game of the agentic era isn’t about the volume of code you produce today; it’s about responsibly architecting the infrastructure and cultural expectations for the repositories of tomorrow. By building the handle of the dumbbell, ensuring that as our tools move faster, they remain anchored to our shared values and technical rigor.
In the very near future, agents will move from reactive assistants to proactive contributors — checking in on your repository and resolving issues while you sleep. By hardening your high-signal feedback loops and documenting your workflows in a version-controlled suite of Skills, you are doing more than just saving time; you are training the agents of the future to respect your specific architecture, your vision, and your standards. Most importantly, you are ensuring that as the engine of development accelerates, your entire team has the digital sensors and shared knowledge to move forward with collective confidence and a stronger, more resilient engineering culture.
Jaime’s build context: Prompt engineering as infrastructure was originally published in Flutter on Medium, where people are continuing the conversation by highlighting and responding to this story.
This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk.
"Another thing which I also observed is that there is some benefit to be able to run your load test in your native... using your native platform, libraries, protocol access; those type of things. Because in our case, for example, we use Orleans and it's a proprietary protocol which doesn't exist in in Java in Scala language. The same about, almost the same, was about Signal R: Microsoft released SignalR for Java, but the quality of this library was different."— Anton Moldovan
Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem.
Today, we're joined by Anton Moldovan to talk about load testing, advice for testing strategies, and how NBomber can help you to load test your applications. Are you sure that your application can handle 4 million users at once? Better load test it before you start boasting.
"We call this type of test, like, "user journey." Like, end-to-end simulating user journey across entire applications. So end-to-end, end-to-end flow, end-to-end tests. But this type of test they they have some downsides."— Anton Moldovan
Along the way, we talked the different types of testing involved in getting your application for production, the many different ways that NBomber (or other load testing suites) can help you prepare for that, and Anton helps us understand a little more about functional programming.
Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes.
Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET.
The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/from-chaos-to-control-anton-moldovan-on-load-testing-with-nbomber/
Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend.
And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch.
You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast.
Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show.
Editing and post-production services for this episode were provided by MB Podcast Services.
This week, we discuss the end of Cloud 1.0, AI agents fixing old apps, and Chainguard vs. Docker images. Plus, the mystery of Dutch broth is finally solved.
Watch the YouTube Live Recording of Episode 556
I've created pr-shadow with vibecoding, a tool that maintains a shadow branch for GitHub pull requests(PR) that never requires force-pushing. This addresses pain points Idescribed in Reflectionson LLVM's switch to GitHub pull requests#Patch evolution.
GitHub structures pull requests around branches, enforcing abranch-centric workflow. When you force-push a branch after a rebase,the UI displays "force-pushed the BB branch from X to Y". Clicking"compare" shows git diff X..Y, which includes unrelatedupstream commits—not the actual patch difference. For a project likeLLVM with 100+ commits daily, this makes the comparison essentiallyuseless.
Inline comments suffer too: they may become "outdated" or misplacedafter force pushes.
Additionally, if your commit message references an issue or anotherPR, each force push creates a new link on the referenced page,cluttering it with duplicate mentions. (You can work around this byadding backticks around the link text, but it is not ideal.)
Due to these difficulties, some recommendations suggest less flexibleworkflows that only append new commits and discourage rebases.However, this means working with an outdated base, and switching betweenthe main branch and PR branches causes numerous rebuilds-especiallypainful for large repositories like llvm-project.
In a large repository, avoiding rebases isn't realistic—other commitsfrequently modify nearby lines, and rebasing is often the only way todiscover that your patch needs adjustments due to interactions withother landed changes.
pr-shadow maintains a separate PR branch (e.g.,pr/feature) that only receives commits—never force-pushed.You work freely on your local branch (rebase, amend, squash), then syncto the PR branch using git commit-tree to create a commitwith the same tree but parented to the previous PR HEAD.
1 | Local branch (feature) PR branch (pr/feature) |
Reviewers see clean diffs between C1 and C2, even though theunderlying commits were rewritten.
When a rebase is detected (git merge-base withmain/master changed), the new PR commit is created as a merge commitwith the new merge-base as the second parent. GitHub displays these as"condensed" merges, preserving the diff view for reviewers.
1 | # Initialize and create PR |
The tool supports both fork-based workflows (pushing to your fork)and same-repo workflows (for branches likeuser/<name>/feature). It also works with GitHubEnterprise, auto-detecting the host from the repository URL.
The name "prs" is a tribute to spr, which implements asimilar shadow branch concept. However, spr pushes user branches to themain repository rather than a personal fork. While necessary for stackedpull requests, this approach is discouraged for single PRs as itclutters the upstream repository. pr-shadow avoids this by pushing toyour fork by default.
I owe an apology to folks who receiveusers/MaskRay/feature branches (if they use the defaultfetch = +refs/heads/*:refs/remotes/origin/* to receive userbranches). I had been abusing spr for a long time after LLVM'sGitHub transition to avoid unnecessary rebuilds when switchingbetween the main branch and PR branches.
Additionally, spr embeds a PR URL in commit messages (e.g.,Pull Request: https://github.com/llvm/llvm-project/pull/150816),which can cause downstream forks to add unwanted backlinks to theoriginal PR.
If I need stacked pull requests, I will probably use pr-shadow withthe base patch and just rebase stacked ones - it's unclear how sprhandles stacked PRs.
In this podcast, Shane Hastie, Lead Editor for Culture & Methods, spoke to Nick Gillian about building cross-functional teams for physical AI innovation, growing engineering culture through positive tensions, and navigating the journey from technical execution to organizational influence.
By Nick Gillian