Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152696 stories
·
33 followers

Microsoft engineer says native apps are back, and it could finally revive Windows 11’s fight against web apps

1 Share

A distinguished Engineer at Microsoft has suggested that native apps are back, and it appears to align with the company’s recent Windows 11 revival efforts.

Web apps now dominate the Windows Store, which is the company’s preferred source of getting apps for PCs, especially for those who want more safety and security on Windows 11. The Microsoft Store has gotten a lot better over the years in terms of performance and hosts apps made using a variety of frameworks.

When Microsoft gave developers more choices to build apps in whichever way they feel like, it was widely believed to be a great move to encourage developers to bring their apps to Windows 11 through the Microsoft Store.

This led to a lot of popular apps, including Netflix and WhatsApp, ditching their native Windows apps that were made using native frameworks like WinUI and replacing them with WebView2-based Progressive Web Apps (PWAs). In our tests, Windows Latest observed that WhatsApp uses up to 600MB of RAM on a PC with 8GB of RAM when doing nothing.

The latest WhatsApp using 600MB RAM on a PC with 8GB RAM, while doing nothing

It’s not just a problem with WhatsApp, which is built on top of WebView2. Electron-based Discord uses up to 4GB of RAM, and it even has a feature that quietly restarts the app and reduces RAM usage.

On the other hand, while PWAs are lightweight, they often miss important features like offline mode available on their native counterparts. We’ve seen Windows users vent their frustrations on platforms like Reddit for what many perceived as an alarming trend of too many apps taking the PWA route, which ultimately ruined the overall OS experience.

If you are one of those users, it seems that Microsoft has taken note of all those complaints and has started taking some concrete steps to improve the app situation in Windows 11.

Microsoft’s plan to improve the apps on Windows 11

A few months ago, Rudy Huyn, a Partner Architect at Microsoft working on the Store and File Explorer, officially confirmed that Microsoft plans to build 100% native apps for Windows 11. Huyn didn’t divulge many details about when the plan will come to fruition.

Now, David Fowler, a distinguished Engineer at Microsoft, put up a post on X saying that, “Native apps are back.” This clearly shows that Microsoft is still aiming to make apps “100%” native for Windows 11.

David Fowler has been with Microsoft for more than a decade, and he is closely associated with .NET, ASP.NET Core, and Microsoft’s developer platform work.

David’s statement that native apps are back confirms he is talking about Windows 11, where most native apps have been replaced with web wrappers. His post appears to hint at an internal engineering signal.

It supports our previous reporting that Microsoft has already started moving key Windows 11 experiences away from web-based components. For those unaware, the Start menu is shifting from React-based shell pieces to WinUI to reduce latency and improve performance.

Neither Fowler nor Huyn disclosed key details on how it’ll pull it off, but in all likelihood, the recently released .NET 10 will play a huge role in achieving that goal.

.NET 10 has what the company calls Native AOT (Ahead of time), which is said to significantly reduce the startup time of apps. It also uses less memory, which should be a huge relief even for developers at Microsoft.

WebView/PWA problem affects Microsoft’s own apps

MIcrosoft Edge package in Copilot app
MIcrosoft Edge package in Copilot app

The company’s web-based Copilot app is a resource hog, as it uses more RAM. In our testing, Copilot uses up to 500MB of RAM in the background, and it reaches up to 1GB when you start using it.

.NET 10 should be able to stop this kind of instance if developers embrace it instead of sticking to web-based technologies or cross-platform tools like React Native, Flutter, and more.

New Windows 11 Start menu with Pinned apps, Recommendations and Category view for all apps
New Windows 11 Start menu with Pinned apps, Recommendations and Category view for all apps

While native apps sound great on paper, one of the biggest challenges that is in front of Microsoft is convincing developers to make more native apps for Windows.

It’ll be interesting to see if the company incentivizes native app development to increase the number of native apps on the Microsoft Store. But before this, the Redmond-based tech giant has to show the world the benefits of its renewed app efforts by making many of its own apps “100%” native on Windows 11.

The post Microsoft engineer says native apps are back, and it could finally revive Windows 11’s fight against web apps appeared first on Windows Latest

Read the whole story
alvinashcraft
45 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Mark Cuban: AI Hype vs. Reality, OpenAI "Shitting Away" $1 Trillion, Lebron vs. Jordan

1 Share

Mark Cuban is an entrepreneur and investor. Cuban joins Big Technology Podcast to discuss how artificial intelligence is reshaping business, software, jobs, and education. Tune in to hear why he believes AI is an exponential shift, how companies should rebuild themselves around it, and why curiosity will matter more than ever in the AI era. We also cover the economics of foundation models, which software companies are most vulnerable, how young people should build careers with AI, and his thoughts on Sam Altman, Dario Amodei, and the future of the NBA. Hit play for a lively conversation on who wins, who loses, and how to stay ahead as AI transforms the economy.


Filmed live at the Dallas Regional Chamber's Convergence AI event.


---

Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice.

Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b


Learn more about your ad choices. Visit megaphone.fm/adchoices





Download audio: https://pdst.fm/e/tracking.swap.fm/track/t7yC0rGPUqahTF4et8YD/pscrb.fm/rss/p/traffic.megaphone.fm/AMPP6999535097.mp3?updated=1777465101
Read the whole story
alvinashcraft
46 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

🚀 Which Flutter State Management Should You Use? (Complete Developer Guide)

1 Share

Flutter makes building beautiful apps easy — but as your app grows, managing data across screens becomes challenging.

You may have faced issues like:

UI not updating properly
Data not syncing between screens
Too many unnecessary rebuilds
👉 This is exactly where state management becomes essential.

In this guide, we’ll break down everything — from basics to advanced approaches — so you can confidently choose the right solution.

🧠** What is State Management?**
In simple terms:

👉 State = Any data that changes in your app

Examples:

Counter value
API response
User login status
Theme (dark/light)
👉 State Management = How you manage and update that data efficiently across your app

Without proper state management:

Your UI becomes unpredictable
Code becomes hard to maintain
Scaling becomes difficult
🔰 1. setState (The Simplest Way)
🧾 What it is
Built-in Flutter method to update UI when state changes.

📦 Example :

class CounterPage extends StatefulWidget {
  const CounterPage({super.key});

  @override
  State<CounterPage> createState() => _CounterPageState();
}

class _CounterPageState extends State<CounterPage> {
  int _count = 0;

  void _increment() {
    setState(() {         // triggers a rebuild
      _count++;
    });
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      body: Center(child: Text('Count: $_count')),
      floatingActionButton: FloatingActionButton(
        onPressed: _increment,
        child: const Icon(Icons.add),
      ),
    );
  }
}

✅ When to use :
Purely local UI interactions with no cross-widget communication — toggle buttons, form field validation, simple animations, local loading spinners.

🌱 2. Provider
🧾 What it is
A wrapper around InheritedWidget for structured state management.

📦 Example :

// 1. Define a ChangeNotifier
class CartProvider extends ChangeNotifier {
  final List<String> _items = [];
  List<String> get items => _items;

  void addItem(String item) {
    _items.add(item);
    notifyListeners(); // triggers rebuild in listeners
  }
}

// 2. Wrap your tree with ChangeNotifierProvider
ChangeNotifierProvider(
  create: (_) => CartProvider(),
  child: const MyApp(),
)

// 3. Read or watch in any descendant widget
final cart = context.watch<CartProvider>();
Text('${cart.items.length} items in cart')

When to use :
Small-to-medium apps with straightforward state that doesn’t need complex async logic. Great for your first real Flutter project beyond tutorials.

3. Riverpod (Modern Approach)
🧾 What it is
A more powerful and safer version of Provider.

📦 Example :

// pubspec.yaml
// riverpod_annotation: ^4.0.2
// riverpod_generator: ^4.0.3

part 'auth_notifier.g.dart';

// Class-based notifier with codegen
@riverpod
class AuthNotifier extends _$AuthNotifier {
  @override
  AuthState build() => const AuthState.initial();

  Future<void> signIn(String email, String password) async {
    state = const AuthState.loading();
    try {
      final user = await AuthService().signIn(email, password);
      state = AuthState.authenticated(user);
    } catch (e) {
      state = AuthState.error(e.toString());
    }
  }
}

// In a ConsumerWidget — no BuildContext magic needed
class AuthScreen extends ConsumerWidget {
  const AuthScreen({super.key});

  @override
  Widget build(BuildContext context, WidgetRef ref) {
    final authState = ref.watch(authNotifierProvider);

    return authState.when(
      initial: () => const LoginScreen(),
      loading: () => const CircularProgressIndicator(),
      authenticated: (user) => HomeScreen(user: user),
      error: (msg) => ErrorView(message: msg),
    );
  }
}

When to use :
Medium-to-large apps where you want clean architecture, excellent async support, and testability without Bloc’s ceremony. Excellent choice for solo developers and small teams building production apps.

🧱** 4. Bloc / Cubit (Enterprise-Level)**
🧾 What it is
A structured pattern using streams.

Bloc → Event-driven
Cubit → Simpler version
📦 Example :

// State class
class AuthState {
  final bool isAuthenticated;
  final String? userId;
  const AuthState({required this.isAuthenticated, this.userId});
}

// Cubit — logic lives here, NOT in the widget
class AuthCubit extends Cubit<AuthState> {
  AuthCubit() : super(const AuthState(isAuthenticated: false));

  Future<void> signIn(String email, String password) async {
    final user = await AuthService.signIn(email, password);
    emit(AuthState(isAuthenticated: true, userId: user.id));
  }

  void signOut() => emit(const AuthState(isAuthenticated: false));
}

// Widget — just listens, zero logic
BlocBuilder<AuthCubit, AuthState>(
  builder: (context, state) {
    return state.isAuthenticated
        ? const HomeScreen()
        : const LoginScreen();
  },
)

When to use :
Large-scale apps with complex business logic, multiple async operations, strict testability requirements, or teams that need predictable, traceable state transitions. Common in enterprise and fintech Flutter apps.

5. GetX (Fast & Lightweight)
🧾 What it is
All-in-one solution (state + routing + dependency injection).


📦 Example :

// Controller — reactive variables with .obs
class ProfileController extends GetxController {
  final RxString name = ''.obs;
  final RxBool isLoading = false.obs;

  Future<void> loadProfile() async {
    isLoading.value = true;
    final data = await ApiService().getProfile();
    name.value = data.name;
    isLoading.value = false;
  }
}

// Widget — Obx auto-rebuilds when .obs changes
class ProfilePage extends GetView<ProfileController> {
  @override
  Widget build(BuildContext context) {
    return Obx(() {
      if (controller.isLoading.value) {
        return const CircularProgressIndicator();
      }
      return Text(controller.name.value);
    });
  }
}

// Navigation — no BuildContext needed
Get.to(() => const ProfilePage());
Get.back();

When to use :
Prototypes, hackathons, small personal projects, or when you need to move very fast and want everything under one roof. Use with caution in large team environments.

🧩 Real-World Use Cases
🟢 Small Apps
Use:

  • setState
  • Provider 👉 Example: Forms, basic apps

🟡 Medium Apps
Use:

  • Provider
  • GetX 👉 Example: Dashboard, e-commerce

🔵 Large Apps
Use:

  • Riverpod
  • Bloc 👉 Example: Production apps with APIs

🔴 Team / Enterprise
Use:

  • Bloc
  • Riverpod 👉 Better maintainability and structure

🏁** Final Recommendation**
👶 Beginner
Start with:

  • setState → then Provider
    🧑‍💻 Intermediate
    Use:

  • Riverpod (recommended)

  • or GetX
    🧠 Advanced / Production
    Use:

  • Riverpod (modern + scalable)

  • Bloc (enterprise structure)

💡 Final Thoughts
There is no single “best” state management solution.

👉 The right choice depends on:

  • App complexity

  • Team size

  • Development speed

  • Maintainability needs

Read the whole story
alvinashcraft
47 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Your PS5 can now transform into a Linux PC

1 Share

A developer has created a method to get Linux running on some versions of Sony's PlayStation 5 console. Andy Nguyen previously showed off a ported version of Ubuntu running PC games on a PS5 last month, and he's now published the installation steps on GitHub this week.

This is a soft mod, so it won't persist between power downs or restarts, but the Linux installation will let you play PC games once it's up and running. So far we've seen GTA V running with enhanced ray tracing at 60fps in Ubuntu on a PS5, as well as Spider-Man running at 1440p resolution and 60fps.

Nguyen is relying on a patched vulnerability to transform a PS5 into a Linux …

Read the full story at The Verge.

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Don’t Automate Your Moat: Matching AI Autonomy to Risk and Competitive Stakes

1 Share

I was talking to a senior engineer at a well-funded company not long ago. I asked him to walk me through a critical algorithm at the heart of their product, something that ran hundreds of times a second and directly affected customer outcomes. He paused and said, “Honestly, I’m not totally sure how it works. AI wrote it.”

A few weeks later, a different engineer at another company was paged about a system outage. He pulls up the failing service and realizes he has no idea it’s connected to a database. A colleague accepted the AI-generated PR three months ago that added that dependency. The tests passed. The change was never written down. The original engineer moved on and the knowledge was lost.

These aren’t new stories. Engineers have always inherited systems they didn’t fully build. What’s new is the disguise and the speed. AI is an amazing enabler. Organizations must adopt it to remain relevant. Yet the emerging pattern—describe what you want, let an agent iterate until it works, pay for it in tokens instead of engineering hours—is functionally a buy decision wearing a build costume. The code is in your repo. Your engineers merged the PR. It feels like you built it. But if nobody on your team understands why it works the way it does, you’ve purchased a dependency you can’t maintain from a vendor you can’t call.

AI doesn’t create that gap once. It widens it continuously at a pace that outstrips the organizational habits that once kept it manageable. Two problems compound at once. You can’t extend the thing that makes you hard to replace. And when it breaks, the incident lands on a team that doesn’t understand what they’re fixing, turning a recoverable outage into a customer-facing crisis. Engineering leaders have wrestled with build-versus-buy tradeoffs for decades, and the hard-won lesson has always been the same: You don’t outsource your competitive advantage. The token-funded generation loop doesn’t change that calculus. It makes it easier to skip the question entirely.

The question that matters isn’t “Can AI do this?” If it can’t today, it will be able to tomorrow. And the argument that follows does not depend on the quality of the AI-generated code. This article covers two questions most engineering organizations have never asked at the same time. Most teams optimize for velocity and never ask what they’re risking or giving away in the process. The gap between those unasked questions is where the most expensive mistakes are already being made.

Part 1: Two dimensions. Neither is velocity.

Moving faster matters. But velocity alone misses the two dimensions that determine whether AI autonomy helps or hurts your business.

Business risk: What’s the blast radius if this fails? A bug in an internal CLI tool costs you an afternoon. A bug in your authentication logic costs you customers and possibly market cap. A bug in your core pricing algorithm costs you the business. These are not the same.

Competitive differentiation: Does this code define your business? Your moat is your architecture, your performance characteristics, your core algorithms, and the product decisions baked into your infrastructure. But it’s also the institutional knowledge that shaped them: the reasoning behind the trade-offs, the context that no model was trained on. If your competitors can generate the same code with the same model you’re using, it stops being an advantage.

Most organizations ask the first question on a good day. Almost none ask the second. That gap is how you end up shipping fast into a moat nobody can explain and nobody can extend.

Understanding why both dimensions matter starts with velocity and what happens when the feedback loop around it breaks.

Velocity feels real. Debt is often invisible.

AI coding tools are genuinely impressive. GitHub’s research showed 55% faster task completion with Copilot in controlled conditions.1 That number has driven an assumption that faster is always better.

A 2025 METR randomized controlled trial2 found something that should give every engineering leader pause. Sixteen experienced developers on real production codebases forecasted they’d complete tasks 24% faster with AI. After finishing, they estimated they’d gone 20% faster. They’d actually gone 19% slower.

The velocity finding is striking. But the perception gap matters more. The feedback loop between “how am I doing?” and “how am I actually doing?” was broken throughout and never corrected itself. This doesn’t resolve the velocity debate. It reframes it. The danger isn’t that individuals move too fast. Organizations mistake output volume for productivity and strip out the review processes that used to catch what that gap costs.

A Tilburg University study of open source projects after GitHub Copilot’s introduction found the same pattern at the organizational level.3 Productivity did increase, but primarily among less-experienced developers. Code written after AI adoption required more rework to meet repository standards. The added rework burden fell on the most experienced (core) developers who reviewed 6.5% more code after Copilot’s introduction and saw a 19% drop in their own original code output. The velocity looks real at the surface. Underneath, the maintenance cost shifts upward to the people who can least afford to lose productive time.

That broken feedback loop has a name. Researchers call it cognitive debt4: the growing gap between how much code exists in your system and how much of it anyone actually understands. Technical debt shows up in your linter and your backlog. Cognitive debt is invisible. There’s no signal telling engineers where their understanding ends. That’s precisely what the METR perception gap showed. It never corrected itself.

Research by Anthropic Fellows found that engineers using AI assistance when learning new tools scored 17% lower on comprehension tests than those who coded by hand, with the steepest drops in debugging ability.5 MIT’s Media Lab found the same pattern in writing tasks: Brain connectivity was weakest in the group using LLM assistance, strongest in the group working without tools.⁴ Active production builds understanding. Passive consumption doesn’t.

You understand what you build better than what you review. When you write code, you produce output and build a mental model. That’s what Peter Naur called the “theory of the program.” It lives in your head, not in the repo.6 The MIT study captured this directly. 83% of participants who wrote essays with LLM assistance could not quote a single sentence from essays they had just written.⁴

Cognitive debt is invisible until it isn’t. When it surfaces, it hits both dimensions hard, in different ways.

Business risk: The blast radius of not knowing

On the business risk dimension, cognitive debt is a safety problem.

When nobody fully understands the system, the blast radius of a failure expands silently. The incident that eventually comes (and it always comes) lands on a team that can’t diagnose what they didn’t build. The engineer pulling up the failing service at 2 AM has no mental model of why it was built the way it was, what it connects to, or what the edge cases look like under load. So they ask the LLM. It can explain what the code does and often propose a reasonable fix. It can’t tell you why it was designed that way. And a fix that looks right to the model can quietly violate constraints that nobody thought to document.

Cognitive debt compounds a second, independent risk: the pace at which AI-generated code reaches production. OX Security’s analysis7 of over 300 software repositories found that AI-generated code isn’t necessarily more vulnerable per line than human-written code. The problem is velocity.

Code review, debugging, and team oversight are the bottlenecks that catch vulnerable code before it ships. AI makes it easy to remove them. CodeRabbit’s analysis of real-world pull requests found AI-authored changes contain up to 1.7x more critical and major defects than human-written code, with logic and correctness issues up 75%.8 Apiiro’s analysis found that while AI reliably reduces surface-level syntax errors, architectural design flaws and privilege escalation paths (the categories automated scanners miss and human reviewers struggle to catch) spiked in AI-assisted codebases.9

AI accelerates output and accelerates unreviewed risk in equal measure. The cognitive debt means that when something breaks, the team is learning the system as they’re trying to fix it. Remove their understanding and you haven’t streamlined the process. You’ve only removed the thing standing between a bad day and a catastrophic one.

Competitive differentiation: What you give away without knowing it

The competitive differentiation risk isn’t that AI will generate your exact competitive algorithm and hand it to your competitor. It’s subtler. Your advantage was never the code itself; it was the judgment that shaped it. When AI writes that code, the judgment never forms. The code arrives, but the understanding that would let your team extend it, improve it, or defend it under pressure doesn’t. Your moat is most likely to survive in the places AI finds hardest to reach.

That judgment—formed by the performance trade-offs that took years to tune, the failure modes that only someone who’s been paged understands, the architectural decisions that encode domain knowledge nobody wrote down—doesn’t live in the codebase. It lives in your engineers’ heads.

And here’s the part most teams miss: Your competitor with the same AI tools doesn’t just get similar code, they get a team that also doesn’t understand why it works the way it does, which means neither of you can extend it, and the race to the next architectural move is a coin flip rather than a compounding advantage. The build-versus-buy discipline exists precisely because decades of experience taught engineering organizations that outsourcing your core means losing the ability to extend it. The token-funded generation loop doesn’t change that calculus. It makes it easier to mistake the outsourcing for ownership because the code has your name on it.

The structural problem runs even deeper. Models trained on public code produce outputs weighted toward well-represented patterns, the common solutions to common problems. Research confirms this. LLM performance drops sharply on less-common programming languages where training data is sparse, and on genuinely novel implementations. Even the best current models correctly implement fewer than 40% of coding tasks drawn from recent research papers.10 And the convergence problem extends beyond code. A pre-registered experiment tracking 61 participants over seven days found that while ChatGPT consistently boosted creative output during use, performance reverted to baseline the moment the tool was unavailable.11 More critically, the work produced with AI assistance became increasingly homogenized over time. That homogenization persisted even after the tool was removed. The participants hadn’t borrowed the tool’s output. They’d internalized its patterns. For engineering organizations, this is the differentiation risk made concrete: Teams that rely on AI for their most critical design decisions risk generating commodity code today and training themselves to think in commodity patterns tomorrow.

Engineers who deeply own their most critical systems are better at diagnosing incidents and see the next architectural move that competitors can’t follow. Delegate that comprehension away and you can keep the lights on. You can’t see around corners.

When it goes wrong, it really goes wrong

Both dimensions rest on the same vulnerability: cognitive debt accumulating on work that matters. The failure cases make it concrete.

The production failures are accumulating. A Replit AI agent deleted months of production data in seconds after violating explicit code-freeze instructions, then initially misled the user about whether recovery was possible.12 Reports emerged in early 2026 of a major cloud provider convening mandatory engineering reviews after a pattern of high-blast-radius incidents, with AI-assisted code changes cited as a contributing factor. In each case, the humans in the loop either didn’t understand what they were approving, or weren’t in the loop at all.

The deeper pattern predates AI tools entirely. Knight Capital Group took seventeen years to become the largest trader in U.S. equities. It took forty-five minutes to lose $460 million.13 The culprit was a nine-year-old piece of deprecated code called Power Peg, left on production servers and never retested after engineers modified an adjacent function in 2005. When engineers reused its feature flag for new functionality in 2012, nobody understood what they were reactivating. When the fault surfaced, the team’s attempt to fix it made things worse. They uninstalled the new code from the seven servers where it had deployed correctly, which caused Power Peg to activate on those servers too and compounded the losses. The SEC’s enforcement order is unambiguous: absent deployment procedures, no code review requirements, no incident response protocols. A failure of institutional comprehension where the mental model had quietly evaporated while the code kept running.

No AI tool wrote that code. The failure was entirely human, through entirely normal processes: engineers leaving, tests never rerun after refactors, flags reused without documentation. This is the baseline, what software organizations produce under ordinary conditions over nine years. An engineering team with modern AI tools won’t recreate this specific bug. They’ll create the conditions for the next one faster: more code that nobody fully understands, more dependencies nobody documented, more cognitive debt accumulating before anyone notices. AI removes the friction that once slowed exactly this kind of erosion.

None are failures of AI capability. They’re failures of judgment about where to deploy AI and how much human oversight to maintain.

Part 2: A four-quadrant model for AI autonomy

The quadrants

Human involvement in programming quadrants

Four quadrants emerge when both questions are asked together. Before the examples, two contrasts are worth naming because the quadrants that look most similar on the surface are the ones most often confused in practice.

Supervised automation versus Human-led craftsmanship. Both demand high human involvement. Both feel like “be careful here.” But the difference is fundamental. In Supervised Automation, the human is a safety gate. The work is a commodity; you’re there to catch errors before they escape. In Human-led craftsmanship, the human is the author. You’re building the mental model that lets the next engineer reason about this system under pressure three years from now and take it somewhere new. The code isn’t something you need to verify. It’s something you need to own. And ownership here extends beyond the individual engineer. The team writes RFCs, debates trade-offs, identifies which parts of the implementation fall into which quadrant, and makes sure the reasoning behind key decisions is shared, not siloed. Human-led craftsmanship isn’t one person writing code alone. It’s a team making sure the understanding survives the people who built it.

Collaborative co-creation versus Human-led craftsmanship. Both involve high differentiation, and in both, the human drives the vision and owns the key decisions. But risk changes everything about how you work. In Collaborative co-creation, early iterations are recoverable. A wrong turn can be corrected before it costs you anything serious, so AI can genuinely accelerate execution. In Human-led craftsmanship, the blast radius of not understanding what you’ve built compounds over time. Wrong turns become load-bearing walls, and the architectural moves you can’t see are the ones that let competitors catch up. AI assists with scoped subtasks only. Every contribution gets interrogated.

In full automation, the human is a director. You define what needs to be done, AI produces the output, and you spot-check the result. The work is low-risk and low-differentiation. If something’s wrong, you fix it in the next iteration without anyone outside the team noticing. This is where AI earns its keep without qualification, and where restricting it costs you real velocity with nothing to show for it.

To make all four quadrants concrete, we’ll use a single feature as a lens: building AI Gateway cost controls, the system that sets token budgets per agent, enforces spending limits, tracks usage by model and agent, and handles enforcement modes when an agent exceeds its budget.

Low risk, low differentiation: Full automation

API docs for cost controls. Test scaffolding for token limit scenarios. Config examples for per-agent budgets. Every platform has docs, and if there’s a mistake, you fix it in the next iteration without anyone outside the team noticing. Humans set direction and spot-check. AI writes, tests, and ships.

The test: If this is wrong, can you fix it before a customer sees it or complains? If yes, automate freely.

Low risk, high differentiation: Collaborative co-creation

Designing the UX for the token usage dashboard. Iterating on routing rules that determine when an agent degrades to a cheaper model, halts entirely, or triggers a notification. These decisions separate a sophisticated platform from a blunt on/off switch, but early iterations are recoverable. A first version that doesn’t surface guardrail costs separately isn’t a disaster. It’s a product conversation. Humans drive the design vision and interrogate AI on trade-offs. AI accelerates execution and handles boilerplate.

The test: If you flipped the ratio (AI deciding, human rubber-stamping) would you be comfortable? If not, this requires genuine co-creation, not delegation. The human should be able to explain the trade-offs in the current design and know where to push it next.

High risk, low differentiation: Supervised automation

Enforcement logic that halts an agent when it hits its token budget. Every cost control system needs enforcement, so this isn’t differentiating. But if it fails, agents run unconstrained and rack up unbounded LLM spend. AI can draft the logic. A human must trace every path and understand every state transition before signing off. The question before merge: Can I explain exactly what happens when an agent hits the limit mid-execution? Can I explain this behavior to Customer Success or the Customer?

The test: Could a competent engineer review this confidently without having written it? If yes, the human’s job is to verify, not to author. But the bar for verification is explanation, not approval.

High risk, high differentiation: Human-led craftsmanship

The core token metering and attribution engine. It tracks usage per agent and per model, attributes guardrail costs separately so they don’t count against agent budgets, and provides the auditability enterprise customers need to govern AI spend. Get it wrong and customers can’t trust the numbers. Get it right and it’s a genuine competitive moat that competitors can’t replicate with the same AI tools you’re using.

Human engineers own the design end-to-end. AI assists on scoped subtasks once the design is settled: drafting specific functions, generating test coverage for paths the engineer has already reasoned through. Every contribution gets interrogated. The bar is whether the engineer could explain it in an incident review without looking at the code first.

The test: If the engineer who built this left tomorrow, would the team still understand why it works the way it does? Could they make it better? If the honest answer is no, you’re accumulating the most dangerous kind of cognitive debt there is.

The counterargument (it’s a good one)

Any engineering leader will push back here, and they’ll have good reason to.

The research is thin. METR’s study had 16 developers. MIT’s EEG work is a preprint that its own critics say should be interpreted conservatively.14 The Anthropic comprehension study shows a quiz score gap, not a business outcome. The evidence is early-stage. Intellectual honesty requires acknowledging that.

But the pattern keeps showing up in unrelated fields. A Lancet study found that endoscopists who routinely used AI for polyp detection performed measurably worse when the AI was removed, with adenoma detection rates dropping from 28.4% to 22.4% in three months.15 The study is observational and small. But the direction is consistent with everything else: Routine AI assistance may erode the skills it was supposed to support.

Most engineering work isn’t high-stakes. Studies consistently estimate that 60–80% of engineering time goes to maintenance, tests, docs, integration, and tooling, exactly the stuff that belongs in the automate quadrant regardless. Restricting AI because of the top 20% creates a real tax on the other 80%.

And can’t engineers develop deep ownership of AI-generated code through study and iteration? Partially. But the behavioral data tells a harder story. GitClear’s analysis of 211 million changed lines shows a decline in refactored code since AI adoption accelerated.16 Engineers aren’t studying AI-generated code carefully. They’re moving on to the next feature. LLM tools can explain what code does; they can’t tell you why the system was designed the way it was.17

The serious pro-AI argument isn’t “use AI everywhere.” It’s more precise: The guardrails for verification and oversight are improving fast, engineers who actively interrogate AI output build understanding even from generated code, and the organizations that restrict AI on their most critical work will fall behind competitors who don’t. This is a real argument.

The answer isn’t to dismiss it but to sharpen what “critical work” means. And, to recognize that the interrogative use of AI that the research identifies as understanding-preserving requires organizational discipline that most teams haven’t built yet. The quadrant isn’t permanent. The threshold shifts as both AI capability and human oversight practices mature. The discipline is the habit of asking both questions honestly before you start, not a fixed answer to them.

The discipline is simple. Maintaining it isn’t.

The quadrant tells you where to be careful. How you engage AI once you’re there determines whether careful is enough. The difference between “write me this function” and “explain why you made this trade-off, and what breaks if the input is malformed” is the difference between borrowing intelligence and developing it. Active, interrogative AI use preserves comprehension. Passive delegation destroys it. That’s what the Anthropic study’s behavioral data shows directly.

Match your review process to the quadrant. AI-generated docs and test scaffolding get a spot-check. AI-generated code touching your core product logic gets the same scrutiny as a junior engineer’s first PR. The bar for approval isn’t “tests pass.” It’s “someone on this team can explain what this does, defend it under pressure, and use that understanding to make it better.” Full automation needs a spot-check. Human-led craftsmanship needs an RFC, a team review, and shared ownership of the reasoning before anyone writes a line of code.

This matters especially in real-time data and AI infrastructure, systems where the most dangerous failure modes are emergent, appearing at scale and under load in combinations the code itself doesn’t express. Recognize that the threshold will shift. As AI capability improves, what belongs in the automate quadrant expands. The discipline isn’t a fixed answer. It’s the habit of asking both questions honestly before you start. It’s a core reason Redpanda is designed for simplicity and predictability: engineers need to be able to reason about how infrastructure behaves under pressure, not discover it during an incident.18

The real competitive question

The companies that get this right won’t be the ones that use the most AI or the least. They’ll be the ones whose leaders have internalized that risk and differentiation are independent variables, and that cognitive debt threatens both.

The engineer who doesn’t know how their algorithm works is a symptom. The organization that allowed it is the cause.

Treat cognitive debt as only a risk problem and you end up with engineers who can’t diagnose failures they didn’t build. Treat it as only a differentiation problem and you get fragile systems that survive until the next incident. Let it accumulate on your most critical systems and you get both at once.

Your competitor is making this calculation right now. The question isn’t whether to use AI. It’s whether you’re being honest about which quadrant you’re in, and whether your team will know the answer when it finally matters.


Co-authored with Claude (Anthropic). Yes, we took the advice from this article.


Footnotes

  1. Peng, S. et al. (2023). The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. https://arxiv.org/abs/2302.06590 ↩
  2. Becker, J., Rush, N. et al. (2025). Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. METR. https://arxiv.org/abs/2507.09089 ↩
  3. Xu, F., Medappa, P.K., Tunc, M.M., Vroegindeweij, M., & Fransoo, J.C. (2025). AI-Assisted Programming May Decrease the Productivity of Experienced Developers by Increasing Maintenance Burden. Tilburg University. https://arxiv.org/abs/2510.10165 ↩
  4. Kosmyna, N. et al. (2025). Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task. MIT Media Lab. https://arxiv.org/abs/2506.08872 (preprint, not yet peer-reviewed) ↩
  5. Shen, J.H. & Tamkin, A. (2026). How AI Impacts Skill Formation. Anthropic Safety Fellows Program. https://arxiv.org/abs/2601.20245 ↩
  6. The generation effect: Rosner, Z.A. et al. (2012). The Generation Effect: Activating Broad Neural Circuits During Memory Encoding. Cortex. https://pmc.ncbi.nlm.nih.gov/articles/PMC3556209/ and Bertsch, S. et al. (2007). The generation effect: A meta-analytic review. Memory & Cognition. https://link.springer.com/article/10.3758/BF03193441 and Naur, P. (1985). Programming as Theory Building. Microprocessing and Microprogramming. https://pages.cs.wisc.edu/~remzi/Naur.pdf ↩
  7. OX Security. (October 2025). Army of Juniors: The AI Code Security Crisis. https://www.helpnetsecurity.com/2025/10/27/ai-code-security-risks-report/ ↩
  8. CodeRabbit. (December 2025). State of AI vs Human Code Generation Report. https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report. Note: CodeRabbit produces AI code review tooling; findings should be read in that context. ↩
  9. Apiiro. (September 2025). 4x Velocity, 10x Vulnerabilities: AI Coding Assistants Are Shipping More Risks. https://apiiro.com/blog/4x-velocity-10x-vulnerabilities-ai-coding-assistants-are-shipping-more-risks/. Note: Apiiro produces application security tooling; findings should be read in that context. ↩
  10. Joel, S., Wu, J.J., & Fard, F.H. (2024). A Survey on LLM-based Code Generation for Low-Resource and Domain-Specific Programming Languages. ACM TOSEM. https://arxiv.org/abs/2410.03981. See also: Hua, et al. (2025). ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code. https://arxiv.org/abs/2506.02314 ↩
  11. Liu, Q., Zhou, Y., Huang, J., & Li, G. (2024). When ChatGPT is Gone: Creativity Reverts and Homogeneity Persists. https://arxiv.org/abs/2401.06816 ↩
  12. Fortune. (July 2025). AI-Powered Coding Tool Wiped Out a Software Company’s Database in ‘Catastrophic Failure.’ https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure/ ↩
  13. Knight Capital Group. SEC Administrative Proceeding, Release No. 70694 (October 16, 2013). https://www.sec.gov/litigation/admin/2013/34-70694.pdf. Levine, M. (2013). Knight Capital’s $440 Million Compliance Disaster. Bloomberg. https://www.bloomberg.com/opinion/articles/2013-10-17/knight-capital-s-440-million-compliance-disaster ↩
  14. Stankovic, M. et al. (2025). Comment on: Your Brain on ChatGPT. https://arxiv.org/abs/2601.00856 ↩
  15. Budzyń, K., Romańczyk, M. et al. (2025). Endoscopist Deskilling Risk After Exposure to Artificial Intelligence in Colonoscopy: A Multicentre, Observational Study. Lancet Gastroenterol Hepatol. 10(10):896-903. https://doi.org/10.1016/S2468-1253(25)00133-5 ↩
  16. Harding, W. (2025). AI Copilot Code Quality: Evaluating 2024’s Increased Defect Rate via Code Quality Metrics. GitClear. https://www.gitclear.com/ai_assistant_code_quality_2025_research ↩
  17. Zhou, X., Li, R., Liang, P., Zhang, B., Shahin, M., Li, Z., & Yang, C. (2025). Using LLMs in Generating Design Rationale for Software Architecture Decisions. ACM TOSEM. https://arxiv.org/abs/2504.20781. See also: Tang, N., Chen, M., Ning, Z., Bansal, A., Huang, Y., McMillan, C., & Li, T.J.-J. (2024). A Study on Developer Behaviors for Validating and Repairing LLM-Generated Code Using Eye Tracking and IDE Actions. IEEE VL/HCC 2024. https://arxiv.org/abs/2405.16081 ↩
  18. Gallego, A. (2025). Introducing the Agentic Data Plane. Redpanda. https://www.redpanda.com/blog/agentic-data-plane-adp. Crosier, K. (2026). How to Safely Deploy Agentic AI in the Enterprise. Redpanda. https://www.redpanda.com/blog/deploy-agentic-ai-safely-enterprise ↩



Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

RapidClaw Earns a 44.89 Proof of Usefulness Score by Building AI Co-Founder Agents

1 Share
RapidClaw helps early-stage founders and indie hackers automate startup tasks like investor outreach, pitch decks, market research, and dev work — each agent gets its own isolated server.
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories