Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147644 stories
·
33 followers

Two different tricks for fast LLM inference

1 Share

Anthropic and OpenAI both recently announced “fast mode”: a way to interact with their best coding model at significantly higher speeds.

These two versions of fast mode are very different. Anthropic’s offers up to 2.5x tokens per second (so around 170, up from Opus 4.6’s 65). OpenAI’s offers more than 1000 tokens per second (up from GPT-5.3-Codex’s 65 tokens per second, so 15x). So OpenAI’s fast mode is six times faster than Anthropic’s1.

However, Anthropic’s big advantage is that they’re serving their actual model. When you use their fast mode, you get real Opus 4.6, while when you use OpenAI’s fast mode you get GPT-5.3-Codex-Spark, not the real GPT-5.3-Codex. Spark is indeed much faster, but is a notably less capable model: good enough for many tasks, but it gets confused and messes up tool calls in ways that vanilla GPT-5.3-Codex would never do.

Why the differences? The AI labs aren’t advertising the details of how their fast modes work, but I’m pretty confident it’s something like this: Anthropic’s fast mode is backed by low-batch-size inference, while OpenAI’s fast mode is backed by special monster Cerebras chips. Let me unpack that a bit.

How Anthropic’s fast mode works

The tradeoff at the heart of AI inference economics is batching, because the main bottleneck is memory. GPUs are very fast, but moving data onto a GPU is not. Every inference operation requires copying all the tokens of the user’s prompt2 onto the GPU before inference can start. Batching multiple users up thus increases overall throughput at the cost of making users wait for the batch to be full.

A good analogy is a bus system. If you had zero batching for passengers - if, whenever someone got on a bus, the bus departed immediately - commutes would be much faster for the people who managed to get on a bus. But obviously overall throughput would be much lower, because people would be waiting at the bus stop for hours until they managed to actually get on one.

Anthropic’s fast mode offering is basically a bus pass that guarantees that the bus immediately leaves as soon as you get on. It’s six times the cost, because you’re effectively paying for all the other people who could have got on the bus with you, but it’s way faster3 because you spend zero time waiting for the bus to leave.

edit: I want to thank a reader for emailing me to point out that the “waiting for the bus” cost is really only paid for the first token, so that won’t affect streaming latency (just latency per turn or tool call). It’s thus better to think of the performance impact of batch size being mainly that smaller batches require fewer flops and thus execute more quickly. In my analogy, maybe it’s “lighter buses drive faster”, or something.

Obviously I can’t be fully certain this is right. Maybe they have access to some new ultra-fast compute that they’re running this on, or they’re doing some algorithmic trick nobody else has thought of. But I’m pretty sure this is it. Brand new compute or algorithmic tricks would likely require changes to the model (see below for OpenAI’s system), and “six times more expensive for 2.5x faster” is right in the ballpark for the kind of improvement you’d expect when switching to a low-batch-size regime.

How OpenAI’s fast mode works

OpenAI’s fast mode does not work anything like this. You can tell that simply because they’re introducing a new, worse model for it. There would be absolutely no reason to do that if they were simply tweaking batch sizes. Also, they told us in the announcement blog post exactly what’s backing their fast mode: Cerebras.

OpenAI announced their Cerebras partnership a month ago in January. What’s Cerebras? They build “ultra low-latency compute”. What this means in practice is that they build giant chips. A H100 chip (fairly close to the frontier of inference chips) is just over a square inch in size. A Cerebras chip is 70 square inches.

cerebras

You can see from pictures that the Cerebras chip has a grid-and-holes pattern all over it. That’s because silicon wafers this big are supposed to be broken into dozens of chips. Instead, Cerebras etches a giant chip over the entire thing.

The larger the chip, the more internal memory it can have. The idea is to have a chip with SRAM large enough to fit the entire model, so inference can happen entirely in-memory. Typically GPU SRAM is measured in the tens of megabytes. That means that a lot of inference time is spent streaming portions of the model weights from outside of SRAM into the GPU compute4. If you could stream all of that from the (much faster) SRAM, inference would a big speedup: fifteen times faster, as it turns out!

So how much internal memory does the latest Cerebras chip have? 44GB. This puts OpenAI in kind of an awkward position. 44GB is enough to fit a small model (~20B params at fp16, ~40B params at int8 quantization), but clearly not enough to fit GPT-5.3-Codex. That’s why they’re offering a brand new model, and why the Spark model has a bit of “small model smell” to it: it’s a smaller distil of the much larger GPT-5.3-Codex model5.

OpenAI’s version is much more technically impressive

It’s interesting that the two major labs have two very different approaches to building fast AI inference. If I had to guess at a conspiracy theory, it would go something like this:

  • OpenAI partner with Cerebras in mid-January, obviously to work on putting an OpenAI model on a fast Cerebras chip
  • Anthropic have no similar play available, but they know OpenAI will announce some kind of blazing-fast inference in February, and they want to have something in the news cycle to compete with that
  • Anthropic thus hustles to put together the kind of fast inference they can provide: simply lowering the batch size on their existing inference stack
  • Anthropic (probably) waits until a few days before OpenAI are done with their much more complex Cerebras implementation to announce it, so it looks like OpenAI copied them

Obviously OpenAI’s achievement here is more technically impressive. Getting a model running on Cerebras chips is not trivial, because they’re so weird. Training a 20B or 40B param distil of GPT-5.3-Codex that is still kind-of-good-enough is not trivial. But I commend Anthropic for finding a sneaky way to get ahead of the announcement that will be largely opaque to non-technical people. It reminds me of OpenAI’s mid-2025 sneaky introduction of the Responses API to help them conceal their reasoning tokens.

Is fast AI inference the next big thing?

Seeing the two major labs put out this feature might make you think that fast AI inference is the new major goal they’re chasing. I don’t think it is. If my theory above is right, Anthropic don’t care that much about fast inference, they just didn’t want to appear behind OpenAI. And OpenAI are mainly just exploring the capabilities of their new Cerebras partnership. It’s still largely an open question what kind of models can fit on these giant chips, how useful those models will be, and if the economics will make any sense.

I personally don’t find “fast, less-capable inference” particularly useful. I’ve been playing around with it in Codex and I don’t like it. The usefulness of AI agents is dominated by how few mistakes they make, not by their raw speed. Buying 6x the speed at the cost of 20% more mistakes is a bad bargain, because most of the user’s time is spent handling mistakes instead of waiting for the model6.

However, it’s certainly possible that fast, less-capable inference becomes a core lower-level primitive in AI systems. Claude Code already uses Haiku for some operations. Maybe OpenAI will end up using Spark in a similar way.


  1. This isn’t even factoring in latency. Anthropic explicitly warns that time to first token might still be slow (or even slower), while OpenAI thinks the Spark latency is fast enough to warrant switching to a persistent websocket (i.e. they think the 50-200ms round trip time for the handshake is a significant chunk of time to first token).

  2. Either in the form of the KV-cache for previous tokens, or as some big tensor of intermediate activations if inference is being pipelined through multiple GPUs. I write a lot more about this in Why DeepSeek is cheap at scale but expensive to run locally, since it explains why DeepSeek can be offered at such cheap prices (massive batches allow an economy of scale on giant expensive GPUs, but individual consumers can’t access that at all).

  3. Is it a contradiction that low-batch-size means low throughput, but this fast pass system gives users much greater throughput? No. The overall throughput of the GPU is much lower when some users are using “fast mode”, but those user’s throughput is much higher.

  4. Remember, GPUs are fast, but copying data onto them is not. Each “copy these weights to GPU” step is a meaningful part of the overall inference time.

  5. Or a smaller distil of whatever more powerful base model GPT-5.3-Codex was itself distilled from. I don’t know how AI labs do it exactly, and they keep it very secret. More on that here.

  6. On this note, it’s interesting to point out that Cursor’s hype dropped away basically at the same time they released their own “much faster, a little less-capable” agent model. Of course, much of this is due to Claude Code sucking up all the oxygen in the room, but having a very fast model certainly didn’t help.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Evolving AI Plans

1 Share

The world of AI is moving faster than most can keep up, by the time you have mastered one pattern, several others have cropped up. If you do not keep a check on yourself, you could be left behind. If you thought it was tough in the 80/90’s (I am old, I get it) with how fast things were evolving, then buckle up, we are now reaching lightspeed.

[!NOTE]

Even my older AI blog posts are starting to feel old already, but I am enjoying the challenge!

New patterns for a new age

The Bueller quote heading this article has never been more true, it seems almost daily I am reviewing the news, latest posts from ALL of the major partners involved with Agents and LLM’s for new patterns, ideas and suggestions. Thankfully, everyone is riffing off each other in a good way, someone suggests or posts something, then someone else builds on that and goes that bit further, whether it is:

  • A new LLM logic.
  • Improved MCP interfaces and tools.
  • Patterns, plans and guidance.
  • Token reuse or conservation.

The list goes on, in some ways, it is like the best parts of collaborative science (in some ways, a dark dystopian nightmare), ever creeping forward.

This post was sparked after I followed up on some of these reports and started to see REAL IMPROVEMENT in several key factors in my own research, namely:

  • Reduced hallucinations (not gone completely, but significantly reduced).
  • Improved memory (yes, you can have memory if you do it right)
  • Improved outcomes.

It almost sounds too true, but my workflow has improved, especially with some dedicated research projects I continually use to test new patterns.

[!WARNING] This is NOT a silver bullet, to get more out, you MUST put more in, and thus the cost, depending on what you are using will INCREASE.

My recommendation, as is becoming more of the goal in today’s market, is to HOST your own LLM infrastructure (they have it running on Raspberry Pi’s now). It is possible with a little effort, and you can reduce your downstream costs to all but the critical path.

Sadly, this is out of scope for this article, but I will try to add more later.

So, what are these patterns I tease?

Evolving patterns

The updated patterns I have started employing recently fall into two categories with a single shared theme:

Shared theme - The living document

The core of the recent updates is to create LIVING DOCUMENTS at the core of any process. This is a SINGULAR document (and you MUST be explicit, as some agents, looking at you Claude, will randomly go off and create masses of documents you did not ask for) which ALL agent calls must update with their state, or you inform it to update at critical junctures.

I have found this works far better than using the Memory MCP, or constantly updating the instructions guide for the agent, and as a separate document, you can create references to it from all other material, including the instructions. In testing, this has been far more efficient and rigorously followed than instructions or memory alone.

  • You start the session with the instructions pointing to the living document
  • The agent reads the current state with suggested next steps
  • It validates any failures or bad paths from previous attempts
  • It then plans ahead!

Your mileage may vary, but this singular change VASTLY improves my outcomes. Granted, at the cost of the additional tokens needed to read the document, hence the additional cost. You “can” summarise the document in Agent format if you wish, but I have found this actually degrades its performance and use.

Patterns

Now to the other meat on the bone, the revised patterns that consume the living document. I have long stated that LLM’s are more tuned to the small tasks, a quick fix, diagnosis, but where you need to dig deeper, you have to keep a better handle on the outcome, thus these are the methods I have now turned to using:

It all starts with a defined and reviewed plan

By far my biggest improvement through LLM use is to focus and narrow its path, to think and to constantly challenge its assumptions. It is NOT foolproof, but it greatly improves your chances of a positive outcome. The whole approach (which is token-heavy) is an almost constant state of review, approval, implementation and review, which when implemented, looks something like this:

  • First Agent session - Create the PRD (Product Definition) from a detailed set of requirements, this ensures a list of preferences and outcomes.
  • Build an instruction document using the PRD as a reference. Guiding the steps for planning, with KEY steps to avoid creating documents unless requested (stops the multi-document scenario). This might sound counterintuitive, however, they serve very distinct purposes. One directs how to think, the other tells the LLM what to think about.

    [!IMPORTANT] Ensure to leave off with a note about the aforementioned Living Document, instruct the agent to check the Living document at the start of a prompt and update it with its intent and finish the query with a statement in the living document as to the outcome. This trains and helps the Agent learn, as well as something to pick up on if you start a completely fresh session.

  • Next, start a completely new session (even with another agent if you wish), define the session as a reviewer / architect actor and have it review and improve the plan. Take note of what has changed and follow its thinking. Make corrections where YOU disagree. It is essential you are part of this review process, as key decisions that were maybe not clear in the design become apparent (as often happens in life, poor design leads to poor outcomes).

    [!NOTE] You can repeat the previous step a few times with different defined actors, different personas, just as you would in real life. The best plans involve a team, and in this case, it is a team of agents AND YOU!

  • At the end, ensure to begin the plan with a Living document record as it delivers the implementation.

Once you are happy with the plan, the next step begins, the following two patterns can be taken:

Plan big but follow your own path

With the plan in hand, instruct the Agent to document and detail the plan for implementation, ALWAYS ensure you finish with “If you have questions, ask them before beginning” (If you do not, it will not, and it ABSOLUTELY SHOULD). The outcome choice is up to you:

  • A singular implementation document.
  • Several documents, one for each stage or component, ordered for implementation.
  • A mix, depending on your style, backend first, then frontend and UX.

Ultimately, this is a plan YOU will follow, this is my preferred mode as it gives more opportunity to question as you implement or even change your choices. All the agent has effectively done is ratify the architectural state based on your inputs, it is arguably the most human way to use the tools to achieve your goals.

If anything is unclear, ask the LLM to clarify, make changes or explain something “BUT ONLY TO THE PLAN”, the Agent should NOT touch the code, that is your domain.

[!NOTE] In my experience, when instructing the planning phase, I also interject to ask the Agent to explain each section or block of code, as to its intention, what it is meant to achieve, or what it is supposed to do. This helps guide the implementation as to whether it is the best thing to do. And if you disagree, get it to update its plan, or make your own implementation and then get the Agent to update the spec, then review for ancillary impacts.

This approach feels more like the Agent working cooperatively than running the show and seeing what it comes out with.

[!IMPORTANT] At each stage, with each change, ensure to instruct the Agent to update the living document. It might be recorded in the Instructions, but I feel safer double-checking at critical points.

At all points, question everything, it is simply good for the soul and makes the challenge more fun. All that is left is to continue to the end, test everything and the result I have found to be vastly improved.

[!TIP] Another fun step, but at the cost of tokens, is to ask the Agent from time to time to review what you have actually implemented against the plan and give a status report (helps to reduce human error), marking the document with AI’s favourite thing, Icons and Emoji’s. As well as providing a summary, as well as giving you a visible checkpoint of your progress.

Automated plan, automated deployment

The automated plan follows a similar track to the manual implementation path, except you ensure to break out the plan into repeatable and testable sections. It is not as efficient as the manual path, but for shorter and more throwaway experiments, it can be beneficial.

[!NOTE] In some cases, I am actually running the two approaches in Parallel! Using the fast route to test theories in advance before incorporating them into plans, similar to prototyping.

Rather than the human doing the implementation, the agent is running the show, but CRITICALLY, not all at once. To avoid hallucinations and dreams of code, each section must be complete, testable and ultimately HUMAN VERIFIABLE.

An example of this was a total conversion of an old game sample in C++ using some REALLY legacy assets. The plan broke down as follows (granted after so many failed attempts in the past):

  • Review the project and break up / document the systems and content of the sample.
  • Research old asset formats and create detailed documentation of their makeup (critical for migration).
  • Build out sample sets of each content type and define a migration strategy.
  • For each asset type, build individual pipelines, with human verification at each step.
  • Then implement the migration in individual sessions or phases, each phase does NOT complete until the human in the process signs off on it.
  • Then plan the implementation, in a stackable way, each implementation building on the last, again with human signoff.

Doing this phased approach with signoff greatly avoids mass delusion and lots of wasted tokens on something that can never ultimately work.

[!IMPORTANT] The human as part of the process is essential, not just for the quality of the output, but also as an efficiency surrounding costs and tokens. Yes, it is more effort than vibe-coding a website or app, but the results are FAR superior.

Conclusion

Thus ends this page in our journey, but the destination is still far from sight.

Learning should never end and we should always strive for improvement, and in my humble view, this is in Cooperation with an agent and not just handing over the keys on a prompt, no matter how detailed, as it is ultimately flawed. We learn more along the journey, take those understandings and evolve a better plan. It is strange, in the many years as a Project Manager, designer, QA and even analyst, this is ultimately how we humans work better, we plan, we question, we revise and ultimately deliver a better outcome. It might not be perfect, but from that, we continue to build better.

Enough musing, back to the code, which is where I feel most at home, with my new pal sitting on my shoulder. (Still not sure whether it is a little angel or devil, but let us see where this leads)

Read the whole story
alvinashcraft
16 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Weekly Roundup – February 14, 2026

1 Share

Welcome to our Healthcare IT Today Weekly Roundup. Each week, we’ll be providing a look back at the articles we posted and why they’re important to the healthcare IT community. We hope this gives you a chance to catch up on anything you may have missed during the week.

Why Health IT Still Struggles to Move As One System. OntarioMD CEO Robert Fox connected with Colin Hung to explain why integration complexity increases when care delivery shifts beyond the solo physician model – and how AI can enable collaboration and bring care teams together. Read more…

Guidance from The Sequoia Project on Computable Consent and Privacy. Mel Soliz and Kevin Day, co-chairs of the organization’s Privacy and Consent Workgroup, outlined why laws written from a policy perspective are hard to translate into clinical and technical terms, and how The Sequoia Project is trying to hep. Read more…

Timely and Insightful Value-Based Care Data. Value-based care partners need to agree on a small set of shared metrics, Shweta Shanbhag at PointClickCare told John Lynn. That makes it easier to share data and glean insights from it in real-time, not weeks after the fact. Read more…

Epic Hosting in the Public Cloud. Former health system CTO Dr. Tim Calahan, now at EHC Consulting, joined John to spell out how transitioning the EHR to the cloud transforms IT teams from infrastructure caretakers to strategic enablers. He also talked about Epic’s evolving thoughts on public cloud, as well as using AI in the cloud. Read more…

Is It a Tech Problem or a Policy Problem in Value-Based Care? We asked the Healthcare IT Today community to weigh in on this question. While opinions were divided, one theme emerged: The problem is primarily misalignment between policy and technology. Read more…

How Ambient AI Helped Two Clinics Break the Cycle of Pajama Time. Colin caught up with Dr. Derrick Hamilton at Juniper Health and Kathy Halcomb at White House Clinics to learn about using NextGen Healthcare’s Ambient Assist to help clinicians reclaim nights and weekends – and boost staff morale. Read more…

Life Sciences Today Podcast: Increasing Productivity in Clinical Research. Yendou CEO Zina Sarif chatted with Danni Lieberman about creating value for trial sponsors and research organizations through thoughtful site selection and relationship management. Read more…

CIO Podcast: Implementing Oracle. Michael Archuleta at Colorado’s Mt. San Rafael Hospital and Clinics joined John to discuss what drew the health system to Oracle Health and its Clinical AI Agent, as well as how they’re preparing for EHR implementation success. Read more…

Revolutionizing Healthcare with Agentic AI: The Breakthroughs Hospitals and Health Plans Can’t Afford to Overlook. We all know that AI agents are all the rage.  Everyone is discussing the future of agents and how they’re going to change the world.  This piece looks at agentic AI from a healthcare perspectives. Read more…

AI Is Already Practicing Medicine. Is Pharma Ready? Effective use of AI for life sciences organizations means balancing governance and competency development, said William Soliman at ACMA – and it works best when it’s treated as a cross-functional priority. Read more…

Health Plan AI Has a Costly Data Problem. Unreliable provider data inflates administrative expenses and undermines ROI for digital transformation, according to Megan Schmidt at Madaket. The ideal fix is an infrastructure approach that prevents data fragmentation at its source. Read more…

How Modularity Is Rebuilding the Healthcare AI Stack. Danish engineering has long favored modular parts that interlock cleanly (the country is the home of LEGO, after all). Composable AI systems would letorganizations focus on specific use cases instead of rebuilding the same foundations repeatedly, said Andreas Cleve at Corti. Read more…

When Lean Six Sigma Meets AI: How Hospitals Redefine Process Excellence. AI is transforming Lean Six Sigma from a periodic improvement exercise into a system of continuous intelligence, said Albert Adusei Brobbey at SigmaSenseAI. In this sense, AI is a natural evolution of Lean Six Sigma ant not an outright replacement. Read more…

Data Analytics and Predictive Modeling’s Role in Identifying High Risk Patients and Optimizing Care Plans. The community comments on how you can leverage data analytics and predictive modeling as part of your value based care efforts.  Hear from leading experts on how to identify high risk patients and improve care plansRead more…

This Week’s Health IT Jobs for February 11, 2026: UCLA Health seeks a CIO, and data management company Harmony Healthcare IT seeks a CTO. Read more…

Bonus Features for February 8, 2026: 60% of healthcare employees say ChatGPT reduces burnout; meanwhile, hospital operating margins hovered at 1.2% in 2025. Read more…

Funding and M&A Activity:

Thanks for reading and be sure to check out our latest Healthcare IT Today Weekly Roundups.

Happy Valentine’s Day! My son got my wife a box of Cinnamon Toast Crunch because he knows that’s her favorite cereal. A true gift from the heart.



Read the whole story
alvinashcraft
37 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Review: Framework Laptop 13

1 Share
Review: Framework Laptop 13

Framework's Laptop 13 isn't just a capable machine, but one you can repair and keep using indefinitely.

The post Review: Framework Laptop 13 appeared first on Make: DIY Projects and Ideas for Makers.

Read the whole story
alvinashcraft
49 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Launching Interop 2026

1 Share

Launching Interop 2026

Jake Archibald reports on Interop 2026, the initiative between Apple, Google, Igalia, Microsoft, and Mozilla to collaborate on ensuring a targeted set of web platform features reach cross-browser parity over the course of the year.

I hadn't realized how influential and successful the Interop series has been. It started back in 2021 as Compat 2021 before being rebranded to Interop in 2022.

The dashboards for each year can be seen here, and they demonstrate how wildly effective the program has been: 2021, 2022, 2023, 2024, 2025, 2026.

Here's the progress chart for 2025, which shows every browser vendor racing towards a 95%+ score by the end of the year:

Line chart showing Interop 2025 browser compatibility scores over the year (Jan–Dec) for Chrome, Edge, Firefox, Safari, and Interop. Y-axis ranges from 0% to 100%. Chrome (yellow) and Edge (green) lead, starting around 80% and reaching near 100% by Dec. Firefox (orange) starts around 48% and climbs to ~98%. Safari (blue) starts around 45% and reaches ~96%. The Interop line (dark green/black) starts lowest around 29% and rises to ~95% by Dec. All browsers converge near 95–100% by year's end.

The feature I'm most excited about in 2026 is Cross-document View Transitions, building on the successful 2025 target of Same-Document View Transitions. This will provide fancy SPA-style transitions between pages on websites with no JavaScript at all.

As a keen WebAssembly tinkerer I'm also intrigued by this one:

JavaScript Promise Integration for Wasm allows WebAssembly to asynchronously 'suspend', waiting on the result of an external promise. This simplifies the compilation of languages like C/C++ which expect APIs to run synchronously.

Tags: browsers, css, javascript, web-standards, webassembly, jake-archibald

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt

1 Share

How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt

This piece by Margaret-Anne Storey is the best explanation of the term cognitive debt I've seen so far.

Cognitive debt, a term gaining traction recently, instead communicates the notion that the debt compounded from going fast lives in the brains of the developers and affects their lived experiences and abilities to “go fast” or to make changes. Even if AI agents produce code that could be easy to understand, the humans involved may have simply lost the plot and may not understand what the program is supposed to do, how their intentions were implemented, or how to possibly change it.

Margaret-Anne expands on this further with an anecdote about a student team she coached:

But by weeks 7 or 8, one team hit a wall. They could no longer make even simple changes without breaking something unexpected. When I met with them, the team initially blamed technical debt: messy code, poor architecture, hurried implementations. But as we dug deeper, the real problem emerged: no one on the team could explain why certain design decisions had been made or how different parts of the system were supposed to work together. The code might have been messy, but the bigger issue was that the theory of the system, their shared understanding, had fragmented or disappeared entirely. They had accumulated cognitive debt faster than technical debt, and it paralyzed them.

I've experienced this myself on some of my more ambitious vibe-code-adjacent projects. I've been experimenting with prompting entire new features into existence without reviewing their implementations and, while it works surprisingly well, I've found myself getting lost in my own projects.

I no longer have a firm mental model of what they can do and how they work, which means each additional feature becomes harder to reason about, eventually leading me to lose the ability to make confident decisions about where to go next.

Via Martin Fowler

Tags: definitions, ai, generative-ai, llms, ai-assisted-programming, vibe-coding

Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories