Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
155518 stories
·
33 followers

The PM’s Playbook for Shipping AI Features That Actually Work in Production

1 Share

The demo to production Death Valley

If you’ve worked on an AI feature, you know the feeling. You start building something that you are excited about, set launch timelines. The model spits out a perfect response, the prototype works magically, and everybody in the room is mentally calculating how big this product will be when we launch. I’ve been in that room a lot many times and it’s fun.

Then you try to test before you ship.

Latency spikes to 10 seconds on mobile. The model starts hallucinating on edge cases that happen to represent 15% of actual user queries. Your A/B test shows no statistically significant engagement lift because the variance in AI outputs makes traditional hypothesis testing basically meaningless. The safety team flags 340 failure cases in the first week, and you’re now debugging nondeterministic cases that fail in creative, novel ways every single day.

Most often than not, it’s not a model problem but an engineering discipline problem. Shipping an AI product is very different from traditional software. I’ve figured this out the hard way. This playbook shares my learnings.

Latency budgets

Every AI feature comes with a latency tax. Large language model inference takes time. We’re talking 500 milliseconds to 5 or even 50 seconds depending on model size, input length, and infrastructure setup. For consumer products where people expect sub-200-millisecond interactions, this is a hard constraint you have to design around.

The mistake I see most often is teams measuring only p50 latency. A feature with 800 milliseconds p50 sounds fine until you discover the p90 is 15 seconds. That means 10 in every 100 users sit there waiting for 15+ seconds. At scale, that’s thousands of terrible experiences per day.

The way I think about it is you define your latency budget by interaction type, not globally: Synchronous interactions, where the user is staring at a spinner, need to resolve under 1 second. Progressive interactions, where output streams token by token, need first token in under 500 milliseconds and full response under 5 seconds. Asynchronous interactions, where the user keeps doing other stuff, can take up to 20 seconds with a progress indicator.

You also need to measure cold starts separately. The first request after a model loads into memory can be 10 times slower than subsequent requests, and if your traffic is bursty, cold starts will disproportionately punish your most engaged users arriving during peak hours.

Besides, you also need to budget for the full pipeline, not just inference. A typical AI feature pipeline including input preprocessing (tokenization, context assembly, and prompt construction), model inference, output postprocessing (parsing, formatting, safety filtering, etc.), and a full response delivery adds up. Optimizing inference while ignoring the rest is like tuning your engine while driving on flat tires.

Lastly, use streaming aggressively for generative features. Pushing tokens to the user as they’re generated instead of waiting for the full response changes how users perceive latency.  A four-second response that starts appearing at 300 milliseconds feels dramatically faster than one that pops in all at once. Perception is reality when it comes to user experience.

Designing fallbacks

Traditional software fails in boring, predictable ways. AI features fail in novel, unpredictable, and occasionally creative ways. I once saw a model respond to a product recommendation query with a poem about loneliness. Your fallback strategy needs to be considerably more sophisticated than a try/catch block.

I think about fallbacks as a hierarchy. First, model fallback: When your primary model fails, drop to a simpler, faster, and more reliable model. Most failure cases get handled without the user ever knowing. Second, cache fallback: For queries similar to stuff you’ve seen before, serve a cached response. Third, template fallback: When generation fails completely, fall back to prewritten templates. Degraded beats dead every time. Fourth, graceful omission: Sometimes the best fallback is to simply not show the AI feature at all rather than showing a broken version.

The design principle underneath all of this is that users should never encounter an unhandled AI failure. Every failure mode maps to a specific level, and transitions between levels should be invisible whenever you can manage it.

Quality measurement

Quality in traditional software is binary. The button works or it doesn’t. AI feature quality is continuous and subjective, and it changes depending on context. I’ve landed on a four-layer quality pyramid.

The foundation is safety, and it’s nonnegotiable. Does the output contain harmful content, PII, or made-up facts? This layer is binary, and you measure it with automated classifiers running against 100% of outputs.

The second layer is factual correctness, which is domain specific. Is the output actually right? For a coding assistant that means generated code compiles and passes tests. For a writing tool it means grammatical, stylistically appropriate output. You measure this with domain specific evaluation suites.

The third layer is usefulness, and it’s user centered. Did the person actually benefit? Track acceptance rate, edit distance, time to task completion, and repeat usage. This is where traditional product metrics meet AI specific ones.

The fourth layer is delight, which is experimental. Does the output feel good? Hardest to measure but often most important for adoption. Sometimes the numbers say the feature works but users’ guts say it doesn’t. This layer catches that gap.

A/B testing AI features

A/B testing AI features is fundamentally harder than traditional features because AI outputs are nondeterministic. The same user doing the same thing twice might get different outputs, introducing variance that traditional frameworks weren’t built to handle.

The core challenge is that intratreatment variance inflates the sample size you need for statistical significance, often by three to five times. If you’re running your AI experiment with normal sample size assumptions, you’re probably looking at noise and calling it signal.

Then there’s the metric selection problem. A chatbot generating entertaining but factually wrong responses might show amazing engagement numbers while actively misleading users. You have to measure engagement and quality together. “Engaged interactions where quality score exceeds threshold” is more meaningful than raw engagement alone.

The temporal problem matters too. AI feature value changes over time as users learn how to work with it. Short experiments will underestimate long-term value if there’s a learning curve, or overestimate it if there’s a novelty bump.

My practical guidance: budget two to three times more time and traffic for AI experiments than traditional ones. Lean on Bayesian methods as they handle high variance better. And always pair quantitative tests with qualitative research. Ten user interviews will surface failure modes that no amount of statistical analysis will catch.

Model drift monitoring

Model drift is the slow, invisible rot of AI output quality over time, and there are multiple culprits.

Data drift happens because the world changes and user behavior evolves. A model trained on 2024 data performs worse on 2026 queries referencing new concepts, slang, and cultural moments.

Provider drift happens because third-party APIs change without your consent. OpenAI acknowledged that GPT-4’s behavior shifted measurably between March and June 2023, and Stanford researchers documented significant performance swings. The fix: Pin your model versions so updates happen on your schedule, after your testing.

Evaluation drift is the subtlest form. Even your quality metrics can become inadequate and the evaluation criteria that made sense at launch might become inadequate as usage patterns shift and user expectations change. Quarterly reviews of your evaluation suites are essential.

At minimum you need daily automated quality evaluations on 1% to 5% of production traffic, weekly analysis of input distribution characteristics, and monthly human evaluation of 100 to 500 examples. Shipping an AI feature without drift monitoring is like deploying a service without alerting. You won’t know it’s broken until your users tell you, and by then they’re angry.

Evaluation frameworks

How do you know if your AI feature is good enough? You need two fundamentally different approaches, and you genuinely need both.

Automated evaluation gives you speed. Build a golden dataset of 500 to 2,000 labeled examples, train a classifier or use a capable model as judge, and validate against human judgment quarterly targeting 85% agreement. Automated evals chew through thousands of examples per hour, making them essential for velocity. The pitfall: They miss novel failure modes not in the training data.

Human evaluation catches what automation misses. Structure it with five to seven evaluators mixing domain experts and representative users. Use a consistent rubric covering accuracy, helpfulness, tone, completeness, and safety. Run weekly during development, monthly in production. The trade-offs: expensive at $15 to $30 per example, slow with 24 to 72 hour turnaround, and subject to human biases. Manage by rotating evaluators and capping sessions at two hours.

The model as judge approach is an increasingly viable middle ground. Judging quality is often easier than generating it, which means a model can reliably evaluate outputs even for tasks where it couldn’t produce them itself. Use it for high-volume evaluation but always validate against human judgment.

Graceful degradation and prompt engineering

Graceful degradation means when capabilities decrease, the experience gets worse smoothly instead of falling off a cliff. Design for capability levels, not binary states. Define four to five levels with specific behaviors at each. For example, for an AI writing assistant: Level 5 is full capability with real-time suggestions, tone adjustment, and structure recommendations. Level 4 is delayed suggestions appearing after a two- to three-second pause because latency is up. Level 3 is basic suggestions only like grammar and spelling with no style feedback. Each level is a deliberate design decision, not an accident.

Make degradation invisible when possible. Users shouldn’t see a “broken” experience. They see a less detailed one. That’s a huge difference psychologically. However,  when the degradation is significant enough that users will notice, proactive communication like “AI suggestions are temporarily limited” builds trust infinitely more than silently pushing poor-quality outputs.

Prompt engineering in production is software engineering. In production, prompts are code, and they need version control, testing, monitoring, and maintenance. Version controls every prompt. Parameterize prompts, don’t hardcode context. Production prompts should be templates with clearly defined injection points for user context, system state, and dynamic instructions. This makes them testable because you can inject known inputs and verify outputs, and it makes them maintainable because changing how you handle context shouldn’t require rewriting the entire prompt from scratch.

Test prompts against regression suites. Maintain 200 to 500 test cases covering the full distribution of expected inputs, including edge cases and adversarial inputs. Run the suite against every prompt change before deployment.

Monitor prompt performance in production. Track output quality metrics like acceptance rate, user edits, and regeneration requests, segmented by prompt version. When you deploy a new version, compare its production metrics against the previous one for at least 72 hours before calling it stable. This is basically canary deployment for prompts.

Ship it right

These systems aren’t optional add ons you can bolt on after launch. Every feature I’ve seen fail was built first with plans to “add production hardening later.” Later never comes.

AI features are probabilistic and nondeterministic, and they change over time without anyone touching them. Build these systems, staff them properly, and treat them with the same seriousness you’d give your core infrastructure. The gap between demo and production is wide, but it’s absolutely crossable if you build the right bridge.

Note: The research work pertaining to this article was done in a personal capacity. Views are of my own and do not reflect my employer’s views in any way.



Read the whole story
alvinashcraft
12 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

10 Brand-New WordPress.com Features From Radical Speed Month

1 Share

Radical Speed Month has wrapped! For one month, Automatticians built in the open, shipped fast, and shared their work. The result is a stack of projects that make WordPress.com more flexible, more useful, and more connected than ever.

Look at the projects below, and you’ll see that WordPress.com is becoming a better place to write, build, sell, prototype, repurpose, and tinker.

WordPress.com has always evolved to make publishing, building, and managing sites easier. What’s shifting now is context. AI gets a lot more useful the moment it understands your site, your content, your workflow, and what you’re actually trying to make.

Note: Many of these features are still in beta and are actively evolving. 

What is Radical Speed Month?

Radical Speed Month is a creative experiment led by developers, designers, marketers, and many others across the Automattic team to build, ship, and test WordPress.com features faster.

Rather than chasing perfection, the goal was to move quickly before ideas got too precious, share what was in progress, and iterate in real time based on your feedback.

Radical Speed Month produced a multitude of experiments, features, and prototypes. 

Here are 10 that stood out:

  1. Workspace
  2. WordCamp Agent
  3. Blueprints Gallery
  4. Milestones
  5. Easy Site Editor
  6. Lately
  7. A short-form social media theme 
  8. Social Feeds
  9. Write
  10. Wapuu Studio

These projects differ from one another, but the same patterns keep showing up: publishing is getting lighter, development is getting faster, creator workflows are becoming more flexible, and AI is moving from “write this for me” to “help me work.”

The projects below span publishing, development, AI, and creator workflows. Together, they offer a small snapshot of what teams across Automattic explored during Radical Speed Month.

1. Workspace

Built by Artur Piszek https://wordpress.com/blog/author/arturpiszek/

What is WordPress Workspace? WordPress Workspace is a desktop app that brings WordPress Agent into your Mac workflow.

The highlight here is context. Instead of bouncing between a browser, notes app, media library, an AI assistant, and file uploads, Workspace gives people a way to work with WordPress within the flow of their day.

You can ask questions about your site, dictate thoughts, upload screenshots and images, transform selected text, and get help without rebuilding the same context over and over.

Why it stands out: your WordPress site already contains the shape of your work – posts, pages, media, products, audience context, drafts, and ideas. Workspace treats that context as something useful while work is happening, not only after something is ready to publish.

2. WordCamp Agent 

Built by Artur Piszek https://wordpress.com/blog/author/arturpiszek/

What is WordCamp Agent? WordCamp Agent is a Telegram assistant built for WordCamp attendees. It can help you plan your trip, browse the conference schedule, remember your interests and preferences, save notes during sessions, and turn those notes into a post or recap later.

The highlight is context. WordCamp Agent is powered by WordPress Guidelines, the system under the hood that stores agent-facing knowledge directly inside WordPress: instructions, memories, skills, and artifacts.

Why it stands out: Useful AI depends on useful context. A one-off prompt can help with one task, but a structured memory layer can support many workflows over time. WordCamp Agent shows what that looks like in practice: a WordPress-powered assistant that can remember, respond, and help people move from information to action.

Built by Kateryna Kodonenko https://wordpress.com/blog/author/katinthehatsite/

What is the Blueprints Gallery? The Blueprints Gallery is a feature in WordPress Studio that lets developers launch reusable WordPress environments from preconfigured blueprints.

The highlight is practical speed. Instead of rebuilding the same local setup again and again, users can launch reusable WordPress environments from preconfigured blueprints.

That reduces setup work and makes it easier to prototype, test, and share development patterns.

Why it stands out: it removes repeated work without asking developers to give up flexibility. If the first hour of a project is usually setup, configuration, and remembering what worked last time, reusable environments change the pace of the work.

4. Milestones

Built by Jacopo Tomasone https://wordpress.com/blog/author/copons/

What are WordPress.com Milestones? WordPress.com Milestones is a feature that celebrates progress and achievements inside WordPress.com, helping creators see momentum as they publish.

Publishing frequently is a habit worth cultivating – and this feature encourages just that.

Why it stands out: better tools should not make people feel removed from their work. They should help people stay on track.

5. Easy Site Editor

What is Easy Site Editor? Easy Site Editor is an experiment that makes editing a WordPress site feel more approachable.

The goal is to make the path from idea to update easier to follow.

Why it stands out: WordPress has depth, flexibility, and extensibility. The challenge is helping more people access that power without needing to understand everything at once.

This points toward clearer site-building workflows for creators updating pages, agencies working with clients, businesses iterating on offers, and new users getting their first site into shape.

AI can suggest, generate, and automate. But the editing experience still has to feel understandable.

6. Lately

Built by Andrew Spittle https://wordpress.com/blog/author/andrewspittle/

What is Lately? Lately is a messaging-first publishing experiment that lets you create private weekly letters by chatting with WordPress Agent.

Instead of asking users to begin inside a traditional editor, Lately lets them interact with WordPress Agent through a lightweight conversational workflow to create private weekly letters.

The highlight is capture. Ideas do not always arrive when someone is sitting in front of a blank editor. They show up in messages, notes, quick reflections, and half-formed thoughts.

Lately explores what happens when WordPress meets people closer to that moment.

Why it stands out: it connects lightweight capture and AI-assisted shaping back to a publishing system the user controls.

7. Theme for Short-Form Blogging

Built by Dave Martin https://wordpress.com/blog/author/lessbloat/

What is the theme for short-form blogging? This short-form social media theme is a WordPress.com theme for lightweight, social-style publishing.

The highlight is immediacy. Not every post needs to be a long essay. Sometimes people want to publish a thought, a link, an image, an update, a reblog, or a quick reflection.

Social platforms made that behavior feel natural, but they also trained creators to build on rented feeds.

Why it stands out: this project brings some of that casual publishing energy back to a space the creator owns.

8. Social Feeds

What is Social Feeds? Social Feeds brings Bluesky, Mastodon, and the wider Fediverse into the WordPress.com Reader.

The highlight is connection. Instead of jumping between social apps, users can follow people, read posts, react, reply, and publish from one place inside WordPress.com.

For creators, the useful part is choice. A quick thought can stay short and social. But when it grows into something bigger, WordPress.com gives it somewhere to go: a post, a site, an archive, and a home the creator owns.

Why it stands out: Social Feeds fits neatly with the short-form blogging theme. Together, they point to a more flexible publishing loop: read, react, post, expand, repurpose, and publish across formats without giving up ownership.

9. Write

Built by Jamie Marsland, Allison Levine, and Kim Brown. 
https://www.pootlepress.com/author/jamie-marsland/
https://wordpress.com/blog/author/allilevine/

What is Write? Write is a simplified posting experience built for writers on WordPress.com. It gives you one page, a blinking cursor, and only the formatting tools you need when you need them.

The highlight is focus. Instead of starting inside the full block editor, Write gives creators a cleaner surface for getting words down quickly. The interface stays intentionally minimal, then brings in formatting when it helps and gets out of the way when it doesn’t.

For writers, the useful part is flow. Write posts are still real WordPress posts, so they live alongside your other content, work with your theme, and can be opened in the block editor later if you want more control.

Why it stands out: Write fits with the broader push toward lighter, faster publishing workflows. Alongside Lately, the short-form blogging theme, and Social Feeds, WordPress.com is exploring more ways to help people capture ideas, publish quickly, and stay in control of where their work lives.

10. Wapuu Studio

What is Wapuu Studio? Wapuu Studio is an AI-powered tool that lets you create and share your own custom Wapuu with the WordPress community.

The highlight is creativity. Simply describe the Wapuu you’re imagining—its mood, outfit, theme, colors, or tiny adventure—and Wapuu Studio turns that idea into a unique character. You can browse community creations, remix ideas, and share your own designs.

Why it stands out: Not every RSM project is about productivity. Wapuu Studio shows how AI can also make it easier to create, play, and participate in the WordPress community. It transforms a simple prompt into something visual, personal, and shareable while celebrating one of WordPress’s most recognizable mascots.

Bonus: Clips (coming soon)

What is Clips? Clips is a feature that turns WordPress.com posts into short-form video.

The highlight is repurposing. Written content often needs to be adapted for social, video, promotional, or campaign materials. But repurposing takes time, tools, and a separate production process.

Clips explores a simpler model: start with the post, then generate short-form video from the same source material.

Why it stands out: A blog post becomes more than a final output. It becomes a source of truth that can feed other formats.

Check back on WordPress.com/blog to catch Clips when it becomes available. 

Radical Speed Month projects at a glance

ProjectWhat it isWhat it enables
WordPress WorkspaceDesktop app for MacContextual help, media capture, text transforms, and quick questions about your site.
WordPress Guidelines / WordCamp AgentAgent-ready site contextInstructions, memories, skills, and reusable knowledge for AI-assisted workflows.
Blueprints GalleryLocal development environmentsFast local setups, consistent stacks, and quicker prototyping.
WordPress.com MilestonesProgress and motivationHabit formation, momentum, and achievement tracking.
Easy Site EditorSimplified site editingLower cognitive load, faster iteration, and clearer editing paths.
LatelyAI publishing by messageAmbient writing, conversational drafts, and lightweight publishing.
Short-form social media themeLightweight owned publishingFast micro-updates and social-style posting with ownership intact.
Social FeedsOpen social publishingFollow, read, react, reply, and publish across Bluesky, Mastodon, and the Fediverse from one place.

Write
Simplified postingFocused writing, minimal formatting, and real WordPress posts that can open in the block editor.
Wapuu StudioAI-powered character creatorGenerate and share custom Wapuus using simple text prompts.
Clips (Coming Soon)
AI-powered content repurposingTurn WordPress posts into short-form videos from the same source content.

Start exploring what WordPress.com makes possible

Radical Speed Month showed the range of ideas Automatticians are bringing to life on WordPress.com: Whether it’s a cleaner way to write, a faster way to prototype, a more connected Reader, a custom Wapuu, or new ways to publish in formats that feel natural.

Some of these tools are built for work. Others are personal, creative, experimental, or somewhere in between. 

Many of these features are available to try now on WordPress.com paid plans

If you’re new to WordPress.com, compare plans to get access to the latest tools and find the ideal setup for your site.









Download video: http://en-blog.files.wordpress.com/2026/05/clips-trial-v11.mp4
Read the whole story
alvinashcraft
23 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Creating Memorable Web Experiences: A Modern CSS Toolkit

1 Share

I love the fact that CSS is finally reclaiming control over visual interactions, taking charge of the styling, the animation, and the accessibility exactly as it should. Today, native browser capabilities allow us to move the heavy lifting away from the JavaScript main thread and closer to the GPU. By letting the browser’s engine optimize performance under the hood, we save energy and processing power while building code that is robust, accessible, and independent of external libraries that might deprecate tomorrow.

We have 3D, modern layout techniques, clip-paths, transforms, custom properties, scroll-driven animations, view-transitions, @property — and we can animate almost anything, even to auto-height!

And, of course, there’s SVG, which isn’t new, but allows us to build entire websites through illustrations and animations. Take the example below: it’s responsive, lightweight, accessible, and powered primarily by CSS Grid + SVG.

We can even build an entire video game including the UI using only SVG:

What follows is not a complete guide to modern CSS, but an opinionated selection of techniques I reach for when I want a site to feel alive and be remembered. There are many ways to create memorable experiences. Sometimes it’s as simple as a form that completes smoothly. But here I’m interested in the expressive end of the spectrum.

Motion as Communication: Defining Your Intent

Before we dive into the technical side, I want to clarify something: we shouldn’t move things just because we can.

Everything communicates, and our animations are no exception. We must take the time to design movements that support the message we want to convey in order to keep our intents tightly scoped without overdoing it.

Here’s a methodology I use when planning the design and animation of a site.

Imagine we’re working on a project for a nature event focused on mushrooms. The design language changes completely depending on the “vibe”: selling a “Psychedelic Mushroom Rave” is worlds apart from a “Spiritual Mushroom Retreat” focused on ancestral medicine.

Every design decision communicates. I like to create what I call keyword lists to define my intent and scope. For example, I might break things down into different options:

Option A: The Psychedelic Event

  • Visuals: Colorful, saturated, high-contrast, illustrations, distortions
  • Movement: Fast, frantic, unpredictable, morphing, rhythmic, synced loops, hypnotic
  • Feeling: Fun, chaotic, energetic, stimulating, surprising
  • Typography: Funk, “psych-rock”
  • Style References: Pop Art, 60s/70s op art, rave flyers
  • Actions: Dancing
  • Extras: Emojis, films (e.g., Fear and Loathing in Las Vegas)

Option B: The Spiritual Retreat

  • Visuals: Earth tones, neutral tones, de-saturated, photograph-heavy, nature, whitespace
  • Movement: Slow, fluid, organic, breathing, subtle parallax, smooth scrolling.
  • Feeling: Calm, serene, introspective, contemplative, safe
  • Typography: Elegant Serif, minimalist sans-serif, wide spacing, legible
  • Style References: Scandinavian design, Japanese Wabi-sabi, wellness/spa aesthetics, botanical books
  • Actions: Breathing
  • Extras: Healing sounds, film (e.g., Eat Pray Love)

This is the kind of exercise I do to guide my design and animation decisions. The lists will help me select everything from which CSS properties I plan to use and how to use them. I even share them with the client and, together, we choose a direction.

Let’s say we go with Option A and look at a few examples of what I think are essential ingredients for creating memorable user experiences.

Split Text Animations

These animations became popular thanks to the GSAP SplitText plugin. It splits text by character (or words, or lines if you like) so we can create interesting text effects, like staggered animations.

<h1 class="reveal-text">
  <span style="--i:0">H</span>
  <span style="--i:1">O</span>
  <span style="--i:2">L</span>
  <span style="--i:3">A</span>
</h1>

This approach wraps each letter in “Hola” in a span. From there, each span is inline-styled with a custom property indexing the spans in order. Which is something that will get a lot easier when the sibling-index() function gains broad browser support.

But for now, each custom property value acts as a multiplier that increases an animation-delay, staggering each span. In this case we fade in each character as it moves up.

.reveal-text span {
  animation: slideUp 0.6s ease-out forwards;
  animation-delay: calc(var(--i) * 0.1s);
  display: inline-block;
  opacity: 0;
  transform: translateY(3rem);
}

@keyframes slideUp {
  to {
    opacity: 1;
    transform: translateY(0);
  }
}

Accessibility is the tricky part here. The instinct is to hide all the individual spans from assistive technology with aria-hidden="true" and add a visually hidden version of the full word for screen readers:

<h1>
  <span class="sr-only">HOLA</span>
  <span aria-hidden="true" class="reveal-text">
    <span style="--i:0">H</span>
    <span style="--i:1">O</span>
    <span style="--i:2">L</span>
    <span style="--i:3">A</span>
  </span>
</h1>
.sr-only {
  position: absolute;
  width: 1px;
  height: 1px;
  padding: 0;
  margin: -1px;
  overflow: hidden;
  clip: rect(0, 0, 0, 0);
  white-space: nowrap;
  border: 0;
}

But be warned: this pattern doesn’t guarantee a good experience across all screen readers. Adrian Roselli tested GSAP’s SplitText across eight screen reader and browser combinations and found it only worked correctly in two of them. If you ship this technique, test it with real assistive technology.

If that risk feels too high, there’s a very clever alternative from Preethi worth knowing that uses the letter-spacing property. It accepts negative values that collapse characters on top of each other, hiding them without touching the DOM at all. Animate it back to 0 and you get a similar reveal effect without accessibility overhead.

What would be great is a pseudo-selector like ::nth-letter to target individual glyphs directly from CSS the way ::first-letter selects the first character. But unfortunately, there’s no ::nth-letter… at least yet.

Remember to respect the user’s motion preferences on every animation:

@media (prefers-reduced-motion: reduce) {
  .reveal-text span {
    animation: none; /* or a softer animation */
  }
}

And here we go:

It might not scale too much when we have a lot of text and different animations we want to apply. For the psychedelic event, I wanted to try splitting text with SMIL, but it was verbose. This is the code for animating two letters alone:

<svg role="img" aria-label="TODOS LOS HONGOS" viewBox="0 0 1366 938.96">
  <title>TODOS LOS HONGOS</title>
  <g aria-hidden="true">
    <text transform="rotate(-9.87 2181.107 -1635.1)" opacity="0">T
      <animate attributeName="dy" values="100; -20; 0" keyTimes="0; 0.8; 1" dur="0.4s" begin="0s" fill="freeze"/>
      <animate attributeName="opacity" from="0" to="1" dur="0.01s" begin="0s" fill="freeze"/>
    </text>
    <text transform="rotate(-8.92 2372.854 -2084.755)" opacity="0">O
      <animate attributeName="dy" values="100; -20; 0" keyTimes="0; 0.8; 1" dur="0.4s" begin="0.1s" fill="freeze"/>
      <animate attributeName="opacity" from="0" to="1" dur="0.01s" begin="0.1s" fill="freeze"/>
    </text>
    <!-- rest of letters... -->
  </g>
</svg>

Add role="img" and a <title> to the <svg>, and wrap the individual letters in <g aria-hidden="true">. That gives screen readers one clean label to read. It works well in some combinations and badly in others, so if the text is critical, don’t animate it.

Here is the complete code. It’s easier to write it when you have an AI to do it for you:

For longer text, a library like GSAP gives you more control, but the same accessibility risks we discussed earlier apply, and the results across screen readers are inconsistent:

<h1>
  <span class="splitfirst">Todos los hongos son</span>
  <span class="splitlast">mágicos</span>
</h1>
const splitFirst = SplitText.create('.splitfirst', {
  type: "chars",
});
const splitLast = SplitText.create('.splitlast', {
  type: "chars, lines",
  mask: "lines"
});

const tween = gsap.timeline()
.from(splitFirst.chars, {
  xPercent: 100,
  stagger: 0.1,
  opacity: 0,
  duration: 1, 
})
.from(splitLast.chars, {
  yPercent: 100,
  stagger: 0.1,
  opacity: 0,
  duration: 1,
});

This would be a nice approach for Option B if we had gone that route. See how “serene” things feel as the text fades in.

Masking & Clipping

The clip-path and mask properties allow us to hide portions of an element, but they work on fundamentally different principles. Clipping is a binary decision: pixels are either fully visible or completely gone,  making it the right choice for clean geometric shapes, like polygons, circles, or SVG paths, where the browser can also optimize rendering more efficiently. Masking, on the other hand, uses luminance or alpha channel values: white reveals, black hides, and everything in between produces partial transparency. This makes it the tool for soft edges, gradient fades, and irregular textures. Keep in mind that if you have a very complex vector shape, it might be more performant to use a mask than a vector clip-path. Sarah Drasner has a nice write-up on when it makes sense to use one over the other.

Our project is a very clear use case for clip-path. We have a circle shape that starts with clip-path: circle(0%), which makes the element invisible (the clipping circle has zero radius). Over the duration of the animation it expands to circle(100%), which fully reveals the element as the circle grows outward from its center. Meanwhile, we fade things in with the help of opacity.

#rainbow, #floor, #mushroom, #flores {
  opacity: 0;
  animation: maskAnim 2s ease-in forwards;
}

@keyframes maskAnim {
  0%, 1% { 
    clip-path: circle(0%);
    opacity: 1; 
  }
  100% { 
    clip-path: circle(100%); 
    opacity: 1; 
  }
}

Note: The 1% keyframe is there to make sure the browser starts the clip-path interpolation from circle(0%) rather than from whatever value the element might already have. Without it, some browsers will unexpectedly jump at the very start. A cleaner alternative is to use animation-fill-mode: both because it locks the element in its from state before the animation begins.

From there, we apply the same animation to the different SVG groups in our illustration:

<g id="rainbow">...</g>
<g id="floor">...</g>
<g id="mushroom">...</g>
<g id="flowers">...</g>

How psychedelic is this?!

Scroll-Driven Animations

Scroll-driven animations are great because we can connect an animation’s progress to the user’s scrolling instead of a typical timeline that runs and stops.

We can use it for subtle and somewhat “trippy” movement, like a light parallax effect. In this case, we can make things that appear closer to the user move faster than the ones that are more distant.

This is the full CSS:

#estrellas, #arcoiris, .text-line, #fecha, #arco, #flores, #dir, #piso, #barras {
  animation: moveUp both;
  animation-timeline: view();
}

@keyframes moveUp {
  from { transform: translateY(var(--offset)); opacity: 0; }
  to { transform: translateY(0); opacity: 1; }
}

#estrellas { --offset: 10vh; }
#arcoiris { --offset: 20vh; }
#fecha { --offset: 45vh; }
#arco { --offset: 50vh; }
#dir { --offset: 50vh; }
#flores { --offset: 65vh; }
#piso { --offset: 85vh; }
#barras { --offset: 90vh; }

The animation-timeline: view() says that things should start the animation as soon as an element enters the scrollport when the user scrolls into it, and fully completes when it scrolls out of view. To make things move at different velocities, we place them at different offsets using an indexed --offset custom property like we did earlier for splitting text.

3D Transforms

This one is trickier and we need to keep an eye on performance. A tool like Layoutit can help carry the lift because it has a voxels and terrain generator built entirely with CSS 3D. It can go even further when it’s complemented with VoxCSS, a full voxel engine that renders 3D cuboids using only CSS Grid layers and transforms without the complexity of Canvas or WebGL.

Let’s put together some combination scrolling and 3D effects. It’s the sort of thing that supports the “hypnotic” and “dancing” ideas in the Option A keyword list. Check this out:

Here, I’ve set up a scene with depth using the perspective property and then wrap all the child elements inside the scene in a 3D space with transform-style: preserve-3d. This way, all the child image elements rotate and translate along the depth axis (or z-axis).

Let’s connect that to a scroll-driven animation that uses transform: rotateY:

.scene {
  perspective: 1200px;
}

.img-wrapper { 
  transform-style: preserve-3d; 
  animation: rotateImg linear;
  animation-timeline: scroll();

  > img {
    transform: rotateY(270deg) translate3d(0, 50px, var(--distance));
  }

  > img:nth-child(2) {
    transform: rotateY(180deg) translate3d(0, 50px, var(--distance));
  }
}

/* etc. */

@keyframes rotateImg {
  to { transform: rotateY(360deg); }
}

Custom Cursors

cursor might be one of the most unused CSS properties. There are many cursor types we can use, although there are definitely opinions on just how far to go with this.

And we can use it to play around with the images, displaying different cursors on different containers when the user hovers them. I would personally use an SVG and PNG image for transparency support, though the property supports any raster image.

It’s worth noting that cursor sizes vary by browser: Firefox caps custom cursors at 32×32px, while Chrome supports up to 128×128px. Most browsers refuse to display — or will downscale — cursors that are larger than 32×32px on high-DPI (retina) screens. Keeping your cursor at 32×32px is the safest choice to ensure consistency.

For example:

.box1 {
  cursor: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB0AAAAZCAMAAAD63NUrAAAACVBMVEX///8AAAD///9+749PAAAAAXRSTlMAQObYZgAAAFZJREFUeNqdzksKwDAIAFHH+x+6lIYOVPOhs5OHJnES/5UkYKEkU7xjijSIm50iFh4fAXgYDd/yumVVRSwsqq/nRA3xVK0oo06d5U6DpQZ7PV7lMxH7LkaQAbYFwryzAAAAAElFTkSuQmCC),auto; 
}

We can even set multiple fallbacks to ensure the widest level of browser support:

body {
  cursor: url('path-to-image.png'), url('path-to-image-2.svg'), url('path-to-image-3.jpeg'), auto;
}

While this is cool and all, we have to keep accessibility in mind for something that changes default web behavior like this. Custom cursors could be fun to apply to very specific elements rather than wholesale across the board.

Bonus: Anchor Positioning

One more thing before we wrap up. I’ve been playing with CSS Anchor Positioning, inspired by a Kevin Powell demo. We can use it to attach a single pseudo-element to a currently-hovered item instead of attaching a pseudo-element for each and every item. In other words, we create a single element and anchor it to a hovered element, like highlighting cards:

That opens up interesting possibilities, like being able to transition the hover state between cards. In this case, I’m using the linear() function to get that natural bounce with help from Easing Wizard.

Conclusion

The technical barriers for creating memorable web experiences are mostly gone now. I hope everything we’ve covered here gives you an idea of just how far we can go with modern CSS features that completely remove the need for additional JavaScript. We have more possibilities than ever before, all without the need for complex technical overhead like days past.

So, instead of asking, is this possible?, the most important question becomes, does this movement tell a better story? If yes, ship it. Use these tools not because you can, but because they help you tell a better story, one that is also accessible and performant.

And, of course, everything in here is just a handful of ways to do that. But what sort of memorable experiences have you used in your work? Or what have you seen on other sites?


Creating Memorable Web Experiences: A Modern CSS Toolkit originally handwritten and published with love on CSS-Tricks. You should really get the newsletter as well.

Read the whole story
alvinashcraft
33 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

The “Vibe Coding” Crisis: Is Web Design Becoming a Commodity?

1 Share
We are entering the era of "Vibe Coding," where AI generates the average of everything we’ve ever built, leaving the web beautiful but soulless. To survive, designers must stop being pixel-pushers and start being "Soul Architects," finding the human friction that a machine would never think to include.
Read the whole story
alvinashcraft
39 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

The Benefits Of Cognitive Inclusion In UX Research

1 Share

In the summer of 2024, I became co-chair of a working group of expert researchers who came together to determine how best to perform accessibility testing with people with cognitive disabilities. This was work I did for Fable, where I am currently VP of Innovation.

Cognitive disability is an umbrella term for several disabilities that impact how people process information, and it usually affects memory, focus, and/or learning. It is the most prevalent disability in the U.S. (13.9% via CDC), and cognitive disability is increasing rapidly (Yale study).

We set four goals for ourselves to learn how to work with this audience:

  1. How should we recruit and screen participants?
  2. What are best practices for research with cognitive participants?
  3. Do these methods work in a real study?
  4. Documenting what we learned so that we could share it.

We created a screener to recruit people who self-identified as having challenges with memory, focus, and learning. We also reviewed published studies that involved cognitive testers to learn best practices for working with them.

Next, we tested these best practices with an initial group of 25 testers in a pilot study. We fine-tuned our approach iteratively and created a guide to running user interviews with cognitive testers and a survey that could quantify their experiences using digital products. Finally, we documented what we learned.

After our pilot study with this new group of testers finished, I felt that they would uncover more usability insights than the general population (gen pop) user research participants I’d worked with in the past. I set out to validate this hunch.

The Cognitive Usability Study

I decided to run a joint study with Fable’s partners at the University of California, Irvine, in collaboration with Syed Fatiul Huq and with help from Fable researchers Pranav Pidathala, Ali Brown, and Michael Fagan to see if my hypothesis about finding more insights with cognitive testers proved true or not.

I generated three websites for the study using an AI prototyping tool. I wanted three different types of sites with different user goals and content so I could test a variety of tasks in the study.

Table 1: Websites And Tasks Tested

Website Strong Snacks Turning Pages Crown & Comb
Description This is a website for three-ingredient high-protein recipes. Recipes can be browsed by category (vegan, muscle building, etc.). The site also features blog posts about protein and contact information. This website is for a bookstore with a catalog of curated reads. It features extensive filtering by book genre, a book swiping feature to build a profile of likes and dislikes, custom book lists, a shopping cart, and checkout. A website for a hair salon that allows you to book appointments and consultations online. It has a VIP program and a variety of special packages visitors can buy.
Design Simple, brutalist, bright, lots of pictures. Moody, classic, dark, lots of pictures of book covers. Bold, clean, black and white with bursts of color.
Content Recipes, blog posts. Books and book lists. Services, experience guide, membership information.
Key functionality Filter by category, newsletter subscription. Shopping cart, book matching, book lists, recommendations. Appointment booking.
Tasks
  1. Find a recipe for a high-protein snack.
  2. Find a blog about protein and read it.
  3. Find a way to be notified about new recipes and blog posts.
  1. Find the book swiping feature and use it on 10 books.
  2. Find the recommended book list.
  3. Add books from two genres of your choice to cart.
  4. Checkout the books in your cart.
  1. Find the prices for getting a haircut.
  2. Book a haircut appointment.
  3. Find the price for the bridal package.

We used a single screener with questions about memory, focus, and learning, and screened participants into two groups based on whether they self-identified as having cognitive challenges or not.

Cognitive disability includes neurodiversity. Neurodivergent is an umbrella term used to describe people whose brains process information and learn differently. It is most commonly used for people who have learning disabilities (e.g., Dyslexia), ADHD, and Autism.

We ran 30 user interviews, 10 per website, with an even 5/5 split between cognitive and gen pop participants for each website. In each session, a participant completed all the tasks for one website during an online user interview facilitated by one of the researchers involved in the study.

All participants completed an Accessible Usability Scale (AUS) survey at the end of their session. This is a free, Creative Commons-licensed 10-question survey to evaluate the usability of websites and mobile apps.

Data Analysis Approach

I reviewed all the study recordings and transcripts and made note of every time a participant raised a concern, question, difficulty, or asked a question about how something worked. I counted all of these as issues. I also noted where a participant missed something that was part of a task, even if they didn’t notice it themselves. I also noted every suggestion for improvement made by participants.

Examples of issues found included:

  • Photo is too tall and requires a lot of scrolling to get to content (noted by participant).
  • I get no feedback when I like or dislike a book (noted by participant).
  • Participant missed the required P.O. Box checkbox the first time (observed by me).

Examples of suggestions included:

  • I would like to see a protein comparison in a table.
  • The “More information” tab should be moved up higher.
  • I would like more information on how the recommendation list is created.

Issues and suggestions were counted once per participant, even if they mentioned the same thing twice, but there are, of course, repeat issues and suggestions across the different participants. It is expected in UX research with multiple participants that you’ll find similar issues with each participant, and that is a signal that an issue is a universal challenge.

Findings Of The Cognitive Usability Study

Across the three websites tested:

  • Cognitive participants identified 197 issues.
  • Gen pop participants identified 113 issues.
  • Cognitive participants made 93 suggestions.
  • Gen pop participants made 54 suggestions.
  • Cognitive participants surfaced more issues related to content, buttons, icons, visual elements, and media than gen pop participants.

The results aligned with my instincts: participants with cognitive disabilities identified 1.8 times more issues and made 1.8 times more suggestions than gen pop participants.

Let’s dive deeper into the data for each website. Note that an AUS score ranges from 0 to 100, with higher numbers representing better usability than lower numbers.

Table 2: Strong Snacks

This site had the simplest design and content of all websites tested in the study and accordingly had the lowest overall issues and the highest median AUS scores. The data aligns with what you’d expect from an easy-to-use and simple website.

On this website, cognitive participants found 3.4 more issues and made 2.2 more suggestions on average. Their average score of the overall experience was 13.7 points lower than that of the gen pop participants.

Total issues Average issues Median issues Total suggestions Average suggestions Median suggestions Average AUS Median AUS
Gen pop 32 6.4 6 13 2.6 2 90.5 97.5
Cognitive 49 9.8 9 24 4.8 4 76.8 73.0

Table 3: Turning Pages

This was the website with the most varied functionality and the most tasks to complete (4), so it’s not surprising that participants found the most issues.

Here, cognitive participants found 6 more issues and made 3.2 more suggestions on average. They also scored the overall experience 17.2 points lower than gen pop participants on average.

Total issues Average issues Median issues Total suggestions Average suggestions Median suggestions Average AUS Median AUS
Gen pop 55 11 10 26 5.2 4 78.0 80.0
Cognitive 86 17 15 42 8.4 6 60.8 58.0

Table 4: Crown & Comb

This website was intentionally designed to be complex, and task 3, finding the bridal package, was meant to be extremely difficult to complete.

On this last website, cognitive participants on average found 7 more issues and made 2.4 more suggestions. Their average score for the overall experience was 14.3 points higher than the gen pop participants.

Total issues Average issues Median issues Total suggestions Average suggestions Median suggestions Average AUS Median AUS
Gen pop 26 5 4 15 3 3 49.5 35.0
Cognitive 62 12 11 27 5.4 2 63.8 68.0

Something interesting happened with the AUS scores for cognitive and gen pop participants in Tables 3 and 4. Cognitive participants scored Crown & Comb higher than Turning Pages, but gen pop scored the opposite — higher for Turning Pages and lower for Crown & Comb. If I had to guess why, I suspect finding more issues on Turning Pages impacted the cognitive participants’ perceptions of usability more than the gen pop participants’.

The other major difference between the sites, outlined in Table 5 below, was that cognitive participants found many more issues with buttons and links on Turning Pages and more issues with icons and visual elements on Crown & Comb. This suggests to me that the interactions being challenging on Turning Pages were a more significant challenge than issues with visual elements.

Qualitative Findings

When it comes to the more qualitative findings, I looked at trends in the types of issues found by both groups of participants.

Cognitive participants:

  • Were more likely to flag issues with icons or visual elements.
  • Surfaced problems with content more frequently.
  • Gave richer qualitative commentary, often explaining why something was hard to find or confusing.

Gen pop participants:

  • Were less likely to flag conceptual or comprehension barriers.
  • Gave shorter feedback, often stopping once the task was complete.

Table 5: Number Of Issues By Category

When I grouped issues by category, the following issues surfaced more often with cognitive participants: content, buttons and links (affordances and function), icons or visual elements, and media (video, animations). They nearly tied with gen pop participants on navigation issues (45 vs 46).

Strong Snacks Turning Pages Crown & Comb
Issue category Gen pop Cognitive Gen pop Cognitive Gen pop Cognitive
Content 11 22 11 30 23 36
Navigation 18 22 25 17 2 7
Buttons and links 0 5 7 20 3 0
Icons or visual elements 3 16 2 3 4 23
Media 0 2 0 1 0 0

Let’s look at the commentary provided by one cognitive participant versus one gen pop participant in the Crown & Comb sessions. The cognitive participant gave an AUS score of 38, and the gen pop participant gave an AUS score of 27.5. I chose to compare these two participants because they both gave the lowest scores within their group.

Notice the differences in how they described the overall experience in the quotes below. The gen pop participant explained it was frustrating and not engaging. The cognitive participant felt drained and less able to focus. I interpreted the experience as having a more profound impact on the cognitive participant’s overall wellbeing.

Gen pop participant quote

“As soon as you have a name of a treatment and a little explanation and like the duration and the price, as soon as you click onto that, it should be that you can interact with that service straight away. And I feel like if you're seeing a service repeated on a page multiple times and you're still not able to select it, it's really, really frustrating. This feels not particularly engaging.”
Cognitive participant quote

“For example, like, the mental energy aspect of it, like, sometimes there's, like, okay, cookies, and then ads, pop-ups, or maybe the website or service has too many options to look through, and maybe I just want something that I already know. I have to go through a lot of stuff. It makes me, like, feel drained and less able to focus.”

In summary, across all 3 websites we tested, participants with cognitive accessibility needs identified 197 usability issues, compared with 113 identified by gen pop participants.

Cognitive participants made 93 suggestions for improving the user experience, compared with 54 suggestions by gen pop participants.

When I compared issues and suggestions across both groups of participants, it turned out that the cognitive participants found 1.8 times more issues and made 1.8 times more suggestions than gen pop participants.

Cognitive participants surfaced more issues related to content, buttons, icons, visual elements, and media than gen pop participants.

How Cognitive Participants Benefit UX Research

In working with cognitive participants for the last few years, I’ve seen how they surface cognitive load issues consistently. These issues don’t just impact people with cognitive disabilities such as neurodivergence; they also impact:

  • Gen Z who lives in a world of short videos optimized for attention-grabbing and struggles to focus on long-form and written content.
  • Seniors who naturally experience cognitive decline as they age and have difficulty with complex interactions, especially online.
  • Adults with jobs and families who are constantly busy, overloaded with information, making their attention and focus difficult to grab.

What would I have missed if I hadn’t included cognitive participants, and how might that have impacted the business outcomes for these websites?

Strong Snacks

On the Strong Snacks website, the cognitive participants surfaced:

  • They would trust the content more if there were links to the sources of information, such as scientific journals.
  • The need for more context in headlines to understand what the blog is about.
  • Lack of clarity of the label “Add-ons.”
  • Layout concerns where recipes for snacks interrupted the main article flow instead of being placed in a sidebar with a distinct design.
  • How ads and animations can distract some users from reading the content.

These are improvements that would give all users more trust in the content while also making it easier to read and skim for key content. The research findings point towards design best practices, such as not having continuous animation and using layout to draw attention to different types of content that a senior designer might also point out.

Turning Pages

Without cognitive participants, we might have missed the more subtle but important issues with confusing interactions, such as how the “Add to book bag” button worked. They were also confused about where reviews and recommendations came from. Both of these issues could decrease a user’s trust in the website.

All participants surfaced that the book-matching feature was hard to find, but the deeper problem the cognitive participants emphasized is that the site’s interactions don’t consistently behave in ways that they can predict and understand, decreasing their confidence.

Anyone who wants to buy a book could benefit from a clear understanding of how to add books to a cart and complete the checkout quickly and with no ambiguity. Compounded over hundreds or thousands of users, a lack of clarity in a purchase flow will lead to lost revenue.

Crown & Comb

The Crown & Comb website in particular highlighted the benefits of having cognitive participants who raised:

  • Concern around why a service would be “subject to stylist consultation.”
  • Uncertainty with services that had similar labels but may or may not be the same service.
  • The importance of choosing a date being early in the flow for booking appointments.
  • Lack of clarity about when or how they would pay for services.

These issues likely also affect gen pop participants, but they are more likely to muddle through a task with incomplete information. However, that can lead to losing customers to a better experience if a competitor pops up. Loyalty is often tied to experiences, not just brands, and having a poor experience means your customer retention can be weaker.

The study showed that finding a bridal package was hard for everyone, but the cognitive group showed how that became an accessibility barrier. When you combine:

  • too much ambiguity,
  • too many decisions,
  • too little user feedback, and
  • too much effort to find something,

You create a high enough cognitive load that some people will not be able to complete the task. In my opinion, this is where usability issues start to become accessibility barriers — when they increase cognitive load so much that it becomes overwhelming for some users.

Key Takeaways
  • Include people with cognitive disabilities in user research, not just accessibility research.
    They can surface general usability issues related to content, buttons and links, icons or visual elements, and media while also helping you understand how your product functions in terms of cognitive load.
  • Cognitive issues are both usability and accessibility issues.
    Tasks that rely heavily on memory, focus, and decision-making can move along a scale from difficult to impossible for some users to complete. That’s where usability challenges become accessibility barriers.
  • Track more than task completion.
    Ask users how they feel, how a task affects their energy, how distractions impact their ability to focus, and how easy or hard a task was for them.
  • Start small and build your cognitive inclusive research practice over time.
    Even a few sessions with people who have cognitive access needs can help you better understand how to manage cognitive load for all users.
Start Incorporating Cognitive Insights Now

The percentage of people aged 65 and older in America is projected to increase from 17% to 25%. By 2060, 1 in 4 Americans will be an older adult (U.S. Census). This is where everyone starts to experience cognitive decline. As the aging and cognitive population segment expands, companies will need to build for these more complex user needs.

People with cognitive access needs are a natural starting point because they will find the types of usability issues that UX teams are used to. This could make cognitive an easier entry point for inclusive research. Getting insights from assistive technology users is still very important, but many teams don’t know how to start doing that.

Cognitive accessibility is a powerful on-ramp into broader accessibility research and testing. By focusing first on cognitive load, clarity, and predictability, we build research foundations that make future work on accessibility with screen readers, screen magnifiers, and alternative navigation users more approachable.

“2 sessions with cognitive users feel like 200 because of the volume of insights we get.”

—UX Manager at Bell Media

In this small exploratory study, participants with cognitive disabilities identified 1.8 times more issues and made 1.8 times more suggestions than gen pop participants. I’ve seen this type of impact in research conducted by Fable customers’ websites that aren’t AI-generated, too.

Cognitive inclusion in UX research is not optional, and it’s not just about accessibility. It’s how UX teams can make their research more efficient, create clearer content, simpler flows, and ship better products for everyone.

Study Limitations

This study had a relatively small sample size, so the findings are more qualitative than quantitatively validated. Testing was also done on two different platforms. Cognitive participant sessions were run using Fable Engage, and gen pop sessions were run on UserFeel. Different platforms with unique participant panels can affect the quality of insights and comfort levels with user research participation.

Disclosure: I work for Fable and chose to use our platform because it was more affordable than paying for access to another research platform, allowing me to include more participants in the study at a lower cost.

Different researchers facilitated the user interviews, which can also affect findings, but all sessions used the same task structure and discussion guide template, and all were completed online. Even though the sessions were facilitated by different researchers, the issue and suggestion counts were all done by me to ensure consistency across all websites and participants.

Resources

I’ve compiled a few useful resources as you begin your cognitive inclusion journey.



Read the whole story
alvinashcraft
47 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Observability overload is drowning engineers

1 Share
Abstract illustration of a hand holding a cross-section of gears and network nodes, representing the complex database storage underlying simple AI agent interfaces.

If you can see everything, you may see nothing at all. That’s what SREs and DevOps engineers are learning as observability tools multiply and human capabilities do not.

Engineers have never had more observability data than they do today, affording them unparalleled visibility into the systems they manage. But all that information at their fingertips doesn’t always equate to faster detection, let alone resolution.

Why not? Problematic observability data lighting up a dashboard merely sends a human into collected logs and traces to dig up the cause of the problem. Suddenly, the collected information becomes both a boon and a headache. The required context is there for you to find and use, but it’s difficult to locate.

Worse, you may chase the wrong lead, leading to exploration of false paths, wasted time, and perhaps even extended downtime. If there’s too much data, why not throw more engineers into the breach? Multiple engineers looking into the same problem can quickly metastasize into a multi-platform coordination nightmare, leading to unpredictable resolution timelines.

To fully capitalize on the modern observability stack and all the information it collects, a new technique is needed. Instead of humans hunting and pecking through logs, the modern enterprise wants a single, unified system that can parse observability data quickly and either execute a fix itself or suggest a mediation pathway for humans to manage.

Yes, we’re talking about AI. Specifically, AI agents. Agents are a strong fit for the observability crisis, as they can handle higher data volumes than humans — especially high-volume data split across different systems that require correlation — and have recently gained the ability to act autonomously. 

The good news is that technology companies are building that precise system. Even better news is that the same tools allow engineers to directly bring observability data into the agentic development environment of their choice, such as Codex, Cursor, and Claude Code, so they can unite what they know about an issue with the tools they need to resolve it. 

At 12 p.m. Eastern/9 a.m. Pacific on Tuesday, June 30, Datadog’s Vignesh Palaniappan, Senior Product Manager for Bits AI, joins The New Stack to discuss the pain points you’re experiencing with observability today, how AI agents offer a solution, and how to set your engineering team up with the tools it needs to operationalize the information it has at its disposal.

Register now

What you’ll learn

  • How to quickly surface root causes behind alerts without sending your engineers off on a wild bug hunt. Who doesn’t want shorter MTTD?
  • How to build and deploy agents that can remediate alerts and issues on their own. Who doesn’t want shorter MTTR?
  • How to import observability data directly into tools like Codex, Cursor, and Claude Code so your developers have the context they need right where they work.

Can’t join us live? Register anyway, and we’ll send you a recording after the session. By registering, you consent to receiving email communications from The New Stack and Datadog.

The post Observability overload is drowning engineers appeared first on The New Stack.

Read the whole story
alvinashcraft
54 seconds ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories