Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152767 stories
·
33 followers

EPUB and HTML - Survey results and next steps

1 Share

Photo by Spencer

Over the Northern Hemisphere summer of 2025, the Publishing Maintenance Working Group (PMWG) ran a survey in the publishing community to ask a question about a topic that has been lurking in our backlog for several years: should we allow HTML in EPUB?

After reviewing and discussing those results we have decided that we will not add HTML to EPUB 3.4.

The survey results were invaluable in helping us come to this decision, and we deeply appreciate everyone who took the time to share and respond to our survey. We received compelling feedback both for and against the change. We know that our decision is going to make some people very happy and some people very unhappy, and we wanted to share how the feedback influenced our thinking, and what we want to do next.

Before diving into the survey results, one thing we want to make clear is that the door is not closing on HTML and digital publications. In fact, the responses provided us important information that helped us realize that we need to take a new approach, one that I hope the community will consider supporting us in.

We received over 100 responses from publishers, tool developers, reading system developers, independent authors, ebook retailers, social DRM providers, conversion vendors, distributors, and readers. The diversity of responses was instrumental in understanding how this change would impact various parts of the ecosystem.

 

The primary arguments for the change included:

  • Improved alignment with the web platform
  • Opportunities for new tool and reading system developments
  • Access to new toolsets or frameworks
  • Ease barriers of understanding for a new generation of ebook developers
  • Future-proofing EPUB, ensuring long-term support for files

The arguments against the change included:

  • Increased complexity for tooling, reading system, and distribution platforms
  • Change may potentially break workflows for publishers, distribution, ingestion, or watermarking
  • Industry investment has been focused on XML/XSLT technologies, adopting HTML would be expensive or prohibitive for some
  • Interoperability concerns, older books no longer working on newly-developed reading systems or newer books not working on older systems

In addition to the arguments for and against this change, we learned a great deal about the needs of the community, and it is these learnings that we want to take away to consider for the future.

One of the most common requests in the feedback was for accompanying best practices, sample files, test suites, and supplemental information on what HTML might look like in EPUB. In addition to these asks, we heard that people wanted to know more about the features that would be implemented and supported by reading systems, or whether we might consider defining a subset of HTML and CSS instead of adopting HTML wholesale.

We also learned that there remains a great deal of misinformation out there about EPUB's current relationship with HTML. For over a decade, EPUB3 has supported what is specifically known as the XML serialization of HTML, not XHTML 1.1. What this means is that EPUB already supports most of the current HTML specification, including elements like `<video>`. For more information, we invite you to read about EPUB's relationship to HTML.

The publishing ecosystem is different from the web. There are more technologies and stakeholders between an ebook and its reader than there are between a website and its visitor. For instance, there are specialized tools that serve authors and publishers who don't write code. Additionally, ebooks must be compatible with a myriad of libraries and booksellers. Therefore, making this change has a greater potential to break tools and workflows than a similar change in the web environment.

EPUB must maintain backwards compatibility. This makes sense in the book world, where content is less dynamic than a website. A book's content is more likely to stay the same, or very similar, throughout the lifetime of the book. Consistency is an expected feature of books. This consistency is important for archiving and preservation. Backwards compatibility makes adopting HTML in EPUB more difficult, too.

What happens next? We do not want to close the door on this discussion entirely, especially because the gaps identified by a number of respondents are important to acknowledge. There are feature gaps in EPUB that HTML integration could have addressed, but in discussing these gaps, we realized that simply adding HTML would not address the underlying problems. Much of the excitement expressed for this change came from people looking to do more than EPUB currently allows or has broad support for, like scripting or interactivity, or from use cases like web publications.

We encourage anyone interested in the future of digital publications and EPUB to join the Publishing Community Group and share their use cases and issues with us. It's important for us to understand what people want to do, or what they are already doing but having challenges with, so we can look for solutions that become specifications. We look forward to learning and working with people interested in what digital publications can be!

Read the whole story
alvinashcraft
3 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Better Context Will Always Beat a Better Model

1 Share
Green lizard blending in with green leaves.

AI leaders have, for the last couple of years, obsessed over benchmarks, debating whether GPT-4, Claude or Gemini holds the crown. This enterprise AI conversation has been dominated by a single metric: Model performance. But this fixation on raw “intelligence” overlooks the most critical factor in successful deployment.

As these models converge in capability, the battleground is shifting. The differentiator for the next generation of enterprise applications won’t be the model itself; it will be the context. In a landscape where every enterprise has access to the same frontier models, the “intelligence” of the model is no longer a sustainable moat. Instead, the winner will be the organization that can most effectively ground that intelligence in its proprietary reality.

Context is the new source code,” says Srinivasan Sekar, director of engineering at TestMu AI (Formerly LambdaTest), an AI native software testing platform. He exposits that while the industry is fixated on model size, the real challenge lies in data delivery.

“We are finding that a model’s intelligence is only as good as the environment we build for it,” he explains. “If the context is cluttered, even the most advanced model will fail.”

Feeding enterprise data into these models is therefore proving to be far more dangerous and complex than initially thought. It is not just about piping in documents; it is about preventing the AI from “choking” on the noise.

This requires a shift from viewing AI as a “know-it-all” oracle to viewing it as a reasoning engine that requires a high-fidelity information environment to produce business value.

I spoke to notable engineering leaders who shared their perspectives about how the future of AI is found in the precision of the architecture surrounding it. In essence, the model acts as the processor, while the architecture serves as the fuel that determines speed, accuracy and enterprise-grade reliability.

The Rise of ‘White Coding’ and the Governance Gap

The stakes of this transition are high because the role of AI has fundamentally changed. We have moved beyond simple auto-complete into a paradigm that Brian Sathianathan, cofounder of Iterate.ai, calls “white coding.” In this environment, tools don’t just complete a line of code; they generate entire architectures, multi-file edits and complex logic from a single prompt. A task that once required days of human effort is now accomplished in 20 minutes.

However, this unprecedented speed creates a terrifying governance gap. When a human writes code, they govern it line by line. When an AI generates 5,000 lines in a single session, that granular oversight vanishes.

Sathianathan warns that if developers do not have the right context and security guardrails in place from the start, they risk generating technical debt at machine speed. Without intentional context, a model might introduce frameworks with known vulnerabilities or create fundamentally insecure logic flows. These are risks that may not be discovered until it is too late.

To address this, engineering teams must move away from retrospective code reviews toward “pre-emptive context governing.” This involves embedding security standards directly into the environment the AI “sees,” ensuring that generated logic remains within safe, predefined boundaries.

The Fallacy of ‘More is Better’

The natural instinct for most developers is to solve inaccuracy by providing the AI with more information. If the AI understands the entire codebase, the logic goes, it cannot make mistakes. Neal Patel, cofounder and CEO of Scaledown, warns that this is a dangerous fallacy. His research into context engineering reveals that across enterprise workloads, roughly 30% to 60% of tokens sent to models add no value.

“People think more tokens mean more accuracy, but in practice, the opposite often happens,” Patel says. “When a model is overloaded with loosely related or irrelevant context, its attention mechanisms get diluted.”

This isn’t just a theoretical concern; it is backed by empirical research. Patel cites the “Lost in the Middle” study (Stanford/Berkeley), which showed that model accuracy drops when relevant details are buried in the center of a long prompt. Furthermore, research from Mila/McGill found that adding unrelated text caused 11.5% of previously correct AI answers to become wrong.

This creates a phenomenon Patel calls “context rot.” As a system serves a user over months or years, it accumulates history and metadata. The same use case becomes exponentially heavier, slower and more expensive.

“The goal isn’t to stuff the window; it’s to extract the signal,” Patel notes. Smarter, high-fidelity context, achieved by isolating only what is truly needed for the query, consistently beats larger, noisier context.

Fighting ‘Context Poisoning’ With Structure

This is where the engineering reality hits the road: How do you build a system that gives the AI exactly what it needs, and nothing more? Sekar identifies the root issue as “opaque systems.” When an engineer dumps an entire codebase or schema into context, the AI is forced to search through a haystack of data to find the needle that matters, often losing sight of security constraints during the process.

To overcome this, teams should adopt a structured retrieval approach. Sai Krishna V, also a director of engineering at TestMu AI and working alongside Sekar, describes a method of “flattening” complex data structures before they ever reach the AI. Instead of feeding deep, nested objects that increase the cognitive load on the model, TestMu AI normalizes data into single layers.

Implementing this requires a mindset of “curating the memory” of the AI. By using intelligent retrieval to fetch only the specific notes or logic required for a current problem, engineers can create a clean environment for the AI’s reasoning process. This ensures the model stays focused on the task at hand without being “poisoned” by distant, unrelated data structures.

Context Caching and ‘The Notebook’

The final piece of the puzzle is operational efficiency. If an AI agent has to re-read and re-analyze the same project context for every single query, the system becomes prohibitively expensive and slow. Patel of Scaledown points out that this inefficiency has a human cost as well; every redundant token increases latency, leading to abandoned searches and slower product flows. And to solve this bottleneck, Sekar advocates for a technique called context caching.

Sekar describes this with a practical analogy. Think of the agent as a student with a notebook. The first time the agent solves a complex architectural problem, it shouldn’t just output the code; it should cache its “understanding” of that problem, essentially taking a note. The next time a similar request comes in, the agent retrieves that cached context rather than deriving the solution from scratch.

So while Patel highlights the necessity of reducing token waste to maintain responsiveness, Sekar’s approach provides the technical blueprint for how enterprises can actually “curate the memory” of their systems. This shift ensures that the AI is not just repeating calculations, but building a persistent, efficient knowledge base over time.

Cognition for Humans and AI

Context is more than just an architectural mechanism for efficiency; it is an active layer in the workflow that helps people work with AI more deliberately. Bhavana Thudi, founder and CEO of Magi, a context-aware operating system for AI native marketing teams, describes this as designing “moments of pause” into the human–machine loop. These pauses create space to reflect, reconsider and learn as part of the flow of work, forming a shared loop of reasoning that makes both humans and machines better at the task.

When AI systems are designed around context and deliberate pause in the workflow, teams intentionally build a thoughtful work environment, and cognition emerges across the human-machine system. Thudi notes that these moments of pause are not just cognitive, but cumulative, allowing work to carry memory forward rather than resetting with each interaction. That is the future of work.

The implication for those building AI systems is clear: Progress will not come from removing humans from the loop, but from designing systems that preserve intent, memory and judgment over time. Systems built with context at the core make better work possible, and compound in value.

Filtering as the Competitive Advantage

As enterprises move from experimenting with chatbots to deploying autonomous agents, the focus must shift from the model to the data pipeline. The companies building the most reliable systems are not necessarily those with the most sophisticated AI models. They are the ones that have done the hard work of redesigning their data foundations to speak the language that machines understand.

As Krishna concludes, in an era of infinite noise, the ability to filter is the ultimate competitive advantage. That filtering does not happen at the model level; it happens at the architecture level, specifically in how you structure data, retrieve context and validate outcomes. The message for the next year of AI development is clear: The model provides the reasoning, but the engineer must provide the context.

The post Better Context Will Always Beat a Better Model appeared first on The New Stack.

Read the whole story
alvinashcraft
3 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

AI Agent & Copilot Podcast: Taylor Dorward Defines Impact of Accessibility in Business, Unlocking Employee Potential

1 Share

In this episode of the AI Agent & Copilot Podcast, John Siefert welcomes Taylor Dorward, Project Team Lead, Podcast Host, Accessible Community. The two discuss the positive impacts of accessibility on businesses and how companies can grasp the value of accessibility in the workplace. Taylor also references a new podcast series that aims to educate the community on disabilities, while highlighting community voices.

Key Takeaways

  • Overview: Taylor defines accessibility as the “design and provision of products, services, environments, and information that can be easily used, accessed, and understood by everyone, especially those with disabilities.” At its core, it’s about creating equitable experiences for all.
  • Accessibility impacts: When noting the impacts of accessibility for organizations and businesses, Taylor explains that it offers ethical, financial, legal, and even SEO-related advantages, highlighting an overlap between accessibility best practices and search optimization. “The more accessible you’re making your stuff to humans, the more accessible you’re making it to machines.”
  • Understanding the value: Helping companies understand the value of accessibility can take time, so Taylor explains that he approaches it from multiple angles, including the significant legal benefits. By ensuring products and websites comply with laws like the U.S. Rehabilitation Act and the European Accessibility Act, organizations can avoid fines, reduce legal risk, and maintain access to international markets.
  • Accommodations: Workplace accommodations can greatly improve employee morale, productivity, and retention by giving people what they need to do their best work. As Taylor puts it, even simple adjustments can “unlock so much potential” and help organizations avoid the high costs of turnover.
  • Learn more: Taylor describes a new podcast project that aims to educate the community about diverse disabilities while giving people a platform to share their lived experiences. The series will highlight the spectrum of experiences when it launches publicly later this month.

AI Agent & Copilot Summit is an AI-first event to define opportunities, impact, and outcomes with Microsoft Copilot and agents. Building on its 2025 success, the 2026 event takes place March 17-19 in San Diego. Get more details.

The post AI Agent & Copilot Podcast: Taylor Dorward Defines Impact of Accessibility in Business, Unlocking Employee Potential appeared first on Cloud Wars.

Read the whole story
alvinashcraft
4 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Satya Nadella Outlines the Next Chapter in AI: Real-World Systems

1 Share

Welcome to the Cloud Wars Minute — your daily cloud news and commentary show. Each episode provides insights and perspectives around the “reimagination machine” that is the cloud.

In today’s Cloud Wars Minute, I examine Satya Nadella’s call to shift AI focus from capabilities to societal contributions.

Highlights

00:21 — One of the leading voices in the AI Revolution, Microsoft CEO Satya Nadella, has outlined his vision for AI in a blog post. Nadella states that 2026 will be a pivotal year for AI. He says we are now past the initial discovery phase and entering a phase of widespread diffusion. Now is the time to hone in on real-world impact, to emphasize what needs to be done.

AI Agent & Copilot Summit is an AI-first event to define opportunities, impact, and outcomes with Microsoft Copilot and agents. Building on its 2025 success, the 2026 event takes place March 17-19 in San Diego. Get more details.

01:10 — Nadella focuses on three areas that require more attention. First, he suggests that we should move beyond the notion of AI slop versus AI sophistication. Instead, we need to view AI capabilities as, and I quote, “scaffolding for human potential,” rather than a substitute.

01:40 — Secondly, Nadella explains that we need to develop more sophisticated engineering that shifts the focus from specific AI models to broader systems. This involves orchestrating multimodal architectures and, crucially, implementing agents. Finally, Nadella emphasizes that for AI to gain social acceptance, these systems must be evaluated based on their real-world impact.

02:10 — This statement is Nadella’s most explicit reference to how Microsoft has positioned itself, particularly through the strong statements made by Microsoft AI CEO, Mustafa Suleyman, regarding the company’s commitment to human-centered AI. It’s very encouraging to see this sentiment reinforced by leadership.


The post Satya Nadella Outlines the Next Chapter in AI: Real-World Systems appeared first on Cloud Wars.

Read the whole story
alvinashcraft
4 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Solve for One, Extend to Many

1 Share

A single red dot in a usability test changed everything. For one participant, it was enough to stop the task entirely—a reminder that small moments can create big cognitive barriers. That insight sparked Inclusive Design for Cognition, an effort born in the Microsoft Garage Hackathon 2022 to design with, not just for, neurodivergent people. Through fast loops and co‑design sessions,....

The post Solve for One, Extend to Many appeared first on Microsoft Garage.

Read the whole story
alvinashcraft
4 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

If you're a Zoomer, this one's for you: Everything Gen Z needs to know about the 2025 tech landscape

1 Share
Here's the lowdown on all the tech from 2025 that you, dear Zoomer, should know about.
Read the whole story
alvinashcraft
4 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories