Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
153250 stories
·
33 followers

The Best Risk Mitigation Strategy in Data? A Single Source of Truth

1 Share

Every data leader has a version of this story. A regulatory audit surfaces a metric that doesn’t match across systems. A board member catches conflicting revenue numbers in two reports presented back-to-back. An AI tool generates a recommendation based on data that hasn’t been governed since the analyst who built it left the company two years ago. The specifics change, but the pattern doesn’t: Somewhere in the stack, data risk turned into business risk, and nobody saw it coming.

In my first article, I covered what a semantic layer is and why it matters. In my second, I spoke with early adopters about what happens when you actually build one. This piece tackles a different angle: The semantic layer as a risk mitigation strategy. Not risk in the abstract, compliance-framework sense, but the practical, operational risk that quietly drains organizations every day—bad numbers reaching decision-makers, sensitive data reaching the wrong people, and metric changes that never fully propagate.

Three risks hiding in plain sight

Data risk tends to concentrate in three areas, and most organizations are exposed in all of them simultaneously.

The first is accuracy. Inaccurate data leading to bad decisions is the oldest problem in analytics, and it hasn’t gone away. It’s gotten worse. As organizations add more tools, more dashboards, and more AI-powered applications, the surface area for error expands. A revenue metric defined one way in a Tableau workbook, another way in a Power BI model, and a third way in a Python notebook isn’t just an inconvenience. It’s a liability. When leadership makes a strategic decision based on a number that turns out to be wrong—or, more commonly, based on a number that’s one version of right—the downstream consequences are real: misallocated resources, missed targets, eroded trust in the data team.

The second is governance and access. Most organizations have some framework for controlling who sees what data. In practice, those controls are scattered across warehouses, BI tools, individual dashboards, shared drives, and cloud storage buckets. Each system has its own permissions model, its own admin interface, and its own gaps. The result is a patchwork that’s expensive to maintain and nearly impossible to audit with confidence. Sensitive data finds its way into a dashboard it shouldn’t be in—not because someone acted maliciously, but because the governance surface area is too large to manage consistently.

The third is change management. A CFO decides that ARR should exclude trial customers starting next quarter. In theory, that’s a single metric change. In practice, it’s a scavenger hunt. That ARR calculation lives in a warehouse view, two Tableau workbooks, a Power BI model, an Excel report that someone on the FP&A team maintains manually, and now the new AI analytics tool that pulls directly from the data lake. Some of those get updated. Some don’t. Three months later, someone notices the numbers don’t match and the cycle starts again. The risk isn’t that the change was wrong—it’s that the change was never fully implemented.

These three risks—accuracy, governance, and change management—aren’t independent. They compound. An ungoverned metric that’s defined inconsistently and can’t be updated in one place is a ticking clock. The question isn’t whether it causes a problem, it’s when.

The legacy approach: more people, more tools, more problems

The traditional response to data risk has been to throw structure at it—and structure usually means people and process.

The most common pattern is the BI analyst as gatekeeper. Critical metrics, reports, and dashboards are managed by a centralized team. Need a new report? Submit a request. Need a metric change? Submit a request. Need to understand why two numbers don’t match? Submit a request and wait. This model exists because organizations don’t trust their data enough to let people self-serve, and for good reason—without a governed foundation, self-service creates chaos. But the gatekeeper model has its own costs. It’s slow. It creates bottlenecks. It’s expensive to staff. And performance is inconsistent—the quality of the output depends entirely on which analyst picks up the ticket and which tools they prefer.

Governance gets its own layer of complexity. Organizations deploy access controls across their data warehouse, BI platforms, file storage, and application layer—each with different permission models, administrators, and audit capabilities. Quality reporting, lineage, and business ownership tracking create additional tooling, complexity, and management overhead. Maintaining consistency across all of these systems is resource-intensive, and the more tools you add, the harder it gets. Most organizations know their governance has gaps. They just can’t find them all.

The combination of centralized BI teams and sprawling governance frameworks produces a predictable outcome: large, slow-moving data organizations that spend more time fixing and maintaining the infrastructure than actually delivering data or insight. When everything is managed manually across dozens of tools, problems don’t grow linearly—they grow exponentially. Every new dashboard, data source, BI tool adds another surface to govern, another place where logic can diverge, another potential point of failure. The legacy approach doesn’t scale. It just gets more expensive.

The semantic approach: govern once, access everywhere

The semantic layer offers a fundamentally different model for managing data risk. Instead of distributing control across every tool in the stack, it consolidates it.

Start with accuracy and change management because the semantic layer addresses both with the same mechanism: A single location for all metric definitions, business logic, and calculations. When ARR is defined once in the semantic layer, it’s defined once everywhere. Tableau, Power BI, Excel, Python, your AI chatbot—they all reference the same governed definition. When the CFO decides to exclude trial customers, that change happens in one place and propagates automatically to every downstream tool. No scavenger hunt. No version that got missed. No analyst discovering three months later that their workbook is still running the old logic. And when that same CFO wants to know how we calculated that same metric several years ago? Semantic layers are driven by version control by default, allowing for seamless versioning across key metrics.

This same centralization transforms governance. Instead of managing access controls across a warehouse, three BI platforms, a shared drive, and an application layer, organizations can align governance around the semantic layer itself. It becomes the single access point for governed data. Users connect to the semantic layer and pull data into the tool of their choice, but the permissions, definitions, and business logic are all managed in one place. The governance surface area shrinks from dozens of systems to one.

But the semantic layer does something else that the legacy approach can’t: it makes data self-documenting. In a traditional environment, the context around data—what a metric means, why certain records are excluded, how a calculation works—lives in the heads of analysts, in scattered documentation, or nowhere at all. The semantic layer captures that context as structured metadata alongside the models, columns, and metrics themselves. Field descriptions, metric definitions, relationship mappings, business rules—all of it is documented where the data lives, not in a wiki that nobody updates. This is what makes genuine self-service possible. When the data carries its own context, users don’t need to submit a ticket to understand what they’re looking at (and AI agents can read-it in for contextual understanding at scale).

The practical result is a shift from centralized gatekeeping to federated, hub-and-spoke delivery. The semantic layer is the hub: governed, documented, consistent. The spokes are the teams and tools that consume it. A finance analyst pulls data into Excel. A data scientist queries it in Python. An AI agent accesses it via MCP. They all get the same numbers, definitions, governance—without a centralized BI team manually ensuring consistency across every output.

Risk reduction, not risk elimination

The semantic layer doesn’t eliminate data risk. The underlying data still needs to be clean, well-structured, and maintained—as every practitioner I’ve spoken with has confirmed, garbage in still produces garbage out. And organizational alignment around metric definitions requires leadership commitment that no software can substitute for.

But the semantic layer changes the economics of data risk. Instead of scaling risk management by adding more people and more governance tools, you reduce the surface area that needs to be managed. Fewer places where logic can diverge. Fewer systems to audit. Fewer opportunities for a metric change to get lost in translation. The problems don’t disappear, but they become containable—manageable in one place rather than scattered across the entire stack.

For organizations serious about AI-driven analytics, this matters more than ever. AI tools need governed, contextualized data to produce trusted outputs. The semantic layer provides that foundation—not just as a nice-to-have for consistency, but as critical risk infrastructure for an era where the cost of bad data is accelerating.

One definition. One access point. One place to govern. That’s not just a better architecture. It’s a better risk strategy.



Read the whole story
alvinashcraft
21 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

How to Make Code Highlighting-Friendly

1 Share

This article introduces the notion of highlighting complexity and provides recipes for making your code highlighting-friendly, resulting in faster, more efficient highlighting.

Code style is not just for style – it impacts the physical world! The benefits of highlighting-friendly code include:

  1. Better responsiveness
  2. Optimized CPU usage
  3. Efficient memory usage
  4. Cooler system temperatures
  5. Quieter operation
  6. Longer battery life

While monads are burritos, you shouldn’t be frying eggs on your laptop!

Consider highlighting complexity

Imagine you’ve written this function to compute Fibonacci numbers using naive recursion:

def fib(n: Int): Int =
  if (n <= 1) n
  else fib(n - 1) + fib(n - 2)

It is predictably slow, but you wouldn’t blame Scala for that. The issue is more fundamental and not specific to the programming language. However, this doesn’t mean that the function cannot be made fast. There is a way to adjust the code so it outputs exactly the same sequence much more efficiently.

The same is true for highlighting code. If highlighting is slow, the IDE is not always to blame. Some code is inherently difficult to analyze. However, this doesn’t mean that highlighting cannot be fast. Minor code tweaks can make highlighting significantly more efficient, even if the code stays essentially the same.

So far, so good. However, while algorithmic complexity is “CS 101”, developers rarely think about highlighting complexity. (The two differ: Code might run slow but be easy to highlight, or run fast but be difficult to highlight.) Even if you study compiler construction, it’s primarily not about performance, and parts that are about performance refer to compilers rather than source code. Furthermore, batch-compiling code is not the same as editing code.

Following software engineering best practices may often speed up highlighting. It’s also useful to do in general: keeping your classes and methods small and focused, preferring clarity over cleverness, etc. However, these principles are mostly about cognitive complexity. In contrast to algorithmic complexity, cognitive complexity often correlates with highlighting complexity. Still, they are not the same and sometimes can differ significantly.

When writing code, you should also consider highlighting complexity. If you ignore algorithmic complexity, your code will perform poorly. If you ignore cognitive complexity, your code will be difficult to understand. If you ignore highlighting complexity, your code will take a long time to compile or highlight and will consume excessive resources in the process.

Good code should be good in all respects. Fortunately, the principles for making your code highlighting-friendly are simple and easy to apply in practice. (Most of the recipes are not Scala-specific and can be useful for other languages as well.)

Separate code into modules

Most Scala programmers divide code into packages, but fewer divide code into modules. There’s one and the same reason for both.

In contrast to a language like C, Scala supports packages, and most Scala projects naturally use them. Modules, however, are a concept of IDEs and build tools rather than the programming language, so they are used less often. Even the Java Platform Module System is mostly about compiled classes and JARs rather than source code.

Modules limit the scopes of bindings and introduce an explicit graph of dependencies – otherwise, any source file could, in principle, depend on any other source file. This limits the scope of incremental compilation and analysis, which makes compilation faster, reduces peak resource consumption, and allows modules to compile in parallel.

Likewise, modules improve the performance of highlighting – an IDE can search for entities and invalidate caches more efficiently. Moreover, this improves the UX by making autocomplete and auto-import more relevant, reducing clutter. Another benefit is that you can compile (or recompile) only part of a project when running an application or a unit test in one of the modules (even if other modules don’t compile cleanly).

Packages are often natural boundaries for modules. If there’s only a single module in your project, or if some modules are too large, consider extracting one or more packages into a separate module. Since the refactoring doesn’t affect packages as such, this should be backward-compatible. Furthermore, you can still package the classes into a single JAR – the refactoring is for the source code, but not necessarily the bytecode.

Note that you must use true modules – using multiple directories or multiple source roots is not the same thing. (See multi-project builds for sbt.)

Put classes in separate files

The Scala compiler doesn’t limit how many classes you can add to a source file (or how you name that file). This can be useful, but you shouldn’t overuse this capability.

If you modify only one class in a source file, the Scala compiler cannot compile that class separately – it has to compile the entire source file. The same is generally true for IDEs: You open a file rather than a class in an editor tab, which analyzes the entire file. (However, you can use incremental highlighting to overcome this limitation.)

Furthermore, when each class has a file with a dedicated name, it’s easier to find classes and navigate around the project, even without an IDE. You should put classes into corresponding files the same way you put packages into corresponding directories.

Another reason is import statements. While each class requires its own set of imports, defining multiple classes in a single file merges these imports and makes them common. This can slow down the resolution of references. (If there are many imports and imported entities that, in turn, depend on many imports, then there could be a combinatorial explosion.)

If you notice many relatively large classes in a single file, consider extracting classes into separate source files. It’s easy to do and doesn’t affect backward compatibility. (Obviously, companion classes and sealed class hierarchies should remain in the same file.)

Define classes in packages rather than objects

In Scala, packages and objects are similar, and there are even package objects! This makes it possible to put classes in objects rather than packages. However, there are good reasons to avoid that.

First, since each object is contained in a single source file, multiple classes in an object implies multiple classes in a file, which, as we’ve already seen, is not ideal.

Second, this also affects compiled code, not just source files. While every class is compiled to a separate JVM .class file, as if they were defined in a package, there’s only one outline for the object – pickles or TASTy. As a result, both the compiler and IDE have to process multiple classes even if they need to access only one.

Thus, you should normally define classes in packages rather than objects. Leave objects for methods, variables, and types. (And in Scala 3, even top-level definitions can reside in a package.)

Favor small classes and methods

Yes, yes, you already know this. But there’s a twist. When you normally think of “small”, you often think of “simple”. For example, if a class contains only a few methods with descriptive names, the class looks simple, and you don’t have to analyze the code of these methods to understand what they do.

This luxury, however, doesn’t apply to compilers or IDEs. If you open the file, the entire contents will be analyzed, and if the methods (and consequently the class) are large, the analysis will consume time and resources.

Consider splitting large classes and methods into smaller ones, even if they are simple. For highlighting, “lines of code” matter; even a single class or method can be too much if it’s very large.

This also applies to generated sources: If a source file is generated and other sources depend on it, you don’t need to look into that code, but IDEs and compilers still do. When generating code, divide the output into smaller parts – files, classes, and methods; don’t mix everything into one blob.

Depend on interfaces rather than classes

It’s good to “program to an interface” in general, and this can also help with highlighting.

Suppose there is a large class with a few methods that comprise its API. Even if you access only the API, reading the source file requires parsing the entire class, including all the implementation details. And even if you specify the types explicitly, resolving the corresponding references requires processing many imports.

Therefore, if a class is very large, consider extracting an interface instead of referencing the class directly.

Avoid wildcard imports

Using named imports rather than wildcard imports is a well-known best practice. It makes code more readable – you can clearly see where symbols come from. It also makes your code more robust. (Otherwise, code might stop compiling after a library adds a class that conflicts with another imported class.) And there’s less clutter – autocomplete will show only relevant symbols that are actually in use.

Furthermore, named imports can speed up code analysis. When resolving identifiers, each wildcard import has to be checked, and import expressions might, in turn, depend on wildcard imports above. There might be imports from objects, which themselves depend on imports elsewhere. All of that is not limited to the file being highlighted. Even if your code depends only on signatures in other files, because paths in the type annotations are not absolute, the analysis still has to process imports in those files.

Wildcard imports are especially problematic for implicits. Because implicits are, well, implicit, and might require other implicits, searching for them can be computationally intensive. And if implicits are imported using a wildcard, then both the usage and the import are implicit. This complicates the task even more – not only does the analysis need to find some vague entity, but it also has to look in a blurry scope.

Therefore, prefer specific imports to wildcard imports. Convert existing wildcards to named imports. In Scala 2, consider importing implicits by name. Although given imports in Scala 3 are an improvement, they are effectively wildcard imports and thus rely on good library design. To be on the safe side, prefer by-type imports to plain given imports. (And if you’re designing a library, define implicits in a separate package or object.)

Prefer imports to mixins

It’s possible to use inheritance instead of imports. We can see this even in Java: Every TestCase is also Assert, so you can access methods such as assertEquals without having to import them. This might seem convenient. However, this is effectively a forced wildcard import, with all the usual drawbacks. It’s better to import Assert.assertEquals selectively (or import Assert.*, as an option).

Furthermore, the approach with subclassing or mixing in traits is slower compared to regular wildcard imports. Analysis has to take inheritance and linearization, as well as overloading and overriding, into account. And if you modify the trait, classes that use it have to be recompiled.

If some definitions are effectively static, put them in an object rather than a trait, so that clients import rather than inherit them.

Declare classes and methods private

There are many good reasons to minimize the accessibility of classes and methods: to distinguish between API and implementation, to maintain source and binary compatibility, to prevent clutter in autocomplete, and to reduce cognitive load.

What’s less known is that declaring classes and methods private, whenever possible, improves the performance of compilation and highlighting. Incremental compilers don’t include private members when determining APIs and thus don’t need to store and compare them. In the process of resolving references, IDEs can skip inaccessible elements faster. When you write “Foo”, you already know which Foo is implied. However, you might be surprised by how much computation resolving a reference often involves. Declaring unsuitable Foos inaccessible helps make analysis faster.

The Scala plugin can help by automatically detecting declarations that can be private.

Specify types of public or complex definitions

Each non-local definition should either be private or have a type annotation. Definitions that are accessible to clients comprise an API. APIs are boundaries of abstraction and thus must be explicit; clients shouldn’t have to study the implementation – the right-hand side – to understand the signature – the left-hand side. In contrast to implementations, APIs must be stable and must not depend on the contents of the right-hand side. Type annotations make APIs both explicit and stable.

Type annotations greatly help incremental computations. When signatures are stable, fewer classes need to be recompiled after a code modification. Likewise, more caches can be reused when you edit code in an IDE, making highlighting faster and reducing resource consumption.

Thus, it’s best to always specify the types of non-private members explicitly. Note that you should specify the type even if there’s overriding because the inferred type might be more specific, at least in Scala 2. (For example, if a superclass method returns Seq[Int] and the subclass method is just = List(1), the type of the latter would be List[Int], which might affect clients that use the subclass directly.) You should also specify the types of protected members, not just public ones – subclasses are also clients. (As an exception, you may omit types when the right-hand side is both simple and stable, e.g., a literal. That said, having the type spelled out explicitly is often better, both for humans and compilers.)

Furthermore, explicit types can benefit even private and local definitions. While an incremental compiler recompiles the entire file, an IDE can invalidate caches more gradually and within a narrower scope. Thus, add type annotations to private members if they are complex – this can make editing code more efficient. Also, specify the types of complex local variables. (Sometimes you may first need to extract a method or introduce a variable to specify the type.)

Code Style | Type Annotation in the Scala plugin requires type annotations for public and protected members – they are automatically added by refactorings and code generation, and are checked by the corresponding inspection. However, there are exceptions for simple expressions, and they are not required for private or local definitions, regardless of complexity. You can make these settings stricter to be on the safe side.

Favor standard language features over macros

The concept behind macros might seem tempting – you do computations at compile time rather than at runtime. However, “compile time” is also “highlighting time”, which is true regardless of whether you use a compiler or an IDE when editing code… unless you always write everything in one go, without any assistance. So, macros might interfere with writing and editing code, making feedback slower and consuming more resources. Note that this applies not just to defining a macro, which requires a feature flag, but also to using macros, which doesn’t require a feature flag.

Macros are rarely actually needed. Take, for example, Lisp: The syntax is very limited, and the language is dynamic, so no static analysis is performed anyway. Scala, however, is a very expressive language as it is, and it’s statically typed. In Scala, the standard language features are sufficient for most tasks. In such a case, macros only make static analysis, as well as understanding code, more difficult. Thus, when writing code, reach for the standard language features first: type parameters, implicit parameters, etc. Macros are supposed to be the last resort, not a go-to solution.

This can be generalized: Don’t use complex language features just “because you can”, only when they are really needed; prefer the least powerful solution that solves the problem. For more details on this topic, see Lean Scala by Martin Odersky.

Apply these principles to AI-generated code

Even if you use AI to generate 100% of your code, you still read that code. (Right?) Therefore, producing highlighting-friendly code is as relevant as ever – the code is generated in a data center but is highlighted on your machine. This also improves incremental compilation, reducing system load when using agents. Moreover, it prevents context stuffing (when a model loads irrelevant information), which improves accuracy and reduces costs.

The first thing you can do is lead AI by example, because models tend to propagate existing conventions and coding styles. In a new project, you can explicitly add recommendations to AGENTS.md. Last but not least, you can always refactor your code, whether it’s written by a human or AI.

Summary

That said, the performance of your IDE is also important. We’re constantly working on improving the performance of both IntelliJ IDEA and the Scala plugin, and there are tips for improving performance that you can apply in practice. However, just as no amount of compiler optimizations can fix the example with naive recursion, highlighting may sometimes require assistance from your side.

As with everything, highlighting complexity is not the only factor; you need to balance different considerations. But often, there’s no contradiction: Clean code improves highlighting complexity, and improving highlighting complexity results in cleaner code. In any case, it’s useful to always consider highlighting complexity and having the recipes at hand.

For more details, see the corresponding ticket in YouTrack. It also lists features that can help you apply the refactorings more easily. If you find them useful, vote for the tickets so we know there is demand.

If you have any questions, feel free to ask us on Discord.

Happy developing!

The Scala team at JetBrains

Read the whole story
alvinashcraft
21 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

AgentCon Around the World: How MVPs Brought AI Agents to Life

1 Share

Over the past months, AgentCon has taken shape as a series of grassroots events organized by Microsoft MVPs across the globe—and events are still being added to the calendar. Throughout 2026, AgentCon events are bringing together thousands of developers across multiple cities worldwide,

offering communityled, free learning experiences focused on AI agents and realworld applications. 

Discussing AI agents… with a side of amazing food in Casablanca!

From handson workshops to deep technical dives 

and communityled conversations, AgentCon is more than just an event brand—it’s a shared movement. MVPs take complex topics such as AI agents, orchestration, and realworld scenarios and translate

them into practical, approachable learning experiences tailored

to the needs of their local communities.  

A Community-Driven Take on AI Agents 

AI agents are no longer a future concept. They are already reshaping how developers, architects, and organizations design and build solutions. 

What makes AgentCon special is how MVPs approach the topic: 

  • Making advanced concepts approachable through live demos and handson labs 
  • Connecting theory with realworld use cases, from productivity to automation 
  • Creating safe, inclusive spaces for learning, asking questions, and experimenting 

Each AgentCon reflect the unique needs and interests of its local audience. Together, these events form a shared global learning journey. 

Voices from the Community  

"AgentCon in Cologne made it a very important day for our local community. We hosted the event at BarmeniaGothaer, one of Germany’s largest insurance companies actively creating and using AI agents. This allowed the community to learn directly from realworld scenarios and practitioners.” MVP and RD Raphael Köllner 

“AgentCon Chennai was special because it gave our local community a chance to explore AI agents together in a very handson way. What excited me most was seeing students, professionals, and researchers all learning side by side, and the energy in the room was incredible. My biggest takeaway was how quickly people realized that agentic systems are not just theory—they’re tools we can start building with today.” MVP Saravanan Ganesan 

“The best thing about exploring AI agents together is the interest and the ‘wow’ feeling of the audience when AI agents demos were presented by different speakers in different contexts.” MVP Hassan Fadili 

MVP Impact in Action 

AgentCon events showcase what MVP impact looks like on the ground: 

  • Technical leadership: MVPs share concrete implementations, best practices, and lessons learned from working with AI agents and related technologies. 
  • Knowledge sharing: Sessions often combine code walkthroughs, architecture discussions, and live Q&A, encouraging curiosity and dialogue. 
  • Community building: Many events are organized in collaboration with user groups, meetups, and local tech communities—extending reach beyond the MVP Program itself. 

A Truly Global Effort 

One of the most inspiring aspects of AgentCon is its global and inclusive nature. MVPs host events across different regions, languages, and formats—online and in-person—helping ensure access for a broad and diverse audience. 

Some communities focus on introductory concepts, while others explore advanced scenarios. Some events center on development, others on architecture or applied AI. All of them share a common goal: empowering people to learn and grow together. 

Thank You to the MVPs Who Made It Happen 

AgentCon is a testament to the passion, generosity, and leadership of the MVP community. Every agenda created, every session delivered, and every question answered contributes to a shared success—so far, and as the series continues.  

To all the MVPs who organized, spoke at, and supported AgentCon events—thank you for investing your time, expertise, and energy to uplift others. AgentCon events are still taking place – find one near you: AgentCon - Global AI Community 

Join the Conversation 

Did you attend or organize an AgentCon event? What did you learn about AI agents? What inspired you most from your local community? 

Share your experience in the comments, or tell us your MVP story. If you’re passionate about technology, community, and knowledge sharing, this might be your moment to take the next step: Nominate or become a Microsoft MVP. 

Want to Learn More About the MVP Program? 

To find an MVP and learn more about the MVP Program visit the MVP Communities website and follow our updates on LinkedIn or #mvpbuzz. 

Join us for a future live session through the Microsoft Reactor where we walk through what the MVP Program is about, what we look for, and how nominations work. These sessions are designed to help you connect the dots between the work you’re already doing and the impact the MVP Program recognizes — with time for questions, examples, and real conversations. 

Read the whole story
alvinashcraft
22 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Random.Code() - Adding Replacement Capabilities to Rocks, Part 1

1 Share
From: Jason Bock
Duration: 0:00
Views: 15

In this stream, I start the work to add a feature to Rocks that will let developer change expectations if needed.

https://github.com/JasonBock/Rocks/issues/410

#dotnet #csharp #roslyn

Read the whole story
alvinashcraft
22 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Leadership as a Service — Why Scrum Masters Should Work Themselves Out of a Job and How Quality Circles Make Learning Flow | Peter Merel

1 Share

Peter Merel: Leadership as a Service — Why Scrum Masters Should Work Themselves Out of a Job and How Quality Circles Make Learning Flow

Read the full Show Notes and search through the world's largest audio library on Agile and Scrum directly on the Scrum Master Toolbox Podcast website: http://bit.ly/SMTP_ShowNotes.

 

"A Scrum Master is a self-defeating role. If you have worked yourself out of a job, then you've succeeded." - Peter Merel

 

Peter Merel challenges the very notion of the Scrum Master as a permanent organizational role. He argues that calling someone a "master" makes everyone else a servant — the opposite of what agile teams need. Instead, Peter advocates for leadership as a service, where every team member provides leadership to their team and every member of a swarm provides leadership to their swarm. He points to the Haudenosaunee Confederacy — the successful direct democratic republic that existed in North America before the USA, and which influenced the American founding fathers — as a model for distributed leadership. The protocol is simple enough to apply universally, regardless of organizational structure. Peter's practical approach to success measurement is equally compelling: build a thin steel thread of alignment, prove it works in 8 to 12 weeks, then split it and backfill with the most progressive people in the organization. He describes growing a group of 300 in just 9 months using this approach. The key insight is that coaches should not think of themselves as change agents, but rather as people who transform change participants into change leaders. Once a team can self-organize without you, your job is to move on to the next challenge — and that's what success looks like.

 

In this episode, we refer to the concept of leadership as a service and the XScale Alliance.

 

Self-reflection Question: If you stepped away from your team tomorrow, could they self-organize effectively — and if not, what's the one thing you could teach them this week that would bring them closer to not needing you?

Featured Retrospective Format for the Week: Quality Circles

Peter Merel recommends quality circles as a cross-team retrospective format drawn from the Toyota Production System. The concept is simple but powerful: take three teams of six people and break them into six quality circles of three — one person from each team in each circle. These circles meet regularly for 10 to 30 minutes, ideally before team planning sessions, to share problems, ideas, and ways they can help each other. The magic of three people is that while one person explains, another listens, and the third is already thinking about where the conversation goes next — creating what Peter calls "a beautiful hum." Each circle brings two kinds of ideas back to their team: proposals for work that would benefit the teams as a whole, and treaties — working agreements between teams. The teams remain autonomous and can decide how to respond. Peter emphasizes that this approach scales naturally — representatives from groups of teams can form quality circles at higher levels, keeping face-to-face communication alive across entire organizations. As Peter puts it, "Learnings flow across the organization — and that's more valuable than anything you can come up with in a retrospective by yourself."

 

[The Scrum Master Toolbox Podcast Recommends]

🔥In the ruthless world of fintech, success isn't just about innovation—it's about coaching!🔥

Angela thought she was just there to coach a team. But now, she's caught in the middle of a corporate espionage drama that could make or break the future of digital banking. Can she help the team regain their mojo and outwit their rivals, or will the competition crush their ambitions? As alliances shift and the pressure builds, one thing becomes clear: this isn't just about the product—it's about the people.

 

🚨 Will Angela's coaching be enough? Find out in Shift: From Product to People—the gripping story of high-stakes innovation and corporate intrigue.

 

Buy Now on Amazon

 

[The Scrum Master Toolbox Podcast Recommends]

 

About Peter Merel

 

Credited in the first agile book (XP Embraced), keynoted the first agile conference, invented the first agile training game, founded the xscale alliance, authored the agile way, Peter developed software by hand for forty years, coached agile in person for twenty years, and is working now to revolutionize the AI alignment landscape.

 

You can link with Peter Merel on LinkedIn. You can also find his work at agile.way.pm.





Download audio: https://traffic.libsyn.com/secure/scrummastertoolbox/20260507_Peter_Merel_Thu.mp3?dest-id=246429
Read the whole story
alvinashcraft
22 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Comparing Different Approaches to Sandboxing

1 Share

AI agents will become the primary way we interact with computers in the future. They will be able to understand our needs and preferences, and proactively help us with tasks and decision making.

Satya Nadella

CEO of Microsoft

Whether you are a software engineer, a product manager, or a designer, this quote should fundamentally change how we approach our daily routine. We are no longer just building interfaces; we are creating environments where agents can operate autonomously with minimal human interaction. What could be the fundamental requirement for such an environment ?

In a single word: Isolation.

A user interacting with traditional software is constrained by the actions it allows. But Agents are non-deterministic, and therefore prone to hallucination and prompt injections. Once you give an AI write access to your systems, there is nothing stopping it from executing a rm -rf to delete all your data. Of course, there are different ways to solve this problem, with one approach being sandboxing: an isolated, controlled environment used for experimentation and testing without affecting the surrounding system.

So, I started exploring different strategies to sandbox the agents. Starting with a bare minimum setup and going all the way to setting up a cloud VM. Here is what I learned at each step.

1. Let’s Start with the Baseline

Chroot has been the traditional way to achieve file system isolation. It works well when you want the process to think that a specific, restricted directory is the absolute root of the machine.

image5

However, there are two major caveats.

  1. If the process inside the chroot has root privileges, it could break out.
  2. While it offers file isolation, process isolation is still a problem. A malicious agent can still see other processes running on your system and try to kill them.
image8

As you can see above, doing an ls /proc still shows all the processes running on the host.

This is when I learnt about systemd-nspawn, also called “chroot on steroids”. The difference between chroot and systemd-spawn is that the latter provides isolation at the network and process levels in addition to the file system.

image3 1

Now, when I do the same ls /proc in the systemd-nspawn mybox container, I just see the processes in the mybox container achieving process-level isolation.

image6

Pros

  1. Lightweight compared to other container processes like Docker, it offers faster startup times.
  2. Native support in Linux.

Caveats

  1. systemd-nspawn is not very popular in the developer community unless you are deep into Linux.
  2. While this works for Linux, what if you need to run your agents on Windows? You will have to find alternatives depending on the platform.

2. Are Containers Enough?

Another technology that comes to mind when thinking about isolated environments is Docker. And unlike the previous concepts we discussed, Docker has a broader ecosystem and a strong community.

With containers, you also get isolated file systems, network interfaces, and process trees. They also come with cross-platform support across Mac, Windows, and Linux. With all these advantages, creating and running agents across different platforms becomes very easy, which makes containers an obvious choice.

However, the model becomes more complex when containers become a dev platform for agents. More often than not, agents need to execute generated code in separate environments, which in practice means spinning up new Docker containers on demand. This introduces a container-in-container pattern (Docker-in-Docker), where an agent running inside a container needs to build and run other containers. 

To make Docker-in-Docker to work, we would have to run the container in privileged mode (--privileged), which gives the container processes elevated permissions rights and dramatically weakens the isolation. At this point, the isolation guarantees are significantly diminished. As a result, complete isolation for agents using only containers becomes tricky.

3. Do Virtual Machines Help?

As you might have already predicted, Virtual Machines (VMs) offer the strongest isolation. With a VM, you can get an entire OS, file system, and network of your own. For example, I currently run MacOS with lima – Linux VM to run Linux-specific workloads.

image10

However, the tradeoff is that spinning up a VM is expensive. And if this needs to be done for every agent, it is not scalable. Some stats that show how expensive spinning up a VM with system-nspawn looks like.

Approach

Per Agent Cost

Boot Time

10 Agents

VM (Lima)

~4GB RAM + 4 CPU

30-60s

~40GB RAM

systemd-nspawn

~10MB RAM

< 1s

~100MB RAM

chroot

1MB RAM

instant

~10MB RAM

For example, in the below screenshot you can see the cost it takes to run a lima vm.

image7

4. MicroVMs to the rescue

A MicroVM (Micro Virtual Machines) felt like the perfect answer to the isolation story. So what is MicroVM, and what makes it better?

MicroVM is a lightweight virtualisation technology that provides the strong security and isolation of a traditional VM, along with the speed of a container.

  1. Strong security and isolation are enabled because a MicroVM gets its own kernel, aka the Guest Kernel, unlike containers, which use a shared kernel. Because of this, any compromise inside the Guest OS does not directly affect the host or the other VMs.
  2. Speed: unlike traditional VMs, it is provisioned with minimal hardware (no USB or PCI buses) and bypasses BIOS/UEFI boot, significantly reducing device emulation overhead and startup latency.
image4

Amazon open-sourced Firecracker in 2018, which was the earliest adoption of the MicroVM architecture. While this helped catalyze the MicroVM architecture, Firecracker was restricted to Linux environments. And most of the agentic orchestration tends to happen on developers’ laptops which run MacOS and Windows as well.

Docker addressed this gap with its Sandbox offering. The best part is their MicroVM-based architecture, which runs natively across macOS, Windows, and Linux, delivering better isolation, faster startup times, and a smoother developer experience. We will learn about this in a bit.

5. gVisor

gVisor takes a unique approach to solving the isolation problem. While the previous strategies used the OS Kernel, gVisor creates its own Kernel called the “application kernel” running in the user space.

When a standard containerized app wants to do something like open a file, allocate memory, or send network traffic, it makes a “system call” (syscall) directly to the host’s Linux kernel.

With gVisor, your app is bundled with a component called the Sentry.

  1. The Sentry intercepts every single syscall your application makes.
  2. It processes that request in user-space using its own implementation of Linux networking, file systems, and memory management.
  3. If the Sentry absolutely needs the host kernel to do something (like actual disk I/O), it translates the request into an extremely restricted, heavily filtered, safe call to the host.

However, it suffers from the same problem as systemd-nspawn. Not much broader community supports and only supports Linux.

Docker Sandbox

With Docker Sandboxes, AI coding agents run in isolated microVM environments. The performance is as seamless as it can be, identical to running on the host, but with significantly stronger isolation and security. This means you can run your autonomous agents without worrying about host compromise or unintended access to your local environment. 

Sandbox achieves this levels of security through three layers of isolation:

image9

Hypervisor Isolation: Every Sandbox has its own Linux Kernel. So, anything that affects the sandbox kernel will not affect the host or other sandbox kernels.

Network Isolation

  • Each Sandbox has its own isolated network. Meaning multiple sandboxes cannot communicate with each other or with the host.
  • In addition, network policies can be enforced to allow or disallow traffic from a source.

Docker Engine Isolation

  • This is what made me fall in love with this new architecture. Every Sandbox gets its own Docker Engine. As a result, whenever the agent runs docker pull or docker compose, those commands are executed against the internal engine rather than the external Docker daemon.
  • Because of this, agents running inside can only see Docker services within their sandbox and nothing else, adding an additional layer of security.

Attribute

Traditional VM

Container

Docker MicroVM

Isolation

Strong (dedicated kernel)

Weak (shared kernel)

Strong (dedicated kernel)

Boot time

Minutes

Milliseconds

Seconds (after the first image pull)

Attack Surface

Large

Medium

Minimal

To demonstrate Docker Engine isolation, I created two Sandbox sessions, ran the Docker hello-world container image in one, and then ran docker ps -a in both.

image2 1

​As you can see from the screenshot below, one session has the hello-world container and the other does not. This is possible because both of them are running two different Docker engine daemons.

image1 2

More on the Sandbox architecture here: https://www.docker.com/blog/why-microvms-the-architecture-behind-docker-sandboxes/

Conclusion

If there is one takeaway; it’s this: isolation plays a major role when building autonomous AI agents because the blast radius of a security mistake is significant. 

Each approach we explored till now solves a different piece of the isolation puzzle. Containers improve portability and developer experience, but inherit the risks of a shared kernel. Virtual Machines deliver strong isolation, but the overhead doesn’t scale when you’re spinning up dozens of agents. gVisor sits in an interesting middle ground, though compatibility and community trade offs might slow you down.

Among all these, what makes Docker Sandbox with MicroVMs compelling is how it unifies these dimensions: VM-level security, container-like startup speed, and a workflow developers already know. Per-sandbox Docker Engines and strict network boundaries make it a strong foundation for running untrusted, autonomous workloads at scale.

So, what are you waiting for? Go ahead and try it out today.

For macOS: brew install docker/tap/sbx

For Windows: winget install Docker.sbx

Read the whole story
alvinashcraft
23 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories