Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Chaos reigned at airports across the country last weekend, with thousands of travelers reportedly waiting in hours-long security lines thanks to staffing shortages. Transportation Security Administration (TSA) and Coast Guard workers have turned to food banks for assistance after weeks without pay. But amid a partial government shutdown aimed at curtailing the Department of Homeland Security's mass arrests and deportations, federal agents have continued their anti-immigrant crackdown unabated - and for now, there's not much anyone can do.
DHS has gone without funding for four weeks in a standoff over immigration enforcement. Congressional D âŚ
In logs as in life, the relationships are the most important part. AI doesn’t fix this. It makes it worse.
After twenty years of devops, most software engineers still treat observability like a fire alarm â something you check when things are already on fire.
Not a feedback loop you use to validate every change after shipping. Not the essential, irreplaceable source of truth on product quality and user experience.
This is not primarily a culture problem, or even a tooling problem. Itâs a data problem. The dominant model for telemetry collection stores each type of signal in a different âpillarâ, which rips the fabric of relationships apart â irreparably.
The three pillars model works fine for infrastructure1, but it is catastrophic for software engineering use cases, and will not serve for agentic validation.
But why? Itâs a flywheel of compounding factors, not just one thing, but the biggest one by far is this:
Your data does not become linearly more powerful as you widen the dataset, it becomes exponentially more powerful. Or if you really want to get technical, it becomes combinatorially more powerful as you add more context.
I made a little Netlify app here where you can enter how many attributes you store per log or trace, to see how powerful your dataset is.
When you add another attribute to your structured log events, it doesnât just give you âone more thing to queryâ. It gives you new combinations with every other field that already exists.
Note that this math is exclusively concerned with attribute keys. Once you account for values, the precision of your tooling goes higher still, especially if you handle high cardinality data.
âData is made valuable by contextâ is another way of saying that the relationships between attributes are the most important part of any data set.
This should be intuitively obvious to anyone who uses data. How valuable is the string âMike Smithâ, or â21 years oldâ? Stripped of context, they hold no value.
By spinning your telemetry out into siloes based on signal type, the three pillars model ends up destroying the most valuable part of your data: its relational seams.
I posted something on LinkedIn yesterday, and got a pile of interesting comments. One came from Kyle Forster, founder of an AI-SRE startup called RunWhen, who linked to an article he wrote called âDo Humans Still Read Logs?â

In his article, he noted that <30% of their AI SRE tools were to âtraditional observability dataâ, i.e. metrics, logs and traces. Instead, they used the instrumentation generated by other AI tools to wrap calls and queries. His takeaway:
Good AI reasoning turns out to require far less observability data than most of us thought when it has other options.
My takeaway is slightly different. After all, the agent still needed instrumentation and telemetry in order to evaluate what was happening. Thatâs still observability, right?
But as Kyle tells it, the agents went searching for a richer signal than the three pillars were giving them. They went back to the source to get the raw, pre-digested telemetry with all its connective tissue intact. Thatâs how important it was to them.
Huh.
Iâve been hearing a lot of âAI solves thisâ, and ânow that we have MCPs, AI can do joins seamlessly across the three pillarsâ, and âthis is a solved problemâ.
Mmm. Joins across data siloes can be better than nothing, yes. But they donât restore the relational seams. They donât get you back to the mathy good place where every additional attribute makes every other attribute exponentially more valuable. At agentic speed, that reconstruction becomes a bottleneck and a failure surface.

Our entire industry is trying to collectively work out the future of agentic development right now. The hardest and most interesting problems (I think) are around validation. How do we validate a change rate that is 10x, 100x, 1000x greater than before?
I donât have all the answers, but I do know this: agents are going to need production observability with speed, flexibility, TONS of context, and some kind of ontological grounding via semantic conventions.
In short:Â agents are going to need precision tools. And context (and cardinality) are what feed precision.
Production is a noisy, rowdy place of chaos, particularly at scale. If you are trying to do anomaly detection with no a priori knowledge of what to look for, the anomaly has to be fairly large to be detected. (Or else youâre detecting hundreds of âanomaliesâ all the time.)
But if you do have some knowledge of intent, along with precision tooling, these anomalies can be tracked and validated even when they are exquisitely minute. Like even just a trickle of requests2 out of tens of millions per second.
Letâs say you work for a global credit card provider. Youâre rolling out a code change to partner payments, which are âonlyâ tens of thousands of requests per second â a fraction of your total request volume of tens of millions of req/sec, but an important one.
This is a scary change, no matter how many tests you ran in staging. To test this safely in production, you decide to start by rolling the new build out to a small group of employee test users, and oh, what the hell â you make another feature flag that lets any user opt in, and flip it on for your own account.
You wait a few days. You use your card a few times. It works (thank god).
On Monday morning you pull up your observability data and select all requests containing the newÂ
build_id or commit hash, as well as all of the feature flags involved. You break down by endpoint, then start looking at latency, errors, and distribution of request codes for these requests, comparing them to the baseline.Hm â something doesnât seem quite right. Your test requests arenât timing out, but they are taking longer to complete than the baseline set. Not for all requests, but for some.
Further exploration lets you isolate the affected requests to a set with a particular query hash. Oops.. howâd that n+1 query slip in undetected??
You quickly submit a fix, ship a new build_id, and roll your change out to a larger group: this time, itâs going out to 1% of all users in a particular region.
The anomalous requests may have been only a few dozen per day, spread across many hours, in a system that served literally billions of requests in that time.

Precision tooling makes them findable. Imprecise tooling makes them unfindable.
How do you expect your agents to validate each change, if the consequences of each change cannot be found?[3]
Well, one might ask, how have we managed so far? The answer is: by using human intuition to bridge the gaps. This will not work for agents. Our wisdom must be encoded into the system, or it does not exist.
In the past, excruciatingly precise staged rollouts like these have been mostly the province of your Googles and Facebooks. Progressive deployments have historically required a lot of tooling and engineering resources.
Agentic workflows are going to make these automated validation techniques much easier and more widely used; at the exact same time, agents developing to spec are going to require a dramatically higher degree of precision and automated validation in production.
It is not just the width of your data that matters when it comes to getting great results from AI. Thereâs a lot more involved in optimizing data for reasoning, attribution, or anomaly detection. But capturing and preserving relationships is at the heart of all of it.
In this situation, as in so many others, AI is both the sickness and the cure[4]. Better get used to it.
1 — Infrastructure teams use the three pillars for one extremely good reason: they have to operate a lot of code they did not write and can not change. They have to slurp up whatever metrics or logs the components emit and store them somewhere.
2 — Yes, there are some complications here that I am glossing past, ones that start with âsâ and rhyme with âamplingâ. However, the rich data + sampling approach to the cost-usability balance is generally satisfied by dropping the least valuable data. The three pillars approach to the cost-usability problem is generally satisfied by dropping the MOST valuable data: cardinality and context.
3 — The needle-in-a-haystack is one visceral illustration of the value of rich context and precision tooling, but there are many others. Another example: wouldnât it be nice if your agentic task force could check up on any diffs that involve cache key or schema changes, say, once a day for the next 6-12 months? These changes famously take a long time to manifest, by which time everyone has forgotten that they happened.
4 — One sentence I have gotten a ton of mileage out of lately: âAI, much like alcohol, is both the cause of and solution to all of lifeâs problems.â
I was thinking about some of the points from the Polyglot Conf list of predictions for Gen AI, titled “Second Order Effects of AI Acceleration: 22 Predictions from Polyglot Conference Vancouver“. One thing that stands out to me, and I’m sure many of you have read about the scenario, of misplaced keys, tokens, passwords and usernames, or whatever other security collateral left in a repo. It’s been such an issue orgs like AWS have setup triggers that when they find keys on the internet, they trace back and try to alert their users (i.e. if a user of theirs has stuck account keys in a repo). It’s wild how big of a problem this is.
Once youâve spent any serious amount of time inside corporate IT, you eventually come to a slightly uncomfortable realization. Exponentially so if you focus on InfoSec or other security related things. Security, broadly speaking, is not in a particularly great state.
That might sound dramatic, but itâs not really. It is the standard modus operandi of corporate IT. The cost of really good security is too high for more corporations to focus where they should and often when some corporations focus on security they’ll often miss the forrest for the trees. There are absolutely teams doing excellent security work, so don’t get the idea I’m saying there aren’t some solid people doing the work to secure systems and environments. There are some organizations that invest heavily in it. There are people in security roles who take the mission extremely seriously and do very good engineering.
A lot of what passes for security is really just a mixture of documentation, policy, and a little bit of obscurity. Systems are complicated enough that people assume things are protected. Access is restricted mostly because people donât know where to look. Credentials are hidden in configuration files or environment variables that nobody outside the team sees.
And that becomes the de facto security posture.
Not deliberate protection.
Just⌠quiet obscurity.
Iâve lost count of the number of times Iâve been pulled into a system review, or some troubleshooting session, where a secret shows up in a place it absolutely shouldnât be. An API key sitting in a script. A database password in a config file. An environment file committed to a repository six months ago that nobody noticed.
That sort of thing happens constantly. Not out of malice. Out of convenience. But now weâve introduced something new into the environment.
Generative AI.
More importantly though, the agentic tooling built around it. Tooling that literally takes actions on your behalf. Tools that can read entire repositories, analyze logs, scan infrastructure configuration, generate code, and help debug systems in seconds. Tools that engineers increasingly rely on as a kind of external thinking partner while they work through problems.
All that benefit is coming with AI tools. However AI doesnât care about the secret. Itâs just processing text. But the act of pasting it there matters. Because the moment that secret leaves your controlled environment, you no longer know exactly where it goes, how itâs stored, or how long it persists in the LLM.
The mental model a lot of people are using right now is wrong. They treat AI like a scratch pad or an extension of their own thoughts.
It isnât.
The more accurate model is this: an AI tool is another resource participating in your workflow. Another staff member, effectively.
Except instead of being a person sitting at the desk next to you, itâs a system operated by someone else, running on infrastructure you donât control, processing information you send to it. Including keys and secrets.
Once you start looking at it that way, a few things become obvious. You wouldnât casually hand a contractor your production API keys while asking them to help debug something. You wouldnât drop a full .env file containing service credentials into a conversation with someone who doesnât actually need those values.
Yet that is exactly the pattern that is quietly emerging with generative AI tools. Especially among new users of said tools! Developers paste configuration files, snippets of infrastructure code, environment variables, connection strings, and logs directly into prompts because itâs the fastest way to get an answer.
It feels harmless. But secrets have a way of spreading through systems once they start moving.
The real issue here is that generative AI doesnât create security problems. It amplifies the ones that already exist. Problems that the industry has failed (miserably might I add) at solving. If an organization already has sloppy credential management, AI just gives those credentials another place to leak. If engineers already pass secrets around informally to get work done, AI becomes another convenient channel for that behavior.
And because AI tools accelerate everything, they accelerate the consequences too. What used to take hours of searching through documentation can now happen instantly. A repository full of configuration files can be analyzed in seconds. Systems that were once opaque are now far easier to reason about.
The Takeaway (Including secrets!)
The practical takeaway here isnât that people should stop using AI tools. Thatâs not realistic and frankly a career limiting maneuver at this point. The tools are genuinely useful and theyâre going to become a permanent part of how software gets built.
What needs to change – desperately – is operational discipline.
Secrets should never be treated casually, and that includes interactions with generative systems. API keys, tokens, passwords, certificates, environment files, connection stringsânone of those belong in prompts or screenshots or debugging sessions with external tools.
If you need to ask an AI for help, scrub the sensitive pieces first. Replace real values with placeholders. Remove anything that grants access to a system. Setup ignore for the env files and don’t let production env values (or vault values, whatever you’re using) leak into your Generative AI systems.
Treat every AI interaction the same way you would treat a conversation with another engineer outside your organization, or better yet outside the company (or Government, etc) altogether.
But not someone you hand the keys to the kingdom. Don’t give them to your AI tooling.