Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
150582 stories
·
33 followers

Upskilling your agents

1 Share

Share Episode         
         
In this adventure, we sit down with Dan Wahlin, Principal of DevRel for JavaScript, AI, and Cloud at Microsoft, to explore the complexities of modern infrastructure. We examine how cloud platforms like Azure function as "building blocks". Which of course, can quickly become overwhelming without the right instruction manuals. To bridge this gap, one potential solution we discuss is the emerging reliance on AI "skills"—specialized markdown files. They can give coding agents the exact knowledge needed to deploy poorly documented complex open-source projects to container apps without requiring deep infrastructure expertise.

         

And we are saying the silent part outloud, as we review how handing the keys over to autonomous agents introduces terrifying new attack vectors. It's the security nightmare of prompt injections and the careless execution of unvetted AI skills. Which is a blast from the past, and we reminisce how current downloading of random agent instructions to running untrusted executables from early internet sites. While tools like OpenClaw purport to offer incredible automation, such as allowing agents to scour the internet and execute code without human oversight, it's already led us to disastrous leaks of API keys. We emphasize the critical necessity of validating skills through trusted repositories where even having agents perform security reviews on the code before execution is not enough.

         

Finally, we tackle the philosophical debate around AI productivity and why Dorota's LLMs raise the floor and not the ceiling is so spot on. The standout pick requires mentioning, a fascinating 1983 paper titled "Ironies of Automation" by Lisanne Bainbridge. This paper perfectly predicts our current dilemma: automating systems often leaves the most complex, difficult tasks to human operators, proving that as automation scales, the need for rigorous human monitoring actually increases, destroying the very value that was attempting to be captured by the original innovation.

         💡 Notable Links:         
🎯 Picks:         




Download audio: https://dts.podtrac.com/redirect.mp3/api.spreaker.com/download/episode/70959256/download.mp3
Read the whole story
alvinashcraft
15 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Meet Claude Mythos: Leaked Anthropic post reveals the powerful upcoming model

1 Share
Matt Binder reports: An accidental leak has now been officially confirmed by AI company Anthropic regarding its most powerful AI model yet. The model, now known as “Claude Mythos,” was originally uncovered in a report from Fortune. Anthropic has since confirmed the details about the leak to the outlet. The data leak included details about the upcoming release of the...

Source

Read the whole story
alvinashcraft
2 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Nvidia’s NemoClaw has three layers of agent security. None of them solve the real problem.

1 Share
Low-poly illustration of a crab with outstretched claws, representing the proliferation of 'Claw'-branded agentic AI tools like OpenClaw and NemoClaw.

The speed of LLM adoption demands that we check its trajectory from time to time. CEO Jensen Huang, talking at the Nvidia GPU Technology Conference, covered the growth of agentic computing. Over a two-year period, there has been a 10,000-fold increase in compute demand per user, with overall usage increasing 100 times. That’s a lot of tokens, which is why AI still sucks up a lot of investment dollars.

As we saw last week, the current star of the agentic world in terms of personal-user popularity is definitely OpenClaw, which appears to deliver on many science-fiction dreams of useful talking computers.

So there is no mystery as to why Nvidia backs OpenClaw all the way. It is the most unrestrained form of token use out there. And of course Mr Huang would also encourage companies to adopt an “OpenClaw strategy”. But just like Anthropic, they know they can only embrace the open-source phenomenon while wearing plenty of armour.

Hence, Nvidia launched NemoClaw, which rides the OpenClaw wave, before adding enough guardrails to make it vaguely safer. But unfortunately, NemoClaw doesn’t replace OpenClaw; it sits on top of it.

Hugging the crab

As we see from recent articles, there will be many opportunities to make OpenClaw safer. And just like Anthropic, Nvidia believes the answer to OpenClaw is to let Nvidia protect you from it. For this, they add three security architecture components.

The first piece is policy enforcement — a system heavily used in the last few decades. This is the boundary-setting governance layer that hopes to make sure the teenager returns home before evening.

By constraining filesystem and network access, the hope is that an agent will reason about why it is blocked and propose a policy update that the human user can approve. But if it leaves through the bedroom window, it can bypass you altogether, with you being none the wiser. And this multiplies for multi-agent systems.

There is an inherent inefficiency in letting self-evolving agents install packages, learn skills, and spawn subagents only to stop them at the door because you don’t like what they are wearing.

“There is an inherent inefficiency in letting self-evolving agents install packages, learn skills, and spawn subagents only to stop them at the door because you don’t like what they are wearing.”

Overall, the more skills the system knows, the less effective policy enforcement is, as it really only learns after the fact. You either stop tasks so often that they are no longer autonomous, or hope you can out-guess a mastermind that you are paying to solve problems 24/7. In reality, the success of any system will be the experience (and cynicism) of the engineers employed to manage it.

The second piece is privacy routing. This is a good way to both control expenses and to stop giving up quite so much of your IP to the cloud providers. (But this doesn’t stop agents from emailing your passwords out because a third party asked nicely.)

Set up well, you decide what stays local and what queries go to the larger cloud models. A router can make decisions about model selection based on cost and an advanced privacy policy. Unlike cloud providers, Nvidia can make good money selling more chips if you try to run heavy inference on your own machines. But it is always sensible to select the right model for the task.

The third piece is sandboxed execution. This is vital to prevent a bad process from having simple access to neighbouring agent processes, but it also provides a way to test a system with much lower risk by tracking and inspecting intended network traffic. This is also important for long-running tasks that cannot be trivially tested otherwise. If you just want to run agents in a container, you can try NanoClaw.

But truly, “significant advancement over OpenClaw” is a low bar. I would expect more attempts to build secure products from the ground up, but until that happens, companies will bide their time and see where the very bottom of the security failure trench is, before taking the plunge.

Too many claws

By the end of 2026, many small outfits and global organisations will probably have an agentic strategy. Hence, the increasing number of “claws” out there. DefenseClaw. PicoClaw. ZeroClaw. There probably is a Sanity Claws.

As the corporate market increases its appetite for agentic computing, the next true barrier will be the ability to employ the right staff to control it. While people are warning us about how many developer jobs may be lost (and seeing share prices rise in the hope of lower overheads), what is less discussed is the difficulty of hiring the right people to babysit the new systems. As I’ve mentioned, it is no longer about employing eager young coders — it is more about grizzled vets spotting potential pitfalls throughout the workflow, and working out risk profiles.

“It is no longer about employing eager young coders — it is more about grizzled vets spotting potential pitfalls throughout the workflow, and working out risk profiles.”

The reason why Apple, Google, Microsoft, et al. did not deliver on the early promises of digital assistants and still haven’t is precisely that they can see the problems. In fact, ever since HAL refused to open the pod bay doors, the big companies have been very careful how they frame AI publicly, knowing full well that enough embarrassing failures would cause a hard rejection. That an open-source project like OpenClaw has opened Pandora’s Box is no reason for responsible organisations to ride on hope while underplaying the risks.

The post Nvidia’s NemoClaw has three layers of agent security. None of them solve the real problem. appeared first on The New Stack.

Read the whole story
alvinashcraft
2 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Random.Code() - Managing Properties From Records in C#, Part 3

1 Share
From: Jason Bock
Duration: 1:19:36
Views: 18

Changing the title, because I've been insipired to broaden the scope of this feature. It's not just about exclusion...

https://github.com/JasonBock/Transpire/issues/44

#dotnet #csharp

Read the whole story
alvinashcraft
2 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Working on products people hate

1 Share

I’ve worked on a lot of unpopular products.

At Zendesk I built large parts of an app marketplace that was too useful to get rid of but never polished enough to be loved. Now I work on GitHub Copilot, which many people think is crap1. In between, I had some brief periods where I worked on products that were well-loved. For instance, I fixed a bug where popular Gists would time out once they got more than thirty comments, and I had a hand in making it possible to write LaTeX mathematics directly into GitHub markdown2. But I’ve spent years working on products people hate3.

If I were a better developer, would I have worked on more products people love? No. Even granting that good software always makes a well-loved product, big-company software is made by teams, and teams are shaped by incentives. A very strong engineer can slightly improve the quality of software in their local area. But they must still write code that interacts with the rest of the company’s systems, and their code will be edited and extended by other engineers, and so on until that single engineer’s heroics is lost in the general mass of code commits. I wrote about this at length in How good engineers write bad code at big companies.

Looking back, I’m glad that people have strongly disliked some of the software I’ve built, for the same reason that I’m glad I wasn’t born into oil money. If I’d happened to work on popular applications for my whole career, I’d probably believe that that was because of my sheer talent. But in fact, you would not be able to predict the beloved and disliked products I worked on from the quality of their engineering. Some beloved features have very shaky engineering indeed, and many features that failed miserably were built like cathedrals on the inside4. Working on products people hate forces you to accept how little control individual engineers have over whether people like what they build.

In fact, a reliable engineer ought to be comfortable working on products people hate, because engineers work for the company, not for users. Of course, companies want to delight their users, since delighted users will pay them lots of money, and at least some of the time we’re lucky enough to get to do that. But sometimes they can’t: for instance, they might have to tighten previously-generous usage limits, or shut down a beloved product that can’t be funded anymore. Sometimes a product is funded just well enough to exist, but not well enough to be loved (like many enterprise-grade box-ticking features) and there’s nothing the engineers involved can do about it.

It can be emotionally difficult working on products that people hate. Reading negative feedback about things you built feels like a personal attack, even if the decisions they’re complaining about weren’t your decisions. To avoid this emotional pain, it’s tempting to make the mistake of ignoring feedback entirely, or of convincing yourself that you’re much smarter than the stupid users anyway. Another tempting mistake is to go too far in the other direction: to put yourself entirely “on the user’s side” and start pushing your boss to do the things they want, even if it’s technically (or politically) impossible. Both of these are mistakes because they abdicate your key responsibility as an engineer, which is to try and find some kind of balance between what’s sustainable for the company and what users want. That can be really hard!

There’s also a silver lining to working on disliked products, which is that people only care because they’re using them. The worst products are not hated, they are simply ignored (and if you think working on a hated product is bad, working on an ignored product is much worse). A product people hate is usually providing a fair amount of value to its users (or at least to its purchasers, in the case of enterprise software). If you’re thick-skinned enough to take the heat, you can do a lot of good in this position. Making a widely-used but annoying product slightly better is pretty high-impact, even if you’re not in a position to fix the major structural problems.

Almost every engineer will work on a product people hate. That’s just the law of averages: user sentiment waxes and wanes over time, and if your product doesn’t die a hero it will live long enough to become the villain. Given that, it’s sensible to avoid blaming the engineers who work on unpopular products. Otherwise you’ll end up blaming yourself, when it’s your turn, and miss the best chances in your career to have a real positive impact on users.


  1. We used to be broadly liked, then disliked when Cursor and Claude Code came out, and now I’m fairly sure the Copilot CLI tool is changing people’s minds again. So it goes.

  2. Although even that got some heated criticism at the time.

  3. Of course, I don’t mean “every single person hates the software”, or even “more than half of its users hate it”. I just mean that there are enough haters out there that most of what you read on the internet is complaints rather than praise.

  4. This is reason number five thousand why you can’t judge the quality of tech companies from the outside, no matter how much you might want to (see my post on “insider amnesia”).

Read the whole story
alvinashcraft
2 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Liberate your OpenClaw

1 Share
Read the whole story
alvinashcraft
2 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories