Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148630 stories
·
33 followers

Random.Code() - General Refactorings in Rocks, Part 1

1 Share
From: Jason Bock
Duration: 0:00
Views: 12

In this stream, I'll start doing some clean-up code in Rocks that I've been meaning to do for a while.

https://github.com/JasonBock/Rocks/issues/408

#csharp #dotnet #roslyn

Read the whole story
alvinashcraft
36 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The design process is dead. Here’s what’s replacing it. | Jenny Wen (head of design at Claude)

1 Share

Jenny Wen leads design for Claude at Anthropic. Prior to this, she was Director of Design at Figma, where she led the teams behind FigJam and Slides. Before that, she was a designer at Dropbox, Square, and Shopify.

We discuss:

1. Why the classic discovery → mock → iterate design process is becoming obsolete

2. What a day in the life of a designer at Anthropic looks like, including her AI tool stack

3. Whether AI will eventually surpass humans in taste and judgment

4. Why Jenny left a director role at Figma to return to IC work at Anthropic

5. The three archetypes Jenny is hiring for now

6. Why chatbot interfaces may be more durable than most people expect

Brought to you by:

Mercury—Radically different banking: https://mercury.com/?utm_source=lennys&utm_medium=sponsored_newsletter&utm_campaign=26q1_brand_campaign

Orkes—The enterprise platform for reliable applications and agentic workflows: https://www.orkes.io/

Omni—AI analytics your customers can trust: https://omni.co/lenny

Episode transcript: https://www.lennysnewsletter.com/p/the-design-process-is-dead

Archive of all Lenny's Podcast transcripts: https://www.dropbox.com/scl/fo/yxi4s2w998p1gvtpu4193/AMdNPR8AOw0lMklwtnC0TrQ?rlkey=j06x0nipoti519e0xgm23zsn9&st=ahz0fj11&dl=0

Where to find Jenny Wen:

• X: https://x.com/jenny_wen

• LinkedIn: https://www.linkedin.com/in/jennywen

• Substack: https://jennywen.substack.com

• Website: https://jennywen.ca

Where to find Lenny:

• Newsletter: https://www.lennysnewsletter.com

• X: https://twitter.com/lennysan

• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/

In this episode, we cover:

(00:00) Introduction to Jenny Wen

(04:23) Why the traditional design process is dead

(06:33) The two new types of design work

(10:00) How widespread this shift will be

(13:00) Day-to-day life as a designer at Anthropic

(18:45) Jenny’s AI stack

(20:03) Why Figma still matters for exploration

(22:25) Advice for working with engineers

(24:19) How to maintain craft, quality, and trust in the AI era

(27:35) Will AI ever have “taste”?

(31:38) The future of chatbot interfaces

(35:33) Moving from director back to IC

(41:00) The 10-day build of Claude Cowork

(46:06) Hiring: the three archetypes

(50:44) Advice for new and senior designers

(54:42) The value of “low leverage” tasks for managers

(57:52) Why the best teams roast each other

(01:01:45) The legibility framework

(01:07:22) Lightning round and final thoughts

Referenced:

• Figma: https://www.figma.com

• Anthropic: https://www.anthropic.com

• v0: https://v0.app

• Navigating a Design Career with Jenny Wen | Figma at Waterloo: https://www.youtube.com/watch?v=OHcBPMh2ivk

• Claude Cowork: https://claude.com/product/cowork

• Use Claude Code in VS Code: https://code.claude.com/docs/en/vs-code

• Claude Code in Slack: https://code.claude.com/docs/en/slack

• Lex Fridman’s website: https://lexfridman.com

• Head of Claude Code: What happens after coding is solved | Boris Cherny: https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens

• OpenClaw: https://openclaw.ai

• OpenAI’s CPO on how AI changes must-have skills, moats, coding, startup playbooks, more | Kevin Weil (CPO at OpenAI, ex-Instagram, Twitter): https://www.lennysnewsletter.com/p/kevin-weil-open-ai

• Marc Andreessen: The real AI boom hasn’t even started yet: https://www.lennysnewsletter.com/p/marc-andreessen-the-real-ai-boom

• Socratica: https://www.socratica.info

• Anthropic’s CPO on what comes next | Mike Krieger (co-founder of Instagram): https://www.lennysnewsletter.com/p/anthropics-cpo-heres-what-comes-next

• Radical Candor: From theory to practice with author Kim Scott: https://www.lennysnewsletter.com/p/radical-candor-from-theory-to-practice

• Evan Tana’s ‘legibility matrix’ on X: https://x.com/evantana/status/1927404374252269667

• How to spot a top 1% startup early: https://www.lennysnewsletter.com/p/how-to-spot-a-top-1-startup-early

• Palantir: https://www.palantir.com

• Stripe: https://stripe.com

• Linear: https://linear.app

• Notion: https://www.notion.com

• Julie Zhuo’s website: https://www.juliezhuo.com

Sentimental Value: https://www.imdb.com/title/tt27714581

The Pitt on Prime Video: https://www.amazon.com/The-Pitt-Season-1/dp/B0DNRR8QWD

• Noah Wyle: https://en.wikipedia.org/wiki/Noah_Wyle

ER on Prime Video: https://www.amazon.com/gp/video/detail/B0FWZSDYRP

• Retro: https://retro.app

• Granola: https://www.granola.ai

Recommended books:

Radical Candor: Be a Kick-Ass Boss Without Losing Your Humanity: https://www.amazon.com/Radical-Candor-Kick-Ass-Without-Humanity/dp/1250103509

The Power Broker: Robert Moses and the Fall of New York: https://www.amazon.com/Power-Broker-Robert-Moses-Fall/dp/0394480767

Insomniac City: New York, Oliver Sacks, and Me: https://www.amazon.com/Insomniac-City-New-York-Oliver/dp/162040494X

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.

Lenny may be an investor in the companies discussed.



To hear more, visit www.lennysnewsletter.com



Download audio: https://api.substack.com/feed/podcast/188846384/24716a31c4c42fdb82ccbe05ad197054.mp3
Read the whole story
alvinashcraft
36 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

AWS Just Told Every Hospital CIO That Microsoft Was Right About Agents

1 Share

Why Amazon's OpenAI Deal Is the Biggest Admission of Infrastructure Defeat in the AI Wars, And What It Means for Healthcare

On February 27, Amazon CEO Andy Jassy went on CNBC with Sam Altman and announced a "Stateful Runtime Environment" that AWS and OpenAI will co-create. A system where AI agents can maintain context, remember prior work, call tools, and access compute. He described it as the next generation of how developers will build AI applications.

Then he said something remarkable. "There's nothing else like that today."

I've been building healthcare AI systems for over 30 years. I built symbolic AI triage engines in 2001, migrated 50+ HIPAA workloads to Azure, and today I advise health systems on mission-critical AI infrastructure. So when the CEO of Amazon makes a claim like that on national television, I pay attention.

And I can tell you he's wrong.

Microsoft Shipped This Nine Months Ago

Azure AI Foundry Agent Service went generally available in May 2025. Not as a concept. Not as a preview. As a production-ready, enterprise-grade platform that over 10,000 customers have already built on.

Everything Jassy described as groundbreaking is already running in production on Azure. The Foundry runtime automatically manages state across messages, tool calls, agent configurations, and metadata. Conversations persist across sessions. Agents can pick up where they left off without developers manually threading context back in.

But Microsoft went further than what AWS is even promising. Foundry supports multi-agent workflows with a structured orchestration layer that coordinates multiple agents across complex, multi-step processes. It handles context management, error recovery, and long-running durability out of the box. Then at Ignite in November 2025, Microsoft added managed long-term memory to the service. A persistent state layer that automatically extracts, consolidates, and retrieves user context across sessions and devices. Microsoft's own AI Research director called it turning memory from "a demo feature into an enterprise primitive."

And here's the part that should really get your attention. Azure is the only cloud that offers both Anthropic's Claude and OpenAI's GPT models under one governance umbrella. AWS just locked itself into a single-vendor model dependency with this deal. Microsoft gives you choice.

Why This Matters More in Healthcare Than Anywhere Else

If you're a CMIO or CIO at a health system, you already know that infrastructure decisions in healthcare are essentially irreversible. You don't get to run a six-month pilot on one cloud, decide you don't like it, and casually migrate your HIPAA workloads somewhere else. The switching costs are brutal. The compliance burden is real. And the downstream impact on clinical workflows, provider experience, and patient safety is too high to get wrong.

That's why the timing of this announcement matters so much. AWS is telling the market, right now, that it doesn't have competitive agentic infrastructure. And it's asking you to wait "a few months" while it catches up.

Meanwhile, the clinical use cases for AI agents aren't waiting.

Think about a patient discharge workflow. You need agents coordinating across clinical documentation, pharmacy reconciliation, insurance verification, follow-up scheduling, and transport. That's not a single model call. That's a multi-step, multi-agent orchestration problem that requires persistent state, error recovery, and real-time coordination across systems. Foundry handles this today. AWS is announcing the concept.

Or consider prior authorization. An agent that monitors submissions, tracks status, escalates denials, and coordinates between payer and provider systems needs to maintain context across days or weeks of interaction. That requires the kind of managed long-term memory that Microsoft is already shipping. Not the kind that AWS is promising to build.

And then there's governance. In a hospital, an AI agent can't operate in a vacuum. It needs identity management through Entra ID, data governance through Purview, HIPAA-grade audit trails, and role-based access controls. Foundry ships with all of that baked into the runtime. AWS is going to have to build or bolt on that entire trust layer around whatever OpenAI delivers. For a regulated industry, that gap matters enormously.

Most large health systems are also already deep in the Microsoft ecosystem. M365, Teams, SharePoint, Fabric. Foundry agents can natively reach into all of it. An AWS-hosted OpenAI agent trying to access the same data requires integration work that adds months and risk to any deployment.

What AWS Is Really Telling Us

Let's call this what it is. This announcement is not a leap forward for AWS. It's an admission that Bedrock's agent infrastructure wasn't competitive, and that Amazon needed OpenAI to close the gap. The partnership is a fast-follow strategy, not an innovation play.

It also creates a strange dynamic. Amazon is betting its agentic future on a company that is simultaneously deepening its relationship with Microsoft. OpenAI's models already run on Azure. The "co-training on Trainium" angle is interesting from a compute economics perspective, but it doesn't solve the developer experience and enterprise governance problems that actually determine platform adoption in healthcare.

For regulated industries, "coming soon" has never been an acceptable answer.

The Decision Framework for Hospital Leadership

If you're already on Azure, you have access to production-grade agentic infrastructure today. Use it. Start building the clinical agent workflows that will define the next era of care delivery.

If you're evaluating platforms, understand that the gap between Microsoft and AWS on agentic infrastructure is widening, not narrowing. Every month that passes deepens the ecosystem advantage.

If you're on AWS, ask your account team a direct question. When will you have parity with what Azure AI Foundry is shipping today? And what does the migration path look like if the answer isn't satisfying?

If you believe, as I do, that AI agents are the future of clinical operations, then the platform you choose now is the platform your agents will run on for years. Agents are the new cloud lock-in. The cloud wars have shifted from compute and storage to agentic infrastructure, and the governance and compliance layer is what separates a real healthcare platform from a general-purpose cloud with a few AI services bolted on.

Microsoft understood that early. AWS just admitted it.

Paul J. Swider is CEO & Chief AI Officer at RealActivity, a Microsoft Partner specializing in mission-critical AI for healthcare systems. He has 30+ years in healthcare technology, has trained over 3,000 engineers across GE, IDX, and Microsoft, and is the founder of BOSHUG, the Boston Healthcare Cloud & AI Community spanning 50+ countries.



Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

What I DON'T want to happen to YOU, and HDRP is dead!

1 Share

Hello and Welcome, I’m your Code Monkey!

March is here! Hopefully that means spring with more sun and less cold.

The February Next Fest is about to end, I did a quick browse of the Top Games and it's insane how much creativity and just the sheer level of quality that these games have. I am particularly looking forward to Tabletop Tavern which is being made by a developer that started working on the game by following my DOTS RTS course, and then massively expanding upon it. I love it when students do that! I hope the game finds tons of success, so far it's looking likely! (60k wishlists)

  • Game Dev: 5 Year Flop; HDRP Dead

  • Gaming: Phil Spencer out

  • Fun: Skip the Tip!



Game Dev

What I DON'T want to happen to YOU!

I often give the advice of "make small games" for many reasons. One is how small games are excellent for learning quickly, you will learn a lot more by making 10 games than just one massive game. They are also great for finding success since the game idea itself is one of the biggest factors in a game's success, if you try 10 ideas you have better odds of hitting a home run as opposed to just one shot.

But most of all, I really don't want you to spend 5 years working on a game with hopes of finding success, only to then launch it and have that game sell a dozen copies, just like what happened with this developer.

The post is a very detailed breakdown of the game Sacred Earth - Reverie which launched in late November after 5 years in development, and currently sits at just 11 reviews, and about $1000 in revenue.

There is some very accurate self-reflection in the post about what caused this game to fail. The genre mix (JRPG + Visual Novel) which makes it insanely niche, the lack of unique compelling hook.

The dev mentions how they did everything in their power to do marketing but nothing worked, how it was a "slow crawl to build mild interest", that should have been a big warning sign to not spend such a long time on a game idea that was clearly not appealing to players.

The good news is that the developer only seems mildly disappointed the game did not sell, it does not seem like they're going to go bankrupt and homeless. So like I always say, you should make small games and pick a good marketable idea IF your goal is financial success, but if you're making things just for fun then there are no rules, make whatever you want to make! But don’t lie to yourself saying you’re making something just for fun while secretly hoping for a million copies sold.

However, in the "what's next" section of the post the developer does not mention the most important lesson: Don't spend 5 years working on your next game.

I constantly give that advice because I really don't want this to happen to you. Even if you're making games just for fun it is sad and frustrating to work so long on something you're really passionate about and in the end realize no one else cares. If you're really going to go down that road then at least do limit your scope, I imagine this release wouldn't hurt as much if it was a 6 month project.


Affiliate

FREE VFX, Unity Tools 97% OFF!

Get some awesome Game Audio and Music at 99% OFF!

This HumbleBundle has it all, Fantasy, Sci-Fi, Casual, Horror and much more. If you don’t have any audio pack then this massive one at a super deep discount is a great pick up.

Get it HERE for 99% OFF!

The Publisher of the Week this time is D.F.Y. STUDIO, publisher with a lot of awesome UI packs for all sorts of games and themes.

Get the FREE Cyberpunk RPG GUI Pack which like the name implies is perfect for any dystopian sci-fi cyberpunk game.

Get it HERE and use coupon DFYSTUDIO at checkout to get it for FREE!


Game Dev

Unity HDRP is dead! (kinda)

Unity has just written a blog post talking about the future of Unity Render Pipelines in 2026, and in it they detail the next steps on their plan to merge the render pipelines (which they announced back at Unite) and those steps involve continuing to improve URP while only maintaining HDRP.

This does NOT mean HDRP is deprecated, if you're using it in your projects you can keep using it. You can even start new projects with it, it is still officially supported, it is just not getting any more new features in the future (other than Switch 2 support)

Whereas URP is continuing to improve, and now hopefully at a faster rate. Recently at Unite they showcased a really awesome looking Real Time Global Illumination tool coming in the future to URP, it looks really awesome! You can see it in action in this video at 55:27. It will also be getting Physical Light Units support, Physical Sky, Auto Exposure, Screen Space Reflections and more.

Another piece of news is how the Built-in Render Pipeline is finally being deprecated in 6.5 with a removal date sometime after that. However if you really like the BiRP and you use it, technically you can keep using it until 2028 with 6.7 LTS.

Many people have considered the splitting of the render pipelines to have been a mistake since it doubles the amount of work for so many people, so hopefully this direction is something people are excited about.

I hope this will end up being a positive thing for Unity, at least they will have all their graphical manpower working on just one render pipeline instead of being split across multiple projects. Hopefully that means faster feature development to bring URP to feature parity with HDRP very quickly.



Gaming

Big shake up of Xbox execs!

Phil Spencer has been in charge of Xbox since 2014, and now that time has come to an end as both he and Sarah Bond (ex-president of Xbox) have left the company.

The Xbox brand has certainly recovered after the failure of the Xbox One (TV, TV, TV!), they bought a ton of awesome studios and published some great games, however in more recent years it has gotten back to a downtrend with many game cancelations and massive price hikes to both Game Pass as well as the consoles themselves.

Replacing Spencer is Asha Sharma who was the president of Microsoft's CoreAI division having previously worked at Meta. Right away this concerned many people who fear/dislike AI, so she made a point to mention how "First, great games. Everything begins here. We must have great games beloved by players before we do anything" and speaking against soulless AI slop, but also continuing the plan to make Xbox "expand across PC, mobile, and cloud."

Summer Game Fest (previously E3) is happening in a few months, I guess that's the first place where we will see what this new direction for Xbox looks like.

I was pretty surprised to see this news, Phil Spencer has been synonymous with Xbox for over a decade at this point, he was one of the few recognizable execs that seemed more focused on gaming as a whole rather than just a business. But of course the business needs to make money which is why Game Pass had those drastic price hikes which really soured a lot of people, I wonder if that strategy will continue, or if they will flip somewhat and perhaps introduce a cheaper tier supported with ads. That would make it a lot more affordable for a lot more people.



Fun

Skip the tip!

Tipping culture in the US sounds insane to basically every other country. You buy something that costs $10 but at the end you're expected to pay $12. It seems crazy how it's legal for a business to pay below minimum wage to their employees, and have the customer pay the remaining 20%.

And if you don't want to tip, usually you have to get by a bunch of dark patterns to be able to skip it.

So here is a fun mini-game all about those dark patterns! You have moving buttons, pre-selected options, some require you to accept giant terms and conditions, others you have to do math, or just straight up psychological manipulation.

Try to see how many of those dark patterns you recognize (which are also widely used in many other places like canceling subscriptions or mobile games), I survived 26 rounds!

I am actually off the US next week for GDC, I will have to constantly remind myself that whatever is the price on the menu, isn't really the final price. I really wish they would pay people a living wage and just raise prices 20%, it's so annoying to think you're paying X amount but at the end you find out you're expected to pay X+20%.




Get Rewards by Sending the Game Dev Report to a friend!

(please don’t try to cheat the system with temp emails, it won’t work, just makes it annoying for me to validate)

Thanks for reading!

Code Monkey

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Neural Networks for Developers - XOR (Part 1)

1 Share

The full source code for the neural network and the interactive demonstrations can be found on GitHub.

Increasing numbers of us are turning to solutions based on neural networks to help us with a wide variety of tasks. From accelerating software development through to getting advice on personal issues they are becoming almost the goto tool for many people.

But how do they actually work?

I figured it would be fun, and perhaps helpful to others, to explore this through a series of practical, interactive, explained, examples of increasing complexity over a fairly short (5 part) series. And I promise that by part 5 we’ll have something pretty cool!

The first thing we’re going to see is that they are, in fact, fundamentally really simple and have absolutely nothing to do with actual neurons at all. The cynic in me can’t help but think that the AI folk love to dress up what they do in humanistic and biological language to convince people they are doing more profound work than they are. Also: it sells.

As a simple example we’re going to look at a neural network that can XOR two numbers together but before we get to the interactive example a little bit of background is useful.

XOR

XOR is a simple bitwise operation that, if you’re a crusty old developer like me, you might remember as being a handy and performant way of drawing and removing sprites on 8-bit machines so that they didn’t erase the background or require a load of CPU time.

Given two input bits the output of the XOR operation is shown in the table below:

Input A Input B Output
0 0 0
0 1 1
1 0 1
1 1 0

In this example we’re going to setup and train a neural network to handle an XOR calculation. Gross overkill, sure, and I don’t think this would be an effective way of drawing sprites on an 8-bit machine, but it makes for a great first example.

Neurons

Firstly neurons. Neurons are basically nodes in the network that have one or more weighted inputs and themselves have a bias. The input, the weights and the bias are all numbers. The neuron works by multiplying each input by its weight, adding the bias and then squashing all this into a number in the range of 0 to 1. Without that squashing step the neuron is just doing basic arithmetic, and when we look at the network we’ll see that we stack layers of neurons and if we just stack layers of basic arithmetic we achieve nothing more than a single layer could. The squash is what gives the network its power.

For our simple XOR example we’re going to use a squashing function called sigmoid. These squashing functions are called activation functions and thats how we’ll refer to them from here on. And so for a neuron with two inputs all it does is run this formula:

output = sigmoid((input1 * weight1) + (input2 * weight2) + bias)

While this isn’t going to become a maths fest going fowards we’ll use mathematical notation, which would express the above like this:

$$output = \sigma\left((input_1 \times weight_1) + (input_2 \times weight_2) + bias\right)$$

At this point you might be asking - whats this sigmoid function. Its this:

$$\sigma(x) = \frac{1}{1 + e^{-x}}$$

Don’t worry too much about the formula — all it does is take any number, no matter how large or small, and map it to a value between 0 and 1. Large positive inputs give values close to 1, large negative inputs give values close to 0, and zero maps to exactly 0.5. It has a characteristic shape that you can see below:

In code the entire neuron is surprisingly compact:

// Sigmoid: squash any number into 0..1
function sigmoid(x: number): number {
  return 1 / (1 + Math.exp(-x));
}

// A neuron: multiply each input by its weight, add bias, squash
function neuron(inputs: number[], weights: number[], bias: number): number {
  let sum = 0;
  for (let i = 0; i < inputs.length; i++) {
    sum += inputs[i] * weights[i];
  }
  sum += bias;
  return sigmoid(sum);
}

That’s it. Every neuron in every neural network, no matter how large, does this same operation.

The Network

Ok. So that’s the neural part - so what about the network part?

Essentially neurons are arranged in layers: an input layer, one or more hidden layers, and a output layer with every neuron in one layer being connected to every neuron in the next layer. Hidden layers sounds very mystical but there’s nothing much hidden about them unless you are treating the network as a black box. They are simply the layers of neurons between the inputs and the output.

Our input neurons are really just numbers - they have no biases - and for our XOR example we have two input neurons, one for each XOR input. And we only need a single output neuron that, when trained, should give us 0 or 1 as our answer. And we’re going to have a single hidden layer of 4 neurons.

This gives us our network topology - 2 input neurons, 4 hidden layer neurons each connected to both inputs, and a single output neuron connected to the 4 hidden layer neurons. When the network runs we run the calculation from earlier for each neuron in the layer and then, when all neurons in the layer have calculated their output, we move on to the next layer.

In code we represent this as arrays of neurons organised into layers. Each neuron stores its weights, bias, and its most recent output. The network is created with random weights — we’ll see why that matters shortly:

interface Neuron {
  weights: number[];
  bias: number;
  output: number;
  net: number; // the weighted sum before sigmoid — we need this for backprop later
}

type Layer = Neuron[];

interface Network {
  layers: Layer[];
  inputCount: number;
}

function createNeuron(inputCount: number): Neuron {
  return {
    weights: Array.from({ length: inputCount }, () => Math.random() * 2 - 1),
    bias: Math.random() * 2 - 1,
    output: 0,
    net: 0,
  };
}

function createNetwork(topology: number[]): Network {
  const layers: Layer[] = [];
  for (let i = 1; i < topology.length; i++) {
    const inputCount = topology[i - 1];
    const layer = Array.from({ length: topology[i] }, () =>
      createNeuron(inputCount)
    );
    layers.push(layer);
  }
  return { layers, inputCount: topology[0] };
}

// Create our XOR network: 2 inputs, 4 hidden, 1 output — 17 parameters total
const network = createNetwork([2, 4, 1]);

The forward pass — running inputs through the network — is just applying the neuron formula to each layer in sequence:

function forward(network: Network, inputs: number[]): number[] {
  let current = inputs;
  for (const layer of network.layers) {
    const next: number[] = [];
    for (const neuron of layer) {
      let sum = neuron.bias;
      for (let i = 0; i < neuron.weights.length; i++) {
        sum += current[i] * neuron.weights[i];
      }
      neuron.net = sum;
      neuron.output = sigmoid(sum);
      next.push(neuron.output);
    }
    current = next;
  }
  return current;
}

If you walk through the example above you will see each neuron running the formula we looked at earlier and at the output giving us the answer 0.53. Which is probably the worst case we could have hoped for - the output should be 0 or 1 and we’re square in the middle. This is because the weights are all random at the moment so we are literally just pushing numbers through random multipliers.

Before the network will be able to give us credible results we need to train it and it needs to learn. That’s what we’ll cover next.

Back propagation - the learning

The network learns by pushing the error back through the network and adjusting the weights and biases. Conceptually it does this by distributing blame proportionally - the neurons that had the biggest impact on the output have their weights and biases adjusted the most. In our examples this means the weights will be adjusted the most on the connections represented by the thickest lines.

The network also has a learning rate - this is a multiplier applied to the proportioned level of blame that basically scales how big an adjustment we’ll make to the weights and biases. Too big and our corrections will overshoot and too small and the network will take longer to converge on accurate answers.

Working this through the network is known as back propagation and the idea is that by tweaking these numbers then over many runs, each run is known as an epoch, then the error delta, the loss, should converge towards 0.

The code for backpropagation is the most involved part but the structure mirrors the forward pass — just working backwards. We need one extra piece: the sigmoid derivative, which tells us how steep the S-curve is at a given neuron’s operating point. Where the curve is steep the neuron is sensitive to changes and absorbs more blame. Where it’s flat — near 0 or 1 — the neuron is saturated and barely learns:

function sigmoidDerivative(net: number): number {
  const s = sigmoid(net);
  return s * (1 - s);
}

function backward(
  network: Network,
  inputs: number[],
  targets: number[],
  learningRate: number
): number {
  const { layers } = network;

  // Step 1: How wrong is the output, and how sensitive is it to changes?
  const outputLayer = layers[layers.length - 1];
  for (let i = 0; i < outputLayer.length; i++) {
    const neuron = outputLayer[i];
    const error = neuron.output - targets[i];
    neuron.delta = error * sigmoidDerivative(neuron.net);
  }

  // Step 2: Propagate blame backward through hidden layers
  for (let l = layers.length - 2; l >= 0; l--) {
    const layer = layers[l];
    const nextLayer = layers[l + 1];
    for (let i = 0; i < layer.length; i++) {
      let downstreamBlame = 0;
      for (const nextNeuron of nextLayer) {
        downstreamBlame += nextNeuron.delta * nextNeuron.weights[i];
      }
      layer[i].delta = downstreamBlame * sigmoidDerivative(layer[i].net);
    }
  }

  // Step 3: Nudge every weight and bias
  for (let l = 0; l < layers.length; l++) {
    const layerInputs =
      l === 0 ? inputs : layers[l - 1].map((n) => n.output);
    for (const neuron of layers[l]) {
      for (let w = 0; w < neuron.weights.length; w++) {
        neuron.weights[w] -= learningRate * neuron.delta * layerInputs[w];
      }
      neuron.bias -= learningRate * neuron.delta;
    }
  }

  // Return the loss so we can track progress
  let loss = 0;
  for (let i = 0; i < outputLayer.length; i++) {
    const diff = outputLayer[i].output - targets[i];
    loss += diff * diff;
  }
  return loss / outputLayer.length;
}

In the simulation below you can see this being applied from where we left off above and if you’re interested in the mathematics thats included too.

The simulation

At this point we have everything we need to train and then use our XOR neural network. Training is just running the forward pass and backward pass on every XOR input, thousands of times:

const xorData = [
  { inputs: [0, 0], targets: [0] },
  { inputs: [0, 1], targets: [1] },
  { inputs: [1, 0], targets: [1] },
  { inputs: [1, 1], targets: [0] },
];

for (let epoch = 0; epoch < 20000; epoch++) {
  for (const sample of xorData) {
    forward(network, sample.inputs);
    backward(network, sample.inputs, sample.targets, 0.5);
  }
}

That’s the entire training loop. Each epoch feeds all four XOR cases through, adjusting weights after each one. If you run the simulation below you’ll see the model train itself over 20,000 epochs with a learning rate of 0.5.

What you’ll probably immediately notice is that at the end of this process the neural network does not give us perfect answers. What we’re seeing is something like this:

Input A Input B Target Output
0 0 0 0.0094
0 1 1 0.9890
1 0 1 0.9872
1 1 0 0.0121

If you’re used to thinking in more classical modes of computation then I think this, particularly, is a key takeaway about neural networks: they provide approximations. Or perhaps what might be best called probabilistic answers.

If you zoom in on the interesting part of the loss curve you might notice it looks like an inverted sigmoid — but it’s coincidental rather than causal. The loss curve isn’t a sigmoid, it just has a similar shape. This pattern of slow start, rapid progress, then diminishing returns shows up across all kinds of optimisation problems, not just neural networks.

It’s interesting to play with the learning rate and the number of epochs — you can get the network to converge on a more accurate result but it will never land on exact values. These are fundamentally approximation machines.

And you’re probably starting to see why these systems can be so expensive to train: as the number of neurons multiplies the number of calculations required grows quickly and you need vast numbers of epochs to converge over really large training sets. You can probably also see why GPUs, and similar architectures, are so good at this. It’s basically multiplication at a massive scale.

In the next part we’re going to build on these basics and get a neural network to do something a bit more complicated - but the concepts will be exactly the same.

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

When to Use Factory Method Pattern in C#: Decision Guide with Examples

1 Share

When to use Factory Method pattern in C#: decision criteria, code examples, and scenarios to determine if Factory Method is the right choice.

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories