Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149268 stories
·
33 followers

To test, or not to Test? Part 1 – Why Test? And at what cost?

1 Share

I wrote code without tests that ran in production without defects, and I wrote buggy code with TDD (Test Driven Development). Time to look back at 35 years of coding and when tests help, and when there is something better. And especially, what these better things are.

Why test?

For me, there are three reasons to test the software I (or we as a team) write manually or automatically:

  1. Check expectations – “it works”
  2. Prevent regressions – “it keeps working (after changes)”
  3. Drive the implementation – “I know what I have and need, but not yet how to achieve this.”

I like (manual) exploratory testing to check my expectations and to find holes in the code that are not working correctly. On the other side, I dislike manually executed test cases to prevent regressions. They are cumbersome and way too expensive to do over and over again.

Automated testing approaches such as unit testing and approval testing are effective for preventing regressions.

TDD is great for driving the implementation of complicated business logic or algorithms.

Property-based testing can help prevent regressions and identify missing pieces in algorithms.

There are also kinds of tests I dislike:

  • E2E (end-to-end) tests: They typically need a complicated setup to run, which makes them brittle.
  • BDD / Spec-driven: Too much focus on the specification, often leading to rigid systems (but this would be a whole series of blog posts on its own😅) – I, however, like the involvement of domain experts and developers. But I can do that with TDD as well.

Of course, there are many more kinds of tests. Choose what matches your situation.

Reasons for defects?

To choose the right kinds and number of tests, we need to understand where defects in our software come from. The causes vary by team and tech stack. I try to give an overview here.

Nulls: When your programming language allows nulls, then you have to deal with them. Dealing with nulls in every situation is simply difficult and, therefore, error-prone.

Shared mutable state: We quickly get overwhelmed with the hidden coupling that shared mutable state introduces into our codebase – especially in multi-threading scenarios. This quickly leads to wrong assumptions and, therefore, to defects.

Integration problems (wrong assumptions): A component or library we integrate into our own codebase behaves differently than we expected.

Misunderstanding (not solving the problem): We didn’t fully understand the problem to be solved, and we delivered an incomplete or incorrect solution.

Misbehaving infrastructure: Our software is misbehaving because of a defect in the infrastructure we rely on.

Programming traps: Programming is tricky, and we sometimes get it wrong. For example, never divide first, then multiply (x/100*7); always multiply first, then divide (x*7/100). Otherwise, precision can get you. Or when summing floating point numbers, always sort them first (ascending).

Wrong equality checks: In runtimes like .Net, everything provides an Equals method, but they do not always act as we assume. Comparing by reference instead of deep equality of the fields/properties quickly leads to defects. And there are more equality checks happening than we are typically aware of, for example, when using HashSets, Dictionaries, etc.

Ignored expression results: Most programming languages allow ignoring the result of an expression. For example, you can call a method returning a value, and just ignore it. But sometimes we should react to the returned value to prevent defects. For example, when we ignore the result of an expression that tells us whether things went well or badly. Ignoring the “bad” return value likely leads to a defect.

Implicit casts: Implicit casts can introduce hidden, unsafe type conversions that can silently corrupt data, break type safety, and cause unpredictable runtime errors.

There are obviously more sources of defects than the ones above.

Level of correctness

We also need to be aware of the level of correctness that we want to achieve, given our context.

On the continuum from “it might run” to having a “proof of correctness”, we typically want to be at the “we are confident” spot – in the context of business applications. Surely, not at the “we have 100% test coverage” spot. Proofing for correctness is typically a lot of effort and not worth the quality gain.

In our team, we don’t need perfect software. In many cases, it is sufficient to respond quickly, and the impact on our users and our business is minimal. The greater the impact, or the longer it takes to detect and fix a defect, the higher our quality standard needs to be.

So, our quality standard for core business logic code is much higher than for UIs that add simple data. In a later part of this series, I’ll show how slicing reduces the impact radius significantly.

Test Effort

Having (automated) tests brings a lot of effort with them – even when we write good, concise, and refactoring-friendly tests:

  • writing the tests (with good error messages)
  • running the tests (during development and as part of the build and release pipeline)
  • maintaining the test code (due to design changes)
  • changing the tests (due to changing business needs)
  • mis-hits (due to flaky or overly-specific tests)

Obviously, tests have their benefits. But we shouldn’t overlook that they also require effort. So, if we could replace them with something cheaper or faster (development and runtime), that would be beneficial.

The hard things to test

Many things are easy to test: throw some input at a method or function and check the return value, and maybe some side-effects (preferably state-based or, less preferably, interaction-based).

However, there are things that are inherently hard to test:

  • Combinatorics: e.g. calculations or queries with lots of variations (the result depends on subtle dependencies of many input values)
  • Multi-threading
  • Boundaries of the system (including UI)
  • Things with many dependencies (and they can’t be reduced meaningfully)
  • Stateful code

For these scenarios, finding a way to ensure quality without relying on tests would be especially beneficial.

Next time

In this post, I’ve written about why we write tests, reasons for defects, and why it would be nice to have something less expensive to replace some tests.

In later parts, you’ll see several ideas about how to reduce the number of tests needed, thus reducing development effort.

Read the whole story
alvinashcraft
19 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Agent Skills: Explore security threats and controls

1 Share

Anthropic announced the release of the Agent Skills functionality on October 16, 2025. This functionality was initially implemented in Claude software, but now it's available on many other agents, including Goose. Agent Skills is based on the concept of skills, a capability that trains an agent or client on tasks tailored to the way users work. Skills are based on folders and files, providing functionality similar to MCP but with a different approach. This article explores how to manage the security threats and access controls associated with adopting the new Agent Skills functionality.

How Agent Skills works

The following is an example of a skill extracted from the Agent Skills documentation. Skills are based on folders and files. Each skill has its own folder containing a SKILL.md file. The following code is the content of the SKILL.md file.

---
name: pdf-processing
description: Extract text and tables from PDF files, fill forms, merge documents.
---
# PDF Processing
## When to use this skill
Use this skill when the user needs to work with PDF files...
## How to extract text
1. Use pdfplumber for text extraction...
## How to fill forms
...

Here, we are defining a skill to extract text and tables from PDF files, fill forms, and merge documents. The body of the skill describes how to execute the procedural knowledge task, among other information.

This is the basic directory structure:

skill-name/
└── SKILL.md          # Required

Source: https://agentskills.io/specification#directory-structure

The basic structure of a SKILL.md file consists of an initial section called the frontmatter. The frontmatter is written in YAML, followed by a body written in Markdown. For the full specification, visit the Agent Skills home page.

When an agent works with Agent Skills, the agent loads the metadata within the frontmatter of all available skills. When the agent receives a request, it uses the metadata and available LLMs to decide which skill to use next. Once decided, the agent loads the body of the skill, which may be the whole description of the task or it may refer to other Markdown files in the skill folder. In this case, the agent will load them intelligently as needed. This means that the body of any skill can be split into different files to reduce the amount of information put in the context each time. It is possible to put all the information in one skill file, but it is recommended to put each task in a specific skill file, optimizing the context. In any case, the primary body is in SKILL.md and should point to the other Markdown files that we want to use in the skill.

The SKILL.md file can also refer to scripts, such as Python, Bash, and JavaScript, present in the skill folder for the skill to execute them when needed. These scripts may also have dependencies. As you can imagine, executing scripts involves some security risks.

The Agent Skills specification defines additional optional directories:

  • scripts/ For executable code that may be executed by the skills.
  • references/ For additional documentation that skills may use.
  • assets/ For static resources such as images, templates, or data files.

Improve security of the skill files

Skills are based on folders and files. If the permissions for these folders and files are not set correctly to ensure only authorized users can modify them, malicious actors who already have direct or indirect access to the filesystem could exploit this. This risk is not that high because it’s not trivial that a malicious actor already has access to the filesystem, but we should take this risk into account specially when implementing security by design and by default and defense in depth. If the permissions are not correctly set and malicious actors have this opportunity, they could modify skill files to introduce unauthorized instructions, add malicious scripts that can be executed with the agent's permissions, often the same permissions as the user, or alter existing scripts to include malicious code.

The permissions on the skills folders and files should be restricted as much as possible by default. If the skills are stored in another system, for example, a skills registry, the permissions in the registry should also be restricted as much as possible by default. It is recommended that any access or modification to the skill files is logged. The logs generated should be protected to avoid unauthorized modification.

Malicious skills

Skills may contain executable scripts in different languages, such as Python or Bash. This provides a lot of power to skills, but it also involves security risks. These scripts may contain malware. If the sources can’t be trusted, check the skills’ source code. The more important the tasks, the more thorough the review should be. Depending on your risk appetite, rather than doing a code review, a way to reduce the risk of Agent Skills having malware is to execute malware scans on them, for example, with tools as malcontent.

Another way of improving the security of your supply chain with relation to Agent Skills is to require them to be signed and validate their signature before use. There is no widely known initiative to sign Agent Skills, but this is something that users and customers should require if they consider it a relevant security control.

Note that although skills are initially safe, if an automatic mechanism to upgrade them exists, an upgrade can include malicious code or vulnerabilities, especially if they come from untrusted sources. In any case, depending on your risk appetite, reviewing the code of any new version of a skill that you plan to use is recommended.

Security vulnerabilities

Skills that contain scripts may have their own security vulnerabilities. Therefore, all security controls from secure development best practices apply here, including code reviews, SAST, DAST, and fuzzing.

Providing companies must also implement vulnerability management processes to identify and resolve security issues at regular intervals, in accordance with their SLAs.

Agents could also contain security vulnerabilities Since the SKILL.md file starts with a YAML section, it is possible that the YAML parser of an agent contains a vulnerability and a malformed malicious YAML in a skills file can exploit it to execute commands in the system or leak information.

Another way to reduce the risk of a vulnerability being exploited in skill scripts is to execute them in isolated environments such as containers or sandboxed environments. Examples of technologies that can be used are seccomp, AppArmor, or Firecracker VMs. Egress communication from these isolated environments to the Internet should also be restricted.

Prompt injection

Part of Agent Skills data flow consists of the agent obtaining information from a source, for example, a document or a webpage, and using that information to compose a prompt to be sent to an LLM to decide the next action or to compose the final output. Since part of the analyzed document is injected in the prompt sent to the LLM, there is a risk of prompt injection. This security issue occurs when input intended as data is interpreted as an instruction by the LLM instead. Agentic systems remain vulnerable to this issue because there is no industry-standard fix. While SQL injection can be mitigated through prepared statements, no similar control currently exists to reliably separate data from instructions in LLM prompts.

Although there are no definitive solutions to eliminate the risk of prompt injection, there are controls that can be applied to reduce the probability and impact of it.

Guardrails is a common security control that is gaining traction in AI systems, especially agentic AI systems that use Skills or other agentic protocols like MCP or A2A. Guardrails are systems that monitor the input and output of an agent or LLM to distinguish between benign and malicious content. If content is benign, the guardrails system lets them pass to the next system. If not, it can perform actions such as modifying the payload, blocking, logging, and throttling content. Since guardrails rely on the ability to distinguish between benign and malicious intent, a classic, and often unsolvable, problem in security, it is not a definitive solution. For that, we need patterns, such as regexes, or use other specialized LLMs. In any case, guardrails are a sound control to reduce the risk of prompt injection. For example, TrustyAI is an open source project developed by Red Hat that includes guardrail capabilities. 

Another key security control is to limit the permissions that an agent has. At a maximum, an agent should only possess the permissions of the user executing it, never more. Ideally, agents should operate with a restricted subset of those permissions, dynamically derived from the specific task or intent. Dynamic authorization for AI agents remains a compelling area for further exploration. In addition to permission limiting, agents should be executed within isolated environments, such as containers or virtual machines, to provide a robust security boundary.

One more control is using the experimental `allowed-tools` field defined by the Agent Skills specification. The specification states that, as it is experimental, it might not be supported by all agents yet. In any case, it is worth pushing for it. This is a mechanism to limit the tools that will be available to the agent, thus, reducing the risk of a malicious prompt injection or an unintended behavior. `allowed-tools` doesn't reduce the probability of prompt injection, but it reduces the impact.

Many of these security controls discussed not only reduce threats by malicious actors, but also reduce risks related to unintended behaviors of agentic systems due to the inherent probabilistic and non-deterministic nature of current LLMs.

Credentials management

While the Agent Skills specification does not prescribe a specific method for credential management, secure handling remains a critical security component. Since agents must interact with external systems to perform actions, they require a robust authentication framework. In scenarios where manual user intervention is not feasible, it is essential to implement standardized solutions like OAuth 2.0 to manage these permissions securely.

Under no circumstances should credentials be stored in plain text or embedded directly within the skills themselves. To mitigate the risk of accidental exposure, users should be educated on secure storage practices and utilize automated secret-scanning tools, such as Trufflehog, to detect hardcoded credentials before deployment.

Final thoughts

Agent Skills introduces a flexible and modular way to extend the functionality of intelligent agents through skill-based orchestration of tasks. This extensibility empowers organizations to build specialized and adaptable AI ecosystems, and expands the attack surface in familiar and novel ways. As shown, risks span from modifying skill at the filesystem level and malicious or vulnerable scripts to prompt injection and credential exposure, demanding a comprehensive and proactive security posture.

Mitigating these risks requires combining traditional secure development practices, such as strict permissions, code reviews, and scanning, with AI-specific controls like guardrails, sandboxing, and controlled permissions. The introduction of constructs such as allowed-tools and signed skill registries marks an important step toward safer deployment, though these mechanisms remain in an early stage of maturity. Organizations adopting Agent Skills should therefore balance innovation with discipline, embedding continuous monitoring, validation, and threat modeling into their workflows.

Ultimately, the security of Agent Skills will depend not only on technical controls but also on the governance and culture surrounding their use. Collaboration between AI developers, security teams, and the open source community will be crucial to evolving standards that can keep pace with this rapidly advancing capability. As Agent Skills continue to mature, their secure adoption will shape the trust, reliability, and resilience of agentic systems using them.

Be sure to check out TrustyAI on GitHub, a default component of Open Data Hub and Red Hat Openshift AI.

The post Agent Skills: Explore security threats and controls appeared first on Red Hat Developer.

Read the whole story
alvinashcraft
28 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

GitHub Copilot CLI Tips & Tricks — Part 2: Session management

1 Share

In the first post we covered the different modes in Copilot CLI. This time we're looking at something that becomes essential once you're doing serious work in the CLI: session management. Sessions let you pause, resume, and organize your work — across terminal restarts, across machines, and across multiple concurrent workstreams.

What is a session?

Every time you launch Copilot CLI, you're working inside a session. A session captures your full conversation history, the tool calls Copilot made, the files it touched, and the permissions you granted — all stored locally under ~/.copilot/session-state/. Sessions are identified by a UUID and automatically receive an AI-generated name based on your first message, making them easy to identify later.

Most important is that sessions persist after you close the CLI. That means nothing is lost when you shut down your terminal — you can always pick up right where you left off.

Resuming a session

Pick up where you left off

The most important session management flag. When you launch with --resume, Copilot presents an interactive picker listing your saved sessions — searchable with / — so you can select the one you want to continue.

copilot --resume

All your conversation context, granted permissions, and history are restored exactly as you left them.

Jump straight back in

If you just want to resume your most recent session without going through the picker, use --continue:

copilot --continue

This is the fastest way back into an active workstream after a restart or a break.

Switch sessions without exiting

You don't have to exit the CLI to switch sessions. The /resume slash command lets you jump to a different session mid-conversation — it opens the same picker UI as --resume, without breaking your current terminal process.

/resume

This is particularly useful when you're juggling multiple features or bugs at the same time.

Naming and organizing sessions

By default, sessions are named based on your first message. But for longer-running workstreams, you'll want to give them meaningful names yourself.

Use the /session slash command to rename the current session:

/session rename "Upgrade to .NET 10"

Named sessions are much easier to find in the /resume picker, especially when you have several active workstreams going at once.


Monitoring your session

Check your token and request consumption

At any point during a session you can call /usage to see a summary of how much you've used:

  • Premium requests consumed in the current session
  • Session duration
  • Number of files modified

This is handy for keeping an eye on consumption during long autopilot runs.

Context window management

Copilot CLI automatically compresses your conversation history in the background as it approaches 95% of the token limit, without interrupting your work. If you want to trigger this manually — for example, after completing a big chunk of work and starting a new phase — you can use:

/compact

This keeps things lean for the next part of the task.


Use /session checkpoints #checkpointnumber to view the compaction summary.





Sharing and exporting sessions

Export to Markdown

The /share command exports the current session to a Markdown file, including the full conversation history, tool calls, and outputs. This is useful for documentation, handoffs, or just keeping a record of a complex debugging session.

/share

Copilot will write the session history to disk.




Publish to GitHub gist

For non-interactive use cases — like CI/CD pipelines or automated documentation — you can export a session directly to a GitHub Gist using the /share-gist slash command:

/share-gist

Copilot returns a URL to the created Gist, making it easy to share with teammates or link in a PR.

Wrapping up

Sessions are what make Copilot CLI feel like a persistent coding partner rather than a stateless tool. Once you get into the habit of naming sessions, using --continue to resume quickly, and switching between workstreams with /resume, it becomes a natural part of how you manage work in the terminal.

In the next post, we'll look at running tasks in parallel.

More information

Session State & Lifecycle Management | github/copilot-cli | DeepWiki

Session Management & History | github/copilot-cli | DeepWiki

Tracking GitHub Copilot's sessions - GitHub Docs

Read the whole story
alvinashcraft
49 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Your API Tests Are Passing. That’s the Problem.

1 Share

When I started going into API testing, it was an extension of my experience in unit testing. There’s a big difference between integrated tests and unit tests. But when you write tests, there’s still a process.

You think about the requirement. You break it down into the pieces of the code you’re testing. For example, if the button on the screen calls the API, and you test the API only, your test will pass the API what the screen would have sent.

Write a bunch of those guys, they go green. We ship.

And everything is fine, until somewhere in prod, something breaks that we swore we covered. Here’s the thing – the tests weren’t lying. They checked exactly what we asked them to check. It’s the things we didn’t ask to check that got us.

That’s the Green Mirage. And it’s not new. We’ve been living in it for years.

Even I can’t think of everything in advance. I can’t write all the tests that will capture real production behavior. But this temporary lack of foresight is compounded in modern apps. Those that are built with, and run with, AI.


The Coyote

I love cartoons. Especially the Road Runner and the Coyote. Can watch them for hours. In a repeating gag, the Coyote chases the Road Runner off the cliff, looks down, and only then starts falling. He would have kept running, in his own little mirage, if he didn’t look down at what’s really happening.

We treat our test reports the same way.

Coverage looks fine. The team is moving fast – faster than ever. AI is generating code, scaffolding tests, filling in the boilerplate. Velocity is up.

And confidence is up too, and it gives us the push to keep going fast. Until something we didn’t anticipate blows up. And then a couple more.

The test reports show us the mirage. In reality, there’s a whole lot of unknowns down there.

This mirage is not new. We pretended we could continue making progress in spite of living in it. It wasn’t right, but we managed. Sometimes.

We can continue to pretend. Or we can look down (have you packed your parachute?).


The Green Mirage: A Simple Example

Let me show you what I mean. This is from my API Analysis Agent: it takes an API endpoint and generates a test plan. My agent has APIs too. I make sure they work.

Or do I?

Here’s a test for it:

def test_controller_handles_missing_data(client):
    response = client.post('/generate-plan', json={"method": "GET"})  # Missing 'path'

    assert response.status_code == 400

Send a POST with a missing field. Get a 400. The code deals with the error correctly.

But does it, really?

Error codes are for kids. Let’s expose some real behavior.

def test_controller_handles_missing_data(client):
    response = client.post('/generate-plan', json={"method": "GET"})  # Missing 'path'

    assert response.status_code == 400
    assert b"Missing required field" in response.data

Same scenario. Same endpoint. Same missing field.

But now we’re also checking the response body. We’re not pretending that the validation “just works” because of the error code. We check that it validated correctly.

Better asserts make for better confidence. They clear the mirage a bit. We can start relying more on that passing green.


A Test Worth Trusting

Now let’s go up a level. The agent doesn’t just validate inputs. It generates test plans. That’s the whole point of it.

Here’s a sanity test for that core behavior:

def test_plan_includes_all_categories():
    plan = get_live_plan(sample_endpoint_data)

    happy_path_index = plan.find("happy path")
    unhappy_path_index = plan.find("unhappy path")
    invalid_keyword_index = plan.find("invalid")

    assert happy_path_index != -1
    assert unhappy_path_index != -1
    assert invalid_keyword_index != -1

This test checks that all test case categories appear in the plan – happy, unhappy, and invalid. The test passes when it should. It fails when it should. It checks that the elements of the requested test plan are actually there.

That passing green LED gives us confidence of the real kind. Not mirage-like.

That’s what a good API test looks like. We should have many of those.


The Assertion Gap

Remember the process of translation? We translate the requirement into testing behavior. What we assert is our definition of “working.”

If we don’t translate well, we get green. But we’re really deep in invisible red.

With AI, we get more code and more behaviors we need to test. We need our test reports to convey that translation correctly.

Before you write a test, or before you ask AI to write one for you, ask this:

“What does this test actually prove?”

Not “what will make it pass.” Not “is the endpoint responding.” Not “does the status code match.”

What does it prove is “working.” And does it show what “not working” looks like.

This is what API test planning actually means now. Not generating tests. Knowing what to ask before you generate anything.

In the AI era, it’s not only good practice. It’s a survival technique.


What’s Next

The Green Mirage is just one of the planning problems we’re tackling in the Modern API Testing webinar series.

On April 15th, we’re running the first session – live, with real APIs, real scenarios. We’re going through test organization, data generation, and a couple of new monsters (Token Gobblers, Knowledge Void, and yes, the Green Mirage in full detail) – and what planning actually looks like when AI is in the room.

One hour. Concrete takeaways. No pretending.

Register Now

Want to go deeper before April 15th? Drop me a message. I’m happy to talk through where your test strategy stands right now.

The post Your API Tests Are Passing. That’s the Problem. first appeared on TestinGil.
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Optimize SQL Server TempDB

1 Share

Explore the importance of SQL Server TempDB and learn how to optimize its performance for a high-functioning environment.

The post Optimize SQL Server TempDB appeared first on MSSQLTips.com.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

A Designer’s Thoughts About This Moment in AI

1 Share

I was walking my dog in the woods and decided to share my thoughts about the state of AI and the tension between the trajectory of AI companies and the designers/creators/makers of the world who are under a tremendous deal of pressure to wield this new technology.

We are people. We are people who use tools and technologies to make things, to advance things, to move things forward, to make the world a better place and help people become healthier, happier, and safe.

At least that’s the aspiration. We all fall short of it. But I fundamentally believe that most people working to create things and put them out into the world are doing it because they want to make the world a better place.

That is why this moment in time—this new technology, this AI landscape, and how it’s emerging and how it is being wielded and how it is being managed—is so incredibly diametrically opposed to this mission.

The alarming and reckless AI trajectory

As I see it, the transformation from the AI field as academic research—very carefully studied, more philosophical—into rapid commercialization has set off an arms race of the absolute worst kind.

It’s playing out even this week where we see OpenAI and Anthropic gussying themselves up, trying to position themselves to look attractive to win contracts with the conveniently renamed Department of War.

What’s so incredibly terrifying and galling about these efforts is that it’s right there in the name, right? This conveniently renamed Department of War—defense is one thing, right? But war is another.

War, my friends, is bad. War is not desirable. Killing people is not good. Killing people is antithetical to promoting the health, happiness, and safety of people on this planet and the world at large.

And I know that war is complicated. I know that societal factors, political forces, geopolitical conflict—war is not a cut and dry effort, but holy shit, what are we doing here?

What are we designing? What are we actively trying to accomplish in this world? What are we building the technology for? What are we doing here?

If your goal is to scale, to win, to profit, then you’ll be willing to take this massively powerful & unknown force and plug it into an apparatus that is out there developing weapons to kill other human beings.

The fact that’s even being entertained is galling and worrying and utterly and completely reckless and irresponsible. You create the most powerful and potent technology that the world has ever seen and inject it into the bloodstream of society without doing due diligence. Just holy shit. What do you think is gonna happen?

Even if you put guardrails in place, even if you thoughtfully and safely roll it out, bad things are gonna happen. Just due to the very nature of the technology. The grain of AI.

But what happens when you hook it up to things that have proven themselves to be not particularly great in the “caring about human beings” department? That’s where this truly feels so reckless and irresponsible.

Our Tension As Creators

And here we are: the people that are on the receiving end of this. The people who are ostensibly wielding these design materials to help in our pursuit to make the world better, to transform the world to be a more just and peaceful and loving and happy place. A place where people have their needs taken care of, everyone has abundance, everyone has health, safety, happiness, that we’re all able to learn from each other and grow with each other and to move on from so much of the bullshit of the past.

This is all available to us.

But the issue, and the dissonance, and the juxtaposition as I see it—that I’ve never felt before in my entire life—is this: of course, we’ve all used tools and technologies made by imperfect humans. We are all imperfect humans.

We see this lousy trajectory unfolding; this barreling commercial enterprise with lousy morals and motives. It’s terrible, yet it’s also inescapable. It puts us all in a really shitty spot. It puts us all in a really awful position.

And there’s a few things we can do. The first thing we need to do is acknowledge the shitty situation that we are all in.

As people who create things and put things out into the world, we have to acknowledge the fact that the core building blocks, its origin story, is built on stolen IP, right? But its ongoing impacts and effects on society, on the planet, on everything needs to be acknowledged.

“Just don’t use it” isn’t realistic

I’ve heard plenty of reactions, and one of those reactions, including from people I greatly admire, greatly respect, is: “just don’t use it.”

To which I say: good luck with that.

I equate this situation to being a vegan. These are undoubtedly noble pursuits and the planet would be much better off if everyone was a vegan. But of course, to convince everyone on the planet to become a vegan is a huge and herculean task.

And in the case of AI, the potential good that can come from this technology—curing cancer, detecting cancer earlier, giving people voice that didn’t have voice before—there are so many use cases for the application of this technology that can make a world a better place.

So saying “just don’t use it” really discounts that . But again, I understand the spirit of not using it as a form of protest against how this stuff emerged, how it’s currently being managed. I get that.

But not everyone has the luxury of just sitting this out, of closing the laptop lid.

My understanding—what I see across the entire industry—is an entire field under so much pressure to learn, get their head around this, to wield it, to figure out how to use it to improve their work, and to simply say “no, I’m not going to do this” out of principle is career suicide, right?

So this dynamic, this zeitgeist, this “you need to do this” is very, very strong. This is a very strong current.

And while this strong current shouldn’t be inflicted from afar from power structures, I believe that from a pure maker/creator perspective, this absolutely is a potent and potentially really beneficial technology to wield to make things better for people.

So sitting it out—just not using it—is not in the cards for the overwhelming majority of people.

Don’t dilute reality to sleep at night

The other thing we could do is dilute ourselves and make excuses for the morally bankrupt and reckless and irresponsible behavior of these companies who are putting this stuff out there.

I get it. It’s an important knee jerk reaction. We need to sleep at night. And also we see the upside, right? We see the benefits, we see the potential of this technology. In order to square that circle, we could just say, “Ah, it’s actually not that bad.”

I think that at no other moment in time, it’s so incredibly important to be able to hold multiple thoughts in our heads at the same time.

The radically transformative power of this technology is huge, and its potential to make the world a better place is real.

The technology itself is neither positive nor negative, but nor is it neutral. But the companies and the people who are perpetuating this arms race to win, to scale, to profit from it—we need to acknowledge that for what it is.

Michael Jackson and The Need For Nuance

As I’ve been thinking about this, I keep coming back to Michael Jackson. Because Michael Jackson produced some of the most powerful, transformative, amazing, beautiful human art and expression that the world has ever seen. Truly the King of Pop.

And yet, when you look at his personal life and you look at the misdeeds and you look at the behaviors and you look at some of the absolute atrocities perpetuated by him—it’s very difficult to reconcile that.

Cause what do you do? Do you not listen to Billie Jean? Do you not listen to Thriller? Do you not say that those are good songs?

No. The art is truly amazing.

But Michael Jackson is the boss level of separating the art from the artist, and I haven’t been able to reconcile that.

So I’ve been forced into a place that I ultimately think is a healthy attitude to cultivate: these things can both be true. Michael Jackson’s music is utterly and completely amazing, transformative, incredible, continues to transform the world even today. And also he did some horrible things in his time on earth.

And there’s no reconciliation of that, especially now that he’s gone.

So I bring that same tension to this moment in time.

Here is this technology that isn’t just pure art. I think it’s important to acknowledge that difference, that this isn’t just overwhelmingly and only a positive technology. It is a powerful technology that can be used for good and it can also be used for really bad things. So that’s one big difference to stress.=.

But unlike Michael Jackson, the people, companies, organizations that are creating this technology, releasing it out into the world are still here.

So what do we do?

I genuinely don’t know and certainly won’t pretend to have all the answers, but I do know that there are some places we can go for guidance.

I feel so incredibly fortunate to have come of age at the same time that the World Wide Web was coming of age as well.

And all these years later—after all of the strange and terrible left turns and bad things that have happened on the web and with the web—the foundational principles, ideals, and values of the World Wide Web are still there.

They’re still intact. They are still a noble pursuit. They are still the vision. They’re still the North Star all these years later. All we have to do is realign ourselves and reintroduce that vision.

Those principles, those values, that commitment to betterment, that commitment to making the world a better place for more people using technology—has to be rekindled. They have to be translated into this new technology landscape. We have to reclaim that because the stakes are fucking high.

We have to align our technology to work in pursuit of the betterment of humankind and all life on this planet, and for the planet itself.

So obviously, there’s a lot of different actions we could take.

We can advocate. We can complain. We can apply pressure—although it seems sometimes quite futile to try to convince certain companies, organizations, and people to suddenly grow a conscience, grow a moral center.

But there’s lots of things we can do.

We can be adopting and looking for healthier alternatives. We could invent healthier alternatives, right?

Another important thing to do is to really take the time to reflect on and establish your own values and principles and how you wield this powerful new design medium.

And not just to do that on an individual level, but with all the other people that you create with.

What do you care about? What are you working towards? What are lines you won’t cross? What is all of this in pursuit of?

In the course we’re putting together, we lead with these values and principles and talk about the importance of being human-centric and working towards the betterment of humankind.

A slide that has words laid out like bricks: humanity, safety, integrity, responsibility, nuance, quality, accessibility, foundations, context, collaboration, curiousity, multiplicity, practicality, durability

Not replacing people, but rather enhancing people. Being responsible. Being ethical. Thinking about sustainability, both in terms of environmental impact, but also in the durability of the things that we create, right? We need to care about quality. We need to care about the impacts that our work has on the world.

We need values and principles to guide every step of the way, especially as the models are changing every day, the tool landscape is this fast moving frenzied landscape and it’s exhausting, and we’re getting pulled in so many different directions. Values and principles serve as a solid foundation to stand on as the fast moving currents of this landscape continue to evolve with breakneck speed.

It doesn’t matter how you go about establishing your values and principles about this moment in time, but it’s fucking imperative that you have a perspective.

And that leads me to my last favor to ask of you, which is that as somebody who is creating and making and putting things out into the world, really recommit to whatever that is—whatever energy you’re putting out into the world—to have it be in pursuit of the betterment of humankind, of life on this planet, of nature, of everything.

This is a really consequential moment. I fundamentally believe in us. I fundamentally believe in humanity.

And so it’s gonna take us all—in our collective actions and decisions—to design a better world.

So let’s have at it.

Thanks.

Read the whole story
alvinashcraft
7 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories