If you want to win in AI - and I mean win in the biggest, most lucrative, most shape-the-world-in-your-image kind of way - you have to do a bunch of hard things simultaneously. You need to have a model that is unquestionably one of the best on the market. You need the nearly infinite resources required to continue to improve that model and deploy it at massive scale. You need at least one AI-based product that lots of people use, and ideally more than one. And you need access to as much of your users' other data - their personal information, their online activity, even the files on their computer - as you can possibly get.
Weāre excited to announce a new GeekWire event for 2026: “Agents of Transformation: Inside the AI Shift.” This half-day summit will be held the afternoon of Tuesday, March 24, in Seattle, exploring how agentic AI is reshaping work, creativity, and leadership.
The event, presented by Accenture, features fireside chats, expert panels, and real-world stories from technology leaders, business execs, and others navigating how AI is changing the way we work and lead, from copilots and automation to the rise of intelligent agents.
Tickets are available now, with discounted early bird rates set to end Feb. 24. Speakers will be announced in the coming weeks.
AI agents is the tech industry’s obsession right now, but there can be a big gap between the pitch and the reality. We’re bringing together people who are in the thick of it to talk candidly about what they’re seeing: breakthroughs, challenges, and what comes next.
The event is part of GeekWire’s longstanding tradition of convening tech, business, and policy leaders for insights and new connections. Hosted at one of our favorite Seattle venues, Block 41, the afternoon will include networking opportunities before, during, and after the program, bringing together founders, executives, and technologists from across the region.
It builds on an ongoing GeekWire editorial series, underwritten by Accenture, spotlighting how startups, developers and tech giants are using intelligent agents to innovate.
Since its founding in 2019, GitHub Security Lab has had one primary goal: community-powered security. We believe that the best way to improve software security is by sharing knowledge and tools, and by using open source software so that everybody is empowered to audit the code and report any vulnerabilities that they find.
Six years later, a new opportunity has emerged to take community-powered security to the next level. Thanks to AI, we can now use natural language to encode, share, and scale our security knowledge, which will make it even easier to build and share new security tools. And under the hood, we can use Model Context Protocol (MCP) interfaces to build on existing security tools like CodeQL.
As a community, we can eliminate software vulnerabilities far more quickly if we share our knowledge of how to find them. With that goal in mind, our team has been experimenting with an agentic framework called the GitHub Security Lab Taskflow Agent. We’ve been using it internally for a while, and we also recently shared it with the participants of the GitHub Secure Open Source Fund. Although it’s still experimental, it’s ready for others to use.
Demo: Variant analysis
It takes only a few steps to get started with seclab-taskflow-agent:
Create a personal access token.
Add codespace secrets.
Start a codespace.
Run a taskflow with a one-line command.
Please follow along and give it a try!
Note: This demo will use some of your token quota, and it’s possible that you’ll hit rate limits, particularly if you’re using a free GitHub account. But I’ve tried to design the demo so that it will work on a free account. The quotas will refresh after one day if you do hit the rate limits.
For security reasons, it’s not a good idea to save the PAT that you just created in a file on disk. Instead, I recommend saving it as a “codespace secret,” which means it’ll be available as an environment variable when you start a codespace in the next step.
Now go back to your codespaces settings and create a second secret named AI_API_TOKEN. You can use the same PAT for both secrets.
We want to use two secrets so that GH_TOKEN is used to access GitHub’s API and do things like read the code, whereas AI_API_TOKEN can access the AI API. Only one PAT is needed for this demo because it uses the GitHub Models API, but the framework also supports using other (not GitHub) APIs for the AI requests.
Answer “yes” when it asks for permission to run memcache_clear_cache; this is the first run so the cache is already empty. The demo downloads and analyzes a security advisory from the repository (in this example, GHSA-c944-cv5f-hpvr from cmark-gfm). It tries to identify the source code file that caused the vulnerability, then it downloads that source code file and audits it for other similar bugs. It’s not a sophisticated demo, and (thankfully) it has not found any new bugs in cmark-gfm 🫣. But it’s short and simple, and I’ll use it later to explain what a taskflow is. You can also try it out on a different repository, maybe one of your own, by changing the repo name at the end of the command.
Other ways to run
I recommend using a codespace because it’s a quick, reliable way to get started. It’s also a sandboxed environment, which is good for security. But there are other ways to run the framework if you prefer.
Running in a Linux terminal
These are the commands to install and run the demo locally on a Linux system:
These commands download our latest release from PyPI. Note that some of the toolboxes included with the framework may not work out-of-the-box with this approach because they depend on other software being installed. For example, the CodeQL toolbox depends on CodeQL being installed. You can copy the installation instructions from the devcontainer configuration that we use to build our codespaces environment.
Running in docker
We publish a docker image with tools like CodeQL pre-installed. You can run it with this script. Be aware that this docker image only includes seclab-taskflow-agent. We are planning to publish a second “batteries included” image that also includes seclab-taskflows in the future. Note: I’ll explain the relationship between seclab-taskflow-agent and seclab-taskflows in the section about the collaboration model.
Taskflows
A taskflow is a YAML file containing a list of tasks for the framework to execute. Let’s look at the taskflow for my demo (source):
seclab-taskflow-agent:
filetype: taskflow
version: 1
globals:
repo:
ghsa:
taskflow:
- task:
must_complete: true
agents:
- seclab_taskflow_agent.personalities.assistant
toolboxes:
- seclab_taskflow_agent.toolboxes.memcache
user_prompt: |
Clear the memory cache.
- task:
must_complete: true
agents:
- seclab_taskflow_agent.personalities.assistant
toolboxes:
- seclab_taskflows.toolboxes.ghsa
- seclab_taskflows.toolboxes.gh_file_viewer
- seclab_taskflow_agent.toolboxes.memcache
user_prompt: |
Fetch the details of the GHSA {{ GLOBALS_ghsa }} of the repo {{ GLOBALS_repo }}.
Analyze the description to understand what type of bug caused
the vulnerability. DO NOT perform a code audit at this stage, just
look at the GHSA details.
Check if any source file is mentioned as the cause of the GHSA.
If so, identify the precise file path and line number.
If no file path is mentioned, then report back to the user that
you cannot find any file path and end the task here.
The GHSA may not specify the full path name of the source
file, or it may mention the name of a function or method
instead, so if you have difficulty finding the file, try
searching for the most likely match.
Only identify the file path for now, do not look at the code or
fetch the file contents yet.
Store a summary of your findings in the memcache with the GHSA
ID as the key. That should include the file path and the function that
the file is in.
- task:
must_complete: true
agents:
- seclab_taskflow_agent.personalities.assistant
toolboxes:
- seclab_taskflows.toolboxes.gh_file_viewer
- seclab_taskflow_agent.toolboxes.memcache
user_prompt: |
Fetch the GHSA ID and summary that were stored in the memcache
by the previous task.
Look at the file path and function that were identified. Use the
get_file_lines_from_gh tool to fetch a small portion of the file instead of
fetching the entire file.
Fetch the source file that was identified as the cause of the
GHSA in repo {{ GLOBALS_repo }}.
Do a security audit of the code in the source file, focusing
particularly on the type of bug that was identified as the
cause of the GHSA.
You can see that it’s quite similar in structure to a GitHub Actions workflow. There’s a header at the top, followed by the body, which contains a series of tasks. The tasks are completed one by one by the agent framework. Let’s go through the sections one by one, focusing on the most important bits:
Header
The first part of the header defines the file type. The most frequently used file types are:
taskflow: Describes a sequence of tasks for the framework to execute.
personality: It’s often useful to ask to assume a particular personality while executing a task. For example, we have an action_expert personality that is useful for auditing actions workflows.
toolbox: Contains instructions for running an MCP server. For example, the demo uses the gh_file_viewer toolbox for downloading source code files from GitHub.
The globals section defines global variables named “repo” and “ghsa,” which we initialized with the command-line arguments -g repo=github/cmark-gfm and -g ghsa=GHSA-c944-cv5f-hpvr. It’s a crude way to parameterize a taskflow.
Task 1
Tasks always specify a “personality” to use. For non-specialized tasks, we often just use the assistant personality.
Each task starts with a fresh context, so the only way to communicate a result from one task to the next is by using a toolbox as an intermediary. In this demo, I’ve used the memcache toolbox, which is a simple key-value store. We find that this approach is better for debugging, because it means that you can rerun an individual task with consistent inputs when you’re testing it.
This task also demonstrates that toolboxes can ask for confirmation before doing something potentially destructive, which is an important protection against prompt injection attacks.
Task 2
This task uses the ghsa toolbox to download the security advisory from the repository and the gh_file_viewer toolbox to find the source file that’s mentioned in the advisory. It creates a summary and uses the memcache toolbox to pass it to the next task.
Task 3
This task uses the memcache toolbox to fetch the results from the previous task and the gh_file_viewer toolbox to download the source code and audit it.
Often, the wording of a prompt is more subtle than it looks, and this third task is an example of that. Previous versions of this task tried to analyze the entire source file in one go, which used too many tokens. So the second paragraph, which asks to analyze a “small portion of the file,” is very important to make this task work successfully.
Taskflows summary
I hope this demo has given you a sense of what a taskflow is. You can find more detailed documentation in README.md and GRAMMAR.md. You can also find more examples in this subdirectory of seclab-taskflow-agent and this subdirectory of seclab-taskflows.
Collaboration model
We would love for members of the community to publish their own suites of taskflows. To make collaboration easy, we have built on top of Python’s packaging ecosystem. Our own two repositories are published as packages on PyPI:
The reason why we have two repositories is that we want to separate the “engine” from the suites of taskflows that use it. Also, seclab-taskflows is intended to be an easy-to-copy template for anybody who would like to publish their own suite of taskflows. To get started on your package, we recommend using the hatch new command to create the initial project structure. It will generate things like the pyproject.toml file, which you’ll need for uploading to PyPI. Next we recommend creating a directory structure like ours, with sub-directories for taskflows, toolboxes, etc. Feel free to also copy other parts of seclab-taskflows, such as our publish-to-pypi.yaml workflow, which automatically uploads your package to PyPI when you push a tag with a name like “v1.0.0.”
An important feature of the collaboration model is that it is also easy to share MCP servers. For example, check out the MCP servers that are included with the seclab-taskflows package. Each MCP server has a corresponding toolbox YAML file (in the toolboxes directory) which contains the instructions for running it.
The import system
Taskflows often need to refer to other files, like personalities or toolboxes. And for the collaboration model to work well, we want you to be able to reuse personalities and toolboxes from other packages. We are leveraging Python’s importlib to make it easy to reference a file from a different package. To illustrate how it works, here’s an example in which seclab-taskflows is using a toolbox from seclab-taskflow-agent:
The implementation splits the name seclab_taskflow_agent.toolboxes.memcache into a directory (seclab_taskflow_agent.toolboxes) and a filename (memcache). Then it uses Python’s importlib.resources.files to locate the directory and loads the file named memcache.yaml from that directory. The only quirk of this system is that names always need to have at least two parts, which means that your files always need to be stored at least one directory deep. But apart from that, we’re using Python’s import system as is, which means that there’s plenty of documentation and advice available online.
Project vision
We have two main goals with this project. First is to encourage community-powered security. Many of the agentic security tools that are currently popping up are closed-source black boxes, which is the antithesis of what we stand for as a team. We want people to be able to look under the hood and see how the taskflows work. And we want people to be able to easily create and share their own taskflows. As a community, we can eliminate software vulnerabilities far more quickly if we share our knowledge of how to find them. We’re hoping that taskflows can be an effective tool for that.
Second is to create a tool that we want to use ourselves. As a research team, we want a tool that’s good for rapid experimentation. We need to be able to quickly create a new security rule and try it out. With that in mind, we’re not trying to create the world’s most polished or efficient tool, but rather something that’s easy to modify.
TL;DR: Today, weāre releasing a new episode of our podcast AI & I, where Dan Shipper sits down with Nir Zicherman, the CEO and cofounder of AI learning platform Oboe, to talk about how to use LLMs to teach yourself anything. Watch on X or YouTube, or listen on Spotify or Apple Podcasts.
Was this newsletter forwarded to you? Sign up to get it in your inbox.
LLMs have made it absurdly easy to go deep on almost any topic. So why havenāt we all used ChatGPT to earn college degrees we wished we had majored in or pursued a niche interest, like learning how to name the trees in our neighborhood? I know Iām not the only one to feel guilty for well-intentioned attempts at autodidactism that inevitably peter out.
Entrepreneur Nir Zicherman has a reason for this disconnect: LLMs can answer most of your questions, but they wonāt notice when youāre lost or pull you back in when your motivation starts to fade.
As the CEO and cofounder of Oboe, a platform that generates personalized courses about everything from the history of snowboarding to JavaScript fundamentals using AI, Zicherman has thought deeply about why the ability to access information does not automatically lead to understanding a concept. In this episode of AI & I, he talks to Dan Shipper about everything heās learned about learning with LLMs.
Enterprise developers know the grind: wrestling with legacy code, navigating complex dependency challenges, and waiting on security reviews that stall releases. OpenAIās GPTā5.2āCodex flips that equation and helps engineers ship faster without cutting corners. Itās not just autocomplete; itās a reasoning engine for real-world software engineering.
Generally available starting today through Azure OpenAI in Microsoft Foundry Models, GPTā5.2āCodex is built for the realities of enterprise codebases, large repos, evolving requirements, and security constraints that canāt be overlooked. As OpenAIās most advanced agentic coding model, it brings sustained reasoning, and security-aware assistance directly into the workflows enterprise developers already rely on with Microsoftās secure and reliable infrastructure.
GPT-5.2-Codex at a Glance
GPTā5.2āCodex is designed for how software gets built in enterprise teams. You start with imperfect inputs including legacy code, partial docs, screenshots, diagrams, and work through multiāstep changes, reviews, and fixes. The model helps keep context, intent, and standards intact across that entire lifecycle, so teams can move faster without sacrificing quality or security.
What it enables
Work across code and artifacts: Reason over source code alongside screenshots, architecture diagrams, and UI mocks ā so implementation stays aligned with design intent.
Stay productive in longārunning tasks: Maintain context across migrations, refactors, and investigations, even as requirements evolve.
Build and review with security in mind: Get practical support for secure coding patterns, remediation, reviews, and vulnerability analysis ā where correctness matters as much as speed.
Feature Specs (quick reference)
Context window: 128K tokens (approximately 100K lines of code)
Supported languages: 50+ including Python, JavaScript/TypeScript, C#, Java, Go, Rust
Multimodal inputs: Code, images (UI mocks, diagrams), and natural language
API compatibility: Drop-in replacement for existing Codex API calls
Use cases where it really pops
Legacy modernization with guardrails: Safely migrate and refactor āuntouchableā systems by preserving behavior, improving structure, and minimizing regression risk.
Largeāscale refactors that donāt lose intent: Execute crossāmodule updates and consistency improvements without the typical āone step forward, two steps backā churn.
AIāassisted code review that raises the floor: Catch risky patterns, propose safer alternatives, and improve consistency, especially across large teams and longālived codebases.
Defensive security workflows at scale: Accelerate vulnerability triage, dependency/path analysis, and remediation when speed matters, but precision matters more.
Lower cognitive load in long, multiāstep builds: Keep momentum across multiāhour sessions: planning, implementing, validating, and iterating with context intact.
Pricing
Model
Input Price/1M Tokens
Cached Input Price/1M Tokens
Output Price/1M Tokens
GPT-5.2-Codex
$1.75
$0.175
$14.00
Security Aware by Design, not as an Afterthought
For many organizations, AI adoption hinges on one nonnegotiable question: Can this be trusted in security sensitive workflows?
GPT-5.2-Codex meaningfully advances the Codex lineage in this area. As models grow more capable, weāve seen that general reasoning improvements naturally translate into stronger performance in specialized domains ā including defensive cybersecurity.
With GPTā5.2āCodex, this shows up in practical ways:
Improved ability to analyze unfamiliar code paths and dependencies
Stronger assistance with secure coding patterns and remediation
More dependable support during code reviews, vulnerability investigations, and incident response
At the same time, Microsoft continues to deploy these capabilities thoughtfully balancing access, safeguards, and platform level controls so enterprises can adopt AI responsibly as capabilities evolve.
Why Run GPT-5.2-Codex on Microsoft Foundry?
Powerful models matter ā but where and how they run matters just as much for enterprise.
Organizations choose Microsoft Foundry because it combines Foundry frontier AI with Azure enterprise grade fundamentals:
Integrated security, compliance, and governance Deploy GPT-5.2-Codex within existing Azure security boundaries, identity systems, and compliance frameworks ā without reinventing controls.
Enterprise ready orchestration and tooling Build, evaluate, monitor, and scale AI powered developer experiences using the same platform teams already rely on for production workloads.
A unified path from experimentation to scale Foundry makes it easier to move from proof of concept to real deployment āwithout changing platforms, vendors, or operating assumptions.
Trust at the platform level For teams working in regulated or security critical environments, Foundry and Azure provide assurances that go beyond the model itself.
Together with GitHub Copilot, Microsoft Foundry provides a unified developer experience ā from ināIDE assistance to productionāgrade AI workflows ā backed by Azureās security, compliance, and global scale. This is where GPT-5.2-Codex becomes not just impressive but adoptable.
Get Started Today
Explore GPTā5.2āCodex in Microsoft today. Start where you already work: Try GPTā5.2āCodex in GitHub Copilot for everyday coding and scale the same model to larger workflows using Azure OpenAI in Microsoft Foundry. Letās build whatās next with speed and security.