Microsoft’s first in-house AI image generator, MAI-Image-1, is now available in two products, Bing Image Creator and Copilot Audio Expressions. The company announced the model in October. Microsoft AI chief Mustafa Suleyman wrote in a post on X that the text-to-image model will be “coming soon” to the EU.
Suleyman added that the model “really excels at” generating images of food and nature scenes, as well as artsy lighting and photorealistic detail.
Microsoft has previously posted more details on its blog: “MAI-Image-1 excels at generating photorealistic imagery, like lighting (e.g., bounce light, reflections), landscapes, and much more. This is particularly so when compared to many larger, slower models. Its combination of speed and quality means users can get their ideas on screen faster, iterate through them quickly, and then transfer their work to other tools to continue refining.”
Microsoft’s MAI-Image-1 will also create AI-generated art to accompany AI-generated audio stories in the “story mode” of Copilot’s text-to-speech platform, Copilot Audio Expressions.
In August, Microsoft announced their first in-house AI models – the speech model MAI-Voice-1 and the text-based model MAI-1-preview. At the time, the company said it planned to use MAI-1-preview in its Copilot AI assistant in certain unspecified cases, a sign that Microsoft might be pivoting away from its reliance on OpenAI’s models. As of today, Microsoft says that its Copilot chatbot is transitioning to OpenAI’s latest model GPT-5, while also offering Anthropic’s Claude AI models as options to users.
MAI-Image-1 is listed as one of the three AI models available on Bing’s image creator website and app. The other two models, DALL-E 3 and GPT-4o, are from OpenAI.
The future of updating Windows is here! Richard talks to Aria Hanson about Windows Autopatch, the consolidation of Microsoft's various update mechanisms to keep your managed Windows devices current. Aria discusses the deprecation of Windows Server Update Services (WSUS) and the move to always-on cloud updates. The conversation turns to the various techniques for quickly rolling out updates to machines while still allowing the process to stop if there are problems, whether you first update IT workstations, then beta testers, and so on. There's also the Windows Roadmap to give you a heads-up on new features and updates coming, so you can decide in advance whether they will be available on your machines, without changing your overall update policies!
In this video, I walk through Step 3 of the excellent Microsoft Learn tutorial "Getting Started with GitHub Copilot."
In this step, you learn to use the Edit Mode of GitHub Copilot to generate code for your application.
Great day at the office, and a terrific reading list (if I may say so) today. Many good “think” pieces versus product announcements.
[blog] Most of What We Call Progress. Wonderful post on the things that many of us tend to learn over time while building systems.
[blog] Leaving the Cloud Isn’t for Everyone. Right on. For a subset of folks, running their own hardware and systems is the right thing competitively, and from a business sense. For the other 93%, you’re wasting time and money doing it yourself.
[blog] The Data Engineering Agent is now in preview. Looks cool. It doesn’t look like it replaces a data engineer, as you still need to know what to ask for. But this type of assistance makes smart people better.
[blog] Why Log Data Management Is a Thing. It’s a question of data, says James. Here’s a good roundup of some of the players trying to get log management under control.
[blog] Medium CEO explains how AI is changing writing. Writing to learn and think is important. It’s why I don’t use any AI on any of this, or other newsletters I write. But other circumstances don’t require that so much “thinking” to occur.
Code agents are AI systems that can generate high-quality code and work smoothly with code interpreters. These capabilities help streamline complex software development workflows, which has led to their widespread adoption.
However, this progress also introduces critical safety and security risks. Existing static safety benchmarks and red-teaming methods—in which security researchers simulate real-world attacks to identify security vulnerabilities—often fall short when evaluating code agents. They may fail to detect emerging real-world risks, such as the combined effects of multiple jailbreak tools. In the context of code, effective red-teaming requires more than simply checking whether the target code agent rejects unsafe requests. Instead, the agent must generate and execute correct code that performs the intended risky functionality, making it essential to evaluate execution behaviors beyond static code analysis.
To address these challenges, researchers from the University of Chicago, University of Illinois Urbana–Champaign, VirtueAI, the UK AI Safety Institute, University of Oxford, UC Berkeley, and Microsoft Research recently proposed RedCodeAgent, the first fully automated and adaptive red-teaming agent designed specifically to evaluate the safety of large language model (LLM)-based code agents.
Comprehensive experimental results demonstrate the effectiveness and efficiency of RedCodeAgent across (1) diverse Common Weakness Enumeration (CWE) vulnerabilities and malware types, (2) multiple programming languages—including Python, C, C++, and Java—and (3) a wide range of code agents, such as OpenCodeInterpreter, ReAct, MetaGPT, and commercial agents like Cursor and Codeium. RedCodeAgent also uncovers common vulnerabilities across agents such as generating and executing unsafe code, exposes variations in red-teaming difficulty across goals, identifies frequently triggered attack tools, and detects previously unknown vulnerabilities that all other baseline methods overlook.
Framework for automatic red-teaming against code agents
Figure 1: Illustration of RedCodeAgent on automatic red-teaming against a target code agent
As shown in Figure 1, RedCodeAgent is equipped with a memory module that accumulates successful attack experiences, enabling the system to continuously learn and adapt its attack strategies. After learning from the previous experiences, RedCodeAgent further leverages a tailored toolbox that combines representative red-teaming tools with a specialized code substitution module, enabling realistic and diverse code-specific attack simulations through function calling. Based on the target agent’s responses across multiple interactive trials, RedCodeAgent optimizes its strategies, systematically probing for weaknesses and vulnerabilities in real time.
In the evaluation phase, RedCodeAgent integrates simulated sandbox environments to enable code execution and assess the impact of the resulting behaviors. This sandbox-based evaluation ensures a more robust assessment of harmful behaviors and addresses the potential biases of previous static methods that rely solely on “LLM-as-a-judge” evaluations.
A case study is shown in Figure 2. Initially, RedCodeAgent discovers that the request was rejected, then RedCodeAgent calls the Greedy Coordinate Gradient (GCG) algorithm to bypass the safety guardrail. After the second request was rejected by the code agent, RedCodeAgent invoked both Code Substitution and GCG to optimize the prompt. Ultimately, RedCodeAgent successfully combined the suggestion from Code Substitution (i.e., using pathlib) with the adversarial suffix generated by GCG, making the target code agent delete the specified file.
Figure2. A case study of RedCodeAgent calling different tools to successfully attack the target code agent
Insights from RedCodeAgent
Experiments on diverse benchmarks show that RedCodeAgent achieves both a higher attack success rate (ASR) and a lower rejection rate, revealing several key findings outlined below.
Using traditional jailbreak methods alone does not necessarily improve ASR on code agents
The optimized prompts generated by GCG, AmpleGCG, Advprompter, and AutoDAN do not always achieve a higher ASR compared with static prompts with no jailbreak, as shown in Figure 3. This is likely due to the difference between code-specific tasks and general malicious request tasks in LLM safety. In the context of code, it is not enough for the target code agent to simply avoid rejecting the request; the target code agent must also generate and execute code that performs the intended function. Previous jailbreak methods do not guarantee this outcome. However, RedCodeAgent ensures that the input prompt has a clear functional objective (e.g., deleting specific sensitive files). RedCodeAgent can dynamically adjust based on evaluation feedback, continually optimizing to achieve the specified objectives.
Figure 3:RedCodeAgent achieves the highest ASR compared with other methods
RedCodeAgent exhibits adaptive tool utilization
RedCodeAgent can dynamically adjust its tool usage based on task difficulty. Figure 4 shows that the tool calling combination is different for different tasks. For simpler tasks, where the baseline static test cases already achieve a high ASR, RedCodeAgent spends little time invoking additional tools, demonstrating its efficiency. For more challenging tasks, where the baseline static test cases in RedCode-Exec achieve a lower ASR,we observe that RedCodeAgent spends more time using advanced tools like GCG and Advprompter to optimize the prompt for a successful attack. As a result, the average time spent on invoking different tools varies across tasks, indicating that RedCodeAgent adapts its strategy depending on the specific task.
Figure 4: Average time cost for RedCodeAgent to invoke different tools or query the target code agent in successful cases for each risk scenario
RedCodeAgent discovers new vulnerabilities
In scenarios where other methods fail to find successful attack strategies, RedCodeAgent is able to discover new, feasible jailbreak approaches. Quantitatively, we find that RedCodeAgent is capable of discovering 82 (out of 27*30=810 cases in RedCode-Exec benchmark) unique vulnerabilities on the OpenCodeInterpreter code agent and 78 on the ReAct code agent. These are cases where all baseline methods fail to identify the vulnerability, but RedCodeAgent succeeds.
Summary
RedCodeAgent combines adaptive memory, specialized tools, and simulated execution environments to uncover real-world risks that static benchmarks may miss. It consistently outperforms leading jailbreak methods, achieving higher attack success rates and lower rejection rates, while remaining efficient and adaptable across diverse agents and programming languages.
The busier security teams get, the harder it can be to understand the full impact of false positives, queue clutter, tool fragmentation, and more. But what is clear—it all adds up to increased fatigue and an increased potential to miss the cyberthreats that matter most.
To help security teams better face the growing challenges, generative AI offers transformative capabilities that can bridge critical gaps. In a newly released e-book from Microsoft, we share multiple scenarios that showcase how Microsoft Security Copilot, powered by generative AI, can empower security analysts, accelerate incident response, and improve operational inefficiencies. Sign up to get the e-book, From Alert Fatigue to Proactive Defense: What Generative AI Can Do for Your SOC, and learn how AI can transform organizations like yours today.
Figure 1. Key statistics on the realities of today’s security operations center (SOCs).
Enhance every stage of the security operations workflow
The teams we talk to mention how generative AI is dramatically improving the efficacy and efficiency of their security operations (SecOps)—it helps analysts triage alerts by correlating threat intelligence and surfacing related activity that might not trigger a traditional alert. It generates rapid incident summaries so teams can get started faster, guides investigations with step-by-step context and evidence, and automates routine response tasks like containment and remediation through AI-powered playbooks. Additionally, generative AI supports proactive threat hunting by suggesting queries that uncover lateral movement or privilege escalation, and streamlines reporting by producing clear, audience-ready summaries for stakeholders, all of which means SOC teams spend less time on manual, repetitive work and more time focusing on high-impact cyberthreats—ultimately allowing for faster, smarter, and more resilient security operations.
Microsoft Security Copilot helps organizations address critical challenges of scale, complexity, and inefficiencies—as well as streamlining investigations, simplifying reporting, and more. It gives analysts a good idea of where to start, how to prioritize, and improves analyst confidence with actionable insights. By embedding generative AI into existing workflows, SOCs can operationalize and contextualize security data in ways never possible before—delivering guided responses, accelerating investigations, and transforming complex data into clear, actionable insights for both technical teams and business leaders.
Organizations using Security Copilot report a 30% reduction in mean time to resolution (MTTR).5
How Security Copilot delivers real value in everyday SOC tasks
The e-book spans four chapters that cover key scenarios, including investigation and response, AI-powered analysis, proactive threat hunting, and simplified security reporting. Each chapter presents the core challenges faced by today’s SOC teams, how generative AI accelerates and improves outcomes, and measurable, real-world results that show improvements for security analysts—like reduced noise, faster critical insights, identified cyberattack paths, and audience-ready summaries generated by AI. For example, when an analyst receives alerts about unusual login activity from multiple geolocations targeting a high-privilege account, generative AI consolidates related alerts, prioritizes the incident, and provides actionable summaries, allowing for faster triage and confident response.
Included in the e-book are summaries of AI in action, with step-by-step explanations of how Copilot is:
Guiding analysts to confident, rapid decisions—helping SOC analysts quickly triage alerts, summarize incidents, recommend precise actions, and guide responses, for faster, more confident threat containment.
Turning complex scripts into clear insights—supporting SOC analysts to decode malicious scripts, correlate threat intelligence, and automate investigations.
Anticipating cyberthreats before they escalate—empowering threat hunters to quickly query indicators of compromise (IOCs), uncover hidden cyberattack patterns, and take proactive actions, for more predictive defense against evolving cyberthreats.
Simplifying security reporting for analysts–letting SOC analysts to instantly consolidate data, capture critical details, and produce clear, audience-ready reports.
We analyze results about 60% to 70% faster with Security Copilot. It plays a central role in our ability to speed up threat analyses and activities, fundamentally reducing the risks for our IT landscape worldwide.
—Norbert Vetter, Chief Information Security Officer, TÜV SÜD
The future of SecOps is here with generative AI
For security leaders looking to improve their response time and better support their teams, generative AI isn’t just a vision for the future—it’s available today. From triage to reporting, generative AI–powered assistants enhance every stage of the SecOps workflow—delivering faster responses, stronger defenses, and more confident decision-making. At the forefront of this transformation is Microsoft Security Copilot, which unifies tools, operationalizes threat intelligence, and guides analysts through complex workflows, letting SOC teams adapt to evolving cyberthreats with ease. Sign up to access “What Generative AI Can Do for Your SOC” today and learn how your team can move from overwhelmed to empowered, tackling today’s challenges with confidence and preparing for tomorrow’s uncertainties. Or read more about Microsoft AI-powered unified security operations and how they can move your team from overwhelmed to empowered.
To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us on LinkedIn (Microsoft Security) and X (@MSFTSecurity) for the latest news and updates on cybersecurity.
1 “Generative AI and Security Operations Center Productivity: Evidence from Live Operations,” page 2, Microsoft, November 2024
2 Cybersecurity Workforce Study: How the Economy, Skills Gap, and Artificial Intelligence Are Challenging the Global Cybersecurity Workforce 2023,” page 20, ISC2, 2023
3 The Unified Security Platform Era Is Here,” page 7, Microsoft, 2024
4 “Global Security Operations Center Study Results,” page 6, IBM, March 2023
5 “Generative AI and Security Operations Center Productivity: Evidence from Live Operations,” page 2, Microsoft, November 2024