Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148770 stories
·
33 followers

BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI

1 Share
Three white icons on a blue-to-green gradient background: the first icon shows a circle with connected nodes, the second shows a circuit, and the third shows a flowchart

Introduction

Large language models (LLMs) are now widely used for automated code generation across software engineering tasks. However, this powerful capability in code generation also introduces security concerns. Code generation systems could be misused for harmful purposes, such as generating malicious code. It could also produce bias-filled code reflecting underlying logic that is discriminatory or unethical. Additionally, even when completing benign tasks, LLMs may inadvertently produce vulnerable code that contains security flaws (e.g., injection risks, unsafe input handling). These unsafe outcomes undermine the trustworthiness of code generation models and pose threats to the broader software ecosystem, where safety and reliability are critical.

Many studies have explored red teaming code LLMs, testing whether the models can reject unsafe requests and whether their generated code exhibits insecure patterns. For more details, see our earlier MSR blog post on RedCodeAgent. While red teaming has significantly improved our understanding of model failure modes, progress on blue teaming—i.e., developing effective defensive mechanisms to detect and prevent such failures—remains relatively limited. Current blue teaming approaches face several challenges: (1) Poor alignment with security concepts: additional safety prompts struggle to help models understand high-level notions, such as what constitutes a malicious or bias instruction, and typically lack actionable principles to guide safe decision-making. A case study is shown in Figure 1. (2) Over-conservatism: especially in the domain of vulnerable code detection, models tend to misclassify safe code as unsafe, leading to more false positives and reduced developer trust. (3) Incomplete risk coverage: without a strong knowledge foundation, models perform poorly when dealing with subtle or previously unseen risks.   

To address these challenges, researchers from the University of Chicago, University of California, Santa Barbara, University of Illinois Urbana–Champaign, VirtueAI, and Microsoft Research recently released a paper: BlueCodeAgent: A Blue Teaming Agent Enabled by Automated Red Teaming for CodeGen AI. This work makes the following key contributions: 

  1. Diverse red-teaming pipeline: The authors design a comprehensive red-teaming process that integrates multiple strategies to synthesize diverse red-teaming data for effective knowledge accumulation.
  2. Knowledge-enhanced blue teaming: Building on the foundation of red-teaming knowledge, BlueCodeAgent significantly improves blue-teaming performance by leveraging constitutions derived from knowledge and dynamic testing. 
  3. Principled-Level Defense and Nuanced-Level analysis: The authors propose two complementary strategies—Principled-Level Defense (via constitutions) and Nuanced-Level Analysis (via dynamic testing)—and demonstrate their synergistic effects in vulnerable code detection tasks. 
  4. Generalization to seen and unseen risks: Empowered by comprehensive red-teaming knowledge, BlueCodeAgent generalizes effectively to unseen risks. Overall, BlueCodeAgent achieves an average 12.7% improvement in F1 score across four datasets and three tasks, attributed to its ability to distill actionable constitutions that enhance context-aware risk detection. 
Figure 1. A case study of BlueCodeAgent on the bias instruction detection task. Even when concepts such as “biased” are explicitly included in additional safety prompts, models often fail to recognize biased requests (left). BlueCodeAgent (right) addresses this gap by summarizing constitutions from knowledge and applying concrete, actionable constraints benefited from red teaming to improve the defense.
Figure 1. A case study of BlueCodeAgent on the bias instruction detection task. Even when concepts such as “biased” are explicitly included in additional safety prompts, models often fail to recognize biased requests (left). BlueCodeAgent (right) addresses this gap by summarizing constitutions from knowledge and applying concrete, actionable constraints benefited from red teaming to improve the defense.

A blue teaming agent enabled by red teaming

Figure 2: Overview of BlueCodeAgent, an end-to-end blue teaming framework powered by automated red teaming for code security. By integrating knowledge derived from diverse red teaming and conducting dynamic sandbox-based testing, BlueCodeAgent substantially strengthens the defensive capabilities beyond static LLM analysis.
Figure 2: Overview of BlueCodeAgent, an end-to-end blue teaming framework powered by automated red teaming for code security. By integrating knowledge derived from diverse red teaming and conducting dynamic sandbox-based testing, BlueCodeAgent substantially strengthens the defensive capabilities beyond static LLM analysis.

Figure 2 presents an overview of the pipeline. The framework unifies both sides of the process: red teaming generates diverse risky cases and behaviors, which are then distilled into actionable constitutions that encode safety rules on the blue-teaming side. These constitutions guide BlueCodeAgent to more effectively detect unsafe textual inputs and code outputs, mitigating limitations such as poor alignment with abstract security concepts. 

This work targets three major risk categories, covering both input/textual-level risks—including biased and malicious instructions—and output/code-level risks, where models may generate vulnerable code. These categories represent risks that have been widely studied in prior research. 

Diverse red-teaming process for knowledge accumulation 

Since different tasks require distinct attack strategies, the red-teaming employs multiple attack methods to generate realistic and diverse data. Specifically, the red-teaming process is divided into three categories:

  1. Policy-based instance generation: To synthesize policy-grounded red-teaming data, diverse security and ethical policies are first collected. These high-level principles are then used to prompt an uncensored model to generate instances that intentionally violate the specified policies.
  2. Seed-based adversarial prompt optimization: Existing adversarial instructions are often overly simplistic and easily rejected by models. To overcome this limitation, an adaptive red-teaming agent invokes various jailbreak tools to iteratively refine initial seed prompts until the prompts achieve high attack success rates.
  3. Knowledge-driven vulnerability generation: To synthesize both vulnerable and safe code samples under realistic programming scenarios, domain knowledge of common software weaknesses (CWE) is leveraged to generate diverse code examples.

Knowledge-enhanced blue teaming agent 

After accumulating red-teaming knowledge data, BlueCodeAgent set up Principled-Level Defense via Constitution Construction and Nuanced-Level Analysis via Dynamic Testing.

  1. Principled-Level Defense via Constitution Construction 
    Based on the most relevant knowledge data, BlueCodeAgent summarizes red-teamed knowledge into actionable constitutions—explicit rules and principles distilled from prior attack data. These constitutions serve as normative guidelines, enabling the model to stay aligned with ethical and security principles even when confronted with novel or unseen adversarial inputs. 
  2. Nuanced-Level Analysis via Dynamic Testing 
    In vulnerable code detection, BlueCodeAgent augments static reasoning with dynamic sandbox-based analysis, executing generated code within isolated Docker environments to verify whether the model-reported vulnerabilities manifest as actual unsafe behaviors. This dynamic validation effectively mitigates the model’s tendency toward over-conservatism, where benign code is mistakenly flagged as vulnerable. 

Spotlight: Microsoft research newsletter

Microsoft Research Newsletter

Stay connected to the research community at Microsoft.

Opens in a new tab

Insights from BlueCodeAgent 

BlueCodeAgent outperforms prompting baselines 

As shown in Figure 3, BlueCodeAgent significantly outperforms other baselines. Several findings are highlighted. 

(1) Even when test categories differ from knowledge categories to simulate unseen scenarios, BlueCodeAgent effectively leverages previously seen risks to handle unseen ones, benefiting from its knowledge-enhanced safety reasoning. 

(2) BlueCodeAgent is model-agnostic, working consistently across diverse base LLMs, including both open-source and commercial models. Its F1 scores for bias and malicious instruction detection approach 1.0, highlighting strong effectiveness. 

(3) BlueCodeAgent achieves a strong balance between safety and usability. It accurately identifies unsafe inputs while maintaining a reasonable false-positive rate on benign ones, resulting in a consistently high F1 score. 

(4) By contrast, prompting with general or fine-grained safety reminders remains insufficient for effective blue teaming, as models struggle to internalize abstract safety concepts and apply them to unseen risky scenarios. BlueCodeAgent bridges this gap by distilling actionable constitutions from knowledge, using concrete and interpretable safety constraints to enhance model alignment. 

Figure 3. F1 scores on bias instruction detection task (BlueCodeEval-Bias) in the first row and on malicious instruction detection task (BlueCodeEval-Mal, RedCode-based) in the second row.
Figure 3: F1 scores on bias instruction detection task (BlueCodeEval-Bias) in the first row and on malicious instruction detection task (BlueCodeEval-Mal) in the second row. 

Complementary effects of constitutions and dynamic testing 

In vulnerability detection tasks, models tend to behave conservatively—an effect also noted in prior research. They are often more likely to flag code as unsafe rather than safe. This bias is understandable: confirming that code is completely free from vulnerabilities is generally harder than spotting a potential issue. 

To mitigate this over-conservatism, BlueCodeAgent integrates dynamic testing into its analysis pipeline. When BlueCodeAgent identifies a potential vulnerability, it triggers a reliable model (Claude-3.7-Sonnet-20250219) to generate test cases and corresponding executable code that embeds the suspicious snippet. These test cases are then run in a controlled environment to verify whether the vulnerability actually manifests. The final judgment combines the LLM’s analysis of the static code, the generated test code, run-time execution results, and constitutions derived from knowledge. 

Researchers find the two components—constitutions and dynamic testing—play complementary roles. Constitutions expand the model’s understanding of risk, increasing true positives (TP) and reducing false negatives (FN). Dynamic testing, on the other hand, focuses on reducing false positives (FP) by validating whether predicted vulnerabilities can truly be triggered at run-time. Together, they make BlueCodeAgent both more accurate and more reliable in blue-teaming scenarios. 

Summary 

BlueCodeAgent introduces an end-to-end blue-teaming framework designed to address risks in code generation. The key insight behind BlueCodeAgent is that comprehensive red-teaming can greatly strengthen blue-teaming defenses. Based on this idea, the framework first builds a red-teaming process with diverse strategies for generating red-teaming data. It then constructs a blue-teaming agent that retrieves relevant examples from the red-teaming knowledge base and summarizes safety constitutions to guide LLMs in making accurate defensive decisions. A dynamic testing component is further added to reduce false positives in vulnerability detection. 

Looking ahead, several directions hold promise.  

First, it is valuable to explore the generalization of BlueCodeAgent to other categories of code-generation risks beyond bias, malicious code, and vulnerable code. This may require designing and integrating novel red-teaming strategies into BlueCodeAgent and creating corresponding benchmarks for new risks.  

Second, scaling BlueCodeAgent to the file and repository levels could further enhance its real-world utility, which requires equipping agents with more advanced context retrieval tools and memory components.  

Finally, beyond code generation, it is also important to extend BlueCodeAgent to mitigate risks in other modalities, including text, image, video, and audio, as well as in multimodal applications. 

Opens in a new tab

The post BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI appeared first on Microsoft Research.

Read the whole story
alvinashcraft
19 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

e233 – Transforming Presentation Workflows with Brandin – and we like it!

1 Share

Show Notes – Episode #233

In episode 233 of The Presentation Podcast the hosts talk about deploying PowerPoint templates across an organization can be a nightmare. They are joined by guests Jamie Garroch and Hannah Harper of BrightCarbon to discuss “BrandIn” – a PowerPoint add-in that centralizes templates, assets, and brand resources for easy access and management in a seamless interface all within PowerPoint. Jamie and Hannah explain how BrandIn streamlines template distribution, enhances brand consistency, and empowers agencies, designers and corporate users to access PowerPoint templates and assets to create on-brand presentations efficiently.

Highlights:

  • Overview of the BrandIn add-in for PowerPoint
  • Benefits of a centralized repository for PowerPoint templates and assets
  • Comparison of BrandIn with traditional solutions like Microsoft’s Organizational Asset Library for SharePoint
  • User experience and ease of installation for the BrandIn add-in
  • Features that enhance brand consistency and productivity for users
  • Discussion on the challenges of distributing PowerPoint templates within organizations
  • Upcoming features for BrandIn , including Brand Check and Text assets
  • User feedback and productivity improvements reported by organizations using BrandIn

Resources from this Episode:  

Show Suggestions? Questions for your Hosts?

Email us at: info@thepresentationpodcast.com

New Episodes 1st and 3rd Tuesday Every Month

Thanks for joining us!

The post e233 – Transforming Presentation Workflows with Brandin – and we like it! appeared first on The Presentation Podcast.

Read the whole story
alvinashcraft
19 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Microsoft Zero Trust Assessment: Helping you operationalize the hardening of your Microsoft security products

1 Share
Evolving Threats, Adaptive Defenses: The Security Practitioner’s New Reality 

Cyber threats are advancing faster than ever, and the arrival of highly accessible AI tools with a low proficiency barrier has made this challenge one that most organizations cannot keep up with. According to the latest Microsoft Digital Defense Report, 28% of breaches begin with phishing, and we also see a 4.5x increase in AI automated phishing campaigns with higher click through rates. This example highlights the need for security organizations to not only prioritize hardened security policies but also automating detection of misconfigurations and deviations from the desired security posture. 
 
To help address these challenges, Microsoft launched the Secure Future Initiative (SFI) in November 2023, a multiyear effort to transform how we design, build, test, and operate our products and services, to meet the highest security standards. SFI unites every part of Microsoft to strengthen cybersecurity across our company and products. We’ve committed to transparency by sharing regular updates with customers, partners, and the security community. Today, we released our third SFI progress report, which highlights 10 actionable patterns and practices customers can adopt to reduce risk, along with additional best practices and guidance. In this report, we share updates across every engineering pillar, introduce mapping to the NIST Cybersecurity Framework to help customers measure progress against a recognized industry standard, and showcase new security capabilities delivered to customers. We also provide implementation guidance aligned to Zero Trust principles, ensuring organizations have practical steps to reduce risk and strengthen resilience. 

Building on these learnings, we’re excited to announce the public preview of the Microsoft Zero Trust Assessment tool, designed to help you identify common security gaps starting with Identity and Device pillars with the remaining pillars of Zero Trust coming soon. This assessment is informed by our own SFI learnings and aligned with widely recognized frameworks such as CISA’s SCuBA project. Your feedback is critical as we continue to iterate and expand this tool. Our goal is for you to operationalize it in your environment and share insights as we add more pillars in the coming months. 

Introducing Zero Trust Assessment  

A deep dive into how the Microsoft Zero Trust Assessment works including report structure, prioritization logic, and implementation guidance is available below in this blog. The Microsoft Zero Trust Assessment empowers teams to make informed decisions, reduce blind spots, and prioritize remediation, turning insights into action. Once you download and run the tool (installation guide), it will assess your policy configurations and scan objects to generate a comprehensive report that not only highlights gaps and risks but also explains what was checked, why a test failed, and how your organization can implement the recommended configuration. This makes the results immediately actionable; security teams know exactly what steps to take next. The report features an overview page that presents aggregated data across your tenant, highlighting overall risk levels, patterns, and trends. This allows security teams to quickly assess their organization’s posture, identify high-impact areas, and prioritize remediation efforts. 

Figure 1: Overview Page

The assessment provides a detailed list of all the tests that were conducted, including those not applicable, so the results are clear and relevant. Each test includes risk level, user impact, and implementation effort, enabling teams to make informed decisions and prioritize fixes based on business impact. By combining clear guidance with prioritized recommendations, the Zero Trust Assessment turns insights into action, helping organizations reduce blind spots, strengthen security, and plan remediation effectively. Future updates will expand coverage to additional Zero Trust pillars, giving organizations even broader visibility and guidance.  

Figure 2: Outcome of the Identity/Devices Checks

For each test performed, customers can see the exact policies or objects that are passing or failing the test with a direct link to where they can address it in the product, and guidance on how to remediate.  

Figure 3: Details of the test performed

The report also provides granular details of the policies evaluated and any applicable assignment groups. In addition, the tool provides clear guidance on details of the test performed and why it matters, and the steps required to resolve issues effectively. 

How It Works 

Here’s a quick summary of the steps for you to run the tool. Check our documentation for full details. 

First, you install the ZeroTrustAssessment PowerShell module. 

Install-Module ZeroTrustAssessment -Scope CurrentUser

Then, you connect to Microsoft Graph and to Azure by signing into your tenant. 

Connect-ZtAssessment

After that, you run a single command to kick off the data gathering. Depending on the size of your tenant, this might take several hours. 

Invoke-ZtAssessment

After the assessment is complete, the tool will display the assessment results report. A sample report of the assessment can be viewed at aka.ms/zerotrust/demo. 

The tool uses read-only permissions to download the tenant configuration, and it runs the analysis locally on your computer. We recommend you treat the data and artifacts it creates as highly sensitive organization security data.  

Get Started Today 

Ready to strengthen your security posture? Download and run the Zero Trust Assessment to see how your tenant measures up. Review the detailed documentation for Identity and Devices to understand every test and recommended action. If you have feedback or want to help shape future releases, share your insights at aka.ms/zerotrust/feedback. If you find the assessment valuable, pass it along to your peers and help raise the bar for all our customers.

To learn more about Microsoft Security solutions, visit our website.  Bookmark the Security blog and Technical Community blogs to keep up with our expert coverage on security matters, including updates on this assessment. Also, follow us on LinkedIn (Microsoft Security) and X (@MSFTSecurity) for the latest news and updates on cybersecurity. 

What’s Next 

This is just the first step in the journey. We will be launching new SFI-infused assessments across the other pillars of Zero Trust in the coming months. Please stay tuned for updates.  

Want to go deeper? 

Visit the SFI webpage to explore the report, actionable patterns, NIST mapping, and best practices that can help you strengthen your security posture today.  

Read the whole story
alvinashcraft
20 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

.NET 10 and the Release Cycle Paradox

1 Share


Read the whole story
alvinashcraft
20 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Securing a retail AI endpoint from abuse for virtual try on

1 Share
Read the whole story
alvinashcraft
20 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

How to structure Figma files for MCP and AI-powered code generation

1 Share

We’ve entered a new era of design-to-code. Thanks to generative AI tools, transforming design images into working code is now possible. You can easily feed a screenshot into Lovable or Replit and have a functional app built within minutes! The problem, however, is that an image alone doesn’t give an AI model enough context to produce a pixel-perfect result. These models often generate layouts that stray from the original design system. As a developer, I can tell this is the fastest way to trigger any designer you work with.

How To Structure Figma Files For MCP And AI-Powered Code Generation

So, how can we get better, more accurate code generation directly from Figma designs? The answer is Model Context Protocol (MCP). MCP is an open protocol that allows AI agents access to external tools and data outside their range, through a supporting MCP provider. This standardized protocol lets us connect any AI application to external systems. It has famously been called the USB-C for AI apps.

Figma recently rolled out its MCP Server (beta), which lets AI coding agents use Figma’s custom AI tools to fetch live design data directly from artboards. It has the full context of your design layout. With Figma MCP, AI coding agents can now read native Figma properties like variables, design tokens, components, variants, auto layout rules, and more. These properties are then used as data inputs (context) for AI agents like Cursor, Copilot, and Claude Code to accurately implement your designs.

The traditional designer-to-developer handoff is now evolving into a designer-to-agent handoff, where the priority is structuring design files in a way that is easily interpreted by AI agents for successful code generation.

For designers, this new wave changes what it means to prepare a design for development. The focus is no longer just on creating visually appealing designs, but on creating AI-compatible design files. In this article, we’ll explore how Figma’s new MCP server works and how to optimize your design files for agent handoff.

Prerequisites

To follow along, you’ll need:

  • A Figma account with at least a Dev Seat subscription — there’s another free option for this later on
  • A paid or trial subscription with a coding IDE like Cursor or VSCode

The new paradigm: Designer-to-agent handoff

I still remember my first experience with Figma. It’s always been great at creating website designs. But as a developer, I couldn’t help but wish that these designs I made or collaborated on could be directly translated into usable source code. That wish started to materialize when Figma started to let us copy CSS styles from designs.

At first, this feature was underwhelming. The generated CSS was filled with absolute positioning and rigid padding and margin values that made no sense in a real project. But over time, the software has gotten better.

The handoff process evolved. Figma introduced ways to export design tokens as JSON, allowing colors, typography, and spacing values to stay version-controlled and in sync with code.

Tools like Storybook came on the scene to help design and development teams collaborate on their design systems.

Now, we’re entering the MCP era that enables designer-to-agent handoff. With Figma MCP, a developer can connect a design file to an agent that calls special Figma AI tools to grab design data and translate it directly to code. This workflow allows for near pixel-perfect implementation and faster design iterations.

Figma MCP approaches

As mentioned earlier, using the official Figma MCP server requires a Figma subscription with a Dev seat. However, there’s another option, which is a community-run MCP server from Framelink.ai named Framelink MCP for Figma. There are tradeoffs, of course.

Figma’s MCP server is built to integrate tightly with other native Figma features and also has Code Connect. Framelink, on the other hand, uses Figma’s API and other methods to read data about your layout and pass that design context to coding agents. Let’s take a closer look.

Figma’s MCP Server (official, beta)

The official Figma MCP Server brings Figma directly into your workflow by providing rich design context to AI agents that generate code from Figma design files. It exposes several tools that coding agents can leverage, including get_design_context, get_screenshot, get_variable_defs, and more. The MCP Server supports Code Connect, which allows you to map Figma components to their corresponding components in your codebase. The coding agent can then use the get_code_connect_map tool to generate code that stays consistent with your component library and codebase standards.

In short, the official server gives first-class support for understanding your design file by integrating tightly with Figma’s native features.

Framelink Figma MCP (community, free)

Framelink’s MCP for Figma is an open-source MCP server that provides Figma layout information to coding agents. Because it’s open-source, it’s free and works with any Figma account. All you need to do is create a Figma access token that has adequate permissions (select all) and a reasonable lifespan. Save this token as you will be needing it later on. Framelink’s MCP server for Figma provides two tools that fetch data via the Figma API. get_figma_data (to pull the structure, styling, and layout) and download_figma_images (to fetch image assets). Framelink can also generate design tokens (e.g., JSON of colors, typography) and a design system doc, but it doesn’t have a built-in Code Connect feature. The tradeoff is that it may not be as “aware” of your custom component library as the official Figma server.

Here’s a deeper comparison between the two:







Features Figma’s official MCP server Framelink MCP for Figma
AI coding agent tools Multiple: 5 tools and 18 prompt resources engineered to guide agents. Tools: get_design_context(code), get_screenshot(image), get_variable_defs, get_metadata, get_code_connect_map, create_design_system_rules. Fewer: mainly get_figma_data (layout & styling JSON) and download_figma_images for assets. It also offers token generation tools.
Code Connect Yes. Has built-in support (get_code_connect_map) to map Figma nodes to your React components, ensuring use of the right components. No. Code Connect is a paid Figma feature.
Design-system and token awareness High. Honors Figma Variables and tokens; can extract colors/spacing exactly and align output with your design tokens. Medium. Can extract raw style info, but you may need to supply context. Relies on the JSON it returns.
Setup / Access Official, but in beta. Requires a Dev seat (Pro or Org plan) for now. Works in Dev Mode (desktop app) or remote beta. Community-built (npm). Free; works with any personal or team Figma file (just need an API token)

Now that we’ve covered the tradeoffs, let’s look at how to structure Figma files for better AI code generation.

Figma design foundations: Going back to basics

I find it poetic that design habits like thoughtful layer naming, consistent use of components, auto layout, and mapping text, color, and spacing values to local variables have become the very foundations that enable coding agents to accurately implement our designs. It emphasizes that these practices aren’t “luxuries” but necessities to improve our workflows.

To try out the Figma MCP server, we’ll work with this Figma file that contains a simple design for a sign-up page. It’s set up with Figma variables, components, and auto layout, and is ready to be consumed by a coding agent:

Figma File With Variables

Let’s work through the core parts of the design:

Variables and design tokens

Figma variables are crucial for creating consistent design systems. We can use them to store values for spacing, colors, typography, and a lot more. Updating a variable in one place updates it everywhere else.

The reference design has primitive variables for colors, spacing, border radius, and typography, making up its design system. It also uses design tokens, which are semantic names that use the primitive variables underneath — think of them as Tailwind classes that use the primitive variables, which are defined in :root.

Now we need a way to export our design system into a file that a coding agent can understand. I’m making use of the Open Variable Visualizer plugin, which lets you export your Figma variables in a JSON file along with a resolver utility file.

After running the extension, download the two files by clicking the two links shown below:

Config Files For Design System

Now you have config files for your design system that any coding agent will understand.

Auto layout

Auto layout is what makes our designs flexible and adaptive. It ensures that when elements resize, text changes, or new items are added, the layout adjusts automatically. This is our best bet for achieving a pixel-perfect output.

In the signup page, every section from the form container to individual input fields and buttons uses auto layout. This ensures consistent spacing and alignment regardless of screen size or content changes.

If you know CSS Flexbox or Grid, you’ll notice the similarities. Direction, alignment, padding, and gap all map directly to flex properties. This structure is what allows a coding agent (or developer) to easily translate the design into production-ready code.

Components and variants

Components are what make a design system reusable and scalable. Instead of recreating the same button or input field across multiple frames, we define one source component and reuse it everywhere. All modern frontend frameworks share this same ideology.

In this design, the button component includes variants for three different states: default, full-width, and loading. Each shares the same structure, spacing, typography, and colors referenced by the design tokens defined earlier.

After you prompt the agent, you’d see clearly how components and variants map to a logical UI state in a codebase:

Logical UI State

Clean layer naming and organization

Finally, layers are named clearly and grouped by hierarchy. This helps both designers and agents in navigating your file. If you have a messy Figma file, Figma AI and community plugins like Rename It can help batch-rename layers and tidy up your structure. Consistent naming is a powerful part of handoff readiness.

Let’s start vibe coding this design!

Setting up the MCP Server

We’ll set up a development environment to test the two Figma MCP servers discussed earlier. At this point, you should have Cursor IDE installed on your device.

Start by connecting to the official, remote Figma MCP server. Use this deep link to add it to Cursor. This opens Cursor’s MCP settings and displays a dialog like this:

Cursor MCP Settings

Click Install to add the server. You’ll be prompted to authenticate your Figma account in the browser. Once complete, the MCP server should appear active in green:

Active MCP Server

Next, add Framelink’s MCP server. Click Add a Custom MCP Server to open Cursor’s mcp.json file. Here, you’ll see the existing configuration for the official Figma server. Grab your Figma access token and add the Framelink MCP:

{
  "mcpServers": {
    "Figma": {
      "url": "https://mcp.figma.com/mcp",
      "headers": {}
    },
    "Framelink MCP for Figma": {
      "command": "cmd", /* use "npx" for mac */
      "args": [
        "/c",
        "npx",
        /* remove the two commands above for mac */ 
        "-y",
        "figma-developer-mcp",
        "--figma-api-key=YOUR-KEY",
        "--stdio"
      ]
    },
  }
}

You now have both Figma MCP servers ready. Make sure to turn one off when working, as running both can confuse your agent and produce inconsistent results:

Figma MCP Servers Ready

Prompting AI to build a sign-up page in React from Figma

To achieve this, we’ll provide the agent with the Figma frame URL and the two exported files containing our design system variables. First, select the sign-up page frame in Figma and right-click. Select Copy/Paste as → Copy link to selection to get the URL:

Sign Up Page In React From Figma

Open Cursor, select Agent mode, and switch the AI model from Auto to any Claude model, preferably Claude 4.5. This model usually gives more accurate and structured results when generating code. Here is an abridged version of the prompt I used (you can find the full version here):

I need you to build a React application using Vite that implements the design from this Figma file: [PASTE_FIGMA_URL_HERE]
I’m providing you with two critical files that contain the design system variables from Figma:
[Attach Variables.JSON file from Variable Visualizer] – Contains all design tokens (colors, typography, spacing, etc.) exported from Figma [Attach figma-variables-esolver.js file from Variable Visualizer] – A resolver utility to access and use these variables programmatically.
Requirements: Setup: Create a new React + Vite project with the appropriate template
….

After running this prompt, the agent generated a complete sign-up page on the first try! I only had to fix a few small import errors in the console:

Complete Sign Up Page

Honestly, I was impressed! It’s not perfect, though — notice how the corner radii that were originally on the left of the image have been moved to the right.

I went through the code, and while the structure isn’t my style, the implementation is still almost pixel-perfect. I just had to prompt the agent to hide the image on mobile screens to keep the layout responsive.

With further prompting, especially through Code Connect, you could get better results that align directly with your code base’s structure.

Tips for better results

Getting strong results from design-to-code agents largely depends on the context you provide. Here are a few ways to help the model produce better output:

  • Add annotations in your Figma design — Always use variables, components and auto layout. For complex designs, use comments or annotations to explain how sections behave or respond to user actions. For example, note hover states, transitions, or responsive behavior. Clear descriptions help the agent understand intent and reduce mistakes in generated code. This is only supported by the official MCP server for now
  • Use or generate prompt templates — If you plan to repeat the same workflow, save your best prompts as templates. You can also ask the model itself to write a better version of your prompt before using it
  • Choose the right model — Some models are better at reasoning and code generation than others. In Cursor, switching to the latest Claude or OpenAI model usually produces cleaner React and Tailwind output than the default “Auto” mode
  • Use Figma Code Connect — Figma’s Code Connect feature lets you link your design components in Figma to the real components in your codebase. When your codebase is connected, Figma shows actual component snippets and props inside Dev Mode. This means both developers and AI agents can see how each component is built and used in your project

To quote the Figma dev team, “Code connect is the #1 way to get consistent component reuse in code. Without it, the model is guessing”.

Publishing your components with Code Connect also helps an agent understand your folder structure and UI library. For example, if your button or input components live in a src/components/ui folder, the agent can place new code in the same structure instead of generating everything from scratch. This keeps your project organized and makes the AI’s code easier to integrate.

Conclusion

Figma MCP marks a shift from static design handoffs to intelligent, context-rich design-to-code workflows. We’re advancing on the no-code community here. By structuring your files with variables, auto layout, and components, you give AI coding agents the context they need to deliver near-production-quality results. Whether you use Figma’s official MCP server or Framelink’s open-source option, the goal is the same — to bridge design and development with clarity, structure, and precision.

As AI-driven workflows mature, the line between designer, developer, and agent continues to blur. The teams that learn to design for AI collaboration will set the standard for the next generation of digital product development.

The post How to structure Figma files for MCP and AI-powered code generation appeared first on LogRocket Blog.

Read the whole story
alvinashcraft
20 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories