Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
153913 stories
·
33 followers

Advanced Excel‑Style Data Analysis with React Spreadsheet Formulas

1 Share

Advanced Excel‑Style Data Analysis with React Spreadsheet Formulas

TL;DR: Excel‑style data analysis in React doesn’t require external tools. With the React Spreadsheet, developers can add SUM, IF, VLOOKUP, and custom formulas to their apps. The result: dashboards, reports, and analytics powered by real‑time recalculation, dependency tracking, and intuitive formula entry. Developers deliver familiar Excel‑like workflows, reduce complexity, and scale enterprise‑grade data analysis seamlessly across React projects.

If you’re building a React app and want to offer Excel-like data analysis without relying on external tools, the Syncfusion® React Spreadsheet is a powerful, enterprise-ready solution. It supports a wide range of Excel-compatible formulas, named ranges, conditional formatting, and more, making it perfect for dashboards, financial models, sales reports, and interactive data exploration.

In this blog, we’ll explore how formulas work in React Spreadsheet, why they’re useful, and how you can practically apply them in your React app, just like you would in Microsoft Excel.

Why Syncfusion Spreadsheet is a smart choice for React developers?

Here’s why React Spreadsheet stands out for data-driven React apps:

  • Excel-compatible functions: Use familiar formulas like SUM, COUNT, LOOKUP, IF, MATCH, and more across categories, including Math, Logical, Statistical, Date & Time, and Financial functions.
  • Custom formula support: Define your own formulas to meet unique business logic and extend spreadsheet capabilities beyond built-in functions.
  • Interactive & embeddable: Deliver an Excel-like experience with a formula bar, fill handle, context menus, and keyboard shortcuts, right inside your app.
  • Smart data binding & formatting: Connect to JSON or remote data, apply styles, number formats, and use conditional formatting for instant visual insights.
  • High performance: Fast recalculations and efficient dependency tracking ensure smooth performance.
  • Developer-friendly API: Programmatically set formulas, define named ranges, and update cells or let users input formulas directly via the UI.

Getting started with React Spreadsheet

Start by integrating the React Spreadsheet component into your React project. This includes installing the necessary packages and rendering the spreadsheet in your app.

How does React Spreadsheet process formulas?

The engine behind React Spreadsheet is powered by a high‑performance calculation engine that parses, plans, and evaluates formulas with Excel‑like semantics, delivering accuracy and speed you can trust.

  • Smart parsing and planning: Validates syntax and builds optimized execution plans.
  • Optimized evaluation: Uses Excel-like semantics and order of operations.
  • Instant recalculation: Any change triggers real-time updates.
  • Excel-like inputs: Enter formulas just as you would in Excel. It supports:
    • Numbers and static values.
    • Same-sheet and cross-sheet references.
    • Named ranges.
  • Dependency tracking: The React Spreadsheet intelligently tracks formula dependencies:
    • Builds a dependency graph to identify related cells.
    • Recalculates only affected cells, not the entire sheet.
    • Handles cross-sheet references and named ranges efficiently.
    • Detects circular references and provides clear error feedback.
    • Automatically removes references when formulas are deleted.
  • Handling nested formulas: Complex formulas are handled with precision:
    • Evaluates inner functions first, then composes results outward.
    • Operator precedence is respected.
    • Optimizes repeated sub-results to avoid unnecessary work.
    • Scales smoothly even with deeply nested, complex formulas.

Advanced formula categories in React Spreadsheet

The React Spreadsheet supports a wide range of Excel-compatible formulas. Below are the key categories with examples and use cases:

1. Lookup functions

  • Functions: VLOOKUP, HLOOKUP, INDEX, MATCH, and more.
  • Use case: Find and retrieve data from large tables with ease, ideal for customer data mapping, inventory tracking, and financial analysis.
Search icon representing lookup functions
Search icon representing lookup functions

2. Logical functions

  • Functions: IF, AND, OR, and more.
  • Use case: Build dynamic decision-making logic, great for marketing segmentation or rule-based models.
Check mark icon representing logical function
Check mark icon representing logical function

3. Statistical functions

  • Functions: MIN, MAX, AVERAGEIFS, COUNTIFS, and more.
  • Use case: Analyze trends and performance metrics using multiple criteria, perfect for evaluating campaign effectiveness or sales performance.
Bar chart icon representing statistical functions
Bar chart icon representing statistical functions

4. Math & array formulas

  • Functions: SUM, PRODUCT, UNIQUE, and more.
  • Use case: Perform bulk operations across multiple cells, ideal for forecasting, budget adjustments, or data transformation.
Summation icon representing math and array formulas
Summation icon representing math and array formulas

5. Nested formulas

  • Combine multiple functions to create complex, intelligent formulas.
  • Example:
    =IF(A2>100, VLOOKUP(B2, DataRange, 2, FALSE), "Low")
IF formula icon representing nested functions
IF formula icon representing nested functions

In addition to these powerful data analysis formulas, the React Spreadsheet supports a wide range of other Excel-compatible functions, including:

  • Mathematical,
  • Statistical,
  • Logical,
  • Text, and
  • Date operations.

This gives users the flexibility to perform everything from simple calculations to complex data modeling.

Additional formula features & enhancements

Beyond core functions, our React Spreadsheet also offers usability improvements that simplify formula creation and reduce errors.

1. Enhanced UX for entering formulas

The React Spreadsheet offers a smooth, Excel-like formula entry experience that helps users build formulas visually and intuitively. These enhancements make formula creation faster, reduce errors, and improve usability.

  • Autocomplete dropdown for formula suggestions: When you start typing a formula by entering =, the Spreadsheet automatically displays a drop-down list of available functions.
    For example:
    • Typing =A will show suggestions like AVERAGE, AND, ABS, etc.
    • The list dynamically filters as the user types, helping them quickly find the desired function.
    • This feature reduces the need to memorize function names and ensures accurate formula entry.
  • Interactive cell and range selection: Once you begin entering a formula and need to specify cell references or ranges:
    • You can click or drag across cells in the sheet to automatically insert the correct references into the formula.
    • For example, typing =SUM and then selecting cells A1:A10 will auto-fill the range as A1:A10 in the formula bar.
    • This mirrors Excel’s behavior and allows users to build formulas without manually typing cell addresses.
Excel-like formula entry in React Spreadsheet
Excel-like formula entry in React Spreadsheet

2. Formula error dialog

When something goes wrong, the React Spreadsheet provides an intuitive, Excel-like error dialog that helps users troubleshoot quickly. It highlights issues, such as:

  • Missing parentheses,
  • Invalid names, and
  • Incorrect arguments.
Formula error dialog in React Spreadsheet
Formula error dialog in React Spreadsheet

3. Calculation modes

Just like Excel, you can control when the formulas should be recalculated using the following calculation modes:

  • Automatic: Instantly updates when dependent cells change, great for real-time dashboards.
  • Manual: Recalculates only when triggered, ideal for large models where performance matters.

Syncfusion supports both sheet-level and workbook-level calculation modes through UI interactions and API.

Calculation modes in React Spreadsheet
Calculation modes in React Spreadsheet

4. Defined names for readable formulas

You can also assign meaningful names to cell ranges (e.g., Sales for A2:A100) to make formulas easier to read and maintain:

=AVERAGE(Sales)
Defining names for readable formulas in React Spreadsheet
Defining names for readable formulas in React Spreadsheet

5. Culture-based argument separator support

Formula syntax varies across regions; some cultures use commas (,), while others use semicolons (;) as argument separators. Our React Spreadsheet intelligently adapts to these differences by supporting culture-based separators, ensuring formulas work accurately across locales.

6. Custom formula support

Need domain-specific logic? You can define custom formulas with clean syntax and Excel-level precision. These formulas can be reused across sheets, simplifying complex models and improving maintainability.

Let’s apply these concepts in a practical demo

Now that we’ve covered the theory, let’s practically apply these features in an example React Spreadsheet setup.

We’ll demonstrate how to apply formulas:

  • Through UI interaction.
  • Programmatically using the updateCell method.
  • Declaratively via data binding and cell model binding.

Applying formulas via UI

With the React Spreadsheet, we can deliver a rich, Excel-like formula entry experience to our end users. Users can enter formulas directly into cells or use the formula bar, and the component provides helpful features to guide the process.

Users can interact with the Spreadsheet in multiple ways:

  1. Typing in the cell: This is the most direct way to enter formulas.
    • Select the cell where you want to enter the formula (e.g., D13).
    • Type =A and use autocomplete to select AVERAGE from the dropdown.
    • Drag to select the range D2:D11.
    • Press Enter to apply the formula (=AVERAGE(D2:D11)) in the selected (B13) cell. The result appears in the cell; the formula is visible in the formula bar.
    Applying formulas by typing in a cell
    Applying formulas by typing in a cell
  1. Using the formula bar: The formula bar at the top of the Spreadsheet allows users to enter or edit formulas with full visibility.
    • Click the formula bar.
    • Type the formula, e.g., =SUM(A1:A10).
    • Select the range directly on the sheet.
    • Press the Enter key.
    Applying formulas via the formula bar
    Applying formulas via the formula bar
  1. Using the insert function dialog: This is ideal for users who prefer guided formula entry.
    • Click the fx icon or go to the Formulas tab.
    • Choose a function category (e.g., Math → SUM).
    • Fill in arguments by typing or selecting ranges.
    • Click OK, then press Enter.
    Applying formulas via the insert function dialog UI
    Applying formulas via the insert function dialog UI

Applying formulas programmatically

For developers who want to automate calculations or initialize sheets with computed values, React Spreadsheet provides a flexible API. You can apply formulas using the updateCell method or bind them directly via cell models.

  1. Using the updateCell method: This method allows you to assign or change a cell’s formula entirely through code. You can pass:
    • A cell model object with a formula field.
    • A cell address (e.g., “F2” or “Sheet2! E5“).

    Example:

    const onCreated = (spreadsheet) => {
        spreadsheet.updateCell(
        { formula: '=ROUND((C2*0.9 + D2*0.1)/1000, 2)' },
        'F2'
        );
    };

    This formula calculates a weighted performance score for the first employee (F2) based on:

    • 90% of salary (C2)
    • 10% of bonus (D2)
    • The result is rounded to two decimal places.

    You can repeat this for other rows to apply formulas individually or use the autoFill method to fill formulas dynamically across a range, just like Excel’s drag-to-fill feature.

  1. Using cell model binding: You can also define formulas directly in your data model when initializing the Spreadsheet.
    Example:
    const rows = [
        {
            Index: 11
            cells: [
              { value: 'Total Salary' },
              { index: 2, formula: '=SUM(C2:C11)' }
            ]
        },
        // More rows...
    ];

    This approach is ideal for preloading calculated values in dashboards or demos.

Reference

For more insights, refer to our advanced data analysis with React Spreadsheet demo on StackBlitz.

In this demo, you can:

  • Enter formulas via the UI.
  • See formulas applied programmatically.
  • Explore formula binding through cell models.

Frequently Asked Questions

What formulas does the React Spreadsheet support, and does it use Excel-like formula syntax?

It supports a wide range of Excel-compatible functions, including mathematical, logical, statistical, lookup/reference, date/time, financial, text, array, engineering, and information functions. Formulas use familiar Excel-style syntax directly in cells or via the formula bar.

How does formula recalculation work, and can it be controlled for performance?

The React Spreadsheet tracks cell dependencies and recalculates only affected formulas when data changes. Calculation can be set to automatic or manual mode to control performance.

How are formula errors handled?

The React Spreadsheet displays user-friendly error dialogs that explain issues like invalid syntax or missing arguments.

Can developers add custom formulas?

Yes, custom formulas can be defined to implement business-specific logic.

Handle complex formulas and advanced data analysis directly in your React UI

Thanks for reading! Here’s the reality: Excel isn’t going anywhere, but your users shouldn’t have to leave your app to use it.

With the React Spreadsheet, you unlock:

  • Familiar Excel experience users already love.
  • Powerful real-time analytics without switching tools.
  • Seamless integration directly inside your React app.

Now, instead of exporting data or building complex workarounds, you can run advanced formulas, automate insights, and create dynamic dashboards, all in one place.

  • Turn raw data into instant, actionable insights.
  • Build interactive dashboards and financial reports faster.
  • Eliminate friction from traditional Excel-based workflows.

👉 Start today and transform how your app handles data, before you fall behind.

If you’re a Syncfusion user, you can download the setup from the license and downloads page. Otherwise, you can download a free 30-day trial.

Need help? Reach out via our support forum, support portal, or feedback portal.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

1 Share
Read the whole story
alvinashcraft
35 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

1 Share
Read the whole story
alvinashcraft
40 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Reading Notes #698

1 Share

The world of AI is exploding, and with that explosion comes a crucial question: how do we keep these powerful agents in check? Traditional security methods might not cut it anymore, so developers are turning to innovative sandboxing techniques. Let's explore some of the most promising approaches and see which ones emerge as the frontrunners in this AI safety race.




AI

Programming

DevOps

Podcasts


I've made it a habit to share the fascinating articles, blog posts, and books that cross my path each week. Think of this as an open invitation, if you stumble upon something intriguing, don't hesitate to share it!
Let's build a community of curious minds.

~frank

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

How to Watch Google I/O

1 Share
Google I/O is back with updates to Search, Android, Gemini, and a fresh peek at upcoming Android XR smart glasses. Here's how to watch the announcements live and what to expect.
Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Agent Skills Work but the Research Shows Most Teams Are Building Them Wrong

1 Share

This post was originally published on The Nuanced Perspective and is being reposted here with the authors’ permission.

Agent skills are everywhere right now. Atlassian built them into Rovo so agents can automatically triage Jira tickets, draft Confluence pages, and route service requests without anyone typing a prompt. Canva and Figma use them so Claude can interact with design files directly. Stripe published skills for payment workflow automation. When Anthropic launched the Agent Skills open standard in December 2025, Microsoft adopted it in VS Code and GitHub within weeks.

The idea is elegantly simple. Instead of building a new specialized agent for every use case, you write a skill once, and any agent that understands the standard can use it. A code reviewer, a PR generator, a deployment checklist, a sprint planner. Each lives in a folder, triggers when relevant, and brings your team’s specific way of doing things into the agent’s context.

But the research on whether skills actually work, and what causes them to fail, is only catching up to adoption now. Four recent papers take the first systematic look at skills in practice: what the benchmarks show, how libraries break down as they grow, and what a more principled approach to orchestration looks like.

Three findings that will change how you think about skills:

  • Curated skills raised the rate at which agents successfully completed tasks by 16.2% on average across 84 tasks. Model-written skills showed no consistent benefit across any configuration tested.
  • As skill libraries grow, the agent’s ability to find the right skill on demand breaks down. When it scans every skill description in one pass, similar-sounding skills start colliding. Organizing skills into a hierarchy rather than a flat list is what the research shows actually fixes this.
  • A large-scale security study of ~31K community skills found that more than one in four contain exploitable vulnerabilities, spanning prompt injection, data exfiltration, and privilege escalation.

This is what those papers found, and what it means for anyone building with skills today.

What a skill is

Your team has a specific way of reviewing PRs. Particular checks, a specific order, standards that go beyond what any generic reviewer would know. You’ve explained it to every new engineer who joined. A skill is how you stop explaining it and let the agent carry it instead. In practice it’s a folder with a SKILL.md file at the center: a description that acts as the trigger condition, a body with step-by-step instructions, and optionally scripts and reference documents that load only when needed. A scoped set of tools and instructions the agent can invoke.

At session startup, the agent reads only the name and description from each installed skill, which is about 100 tokens per skill. The full instructions load only when the skill activates, and scripts run without being read into context at all. A large skill library costs almost nothing at initialization. The context budget only gets spent when a skill is actually running.

That’s progressive disclosure, and it’s what makes skills different from system prompts, which load everything globally every session, or tools, which are API calls that give the agent direct capabilities. The distinction that holds up for MCPs is that MCP gives the agent abilities, say, a shell, an API connection, or access to a database, whereas skills encode the knowledge of how to use those abilities well for a specific workflow. Block’s engineering team put it well that skills are like GitHub Actions YAML, and MCP is the runner. One describes the workflow and the other makes it possible.

Some concrete examples of what this looks like in practice, from teams that have shipped skills in production:

  • A PR review skill that loads your org’s specific style guide, flagging violations and blockers according to your team’s standards rather than generic best practices
  • A deployment checklist skill that runs your team’s exact predeploy sequence, covering environment checks, rollback verification, and the three Slack channels to notify in order
  • A data reporting skill that knows your company’s metric definitions, so when someone asks for “revenue,” it pulls the right number rather than the closest approximation
  • A sprint planning skill that fetches the backlog, applies your team’s capacity rules, and proposes a plan structured the way your team runs standups

The value in each of these isn’t the task itself. Any agent can attempt a PR review or a sprint plan. The value is the organizational knowledge baked into how the skill executes it, your style rules, your deploy sequence, your metric definitions, your team’s way of running things. That specificity is also what makes skills hard to get right, as the benchmarks show.

What the benchmarks show

SkillsBench is the first benchmark built specifically to measure whether agent skills actually improve performance. It tested 84 tasks across 11 domains, running each task under three conditions: no skill, a curated skill, and a self-generated skill. The results are worth sitting with.

Curated skills raised average pass rates by 16.2%. However, the gains were uneven across domains. Software engineering tasks improved by 4.5%, while healthcare tasks saw nearly 52% improvement. The domains where skills helped most were the ones with highly structured workflows and domain-specific conventions the base model doesn’t carry natively.

The less-cited result is that self-generated skills, where the model writes its own skill rather than a human curating one, provided no average benefit across configurations (“SkillsBench,” Table 3). Some model configurations saw small gains; others saw small losses. The paper’s conclusion was that models cannot reliably author the procedural knowledge they benefit from consuming. The trajectory analysis in the benchmark identified two failure modes:

  • Models either generate imprecise procedures lacking specific API patterns, or
  • Fail to recognize what domain knowledge the task actually requires.

The benchmark’s self-generation condition has also drawn pushback from practitioners. One engineer writing on HackerNoon argues the test doesn’t reflect how skilled teams actually build skills. The benchmark prompted a fresh agent to write a skill and immediately use it, which is closer to asking a model to think harder before attempting a task than to building a skill from real execution experience. His own replication, using skills built from actual debugging sessions, showed much stronger results. The distinction matters because a skill captures what a fresh model wouldn’t know. If the model could have reasoned its way there anyway, the skill wasn’t needed.

The practical consequence is that self-generation is the obvious shortcut. You finish a workflow, ask the agent to extract it as a skill, and move on. The benchmark says that without a human review step, you’re not getting the gains you’d expect. The skills look complete. They often cover the main path. What they miss are the edge cases, the exceptions, the three things your team does differently that the model has no way of knowing, and those are exactly the things that make a skill valuable.

One finding worth noting for anyone building with skills: focused skills with two to three modules consistently outperformed comprehensive documentation (“SkillsBench,” Section 4.2). More coverage in a single skill didn’t help; more focused, well-scoped skills did. The benchmark also found that smaller models running with curated skills could match larger models running without them, which is a meaningful cost implication for anyone running skills at scale (“SkillsBench,” Section 4.2.3, Finding 7).

Questions that come up when building with skills

These questions show up every time a team starts building a skill library.

When does something become a skill versus staying in a workflow or system prompt?
The cleaner test is whether this is a recurring task that your team has a specific, repeatable way of doing. If yes, it’s a skill candidate. If it’s a one-time flow or something where general reasoning is sufficient, it probably doesn’t need one. The key difference between a skill and a workflow tool like n8n is flexibility. A workflow executes a fixed sequence and breaks when inputs change, while a skill gives the agent procedural guidance it can apply to variations of the same task. Similarly, agentic workflows can chain multiple agents and tasks together, but each agent still benefits from skills that encode the org-specific knowledge for its part of the chain. When you want the what to be consistent but the agent to handle the how intelligently, that’s a skill.

How narrow or broad should a skill be?
The SkillsBench finding that focused skills with two to three modules outperform comprehensive ones is directly relevant here (“SkillsBench,” Section 4.2). A skill that tries to cover an entire domain tends to underperform one that handles a specific thing well. The more practical question is whether to put a full workflow (data fetch, format, generate PDF) into one skill or split it. Current research supports splitting because, then, each piece becomes reusable, easier to update when something changes, and less likely to create unexpected behavior when one module’s scope drifts.

What about skills for noncoders or nonsoftware workflows?
Skills are format-agnostic. They’re structured instructions plus optional scripts, and the domain can be anything. A customer support team can encode their escalation criteria, tone guidelines, and the specific conditions where a human always takes over. A legal team can encode their document review checklist. A design team can encode component standards so reviews stay consistent across contributors. Atlassian’s Rovo agents are a useful reference outside the coding context. Their skills handle ticket triage, Confluence page creation, and service request routing, none of which is software engineering.

When should you deprecate a skill?
This is the question that gets skipped most often. The “SoK” paper argues for treating skills like any other maintained artifact through discovery, refinement, evaluation, update, and eventually deprecation (see Figure 2 in the paper). A skill that was compensating for a model capability gap six months ago may now be redundant, and worse than redundant if it’s overriding better native behavior. The practical test is to run the task with and without the skill and check if the skill still helps. If the gap has closed, retire it.

What breaks as the library grows

A single well-written skill works well. As libraries grow, flat retrieval breaks down, and the “AgentSkillOS” paper is the first to study this systematically across ecosystem scales from 200 to 200,000 skills.

Flat skill libraries don’t scale. When the agent scans a flat directory of, say, 80+ skills on every request, retrieval becomes unreliable. Two skills with similar descriptions start triggering interchangeably and behavior becomes nondeterministic for the same input. At the extreme, the orchestrator falls into routing collapse, where it consistently invokes the wrong skill because the semantic embeddings of two similar skills are indistinguishable. The output looks reasonable BUT the wrong skill ran.

The fix the paper proposes is capability trees: organize skills into a hierarchy rather than a flat list. Top-level domains like code, data, docs, with more specific skills as branches and leaves. The agent navigates from domain to branch to leaf instead of scanning everything. They also introduce a usage frequency queue, where skills that aren’t being invoked or aren’t improving outcomes get moved to a dormant index so they don’t pollute retrieval for active skills.

Testing this across ecosystems ranging from 200 to over 200,000 skills, the structured approach consistently outperformed flat invocation, and the gap widened as library size grew.

This pattern shows up in how production teams manage their libraries too. Atlassian recommends fewer than five skills per Rovo agent. OpenHands maintains a curated extensions repository with separate skill packages for discrete workflows rather than one monolithic skill set. Across all of them, scoped purposeful skill sets outperform comprehensive ones. More skills isn’t more capable. Past a point, it’s just more noise.

How orchestration can work differently

This section uses a different definition of skill than the rest of the article, so the distinction matters upfront.

In the “SkillOrchestra” paper, a skill isn’t a SKILL.md file. It’s a capability description used to match task requirements to individual agents in a multi-agent system (see Figure 3 in the paper). The concern isn’t procedural knowledge for one agent but figuring out which agent in a pool should handle a given task and why.

The problem it’s solving is that standard reinforcement learning approaches to multi-agent routing don’t hold up as systems grow. Adding a new agent or modifying a workflow means retraining from scratch. RL policies also tend to send everything to the highest-capability agent regardless of cost, which looks fine in evaluation but gets expensive when you’re running it in production.

SkillOrchestra’s alternative has each agent maintain a competence profile derived from its own execution history, specifically estimated success rates across different task types. The orchestrator routes incoming tasks to the agent whose profile best matches what the task actually demands, rather than the one with the highest raw capability. The routing logic stays current without retraining, and you can inspect why a task went where it went.

The same logic applies to SKILL.md-based systems. Tracking which skills actually improve outcomes for specific task types, and what they cost in tokens, gives you the foundation for better selection as your library grows. You don’t need SkillOrchestra’s full framework to benefit from the core idea.

The security problem

A large-scale security analysis of 31,132 community-sourced skills found that 26.1% contain at least one exploitable vulnerability, spanning prompt injection, data exfiltration, privilege escalation, and supply chain risks. More than one in four.

The attack patterns aren’t exotic. Prompt injection hidden in skill descriptions that manipulate agent behavior once the skill loads. Scripts that execute against filesystem permissions broader than the skill needs. Tool authorizations scoped to the entire workspace when the task only requires one directory.

The core issue is that an external skill isn’t a document you’re reading. It’s code running with your agent’s permissions. Importing a skill from a public repository without reviewing it is like doing an npm install from an unknown author. You wouldn’t do that without at least checking what the package does. That framing changes what due diligence looks like. It means checking the scripts folder before installing, verifying that the permissions the skill requests match what the task actually requires, and sandboxing execution where your environment allows.

The tooling for auditing skills at install time doesn’t exist at the level it should yet. Until it does, the due diligence is manual. OpenHands’ extensions repository and Atlassian’s open source skill package are reasonable references for how production-grade community skills scope permissions. Claude Code’s built-in skill creator also helps here, since it structures permission scoping explicitly from the start.

3 things to do differently

Across all four papers, three recommendations are consistent.

Write skills from real execution. Do the workflow manually with an agent, correct it as you go, then extract it as a skill. The agent has full context of what worked. Skills built from real runbooks, incident reports, and accumulated corrections outperform skills written from scratch. The org-specific edge cases are exactly what the base model doesn’t already know. The general workflow it can handle; the three exceptions your team deals with differently are what the skill needs to capture.

Treat the description as routing logic. The description isn’t a label. It’s how the skill gets triggered at all. Specific phrases, explicit activation conditions, context that distinguishes this skill from adjacent ones. If a skill isn’t firing when you expect it to, or fires when it shouldn’t, rewrite the description first. That’s almost always where the problem is.

Plan for the full lifecycle. Creation is the easy part. Skills drift out of relevance as models improve. A skill that compensated for something Claude couldn’t do eight months ago may now be actively overriding better native behavior. They need to be evaluated against actual task outcomes, updated when workflows change, and retired when they stop earning their place. The teams that treat their skill libraries the way good engineering teams treat their codebase, with reviews, with metrics, with a process for deprecation, are the ones whose libraries stay useful as they grow.

Where this is heading

The shift from prompt engineering to tool use to skill engineering has followed a pattern. Each era produces artifacts that persist longer than the last. Prompts lived in conversations. Tools live in configurations. Skills live in libraries, versioned, shared, maintained, and eventually retired. They behave like code.

Most teams aren’t treating them that way yet. Skills get written quickly, without evaluation criteria, without any plan for what happens when they stop being useful. That’s worked so far because most skill libraries are still small enough to hold in your head. It won’t hold as they become infrastructure.

The teams building durable agent systems won’t be the ones with the most skills. They’ll be the ones who figured out earlier that a skill library needs to be maintained, not just populated, and who started building the discipline to do that before it became urgent.


This article grew out of a live “Chai & AI” session conducted by Prahitha Movva where practitioners debated whether agent skills actually deliver on the hype, or just add another layer of complexity.



Read the whole story
alvinashcraft
5 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories