Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147838 stories
·
33 followers

Generative AI has broken the subject matter expert/editor relationship

1 Share

Until recently, if a draft was sent to me by a subject matter expert (SME), it might need significant edits, but I could generally assume the technical content was good. At Chrome, my team of experienced writers has a good pipeline for taking those drafts, shaping them up into readable docs, blog posts, and articles and publishing them. Even when I was in the business of commissioning articles from external sources, there were pretty obvious signs that content was plagiarised, or that the SME wasn’t quite the expert they thought they were. You get a good nose for this over time. Over the last year everything has changed.

There used to be an implicit contract between SME and editor. We receive technically accurate content, and we use our skills in developer communication to ensure the information lands well. In general the questions writers ask are clarifying ones, we’re essentially customer zero for the content, working through the tutorial and ensuring each step is as clear as can be. However, other than obvious typos we could assume the SME knows what they are talking about.

Generative AI has broken that contract. Increasingly writers receive content that looks polished, yet contains inaccuracies. This can be because the SME, while polishing their content using AI tools, has missed the fact that the tool has also modified some code or changed the meaning of text. It can also be that the drive for productivity with these tools has meant that people are being asked to cover broader subject areas, so are relying on AI tools for research rather than their own knowledge. AI can be very confidently wrong, and if the text seems clear, it’s possible to miss that it’s clearly nonsense.

This places a greater burden on the team editing and producing the content. Even with content handed to us from a known SME, we now need to review things with the assumption that they may be wrong. Does that interface really have those methods? Is that diagram inventing a brand new language? Can those quotes be attributed to those people? This relies on having a writing team who also have a level of expertise that allows them to catch these things. It also relies on having enough people in that writing team to deal with the increased workload.

I mention the GitHub flow example, not to take a dig at a fellow writing team, but as something we all need to learn from. I’m thankful that we’ve not had a similar thing happen so far at Chrome, credit is due to my excellent team and the care in which the broader developer relations team are taking as they adopt AI. But things are moving fast, and writers in giant companies are having to work out how to deal with it as much as anyone else. Separately, the back story of that Ars Technica article is wild

The problem becomes bigger if you are relying on vendors and external contributors. You can put as many requirements into your contracts as you like, and reject obvious slop, but the level at which you have to treat what comes in as suspect is like nothing we’ve seen before.

If you are doing content operations at scale, it’s your job to put in place processes to deal with this new reality. People will be putting AI generated content through your pipeline. Even if it’s not completely generated, they may be unaware of how much AI polishing has changed their original words. How are you verifying things? The assumptions that were generally true two years ago don’t work now. Even in smaller operations, you can’t just rely on an experienced editor spotting issues, AI has broken much of the internal knowledge I’ve been able to rely on for years.

I’m not anti-AI, I’m increasingly using AI in my content operations pipeline, and will share some of that on this blog in future. However, as with any new technology, there’s the potential for positive and negative impacts. In this case a seemingly positive thing for the SMEs—help in drafting their content—is resulting in additional work for another team. But that’s how change happens, it doesn’t happen all at once, you have to work down the chain of problems, and understand where old patterns are no longer serving you. I imagine that we’ll see more unfortunate things shipped by content teams as we work through this. I’m dreading the point at which it’s my turn to be the person who LGTM’d the slop! We’re in a transitional time though, and I’m encouraged by the amount of discussion I’m seeing from other writers, as we work to redefine how we do content operations in a world of generative AI.

Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

AI-Era Employability and Job Security for Software Engineers - Mental Models for Finding a Competitive Advantage Without Selling Out

1 Share

I've been delaying this episode for a long time because the topic is genuinely difficult and, for many of us, scary. AI is threatening not just to our livelihood, but to our sense of self-worth as creators.

In this episode, I don't offer false guarantees about job security. Instead, I frame the problem through the lens of microeconomics and rational incentives to help you understand how to remain employable. We discuss why you must separate your ego from your current skill set and how to position yourself not as a competitor to AI, but as a force multiplier.

The Hard Truth: I explain why the "abstinence" approach—hoping the industry rejects AI or that it turns out to be a bubble—is a high-risk gamble that is unlikely to succeed.

Ego vs. Employability: We discuss the difficult mental shift required to disconnect your self-worth from the act of writing code manually, allowing you to adopt new tools without feeling like you are losing your identity.

The Microeconomics of Your Job: Understand the cold reality that a rational market only pays you if you generate more value than you cost; if AI can do the same task with less risk or cost, the market will choose AI.

The Non-Zero Sum Game: Learn why the economy isn't a fixed pie. The goal isn't just to survive, but to recognize that the combination of Human + AI can generate more total value than either can alone.

Multiplicative Value: I challenge you to stop thinking about linear skill acquisition and start thinking like a manager: how can you use AI to multiply your output and become indispensable?

Accepting Atrophy: We confront the reality that your core coding skills may degrade over time as you rely on AI, and why accepting this trade-off might be necessary for your career survival.

🙏 Today's Episode is Brought To you by:

If you are building an application that needs real-time search results—especially if you are working with LLMs—you know that stale data is a problem. SerpApi is the live web search API for your application.

• Get real-time search results fast, directly in your app as JSON.

• Bridge the gap for LLMs that are locked to a training date.

• Trusted by companies like NVIDIA, Adobe, and Shopify. Get started with a free tier to build your full integration before you commit. Go to serpapi.com

📮 Ask a Question

If you enjoyed this episode and would like me to discuss a question that you have on the show, drop it over at: developertea.com.

📮 Join the

If you want to be a part of a supportive community of engineers (non-engineers welcome!) working to improve their lives and careers, join us on the Developer Tea Discord community today!

🧡 Leave a Review

If you're enjoying the show and want to support the content, head over to iTunes and leave a review!





Download audio: https://dts.podtrac.com/redirect.mp3/cdn.simplecast.com/audio/c44db111-b60d-436e-ab63-38c7c3402406/episodes/86554efb-82a2-4e81-bde4-e2a4e5a9a3a7/audio/9c80651c-a8cb-4ea1-9b56-c1fcaa5d2df9/default_tc.mp3?aid=rss_feed&feed=dLRotFGk
Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

T4 templates on modern .NET

1 Share

T4 is a popular .NET-based templating language. Originally, it could use only .NET Framework, but in 2023, Microsoft added a version of the template tool that could use .NET 6.0. At some later point they added support for .NET 8.0. (As I write this in February 2026, there was not yet support for .NET 10.0.)

However, this modern .NET support is minimal, and is not used by default. The Visual Studio integration continues to use the old .NET Framework implementation. To use the new modern .NET support, you have to run the command line tool manually, or adapt your project files to invoke the tool for you.

The Rx.NET and Reaqtor codebases make extensive use of T4. Up until now we've relied on the built-in Visual Studio support, but the inability to use modern .NET features is starting to become a problem. This post explains what it takes to move projects that use the old .NET Framework T4 support in Visual Studio over to using T4 with modern .NET.

A quick introduction to T4

The documentation seems coy about what the name T4 means, but some say it stands for Text Template Transformation Toolkit. If you've not used T4 before, it's a bit like a Razor page–it can contain a mixture of plain text and C# code. For example:

<#@ template language="C#" #>
This is some plain text that will be emitted verbatim.
<#
  // This code is executed, so it won't appear in the output, but it
  // changes how the output that follows is produced.
  for (int i = 0; i < 5; ++i)
  {
#>
This is also emitted. It's in a loop, so we get many copies.
<#
    // This is another code block.
  }
#>

If I put that in a file called SimpleTemplate.tt and then run this command:

TextTransformCore SimpleTemplate.tt

it produces a file called SimpleTemplate.txt with this content:

This is some plain text that will be emitted verbatim.
This is also emitted. It's in a loop, so we get many copies.
This is also emitted. It's in a loop, so we get many copies.
This is also emitted. It's in a loop, so we get many copies.
This is also emitted. It's in a loop, so we get many copies.
This is also emitted. It's in a loop, so we get many copies.

I've made my template emit plain text in this example to clarify the fact that T4 is fundamentally text-oriented. You could use it to generate C#, F#, VB.NET, markdown, HTML, Cucumber specs, or, as in this case, just plain text containing natural language.

In Rx.NET and Reaqtor we use T4 to generate repetitive code. For example, the Min and Max operators have multiple versions of what are essentially the same code, just for different numeric types. (Since .NET 7, there has been a better way to solve this particular problem: it introduced of new ways of defining interfaces, and the associated generic math feature. However, we still target .NET Framework in Rx.NET, so we can't use that.) We also often use templates driven by reflection to generate code whose structure is determined by other code.

Aren't we supposed to be using source generators now?

In theory the introduction of source generators renders T4 unnecessary for the ways we use it in Rx.NET and Reaqtor. Now there is direct support in the .NET SDK for generating code at build time.

However, having written a couple of source generators I find them to be a major step up in complexity from T4.

They enable developers to create really useful tools. For example, our Corvus.JsonSchema libraries offer Corvus.Json.SourceGenerator, which is now my go-to solution when I want to deal with JSON in C#. But while source generators can be great to use, they are a bit of a nightmare to write. So I think there is still a place for T4.

Tooling changes

To understand how to migrate an existing project to using .NET in T4, it's important to understand the differences in tooling support for T4 on .NET FX and T4 on .NET.

Visual Studio's existing support for .NET FX

Visual Studio has offered support for T4 for many years. You enabled it by adding this to your project file:

<ItemGroup>
    <Service Include="{508349b6-6b84-4df5-91f0-309beebad82d}" />
</ItemGroup>

You could then tell Visual Studio that certain source files were T4 templates, and normally you would also tell it about the association between the T4 template and its generated output, e.g.:

  <ItemGroup>
    <None
        Update="Example.tt"
        Generator="TextTemplatingFileGenerator"
        LastGenOutput="Example.cs"
        />

    <Compile
        Update="Example.cs"
        DesignTime="True"
        AutoGen="True"
        DependentUpon="Example.tt"
        />
  </ItemGroup>

The <None> element here sets the Generator attribute to TextTemplatingFileGenerator, and this makes Visual Studio offer a couple of additional options on the file's context menu in Solution Explorer:

Visual Studio T4 context menu showing the Run Custom Tool and Debug T4 Template options

Selecting Run Custom Tool causes the T4 template to execute, generating its output. The Debug T4 Template runs it in the debugger so you can step through the template code.

T4 on .NET

The more recently added support for T4 on .NET provides one thing: the TextTransformCore command line tool. There is no Visual Studio integration. There is no supported way to tell Visual Studio to execute a template using .NET–VS (today) only offers the old .NET Framework-based T4 execution that has had for years.

So the new .NET support is all very bare bones. We get almost nothing compared to the support available when running a T4 template on .NET Framework. The old context menu items are still available, it's just that they can only invoke the old .NET Framework T4 tooling.

Changes required when using migrating T4 from .NET Framework to .NET

Note that if your are using assembly directives in your template you might need to change them because some .NET runtime library types are in different assemblies. For example, if a template written to run on .NET FX includes this line:

<#@ assembly name="System.Core" #>

you will probably need to change it to this to get it working on .NET:

<#@ assembly name="System.Linq" #>

You might also find that you are getting errors such as these:

Compiling transformation: CS1069: The type name 'Stack<>' could not be found in the namespace 'System.Collections.Generic'. This type has been forwarded to assembly 'System.Collections, Version=0.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' Consider adding a reference to that assembly.

You may need to add this:

<#@ assembly name="System.Collections" #>

A more subtle problem is that the T4 tooling does not understand the distinction between reference assemblies and runtime assemblies. It always uses the latter, which can cause some surprises. For example, you might get an error of this form when trying to use the types in System.Xml.Linq:

error CS1069: Compiling transformation: CS1069: The type name 'XElement' could not be found in the namespace 'System.Xml.Linq'. This type has been forwarded to assembly 'System.Private.Xml.Linq, Version=8.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' Consider adding a reference to that assembly.

You can resolve this by adding another assembly directive:

<#@ assembly name="System.Private.Xml.Linq" #>

but this is somewhat unsatisfactory: the fact that .NET 8.0 happens to put this type in this assembly is an implementation detail that could easily change from one version of .NET to the next. But for now this seems to be the only way to work around this. I've submitted a bug report at https://developercommunity.visualstudio.com/t/TextTransformCore-uses-runtime-not-ref/11013312? if you're having the same problem and want to add your support for this being fixed.

Better project support for T4 on .NET

Although there is no built in tooling, it's actually relatively straightforward to make an existing project use the newer tooling, once you know how. We can do this with some modifications to project files. The basic process is:

  • Define an ItemGroup for all your T4 templates
  • Automatically set the DependentUpon item metadata on the generated code (to ensure generated files go underneath their T4 files)
  • Define a custom Target that runs the T4 templates if the templates are newer than the generated outputs

Defining an item group for templates

I put this in a Directory.build.props file at the root of my solution, so that .tt files anywhere in any project in my solution are added to the item group:

<ItemGroup>
  <TextTemplates Include="**\*.tt">
    <GeneratedOutput>%(Filename).cs</GeneratedOutput>
    <GeneratedOutputRelativePath>%(RelativeDir)%(GeneratedOutput)</GeneratedOutputRelativePath>
  </TextTemplates>
</ItemGroup>

The Include="**\*.tt" is a glob that adds all files with a .tt extension anywhere in any project to the TextTemplates item group.

We then set two item metadata values:

  • GeneratedOutput: the filename of the output that the template will generate
  • GeneratedOutputRelativePath: the path of the template output relative to the project folder

In fact, in the Reaqtor codebase, we do something slightly more complex:

<ItemGroup>
  <TextTemplates Include="**\*.tt">
    <GeneratedOutput Condition="Exists('%(RootDir)%(Directory)%(Filename).generated.cs')">%(Filename).generated.cs</GeneratedOutput>
    <GeneratedOutput Condition="%(GeneratedOutput) == ''">%(Filename).cs</GeneratedOutput>
    <GeneratedOutputRelativePath>%(RelativeDir)%(GeneratedOutput)</GeneratedOutputRelativePath>
  </TextTemplates>
</ItemGroup>

Ths reason for this is that historically the Reaqtor codebase has used two different conventions. In some cases, the a template called, say, ByteArray.tt generates a file with the same name but a .cs template, e.g. ByteArray.cs. However, in some places the T4 includes this directive:

<#@ output extension=".generated.cs" #>

For example, that appears in the LetOptimizerTests.tt template, and the effect is that the generated file is called LetOptimizerTests.generated.cs. (In this case, that's because the generated code is adding extra methods to a partial class, so there's already a non-generated LetOptimizerTests.cs file. The generated code needs to go into a file with a different name.) Just to confuse matters further, some templates have Generated in their name, e.g. PooledObjects.Generated.tt. Obviously in this case we don't want the generated file to be PooledObjects.Generated.generated.cs, so this one is really an example of the first convention in which the .tt becomes .cs in the generated output.

The more complex XML shown above takes this into account: it looks to see if a file with that .generated.cs extension exists, and if so, selects that filename as the target for the template. But if it's not present, it just picks the other name.

Note that it's actually the template itself that determines what the output file name is with that output extension directive. This project file content just looks at what files exist, and infers from that which convention was used.

Correct Solution Explorer behaviour with DependentUpon

To ensure that the source file that a template generates appears nested inside that template in Solution Explorer, I put this in the Directory.Build.targets:

<ItemGroup>
  <Compile Update="@(TextTemplates->'%(GeneratedOutputRelativePath)')">
    <DesignTime>true</DesignTime>
    <AutoGen>true</AutoGen>
    <DependentUpon>%(TextTemplates.Filename).tt</DependentUpon>
  </Compile>    
</ItemGroup>

We put this in the Directory.Build.targets file so that it can run after everything in the .NET SDK's various .props files. Those will set up the Compile item group, which we just want to update. If we put this in the Directory.Build.props file, the Compile item group wouldn't exist yet so there would be nothing for us to Update.

With this in place, you can now remove all entries of this form from your project files:

<ItemGroup>
  <None
      Update="Example.tt"
      Generator="TextTemplatingFileGenerator"
      LastGenOutput="Example.cs"
      />
  <Compile
      Update="Example.cs"
      DesignTime="True"
      AutoGen="True"
      DependentUpon="Example.tt"
      />
</ItemGroup>

These are no longer necessary because the preceding ItemGroup automatically sets the item group metadata correctly for all templates.

Custom target to execute templates

Finally, also in the Directory.Build.targets we define this custom target:

<Target
  Name="_TransformTextTemplates"
  BeforeTargets="PreBuildEvent"
  Condition="@(TextTemplates) != '' and $(DevEnvDir) != ''"
  Inputs="@(TextTemplates)"
  Outputs="@(TextTemplates->'%(GeneratedOutputRelativePath)')">

  <Exec
    WorkingDirectory="$(ProjectDir)"
    Command='"$(DevEnvDir)TextTransformCore.exe" "%(TextTemplates.Identity)"' />

</Target>

We've set this to execute before the PreBuildEvent, meaning that all T4 generation occurs before the main build work happens.

The Condition here ensures that this target only attempts to run when the build is in a Visual Studio environment. (Either we are using Visual Studio to run the build, or the build was run from a Visual Studio developer prompt.) This is necessary because the TextTransformCore tool is not part of the .NET SDK, so it's not universally available. It's part of Visual Studio.

Generated source files will be checked into source control, so we only ever need to run the T4 tool if the template changes. So in cases where someone just clones a repository and builds it, it won't matter if they don't have Visual Studio available because all the generated files will be present anyway. (But anyone wishing to modify a template, and to get the corresponding modified output, will need Visual Studio, because that's the only official way to get the TextTransformCore tool today.)

This target uses the Inputs and Outputs to ensure that it only runs templates when the .tt file's timestamp is newer than the generated source file. (This conditional timestamp-based execution is built into MSBuild. You just have to tell it how a target's inputs and outputs are related.)

Conclusions

We are no longer constrained to using .NET Framework inside T4 files. Although this means abandoning the built-in Visual Studio tool support, with some relatively simple project file modifications, it's possible to get your T4 files using a modern .NET runtime in a straighforward way. We do lose the ability to debug the templates, but we get automated re-execution of the templates as part of the build.

Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

How I built a custom agent skill to configure Application Insights

1 Share

If you've ever found yourself repeating the same Azure setup ritual — adding the Application Insights SDK, wiring up telemetry, configuring sampling rules — you already know the pain. It's not hard, but it's tedious. Every new service needs the same scaffolding. Every new team member has to learn the same conventions.

That's exactly what I solved with a custom skill. Now, when I need to instrument a service, I just tell Copilot to configure Application Insights, and it does everything exactly the way our team expects. No extra prompting, no re-explaining our conventions. It just works.

This post explains what Skills are, how they work inside VS Code, and how to build one for your own team — using my Application Insights skill as a hands-on example.

What is an agent skill?

An agent skill is a folder of instructions, scripts, and reference files that teaches your AI agent how to handle a specific task. Think of it as institutional knowledge made executable. Instead of pasting a wall of context into every conversation, you define it once in a SKILL.md file, and Copilot (or Claude) loads it automatically when the task is relevant.

Skills can be as simple as a few lines of instructions or as complex as multi-file packages with executable code. The best skills encode your team's standards in a reusable, shareable package — essentially turning Claude from a general-purpose assistant into a specialized expert for a specific workflow.

Skills where originally introduced in Claude but are now also available in VS Code through GitHub Copilot's Agent Skills integration. The format is the same across all of them, so a skill you write once is portable.

How agent skills work in VS Code

Agent skills is an open standard that works across multiple AI agents, including GitHub Copilot in VS Code, GitHub Copilot CLI, and GitHub Copilot coding agent.

In practice, here's what happens when you use a skill in VS Code:

  1. You open the Chat panel and start a new conversation.
  2. VS Code detects which skills are available and their descriptions.
  3. When you describe a task, the agent automatically loads the most relevant skill based on those descriptions.
  4. The agent follows your skill's instructions — running bundled scripts if needed — and produces output consistent with your defined standards.

Skills are stored inside .github/skills/ and use a SKILL.md file that defines the skill's behavior.

The anatomy of a SKILL.md file

Every skill starts with a SKILL.md file containing two parts: YAML frontmatter and a Markdown body.

---
name: your-skill-name
description: One sentence explaining what the skill does and when to use it.
---

# Skill Title

Instructions, context, examples, and anything else an agent needs to do this job well.

The YAML frontmatter holds the required name and description fields. The Markdown body is the second level of detail — your agent accesses it when performing the task.

Building the Application Insights skill

Here's how I structured my Application Insights skill. It handles three things: adding the right NuGet packages, generating the correct configuration boilerplate, and applying our team's conventions around dependency tracking, log levels etc.

Start by creating a skill directory

mkdir -p ~/.github/skills/application-insights

Create a SKILLS.md markdown file:

We mainly support Angular, ASP.NET Core and Worker apps. So we isolated the specific instructions in separate files:

I also included a powershell script to fetch the Application Insights connectionstring from Azure:

Remark: You have to tweak this Powershell script to your needs as I hardcoded some configuration values to our setup.

Tips & tricks

Write descriptions like trigger conditions. The description isn't a tagline — it's what the agent reads to decide when to load your skill. "Use when setting up telemetry, adding the Application Insights SDK, or instrumenting a new service" is far more effective than "Application Insights helper."

Keep skills focused. Creating separate skills for different workflows is better than a single skill meant to do everything. Focused skills compose better than large ones. I have a separate skill for configuring logging with Serilog, and they work well together automatically.

Include a validation checklist. Adding a checklist at the end of your instructions gives the agent a concrete definition of done. It dramatically reduces the chance of skipped steps.

Use examples. Include two or three concrete code examples in your SKILL.md. This shows what success looks like and improves consistency.

Start simple, then add scripts. Your instructions should be structured, scannable, and actionable. Start with basic instructions in Markdown before adding complex scripts. My Application Insights skill worked well as pure Markdown for weeks before I added the project-detection script.

Test with different phrasings. After uploading, try triggering the skill with different prompts: "add telemetry," "instrument this service," "configure observability." If the skill doesn't activate reliably, broaden the description and add more use cases.

And a last tip;

Use the Skill-Creator skill: The skill-creator skill guides you through creating well-structured skills. It asks clarifying questions, suggests description improvements, and helps format instructions properly.

Putting it together

I started using skills only recently but we already see the advantage for our teams by automating common tasks.

The Application Insights skill I built means I never have to think about whether the sampling config is right, or whether someone remembered to gate telemetry for the development environment. Claude knows our conventions, and it applies them every time.

If you have workflows that follow a consistent pattern — configuring infrastructure, generating boilerplate, enforcing code standards — that's a candidate for a skill. The time to write a good SKILL.md for your team pays for itself the first week.

More information

About Agent Skills - GitHub Docs

agentskills/agentskills: Specification and documentation for Agent Skills

How to create Skills for Claude: steps and examples | Claude

Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

MariaDB innovation: binlog_storage_engine, small server, Insert Benchmark

1 Share

 MariaDB 12.3 has a new feature enabled by the option binlog_storage_engine. When enabled it uses InnoDB instead of raw files to store the binlog. A big benefit from this is reducing the number of fsync calls per commit from 2 to 1 because it reduces the number of resource managers from 2 (binlog, InnoDB) to 1 (InnoDB).

My previous post had results for sysbench with a small server. This post has results for the Insert Benchmark with a similar small server. Both servers use an SSD that has has high fsync latency. This is probably a best-case comparison for the feature. If you really care, then get enterprise SSDs with power loss protection. But you might encounter high fsync latency on public cloud servers.

tl;dr for a CPU-bound workload

  • Enabling sync on commit for InnoDB and the binlog has a large impact on throughput for the write-heavy steps -- l.i0, l.i1 and l.i2.
  • When sync on commit is enabled, then also enabling the binlog_storage_engine is great for performance as throughput on the write-heavy steps is 1.75X larger for l.i0 (load) and 4X or more larger on the random write steps (l.i1, l.i2)
tl;dr for an IO-bound workload
  • Enabling sync on commit for InnoDB and the binlog has a large impact on throughput for the write-heavy steps -- l.i0, l.i1 and l.i2. It also has a large impact on qp1000, which is the most write-heavy of the query+write steps.
  • When sync on commit is enabled, then also enabling the binlog_storage_engine is great for performance as throughput on the write-heavy steps is 4.74X larger for l.i0 (load), 1.50X larger for l.i1 (random writes) and 2.99X larger for l.i2 (random writes)
Builds, configuration and hardware

I compiled MariaDB 12.3.0 from source.

The server is an ASUS ExpertCenter PN53 with an AMD Ryzen 7 7735HS CPU, 8 cores, SMT disabled, and 32G of RAM. Storage is one NVMe device for the database using ext-4 with discard enabled. The OS is Ubuntu 24.04. More details on it are here. The storage device has high fsync latency.

I used 4 my.cnf files:
  • z12b
    • my.cnf.cz12b_c8r32 is my default configuration. Sync-on-commit is disabled for both the binlog and InnoDB so that write-heavy benchmarks create more stress.
  • z12c
  • z12b_sync
  • z12c_sync
    • my.cnf.cz12c_sync_c8r32 is like cz12c except it enables sync-on-commit for InnoDB. Note that InnoDB is used to store the binlog so there is nothing else to sync on commit.
The Benchmark

The benchmark is explained here. It was run with 1 client for two workloads:
  • CPU-bound - the database is cached by InnoDB, but there is still much write IO
  • IO-bound - most, but not all, benchmark steps are IO-bound
The benchmark steps are:

  • l.i0
    • insert XM rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client. X is 30M for CPU-bound and 800M for IO-bound.
  • l.x
    • create 3 secondary indexes per table. There is one connection per client.
  • l.i1
    • use 2 connections/client. One inserts XM rows per table and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate. X is 40M for CPU-bound and 4M for IO-bound.
  • l.i2
    • like l.i1 but each transaction modifies 5 rows (small transactions) and YM rows are inserted and deleted per table. Y is 10M for CPU-bound and 1M for IO-bound.
    • Wait for S seconds after the step finishes to reduce MVCC GC debt and perf variance during the read-write benchmark steps that follow. The value of S is a function of the table size.
  • qr100
    • use 3 connections/client. One does range queries and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested. This step is frequently not IO-bound for the IO-bound workload. This step runs for 1800 seconds.
  • qp100
    • like qr100 except uses point queries on the PK index
  • qr500
    • like qr100 but the insert and delete rates are increased from 100/s to 500/s
  • qp500
    • like qp100 but the insert and delete rates are increased from 100/s to 500/s
  • qr1000
    • like qr100 but the insert and delete rates are increased from 100/s to 1000/s
  • qp1000
    • like qp100 but the insert and delete rates are increased from 100/s to 1000/s
Results: summary

Results: summary

The performance reports are here for:
  • CPU-bound
    • all-versions - results for z12b, z12c, z12b_sync and z12c_sync
    • sync-only - results for z12b_sync vs 12c_sync
  • IO-bound
    • all-versions - results for z12b, z12c, z12b_sync and z12c_sync
    • sync-only - results for z12b_sync vs 12c_sync
The summary sections from the performance reports have 3 tables. The first shows absolute throughput by DBMS tested X benchmark step. The second has throughput relative to the version from the first row of the table. The third shows the background insert rate for benchmark steps with background inserts. The second table makes it easy to see how performance changes over time. The third table makes it easy to see which DBMS+configs failed to meet the SLA.

I use relative QPS to explain how performance changes. It is: (QPS for $me / QPS for $base) where $me is the result for some version $base is the result from the base version. The base version is Postgres 12.22.

When relative QPS is > 1.0 then performance improved over time. When it is < 1.0 then there are regressions. The Q in relative QPS measures: 
  • insert/s for l.i0, l.i1, l.i2
  • indexed rows/s for l.x
  • range queries/s for qr100, qr500, qr1000
  • point queries/s for qp100, qp500, qp1000
Below I use colors to highlight the relative QPS values with yellow for regressions and blue for improvements.

I often use context switch rates as a proxy for mutex contention.

Results: CPU-bound

The summaries are here for all-versions and sync-only.
  • Enabling sync on commit for InnoDB and the binlog has a large impact on throughput for the write-heavy steps -- l.i0, l.i1 and l.i2.
  • When sync on commit is enabled, then also enabling the binlog_storage_engine is great for performance as throughput on the write-heavy steps is 1.75X larger for l.i0 (load) and 4X or more larger on the random write steps (l.i1, l.i2)
The second table from the summary section has been inlined below. That table shows relative throughput which is:
  • all-versions: (QPS for my config / QPS for z12b)
  • sync-only: (QPS for my z12c / QPS for z12b)
For all-versions
dbmsl.i0l.xl.i1l.i2qr100qp100qr500qp500qr1000qp1000
ma120300_rel_withdbg.cz12b_c8r321.001.001.001.001.001.001.001.001.001.00
ma120300_rel_withdbg.cz12c_c8r321.031.011.001.031.000.991.001.001.011.00
ma120300_rel_withdbg.cz12b_sync_c8r320.041.020.070.011.011.011.001.011.001.00
ma120300_rel_withdbg.cz12c_sync_c8r320.081.030.280.061.021.011.011.021.021.01

For sync-only
dbmsl.i0l.xl.i1l.i2qr100qp100qr500qp500qr1000qp1000
ma120300_rel_withdbg.cz12b_sync_c8r321.001.001.001.001.001.001.001.001.001.00
ma120300_rel_withdbg.cz12c_sync_c8r321.751.013.996.831.011.011.011.011.031.01

Results: IO-bound

The summaries are here for all-versions and sync-only.
  • Enabling sync on commit for InnoDB and the binlog has a large impact on throughput for the write-heavy steps -- l.i0, l.i1 and l.i2. It also has a large impact on qp1000, which is the most write-heavy of the query+write steps.
  • When sync on commit is enabled, then also enabling the binlog_storage_engine is great for performance as throughput on the write-heavy steps is 4.74X larger for l.i0 (load), 1.50X larger for l.i1 (random writes) and 2.99X larger for l.i2 (random writes)
The second table from the summary section has been inlined below. That table shows relative throughput which is:
  • all-versions: (QPS for my config / QPS for z12b)
  • sync-only: (QPS for my z12c / QPS for z12b)
For all-versions
dbmsl.i0l.xl.i1l.i2qr100qp100qr500qp500qr1000qp1000
ma120300_rel_withdbg.cz12b_c8r321.001.001.001.001.001.001.001.001.001.00
ma120300_rel_withdbg.cz12c_c8r321.010.990.991.011.011.011.011.071.011.04
ma120300_rel_withdbg.cz12b_sync_c8r320.041.000.550.101.020.971.000.800.950.55
ma120300_rel_withdbg.cz12c_sync_c8r320.181.000.830.311.021.011.020.961.020.86

For sync-only
dbmsl.i0l.xl.i1l.i2qr100qp100qr500qp500qr1000qp1000
ma120300_rel_withdbg.cz12b_sync_c8r321.001.001.001.001.001.001.001.001.001.00
ma120300_rel_withdbg.cz12c_sync_c8r324.741.001.502.991.001.041.021.201.081.57












Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Zero Knowledge Proof (ZKP) in SQL Server

1 Share

Learn about zero knowledge proof and how to implement it in SQL Server to ensure privacy in blockchain and transaction systems.

The post Zero Knowledge Proof (ZKP) in SQL Server appeared first on MSSQLTips.com.

Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories