Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152587 stories
·
33 followers

Building a Modern Track Changes Review Workflow in ASP.NET Core C#

1 Share
In this article, we will explore how to build a modern track changes review workflow in ASP.NET Core C#. We will leverage the powerful features of TX Text Control .NET Server to create an efficient and user-friendly review process for document editing.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Safely creating projects with the AI Assistant

1 Share

AI in DevOps faces an interesting problem.

According to Octopus Deploy’s AI Pulse report, 44% of developers say they’re frustrated with AI outputs that are “almost right, but not quite.”

You’ve likely experienced it yourself: a large language model generates a deployment config missing a critical variable, or suggests a step that works in staging but breaks in production. Which is why many engineers will use AI to whip up a script. Fewer will let it configure a production pipeline unsupervised.

So far in the Easy Mode series, we’ve walked through how to use the Octopus AI Assistant for specific, bounded tasks, from adding manual interventions to setting environment-scoped variables.

Each post focused on a task where you could verify the result before it hit production. This post continues that pattern. We’ll cover how to create projects and what to do if you need to rollback a change you are not satisfied with.

What the AI Assistant does (and what we’re building)

If this is your first easy mode article, it is worth explaining what the Octopus AI assistant is and exactly what we will be building.

The Octopus AI assistant is a Chrome extension that integrates AI directly into Octopus Deploy. It can answer questions about your instance, help you build projects, and guide you through configuration, all without leaving the browser tab you are already in.

A common question users often have is “Does Octopus use my data for training?” to get that out of the way, the extension does NOT train AI models on your data.

Demo: Creating a project with the AI Assistant

In this walkthrough, we will create a project using the Octopus AI assistant and simulate responding to a request to revert a change.

Prerequisites

In order to follow along with this portion of the tutorial, you will need the following:

  • An Octopus Cloud account: If you don’t have one, you can sign up for a free trial.
  • The Octopus AI Assistant Chrome extension: You can install it from the Chrome Web Store.

Upon installation, you should see a new icon in the bottom right.

Create a project

With the extension installed, the next step is to create a project.

Paste the following prompt into the Octopus AI Assistant and run it:

Create a Script project called "15. Git backed Project".

Upon hitting enter, the Octopus AI assistant will process your request and generate the OpenTofu code required. This is important because using OpenTofu makes your results much more deterministic.

Generating the OpenToFu

Generating the OpenTofu.

Many engineers have expressed doubts when using generative AI, as the results tend to vary. Generating the OpenTofu required to fulfill your requests helps the assistant achieve greater accuracy, as there is only one way to create a project via the OpenTofu provider.

Additionally, if you were to do this manually, it would take anywhere from a couple of minutes to a few minutes, depending on how many projects you intend to create and whether you need to modify an existing OpenTofu module.

Your safety net: Version-controlled projects

So far, the Octopus AI assistant has created your project. The obvious question is: what if something is wrong?

Thankfully, Octopus Deploy lets you back any project with a Git repository. When you enable this, every change to that project, whether you made it or the assistant did, becomes a Git commit.

To set this up, go to your project settings and connect a Git repository. Once connected, Octopus stores your project configuration in a .octopus folder in that repo. Deployment processes, variables, triggers, all of it lives in version-controlled files.

Now look at what happens after the AI assistant creates your project. Open the Git history for the repo. You’ll see the commits the assistant generated, and you can inspect exactly what it configured.

Visual image of the committed files

Visual image of the committed files.

Clicking into any commit gives you the full diff, and if something looks off, you have two options: fix it in the next commit or revert to a known-good state. This is the same workflow you already use for application code.

Visual image of the diffs

Visual image of the diffs.

This matters because you can use the assistant with more confidence, because changes are auditable, and because relying on a config-as-code approach means there is only one way to do things.

Tips for getting the most out of AI-Assisted setup

At this stage, if you’re looking to try out an AI-Assisted workflow, here are some quick tips on how to get better returns on your new workflow:

  • Write short, specific prompts: The more detail you pack into a single prompt, the longer the assistant takes to process it. Keep prompts scoped to a single task.
  • Include platform and environment details: The assistant produces better results when you tell it what you’re deploying to. Specify the target (Kubernetes, Azure App Service, AWS ECS), name your environments, and indicate your preferred deployment strategy (rolling, blue/green, canary) if applicable.
  • Plan to customize after generation: The assistant handles straightforward project scaffolding well. Complex variable scoping, multi-tenant configurations, and custom scripts are better handled manually after the initial setup. Use the AI to get the structure in place, then fine-tune. Git tracks everything either way.

Beyond project creation, the AI Assistant can also help with troubleshooting failed deployments and surfacing best-practice recommendations.

Build Fast, Ship Safe

In this post, we used the Octopus AI Assistant to create a project from a prompt and then backed it with Git via Config as Code.

While the conversation around AI in DevOps evolves, having the ability to use it on your own terms is what makes the Octopus AI assistant a good choice for teams looking to dip their toes in the water.

If you want to try this yourself, install the Octopus AI Assistant Chrome extension and give it a shot.

Happy deployments!

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Smooth Scrolling Performance for Image‑Heavy PDFs: Proven Rendering Techniques

1 Share

Smooth Scrolling Performance for Image‑Heavy PDFs: Proven Rendering Techniques

TL;DR: Image‑heavy PDFs often suffer from scroll lag due to large page images, high memory consumption, and inefficient page rendering in browser‑based viewers. The article examines how rendering scope, image processing behavior, and memory handling impact scrolling performance in large PDF documents.

Your application loads a 200‑page scanned product catalog. The user scrolls for a few seconds, and suddenly frames drop, pages go blank, memory spikes, and the browser tab crashes.

This is not a hardware issue. It is a rendering architecture failure.

In 2026, users expect desktop‑grade PDF performance in the browser. Scroll lag is not a minor inconvenience; it breaks trust. For applications where document viewing is a core workflow, every freeze and delay pushes users away.

Most web PDF viewers were designed for lightweight, text‑based documents. Feed them a 300‑page scanned catalog, medical record, or engineering drawing, and the cracks appear quickly: unresponsive scrolling, blank pages, runaway memory usage, and crashed tabs.

Syncfusion® JavaScript PDF Viewer is built to handle exactly these large, image-heavy PDFs that demand consistent, smooth scrolling performance in the browser.

This article explains:

  • Why scrolling fails on image‑heavy PDFs
  • What production‑ready smooth scrolling actually requires
  • How modern PDF rendering techniques solve the problem at scale

Why scrolling fails on image-heavy PDFs

Image‑rich PDFs like product catalogs, scanned contracts, medical records, and engineering drawings are fundamentally different from text documents.

Each page contains high‑resolution images that must be:

  • Decoded,
  • Allocated in memory,
  • Processed by the GPU,
  • and rendered before anything appears on screen.

Most PDF viewers collapse under this load because they share three architectural flaws:

  • Eager rendering: Pages are rendered all at once, including pages the user may never reach.
  • Lack of viewport awareness: Page 1 and page 180 compete for resources simultaneously.
  • Uncontrolled render queues: Fast scrolling floods the render pipeline, stalling the UI.

The result is predictable: memory exhaustion, dropped frames, and unstable scrolling.

The fix is not file compression or faster hardware. It requires a rendering architecture built for large, image‑heavy documents.

What smooth scrolling really requires

Smooth scrolling is not a single optimization; it is the result of multiple systems working together consistently.

Before evaluating any PDF viewer, developers need a clear definition of what production-ready scroll performance means. Four dimensions determine whether a viewer holds up under real document loads: Render speed, Scroll smoothness, Memory stability, and Responsiveness.

Meeting all four requires specific architectural decisions:

  • On-demand rendering
    Renders visible pages immediately while the rest loads in the background.
  • Tile-based rendering
    Large image pages must be divided into smaller, independent tiles that load progressively as they enter the viewport.
  • Virtual scrolling
    Keeps only current viewport pages in memory and immediately releases memory for pages the user has scrolled past.
  • Pre-fetch and scroll debouncing
    A viewer pre-fetches pages just ahead of the scroll position so content is ready before the user reaches it. Scroll debouncing rate-limits render requests during fast scrolling, preventing the render queue from flooding.

If a viewer cannot satisfy all four, it is not ready for image-heavy documents at production scale.

How Syncfusion JavaScript PDF Viewer is built for smooth scrolling

Syncfusion JavaScript PDF Viewer is a purpose-built, enterprise-grade component designed specifically for large, complex document rendering in the browser, with smooth scrolling performance as a core architectural requirement.

Rendering engine foundation: PDFium via WebAssembly:

At the core of modern high‑performance PDF viewing is PDFium, Google’s native PDF rendering engine, compiled to WebAssembly and executed directly in the browser.

This approach delivers:

  • Near‑native rendering speed
  • Accurate image and font rendering
  • Predictable memory allocation per page
  • Consistent behavior across all major browsers

Because each page’s memory lifecycle is tightly controlled, memory is allocated only when needed and released immediately when pages leave the viewport, preventing cumulative memory growth on long documents.

Before we walk through each rendering technique, explore the full engine capability in the JavaScript PDF Viewer documentation.

Viewport-only rendering and progressive loading

Two issues destroy scroll performance before the user even interacts:

  • Waiting for the entire PDF to load
  • Holding every page in memory at once

A Syncfusion PDF Viewer solves both with:

  • Progressive loading
    Visible pages render immediately while the rest of the document loads in the background.
  • Viewport‑only rendering
    Only pages currently visible on screen exist in memory. As the user scrolls, off‑screen pages are released instantly.

In practice, even a 500‑page scanned PDF typically keeps only 2–4 pages in memory at any moment.

Developers can further tune this behavior using APIs such as:

Here’s how you can do it in code:

var pdfviewer = new ej.pdfviewer.PdfViewer({
    documentPath: 'https://cdn.syncfusion.com/content/pdf/pdf-succinctly.pdf',
    resourceUrl: 'https://cdn.syncfusion.com/ej2/32.2.3/dist/ej2-pdfviewer-lib',
    // Change the page request delay on scroll.
    scrollSettings: { delayPageRequestTimeOnScroll: 150 },
    // Specifies the maximum number of pages that should be rendered on document loading.
    initialRenderPages: 4,
});

Outcome: A 500‑page image‑heavy document scrolls with the same perceived responsiveness as a 10‑page file, as demonstrated in the GIF below.

Smooth scrolling in JavaScript PDF Viewer
Smooth scrolling in JavaScript PDF Viewer

Want to see viewport-only rendering handle a 500-page image PDF live? Try the Syncfusion PDF Viewer Live demo.

Tile-based rendering for image pages

Viewport rendering controls which pages load. Tile‑based rendering controls how they render.

Instead of processing an entire page image at once, each page is divided into smaller tiles that render independently.

  • Without tile rendering: Users see a blank white page until rendering completes.
  • With tile rendering: Tiles appear progressively as they are ready.

This ensures:

  • Faster perceived load time
  • No full‑page white screens
  • Stable rendering for large, high‑DPI pages

Zoom optimization

Tile rendering also makes zooming efficient.

When users zoom:

  • Only tiles inside the zoomed viewport are rendered.
  • Cached tiles are reused during transitions.
  • Extreme zoom levels can be capped to avoid rendering instability.

The following APIs work together to keep Zoom fast and controlled:

Here’s how you can do it in code:

var pdfviewer = new ej.pdfviewer.PdfViewer({
    documentPath: 'https://cdn.syncfusion.com/content/pdf/pdf-succinctly.pdf',
    resourceUrl: 'https://cdn.syncfusion.com/ej2/32.2.3/dist/ej2-pdfviewer-lib',
    tileRenderingSettings: {
        enableTileRendering: true,
        x: 3,  // tile columns
        y: 3   // tile rows
    },
    enableZoomOptimization: true,
    minZoom: 10,   // minimum zoom percentage
    maxZoom: 400   // maximum zoom percentage
});

The result is smooth, responsive zooming even on dense image documents like engineering drawings and scanned blueprints.

Zoom optimization in JavaScript PDF Viewer
Zoom optimization in JavaScript PDF Viewer

Feature module injection: Load only what your app needs

Every PDF viewer feature has a cost: bundle size, memory usage, and initialization time. Syncfusion’s Inject() pattern removes that overhead entirely.

Instead of loading everything by default, modern viewers allow feature‑level injection:

  • Only explicitly enabled modules are initialized
  • Unused features consume zero memory and add no startup overhead

For example, a read‑only viewer can exclude annotations, forms, and printing entirely while enabling only text selection and search.

ej.pdfviewer.PdfViewer.Inject(
    ej.pdfviewer.TextSelection,
    ej.pdfviewer.TextSearch,
    ej.pdfviewer.Magnification
    // Annotation, FormFields, Print excluded — not needed for read-only viewer
);

The outcome: A leaner, faster-initializing viewer that carries zero overhead for features the application does not use.

Want to see the full injectable module list? View the Module Injection documentation.

Real-world impact: Where these techniques matter most

Great rendering architecture is invisible to users; they just know the document feels fast and reliable. Here is how each technique translates into real-world document workflows.

Product catalog applications

Hundreds of image‑heavy pages, non‑linear scrolling, and frequent zooming demand:

  • Progressive page display
  • Stable memory usage
  • Fast jump navigation without intermediate page renders.
  • API delivers smooth touch-based zoom
Smooth document rendering with JavaScript PDF Viewer
Smooth document rendering with JavaScript PDF Viewer

Medical imaging and healthcare document workflows

Clinical workflows require:

  • Immediate partial visibility of large diagnostic images
  • Smooth zooming into specific regions
  • Stable interaction during search and review

Legal and compliance document review

Reviewers search, scroll, and navigate simultaneously.

  • Async operations prevent UI freezes
  • Viewport rendering keeps the large document review stable
  • Feature injection simplifies read‑only enforcement

Want to see Syncfusion PDF Viewer running on real enterprise document types? Try the live demo on Your document type.

Frequently Asked Questions

How can I load a 100-page PDF starting at page 60 on initial render?

You can call the goToPage(60) API of the Syncfusion PDF Viewer at the documentLoad event to initially render the PDF at page number 60. You can also navigate by entering 60 at the page number toolbar after the document is loaded into the viewer.

Does rendering performance differ across Chrome, Firefox, Edge, and Safari?

No. Because Syncfusion’s PDF Viewer uses PDFium compiled to WebAssembly, rendering executes at the engine level, independent of each browser’s native PDF handling. Image rendering quality, font accuracy, and scroll performance are consistent across browsers.

How do I support simultaneous search and scroll without freezing the reviewer's session in a Legal document?

Use the async findText() method for all text search operations that run independently of the rendering pipeline. Reviewers can initiate a full-document search across a multiple-page scanned document while continuing to scroll, and the UI remains fully interactive throughout.

Can I load PDFs from storage services like AWS S3, Azure Blob, or cloud storage?

Yes. You can configure your application to open PDFs from various storage options, including AWS S3, Azure Blob, and DropBox Cloud storage, using the documentPath property.

Easily build real-time apps with Syncfusion’s high-performance, lightweight, modular, and responsive JavaScript UI components.

Conclusion

Thank you for reading! Scroll lag in image‑heavy PDFs is not a browser limitation or a file‑size issue. It is a rendering architecture problem, and it is entirely solvable.

Syncfusion JavaScript PDF Viewer achieves smooth scrolling by combining:

  • Viewport‑only rendering
  • Progressive loading
  • Tile‑based page display
  • Zoom optimization
  • Controlled feature initialization

The result is faster documents, lower memory usage, higher user trust, and fewer support issues across every document‑heavy workflow.

If your application depends on large PDFs, smooth scrolling is not optional. It is the foundation of the entire document experience.

If you’re a Syncfusion user, you can download the setup from the license and downloads page. Otherwise, you can download a free 30-day trial.

You can also contact us through our support forumsupport portal, or feedback portal for queries. We are always happy to assist you!

Read the whole story
alvinashcraft
46 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

New features in Git 2.54: easier rebasing, hooks, and statistics

1 Share

In this post I describe some of the nice new features released in Git 2.54, including easier simple rebases, hooks defined in config, and some stats about your git repo. I learned about these from other posts, and these are the things that caught my eye.

Easier simple rebases with git history

I'm a big fan of interactive rebasing with git rebase -i, particularly when using a tool like Rider which makes working out exactly what you need to do that much easier:

Performing an interactive rebase with Rider

But the reality is that rebase is often daunting to people. You can mess it up, and if you end up with merge conflicts on the way, things can easily get very confusing. And sometimes, you don't really need all the power of a full rebase.

I've written a lot about rebasing in the past, including stacked branches, git absorb and --update-refs. If you don't know about these tools, I highly recommend checking them out!

If you don't need to do anything fancy with git rebase then the new git history command might be for you. In Git 2.54, git history supports two commands:

  • git history reword <commit> lets you change the commit message for a specific commit.
  • git history split <commit> lets you split a specific commit in two.

Those are obviously a tiny subset of things that you can do with an interactive rebase, but they're also things that you might want to do relatively often. The other nice thing is that you can run these without having to check out the branch they're associated with first.

Rewording commits with git history reword

For example, imagine you have this small set of branches, where we currently have master checked out, and we're working on that, but there's a separate branch issue-83

A master branch, and a sub-branch issue-83

That wip commit at the base of the branch issue-83 doesn't have a good commit message, it should be describing what it does, and was probably meant to be tidied up later. Previously, this is the flow we'd need to take:

git checkout issue-83        # checkout the branch
git rebase -i origin/master  # start an interactive rebase

This would open up an editor, and we'd need to find the commit we want and change the action from pick to reword:

reword 055db13 # wip
pick f44696a # Fix the underlying cause # empty

# Rebase 69d0f46..f44696a onto 69d0f46 (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# ...

After closing the editor, Git would start rebasing, and then pause and open our editor so we could reword the commit. That obviously works, and if you have the branch already checked out it's not a big deal, especially if you're using an IDE like Rider that makes it even easier.

However, with git history reword you can do the same thing in one shot, from anywhere, without having to checkout the branch first. You simply pass in the commit that you want to change, and Git opens the editor, waits for the update, and rewrites the commit message and fixes the rest of the history:

git history reword 055db134ac326766b1566a64cd81873c69b1dc58   # this is the only step

The whole operation is very fast as well, because Git isn't walking through, updating the working directory as it replays commits, it's just fixing the commits, and then fixing up the descendant hashes.

The wip commit has been rewritten

Splitting a commit with git history split

In general, I like to create self-contained commits in my branches, and to sequence them in such a way that it makes it easier for reviewers that review commit-by-commit. Sometimes, however, I accidentally make a commit which, in hindsight, is too big, and which I want to split. Another scenario is where I accidentally include a file in a commit which was meant to be in a different commit.

Typically, I would handle this by using git rebase -i to start an interactive rebase, pause on the problematic commit, do a git reset HEAD~ to "erase" the commit from history (while keeping the index intact) and then make my two separate commits, before continuing with git rebase --continue. This is a workflow that I'm very comfortable with, but I'm sure many people wouldn't be.

git history split essentially does the same thing in one hit, though just like git history reword, it doesn't require you to have the branch you're editing checked out. Instead, you use the built-in hunk selector to choose which parts of the commit should be pulled out into a "parent" commit.

This feature is apparently inspired by Jujustu's jj split command. jj is a tool I keep feeling like I should look at, but with my git muscle-memory as it is, I just don't have much of an incentive. But if you don't like git, maybe take a look!

For example, let's imagine the commit we just reworded also needs to be split. We initiate using

git history split 1153957368717fbe4dd19866315fbf53b17a0993

Git immediately starts showing you diffs, and you need to decide whether they should be included in the parent commit, or kept in the existing commit:

diff --git a/src/NetEscapades.Configuration.Yaml/NetEscapades.Configuration.Yaml.csproj b/src/NetEscapades.Configuration.Yaml/NetEscapades.Configuration.Yaml.csproj
index 4a249b0..953865b 100644
--- a/src/NetEscapades.Configuration.Yaml/NetEscapades.Configuration.Yaml.csproj
+++ b/src/NetEscapades.Configuration.Yaml/NetEscapades.Configuration.Yaml.csproj
@@ -21,7 +21,7 @@

   <ItemGroup>
     <PackageReference Include="Microsoft.SourceLink.GitHub" Version="1.0.0-*" PrivateAssets="all" />
-    <PackageReference Include="YamlDotNet" Version="13.0.1" />
+    <PackageReference Include="YamlDotNet" Version="16.3.0" />
     <PackageReference Include="Microsoft.Extensions.Configuration" Version="2.0.0" />
     <PackageReference Include="Microsoft.Extensions.Configuration.FileExtensions" Version="2.0.0" />
   </ItemGroup>
(1/1) Stage this hunk [y,n,q,a,d,?]?

At the end, you can see the question (1/1) Stage this hunk [y,n,q,a,d,?]? This is showing you the valid options. If you type ? and push Enter, you can see what each of the options does:

y - stage this hunk
n - do not stage this hunk
q - quit; do not stage this hunk or any of the remaining ones
a - stage this hunk and all later hunks in the file
d - do not stage this hunk or any of the later hunks in the file
? - print help

If you stage the hunk, then it's added to the parent commit, otherwise it stays in the existing commit.

Once you've staged (or not) all of the hunks in the commit, Git opens your editor twice, once for each commit. The editor is pre-populated with the existing commit message in both cases, but you can change both of them. After the editor closes, the split is complete

The commit has been split in two

Again, being able to do this without having to check out the branch makes this command both convenient and fast!

Limitations with git history

The main limitation with git history is that it can't be used on any segments of history that contain merge commits; it will just refuse if you try:

$ git history reword a626aa2b9296ed0530356de98fb94bbd78802f5b
error: replaying merge commits is not supported yet!

Also if you're not used to using the interactive hunk staging (e.g. using git add -p) then you might find working with git history split a little tricky. As much as I use the command line for many Git operations, I much prefer using a GUI whenever I need to partially stage files, and that's just not possible with git history split.

The other main limitation is that these are the only things you can do. For me, I don't know how often I would end up doing just these operations and not need to do anything else that would require a full git rebase. I can see myself occasionally using the git history reword, but that's probably about it.

The other thing to be aware of is that the git history command is currently marked experimental, so it may well change in the future.

Setting up Git hooks in repository configuration

Git hooks are, as the name implies, hook points that let you run scripts automatically when Git performs certain actions. The most common hooks are probably "pre-commit" hooks, which run just as you create a commit, and "pre-push" hits, which run just as you push to a remote.

These hooks are a great way to, for example, enforce that code is always run through a linter before it's committed. I've seen people add pre-push hooks that automatically run all the unit tests, to ensure you're never pushing broken code.

The main downside with hooks was often that they were sometimes a bit tricky to setup. In Git 2.54 you can now configure hooks using "normal" config instead! For example, let's say I want to ensure I run dotnet format just before I commit. I could add a pre-commit hook to do this by running the following:

git config set hook.formatter.event pre-commit
git config set hook.formatter.command "dotnet format"

This adds a section to the local git config for the repository that looks like the following:

[hook "formatter"]
	event = pre-commit
	command = dotnet format

The "formatter" name is arbitrary, but this config shows the hook that triggers the event, and the command that will run. With this, any time you create a commit, the hook will kick in and run dotnet format.

For what it's worth, I don't tend to use hooks that much, mostly because I find the slowdown they add to be too disruptive to my flow. But I suspect it's something that will becoming increasingly common in the era of AI agents where you can make sure that you're really enforcing the rules on your agents!

You can add multiple hooks for the same event using this approach, as well as using the "traditional" style. If you want to see all of the hooks that are going to run, you can use the git hooks list <event> command to see them:

$ git hook list pre-commit
formatter

When I originally saw this feature, I thought that it implied that you could finally share config in the repository itself, but that's not the case. It's still not possible to have anyone who clones the repo to have the hooks enabled by default, and likely never will, as this by definition would provide an easy way to get remote code execution on anyone that clones your repo!

Getting some git repository stats with git repo structure

The final tool I'm calling out in this post is the git repo structure command, which can give you some statistics about the size and layout of yourGgit repository. This seems like something which is not going to be an issue for many people, but if you're working on a high velocity repository, then these details could be very important, as it affects how well your CI and repo hosting is going to perform!

Performance-wise, the size on disk of your repository, as well as it's inflated size are important factors in repo performance, as well as the number of commits and the directory structure. All those stats are available in the git repo structure command:

$ git repo structure
Counting objects: 245390, done.
| Repository structure      | Value      |
| ------------------------- | ---------- |
| * References              |            |
|   * Count                 |   1.63 k   |
|     * Branches            |      1     |
|     * Tags                |    241     |
|     * Remotes             |   1.39 k   |
|     * Others              |      0     |
|                           |            |
| * Reachable objects       |            |
|   * Count                 | 245.39 k   |
|     * Commits             |  15.21 k   |
|     * Trees               | 121.16 k   |
|     * Blobs               | 109.01 k   |
|     * Tags                |      3     |
|   * Inflated size         |   3.19 GiB |
|     * Commits             |  13.85 MiB |
|     * Trees               | 273.41 MiB |
|     * Blobs               |   2.91 GiB |
|     * Tags                |    491 B   |
|   * Disk size             | 406.63 MiB |
|     * Commits             |   8.52 MiB |
|     * Trees               |  14.77 MiB |
|     * Blobs               | 383.34 MiB |
|     * Tags                |    438 B   |
|                           |            |
| * Largest objects         |            |
|   * Commits               |            |
|     * Maximum size    [1] |  66.30 KiB |
|     * Maximum parents [2] |      2     |
|   * Trees                 |            |
|     * Maximum size    [3] | 238.24 KiB |
|     * Maximum entries [4] |   2.09 k   |
|   * Blobs                 |            |
|     * Maximum size    [5] |  86.33 MiB |
|   * Tags                  |            |
|     * Maximum size    [6] |    191 B   |

[1] 5dda78ddb94fa922091fa7ecea007d944b41af05
[2] 473cbd76eb66679abaabd17046b469e172bbe386
[3] f857ca54b07540dfb53b88be29e58d6c98686d39
[4] f857ca54b07540dfb53b88be29e58d6c98686d39
[5] cba4d794d0ea43222038ce1df62e63b2a88ef52c
[6] 62b75ef5f602aa209bc278bfea67b173c966a083

If, like me, these numbers don't really mean a lot to you, then this is just interesting numbers for the sake of it 😅 I'm sure they're really important if you work at GitHub or GitLab though 😀

There's many more small features in Git 2.54, these were just the ones that caught my eye, so go update!

Summary

In this post I described some of the new features released in Git 2.54. Most interesting to me was the introduction of git history for simplified rebasing. You can now reword commits using git history reword without having to check out the branch first or do a full rebase. Similarly, you can split a commit in two using git history split. Support was also added for config-based hooks and viewing statistics about your repository using git repo structure.

Read the whole story
alvinashcraft
54 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

DiskANN Vector Index Improvements

1 Share

 DiskANN Vector Index Improvements


Chapters

Full Transcript

Erik Darling here, Darling Data. And I finally have something to be excited about in the vector area. It would figure that I just finished, you know, wrapping up my sort of YouTube expose into vector search in SQL Server 2025 and saying, man, like, Microsoft doesn’t make some fixes here. Like, I don’t know where this story’s going. But lo and behold, at, well, I guess, SQL Server, SQL Con, uh, this, uh, last week in Atlanta, uh, it was announced that a lot of the problems that I had with disk and indexes, uh, are, are gone now. So congratulations, a round of applause to, uh, everyone who worked on that. This is wonderful news because now Microsoft actually has a pretty good story around, uh, vector search in SQL Server that it just didn’t have before. So the, the, the two main things were, uh, one, uh, the, when you added a vector index, uh, to a table, the whole table became read only. That has been, that has been fixed now. That has been worked out. So you’re, you can write to your table. So people can both write to, to your, to your tables and like do normal stuff. And, and, and, and the vector index doesn’t stop that. Uh, so that, that is wonderful. That is fantastic news. Uh, this, this feature finally has a strong pair of legs under it. Uh, they’ve also done some other stuff, um, where, uh, along the way. Um, I think the other main thing in here, I haven’t had a chance to test any of this out. It’s rolling out pretty slowly to some of the, um, some of the Azure, uh, regions, but I have, I’m using my robot friends to probe them. I haven’t found one where, uh, this is available yet. So maybe it’s still a little too soon, but I just haven’t found it yet. Maybe, maybe I just missed it. I don’t know. You can never trust those robots. They are, they’re kind of lazy sometimes. They’re like, yeah, I checked all that. Sorry, nothing there. And you’re like, but I see it. And they’re like, oh, ah, sorry. I missed that one. But, uh, anyway, uh, some of the other cool stuff that they did.
Um, was speed up, uh, the, the creation of vector indexes. Uh, if you remember some of my videos where I showed you, uh, how slow it was and the insane amount of code that ran behind the scenes on that. Uh, apparently that’s all gone. I have, again, I have not yet tested it. So I don’t know what the improvement is or if that weird code still happens, but just runs faster now. We’re going to wait and see. But it seems like the way, um, it seems like fundamentally the way that, um, like the vector indexes get created now is just, totally, uh, different in storage engine and behind the scenes. And there’s not like 3000 lines of strange code with bizarre use hints running. So this is, this is a very, this is very good news for us here in vector land. Um, I guess there’s an important note about migrating existing indexes, but if you were crazy enough to use a preview feature and create indexes, Oh, I mean, I guess read the, read the warning there. Um, of course, as soon as I start recording this, it becomes the noisiest day in the world. I had a plane fly by, there’s ambulances going. I can’t win sometimes.
Uh, but the other thing that they did that I think was really cool is, um, let me get, scroll down to this part. Um, the query syntax and, uh, filtering bits. If zoom, it will cooperate. I’m going to give me Mark Vassinovich’s number. Uh, can file a complaint about zoom, about zoom it here. Uh, but it used to be that you use, like when you wrote a query, like, uh, the one on, well, I guess further, right. Uh, you, you, you had to ask for a much higher top end number, uh, sometimes because you didn’t know like how many things it would find. So if you wanted like the top 20, uh, from like the outer query, but you asked for the top end in the inner query, uh, you might not get as many back as you asked for in the inner query.
And so your outer top 20 would not be 20. So you had to sometime ask for like the top end 100 or 200 in order to make sure that you got 20 back. But all that has apparently been improved. Uh, the top syntax has apparently been extended. So top with approximate, that’s going to be fun to mess with. Uh, I can’t wait to get my hands on that one. See what, see what I can see again. I wonder if it’s only applicable with vector searches or if top with approximate is, uh, is, is usable in other, uh, non-vector index, uh, non-vector, non-vector searches, but we’ll, we’ll see.
Um, maybe, maybe that’s said in the post. I don’t know. I haven’t read all of it too closely. I just got so excited. But anyway, uh, if zoom, it will unzoom now, now that I’m done with you. Thank you. Uh, apparently there’s also some cool optimizer stuff in here, um, where the optimizer will choose depending on, hello, zoom it. Uh, the optimizer will choose between when to do a vector search, when to do an exact search, uh, based on, I guess, some various factors here. So, uh, again, very good job, um, everyone who worked on this. This is very exciting stuff for those of us who, um, have an interest in vector search and SQL Server 20, well, I guess not just 2025.
So I suppose it’s all in Azure as well. Uh, not, not, not being a huge disappointing stink bomb. So, uh, this, this all looks great to me. It all sounds great to me. As soon as I get my hands on it and I get to start messing with it, I will, I will of course report back. And, um, what do you call it? What was the other thing? Uh, I don’t know. So, uh, I, I tried to ask about when this might make its way to us, uh, earthly denizens who you, who still use on-prem SQL, uh, what cumulative update it might land in, but not sure on that yet. Um, so, anyway, uh, exciting news. Very happy about this. Again, good job to all involved and, uh, I cannot wait to get my hands on it.
Alright. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

The post DiskANN Vector Index Improvements appeared first on Darling Data.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

There Are Days When I Feel Like Giving Up on the Plan Cache and Query Store.

1 Share

In theory, SQL Server performance monitoring is pretty simple:

  1. Review the server’s top wait types
  2. Find the queries causing those wait types
  3. Fix those queries, or improve the way the server reacts to them (indexes, settings, etc.)

But in practice, step 2 is awful because:

  • Apps send unparameterized strings to the database server
  • Entity Framework users build queries with FromSqlRaw or string.Format()
  • Entity Framework users write queries with .Contains, which builds an unparameterized IN list, even when they’re only looking for a single value (which got better in EF9)
  • People write sloppy dynamic SQL that just concatenates values directly into the query string
  • SaaS developers put each client in their own database, and plans aren’t reused across databases

So as a result, the plan cache and Query Store are damn near useless because every query that comes in is seen as a “new” query. I wrote about this back in 2018, and since then, the problem seems like it’s gotten continuously worse. I’ve been tracking client servers over the last couple of years, and these days, 1 in 3 servers I face has these issues. My lens might be distorted since maybe people who aren’t having this problem are all solving their performance issues with conventional tools like sp_BlitzCache and Query Store, but even if that’s true, there’s a problem bigger than human query troubleshooting.

Modern versions of SQL Server are increasingly reliant on properly parameterized queries. Tools like Automatic Plan Regression (aka Automatic Tuning), Adaptive Memory Grants, Adaptive Joins, CE Feedback, DOP Feedback, Parameter-Sensitive Plan Optimization, and Optional Parameter Plan Optimization all rely on proper parameterization so that they can adapt to the same query over time. If every query comes in wearing a disguise, these features just don’t work.

There’s a database-level switch that’s supposed to help: Forced Parameterization. Turn it on, and SQL Server and Azure SQL DB examine every incoming query, and if it isn’t fully parameterized, the literals are stripped out and replaced with variables. The problem is that it doesn’t work in a lot of situations:

I'M MAD AS HELL AND I'M NOT ... actually I'm going to keep taking it because I get paid to take it

  • Partially parameterized queries – if a query has any parameters, Microsoft assumes the whole thing is parameterized, which is especially problematic for EF’s .Contains
  • Literals in the SELECT list, like SELECT 1 AS ClientId, …  – which always blows me away when I see them, but strangely it’s a commonly used technique for reasons I will never understand
  • Literals in HAVING, GROUP BY, and ORDER BY
  • And more, as explained in this SQL Server 2008 R2 documentation page that has never been updated

Even when Forced Parameterization does work, turning it on suddenly causes parameter sniffing emergencies. Queries that used to get their own hand-crafted plans suddenly get reusable plans, and while that’s great for performance monitoring, it’s not so great for end user performance in some scenarios. Your application might have 1,000 queries, and 990 of them might be just fine with reusable queries – but those 10 others represent 10 different parameter sniffing emergency situations that are going to strike out of nowhere, and keep striking if you don’t fix the queries for good.

I don’t have any answers.

It just feels like there are two kinds of shops:

  • The ones who properly parameterize everything, and can leverage the plan cache and Query Store, but suffer from parameter sniffing emergencies. So they have good monitoring, but they need it, because they face performance emergencies from time to time.
  • The ones who don’t parameterize at least some of their code, so the plan cache & Query Store are largely useless at best, misleading at worst – but they don’t have parameter sniffing emergencies. So they have bad monitoring, but … they don’t care as much.

Some days, I look at that latter group and say, I get it. Not all the time! Most of the time, I want the cool features built into modern versions of SQL Server, Azure SQL DB, and Intelligent Query Processing. But some days… some days, I want to embed literals into all my queries.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories