Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
151936 stories
·
33 followers

Static Allocation For Compilers

1 Share

Static Allocation For Compilers

TigerBeetle famously uses “static allocation”. Infamously, the use of the term is idiosyncratic: what is meant is not static arrays, as found in embedded development, but rather a weaker “no allocation after startup” form. The amount of memory TigerBeetle process uses is not hard-coded into the Elf binary. It depends on the runtime command line arguments. However, all allocation happens at startup, and there’s no deallocation. The long-lived event loop goes round and round happily without alloc.

I’ve wondered for years if a similar technique is applicable to compilers. It seemed impossible, but today I’ve managed to extract something actionable from this idea?

Static Allocation

Static allocation depends on the physics of the underlying problem. And distributed databases have surprisingly simple physics, at least in the case of TigerBeetle.

The only inputs and outputs of the system are messages. Each message is finite in size (1MiB). The actual data of the system is stored on disk and can be arbitrarily large. But the diff applied by a single message is finite. And, if your input is finite, and your output is finite, it’s actually quite hard to need to allocate extra memory!

This is worth emphasizing — it might seem like doing static allocation is tough and requires constant vigilance and manual accounting for resources. In practice, I learned that it is surprisingly compositional. As long as inputs and outputs of a system are finite, non-allocating processing is easy. And you can put two such systems together without much trouble. routing.zig is a good example of such an isolated subsystem.

The only issue here is that there isn’t a physical limit on how many messages can arrive at the same time. Obviously, you can’t process arbitrary many messages simultaneously. But in the context of a distributed system over an unreliable network, a safe move is to drop a message on the floor if the required processing resources are not available.

Counter-intuitively, not allocating is simpler than allocating, provided that you can pull it off!

For Compilers

Alas, it seems impossible to pull it off for compilers. You could say something like “hey, the largest program will have at most one million functions”, but that will lead to both wasted memory and poor user experience. You could also use a single yolo arena of a fixed size, like I did in Hard Mode Rust, but that isn’t at all similar to “static allocation”. With arenas, the size is fixed explicitly, but you can OOM. With static allocation it is the opposite — no OOM, but you don’t know how much memory you’ll need until startup finishes!

The “problem size” for a compiler isn’t fixed — both the input (source code) and the output (executable) can be arbitrarily large. But that is also the case for TigerBeetle — the size of the database is not fixed, it’s just that TigerBeetle gets to cheat and store it on disk, rather than in RAM. And TigerBeetle doesn’t do “static allocation” on disk, it can fail with ENOSPACE at runtime, and it includes a dynamic block allocator to avoid that as long as possible by re-using no longer relevant sectors.

So what we could say is that a compiler consumes arbitrarily large input, and produces arbitrarily large output, but those “do not count” for the purpose of static memory allocation. At the start, we set aside an “output arena” for storing finished, immutable results of compiler’s work. We then say that this output is accumulated after processing a sequence of chunks, where chunk size is strictly finite. While limiting the total size of the code-base is unreasonable, limiting a single file to, say, 4 MiB (runtime-overridable) is fine. Compiling then essentially becomes a “stream processing” problem, where both inputs and outputs are arbitrary large, but the filter program itself must execute in O(1) memory.

With this setup, it is natural to use indexes rather than pointers for “output data”, which then makes it easy to persist it to disk between changes. And it’s also natural to think about “chunks of changes” not only spatially (compiler sees a new file), but also temporally (compiler sees a new version of an old file).

Is there any practical benefits here? I don’t know! But seems worth playing around with! I feel that a strict separation between O(N) compiler output and O(1) intermediate processing artifacts can clarify compiler’s architecture, and I won’t be too surprised if O(1) processing in compilers would lead to simpler code the same way it does for databases?

Read the whole story
alvinashcraft
3 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

LEDs, Tabs, an Open Source Fun!

1 Share
From: Fritz's Tech Tips and Chatter
Duration: 0:00
Views: 17

Made with Restream. Livestream on 30+ platforms at once via https://restream.io

Let's recap the holiday, look at some of your code and get a few updates out the door for TAML and CodeMedic

Read the whole story
alvinashcraft
6 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

What Does a Database for SSDs Look Like?

1 Share

What Does a Database for SSDs Look Like?

Maybe not what you think.

Over on X, Ben Dicken asked:

What does a relational database designed specifically for local SSDs look like? Postgres, MySQL, SQLite and many others were invented in the 90s and 00s, the era of spinning disks. A local NVMe SSD has ~1000x improvement in both throughput and latency. Design decisions like write-ahead logs, large page sizes, and buffering table writes in bulk were built around disks where I/O was SLOW, and where sequential I/O was order(s)-of-magnitude faster than random. If we had to throw these databases away and begin from scratch in 2025, what would change and what would remain?

How might we tackle this question quantitatively for the modern transaction-orientated database?

But first, the bigger picture. It’s not only SSDs that have come along since databases like Postgres were first designed. We also have the cloud, with deployments to excellent datacenter infrastructure, including multiple independent datacenters with great network connectivity between them, available to all. Datacenter networks offer 1000x (or more) increased throughput, along with latency in the microseconds. Servers with hundreds of cores and thousands of gigabytes of RAM are mainstream.

Applications have changed too. Companies are global, businesses are 24/7. Down time is expensive, and that expense can be measured. The security and compliance environment is much more demanding. Builders want to deploy in seconds, not days.

Approach One: The Five Minute Rule

Perhaps my single favorite systems paper, The 5 Minute Rule… by Jim Gray and Franco Putzolu gives us a very simple way to answer one of the most important questions in systems: how big should caches be? The five minute rule is that, back in 1986, if you expected to read a page again within five minutes you should keep in in RAM. If not, you should keep it on disk. The basic logic is that you look at the page that’s least likely to be re-used. If it’s cheaper to keep around until it’s next expected re-use, then you should keep more. If it’s cheaper to reload from storage than keep around, then you should keep less1. Let’s update the numbers for 2025, assuming that pages are around 32kB2 (this becomes important later).

The EC2 i8g.48xlarge delivers about 1.8 million read iops of this size, at a price of around $0.004576 per second, or \(10^{-9}\) dollars per transfer (assuming we’re allocating about 40% of the instance price to storage). About one dollar per billion reads. It also has enough RAM for about 50 million pages of this size, costing around \(3 \times 10^{-11}\) dollars to storage a page for one second.

So, on this instance type, we should size our RAM cache to store pages for about 30 seconds. Not too different from Gray and Putzolu’s result 40 years ago!

That’s answer number one: the database should have a cache sized so that the hot set contains pages expected to be accessed in the next 30 seconds, for optimal cost. For optimal latency, however, the cache may want to be considerably bigger.

Approach Two: The Throughput/IOPS Breakeven Point

The next question is what size accesses we want to send to our storage devices to take best advantage of their performance. In the days of spinning media, the answer to this was surprisingly big: a 100MB/s disk could generally do around 100 seeks a second, so if your transfers were less than around 1MB you were walking away from throughput. Give or take a factor of 2. What does it look like for modern SSDs?

SSDs are much faster on both throughput and iops. They’re less sensitive than spinning drives to workload patterns, but read/write ratios and the fullness of the drives still matter. Absent benchmarking on the actual hardware with the real workload, my rule of thumb is that SSDs are throughput limited for transfers bigger than 32kB, and iops limited for transfers smaller than 32kB.

Making transfers bigger than 32kB doesn’t help throughput much, reduces IOPS, and probably makes the cache less effective because of false sharing and related effects. This is especially important for workloads with poor spatial locality.

So that’s answer number two: we want our transfers to disk not to be much smaller than 32kB on average, or we’re walking away from throughput.

Approach Three: Durability and Replication

Building reads on local SSDs is great: tons of throughput, tons of iops. Writes on local SSDs, on the other hand, have the distinct problem of only being durable on the local box, which is unacceptable for most workloads. Modern hardware is very reliable, but thinking through the business risks of losing data on failover isn’t very fun at all, so let’s assume that our modern database is going to replicate off-box, making at least one more synchronous copy. Ideally in a different availability zone (AZ).

That i8g.48xlarge we were using for our comparison earlier has 100Gb/s (or around 12GB/s) of network bandwidth. That puts a cap on how much write throughput we can have for a single-leader database. Cross-AZ latency in EC2 varies from a couple hundred microseconds to a millisecond or two, which puts a minimum on our commit latency.

That gives us answer number three: we want to incur cross-AZ latency only at commit time, and not during writes.

Which is where we run into one of my favorite topics: isolation. The I in ACID. A modern database design will avoid read-time coordination using multiversioning, but to offer isolation stronger than READ COMMITTED will need to coordinate either on each write or at commit time. It can do that like, say, Aurora Postgres does, having a single leader at a time running in a single AZ. This means great latency for clients in that zone, and higher latency for clients in different AZs. Given that most applications are hosted in multiple AZs, this can add up for latency-sensitive applications which makes a lot of round trips to the database. The alternative approach is the one Aurora DSQL takes, doing the cross-AZ round trip only at COMMIT time, saving round-trips.

Here’s me talking about the shape of that trade-off at re:Invent this year:

There’s no clear answer here, because there are real trade-offs between the two approaches. But do make sure to ask your database vendor whether those impressive latency benchmarks are running where you application actually runs. In the spirit of the original question, though, the incredible bandwidth and latency availability in modern datacenter networks is as transformative as SSDs in database designs. Or should be.

While we’re incurring the latency cost of synchronous replication, we may as well get strongly consistent scale-out reads for free. In DSQL, we do this using high-quality hardware clocks that you can use too. Another nice win from modern hardware. There are other approaches too.

That’s answer number four for me: The modern database uses high-quality clocks and knowledge of actual application architectures to optimize for real-world performance (like latency in multiple availability zones or regions) without compromising on strong consistency.

Approach Four: What about that WAL?

Design decisions like write-ahead logs, large page sizes, and buffering table writes in bulk were built around disks where I/O was SLOW, and where sequential I/O was order(s)-of-magnitude faster than random.

WALs, and related low-level logging details, are critical for database systems that care deeply about durability on a single system. But the modern database isn’t like that: it doesn’t depend on commit-to-disk on a single system for its durability story. Commit-to-disk on a single system is both unnecessary (because we can replicate across storage on multiple systems) and inadequate (because we don’t want to lose writes even if a single system fails).

That’s answer number five: the modern database commits transactions to a distributed log, which provides multi-machine multi-AZ durability, and might provide other services like atomicity. Recovery is a replay from the distributed log, on any one of a number of peer replicas.

What About Data Structures?

B-Trees versus LSM-trees vs B-Tree variants versus LSM variants versus other data structures are trade-offs that have a lot to do with access patterns and workload patterns. Picking a winner would be a whole series of blog posts, so I’m going to chicken out and say its complicated.

Conclusion

If we had to throw these databases away and begin from scratch in 2025, what would change and what would remain?

I’d keep the relational model, atomicity, isolation (but would probably pick SNAPSHOT as a default), strong consistency, SQL, interactive transactions, and the other core design decisions of relational databases. But I’d move durability, read and write scale, and high availability into being distributed rather than single system concerns. I think that helps with performance and cost, while making these properties easier to achieve. I’d mostly toss out local durability and recovery, and all the huge history of optimizations and data structures around that3, in favor of getting better properties in the distributed setting. I’d pay more attention to internal strong isolation (in the security sense) between clients and workloads. I’d size caches for a working set of between 30 seconds and 5 minutes of accesses. I’d optimize for read transfers around that 32kB sweet spot from local SSD, and the around 8kB sweet spot for networks.

Probably more stuff too, but this is long enough as-is.

Other topics worth covering include avoiding copies on IO, co-design with virtualization (e.g. see our Aurora Serverless paper), trade-offs of batching, how the relative performance of different isolation levels changes, what promises to give clients, encryption and authorization of data at rest and in motion, dealing with very hot single items, new workloads like vector, verifiable replication journals, handing off changes to analytics systems, access control, multi-tenancy, forking and merging, and even locales.

Footnotes

  1. The reasoning is slightly smarter, thinking about the marginal page and marginal cost of memory, but this simplification works for our purposes here. The marginal cost of memory is particularly interesting in a provisioned system, because it varies between zero (you’ve paid for it already) and huge (you need a bigger instance size). One of the really nice things about serverless (like DSQL) and dynamic scaling (like Aurora Serverless) is that it makes the marginal cost constant, greatly simplifying the task of reasoning about cache size.
  2. Yes, I know that pages are typically 4kB or 2MB, but bear with me here.
  3. Sorry ARIES.
Read the whole story
alvinashcraft
6 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Performance regressions in MySQL 8.4 and 9.x with sysbench

1 Share

I have been claiming that I don't find significant performance regressions in MySQL 8.4 and 9.x when I use sysbench. I need to change that claim. There are regressions for write-heavy tests, they are larger for tests with more concurrency and larger when gtid support is enabled.

By gtid support is enabled I mean that these options are set to ON:

Both of these are ON by default in MySQL 9.5.0 and were OFF by default in earlier releases. I just learned about the performance impact from these and in future tests I will make probably repeat tests with them set to ON and OFF.

This blog post has results from the write-heavy tests with sysbench for MySQL 8.0, 8.4, 9.4 and 9.5 to explain my claims above.

tl;dr

  • Regressions are most likely and larger on the insert test
  • There are regressions for write-heavy workloads in MySQL 8.4 and 9.x
    • Throughput is typically 15% less in MySQL 9.5 than in 8.0 for tests with 16 clients on the 24-core/2-socket srever
    • Throughput is typically 5% less in MySQL 9.5 than 8.0 for tests with 40 clients on the 48-core server
  • The regressions are larger when gtid_mode and enforce_gtid_consistency are set to ON
    • Throughput is typically 5% to 10% less with the -gtid configs vs the -nogtid configs with 40 clients on the 48-core server. But this is less of an issue on other servers.
    • There are significant increases in CPU, context switch rates and KB written to storage for the -gtid configs relative to the same MySQL version using the -nogtid configs
  • Regressions might be larger for the insert and update-inlist tests because they have larger transactions relative to other write-heavy tests. Performance regressions are correlated with increases in CPU, context switches and KB written to storage per transaction.
What changed?

I use diff to compare the output from SHOW GLOBAL VARIABLES when I build new releases and from that it is obvious that the default value for gtid_mode and enforce_gtid_consistency changed in MySQL 9.5 but I didn't appreciate the impact from that change.

Builds, configuration and hardware

I compiled MySQL from source for versions 8.0.44, 8.4.6, 8.4.7, 9.4.0 and 9.5.0.

The versions that I tested are named:
  • 8.0.44-nogtid
    • MySQL 8.0.44 with gtid_mode and enforce_gtid_consistency =OFF
  • 8.0.44-gtid
    • MySQL 8.0.44 with gtid_mode and enforce_gtid_consistency =ON
  • 8.4.7-notid
    • MySQL 8.4.7 with gtid_mode and enforce_gtid_consistency =OFF
  • 8.4.7-gtid
    • MySQL 8.4.7 with gtid_mode and enforce_gtid_consistency =ON
  • 9.4.0-nogtid
    • MySQL 9.4.0 with gtid_mode and enforce_gtid_consistency =OFF
  • 9.4.0-gtid
    • MySQL 9.4.0 with gtid_mode and enforce_gtid_consistency =ON
  • 9.5.0-nogtid
    • MySQL 9.5.0 with gtid_mode and enforce_gtid_consistency =OFF
  • 9.5.0-gtid
    • MySQL 9.5.0 with gtid_mode and enforce_gtid_consistency =ON

The servers are:
  • 8-core
    • The server is an ASUS ExpertCenter PN53 with and AMD Ryzen 7 7735HS CPU, 8 cores, SMT disabled, 32G of RAM. Storage is one NVMe device for the database using ext-4 with discard enabled. The OS is Ubuntu 24.04.
    • my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
    • my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
    • The benchmark is run with 1 thread, 1 table and 50M rows per table
  • 24-core
    • The server is a SuperMicro SuperWorkstation 7049A-T with 2 sockets, 12 cores/socket, 64G RAM, one m.2 SSD (2TB,  ext4 with discard enabled). The OS is Ubuntu 24.04. The CPUs are Intel Xeon Silver 4214R CPU @ 2.40GHz.
    • my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
    • my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
    • The benchmark is run with 16 threads, 8 tables and 10M rows per table
  • 48-core
    • The server is ax162s from Hetzner with an AMD EPYC 9454P 48-Core Processor with SMT disabled and 128G of RAM. Storage is 2 Intel D7-P5520 NVMe devices with RAID 1 (3.8T each) using ext4. The OS is Ubuntu 22.04 running the non-HWE kernel (5.5.0-118-generic).
    • my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
    • my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
    • The benchmark is run with 40 threads, 8 tables and 10M rows per table
Benchmark

I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by InnoDB. While I ran all of the tests, I only share results from a subset of the write-heavy tests.

The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 900 seconds. 

The purpose is to search for regressions from new CPU overhead and mutex contention. The workload is cached -- there should be no read IO but will be some write IO.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. Here I only share results from a subset of the write-heavy tests.

I provide charts below with relative QPS. The relative QPS is the following:
(QPS for some version) / (QPS for MySQL 8.0.44)
When the relative QPS is > 1 then some version is faster than MySQL 8.0.44.  When it is < 1 then there might be a regression. When the relative QPS is 1.2 then some version is about 20% faster than MySQL 8.0.44.

Values from iostat and vmstat divided by QPS are here for the 8-core, 24-core and 48-core servers. These can help to explain why something is faster or slower because it shows how much HW is used per request, including CPU overhead per operation (cpu/o) and context switches per operation (cs/o) which are often a proxy for mutex contention.

The spreadsheet and charts are here and in some cases are easier to read than the charts below. The y-axis doesn't start at 0 to improve readability.

Results: 8-core

Summary
  • For many tests there are small regressions from 8.0 to 8.4 and 8.4 to 9.x
  • There are small improvements (~5%) for the -gtid configs vs the -nogtid result for update-index
  • There is a small regression (~5%) for the -gtid configs vs the -nogtid result for insert
  • There are small regression (~1%) for the -gtid configs vs the -nogtid result for other tests
From vmstat metrics for the insert test where perf decreases with the 9.5.0-gtid result
  • CPU per operation (cpu/o) increases by 1.10X with the -gtid config
  • Context switches per operation (cs/o) increases by 1.45X with the -gtid config
  • KB written to storage per commit (wKB/o) increases by 1.16X with the -gtid config
From vmstat metrics for the update-index test where perf increases with the 9.5.0-gtid result
  • CPU per operation (cpu/o) decreases by ~3% with the -gtid config
  • Context switches per operation (cs/o) decrease by ~2% with the -gtid config
  • KB written to storage per commit (wKB/o) decreases by ~3% with the -gtid config
  • This result is odd. I might try to reproduce it in the future
Results: 24-core

Summary
  • For many tests there are regressions from 8.0 to 8.4 and 8.4 to 9.x and throughput is typically 15% less in 9.5.0 than 8.0.44
  • There are large regressions in 9.4 and 9.5 for update-inlist
  • There is usually a small regression (~5%) for the -gtid configs vs the -nogtid result
From vmstat metrics for the insert test comparing 9.5.0-gtid with 9.5.0-nogtid
  • Throughput is 1.15X larger in 9.5.0-nogtid
  • CPU per operation (cpu/o) is 1.15X larger in 9.5.0-gtid
  • Context switches per operation (cs/o) are 1.23X larger in 9.5.0-gtid
  • KB written to storage per commit (wKB/o) is 1.24X larger in 9.5.0-gtid
From vmstat metrics for the update-inlist comparing both 9.5.0-nogtid and 9.5.0-nogtid with 8.0.44-nogtid
  • The problems here look different than most other tests as the regressions in 9.4 and 9.5 are similar for the -gtid and -nogtid configs. If I have time I will get flamegraphs and PMP output. The server here has two sockets and can suffer more from false-sharing and real contention on cache lines.
  • Throughput is 1.43X larger in 8.0.44-nogtid
  • CPU per operation (cpu/o) is 1.05X larger in 8.0.44-nogtid
  • Context switches per operation (cs/o) are 1.18X larger in 8.0.44-nogtid
  • KB written to storage per commit (wKB/o) is ~1.12X larger in 9.5.0
Results: 48-core

Summary
  • For many tests there are regressions from 8.0 to 8.4
  • For some tests there are regressions from 8.4 to 9.x
  • There is usually a large regression for the -gtid configs vs the -nogtid result and the worst case occurs on the insert test
From vmstat metrics for the insert test comparing 9.5.0-gtid with 9.5.0-nogtid
  • Throughput is 1.17X larger in 9.5.0-nogtid
  • CPU per operation (cpu/o) is 1.13X larger in 9.5.0-gtid
  • Context switches per operation (cs/o) are 1.26X larger in 9.5.0-gtid
  • KB written to storage per commit (wKB/o) is 1.24X larger in 9.5.0-gtid
Read the whole story
alvinashcraft
6 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Performance for RocksDB 9.8 through 10.10 on 8-core and 48-core servers

1 Share

This post has results for RocksDB performance using db_bench on 8-core and 48-core servers. I previously shared results for RocksDB performance using gcc and clang and then for RocksDB on a small Arm server

tl;dr

  • RocksDB is boring, there are few performance regressions. 
  • There was a regression in write-heavy workloads with RocksDB 10.6.2. See bug 13996 for details. That has been fixed.
  • I will repeat tests in a few weeks

Software

I used RocksDB versions 9.8 through 10.0.

I compiled each version clang version 18.3.1 with link-time optimization enabled (LTO). The build command line was:

flags=( DISABLE_WARNING_AS_ERROR=1 DEBUG_LEVEL=0 V=1 VERBOSE=1 )

# for clang+LTO
AR=llvm-ar-18 RANLIB=llvm-ranlib-18 CC=clang CXX=clang++ \
    make "${flags[@]}" static_lib db_bench

Hardware

I used servers with 8 and 48 cores, both run Ubuntu 22.04:

  • 8-core
    • Ryzen 7 (AMD) CPU with 8 cores and 32G of RAM.
    • storage is one NVMe SSD with discard enabled and ext-4
    • benchmarks are run with 1 client, 20M KV pairs for byrx and 400M KV pairs for iobuf and iodir
  • 48-core
    • an ax162s from Hetzner with an AMD EPYC 9454P 48-Core Processor with SMT disabled, 128G of RAM
    • storage is 2 SSDs with RAID 1 (3.8T each) and ext-4.
    • benchmarks are run with 36 clients, 200M KV pairs for byrx and 2B KV pairs for iobuf and iodir

Benchmark

Overviews on how I use db_bench are here and here.

Most benchmark steps were run for 1800 seconds and all used the LRU block cache. I try to use Hyperclock on large servers but forgot that this time.

Tests were run for three workloads:

  • byrx - database cached by RocksDB
  • iobuf - database is larger than RAM and RocksDB used buffered IO
  • iodir - database is larger than RAM and RocksDB used O_DIRECT

The benchmark steps that I focus on are:
  • fillseq
    • load RocksDB in key order with 1 thread
  • revrangeww, fwdrangeww
    • do reverse or forward range queries with a rate-limited writer. Report performance for the range queries
  • readww
    • do point queries with a rate-limited writer. Report performance for the point queries.
  • overwrite
    • overwrite (via Put) random keys and wait for compaction to stop at test end

Relative QPS

Many of the tables below (inlined and via URL) show the relative QPS which is:
    (QPS for my version / QPS for RocksDB 9.8)

The base version varies and is listed below. When the relative QPS is > 1.0 then my version is faster than RocksDB 9.8. When it is < 1.0 then there might be a performance regression or there might just be noise.

The spreadsheet with numbers and charts is here. Performance summaries are here.

Results: cached database (byrx)

From 1 client on the 8-core server

  • Results are stable except for the overwrite test where there might be a regression, but I think that is noise after repeating this test 2 more times and the cause is that the base case (result from 9.8) was an outlier. I will revisit this.

From 36 clients on the 48-core server

  • Results are stable

Results: IO-bound with buffered IO (iobuf)

From 1 client on the 8-core server

  • Results are stable except for the overwrite test where there might be a large improvement. But I wonder if this is from noise in the result for the base case from RocksDB 9.8, just as there might be noice in the cached (byrx) results.
  • The regression in fillseq with 10.6.2 is from bug 13996

From 36 clients on the 48-core server
  • Results are stable except for the overwrite test where there might be a large improvement. But I wonder if this is from noise in the result for the base case from RocksDB 9.8, just as there might be noice in the cached (byrx) results.
  • The regression in fillseq with 10.6.2 is from bug 13996
Results: IO-bound with O_DIRECT (iodir)

From 1 client on the 8-core server

  • Results are stable
  • The regression in fillseq with 10.6.2 is from bug 13996

From 36 clients on the 48-core server

  • Results are stable
  • The regression in fillseq with 10.6.2 is from bug 13996


Read the whole story
alvinashcraft
7 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

TIL: Downloading archived Git repositories from archive.softwareheritage.org

1 Share

TIL: Downloading archived Git repositories from archive.softwareheritage.org

Back in February I blogged about a neat Python library called sqlite-s3vfs for accessing SQLite databases hosted in an S3 bucket, released as MIT licensed open source by the UK government's Department for Business and Trade.

I went looking for it today and found that the github.com/uktrade/sqlite-s3vfs repository is now a 404.

Since this is taxpayer-funded open source software I saw it as my moral duty to try and restore access! It turns out a full copy had been captured by the Software Heritage archive, so I was able to restore the repository from there. My copy is now archived at simonw/sqlite-s3vfs.

The process for retrieving an archive was non-obvious, so I've written up a TIL and also published a new Software Heritage Repository Retriever tool which takes advantage of the CORS-enabled APIs provided by Software Heritage. Here's the Claude Code transcript from building that.

Via Hacker News comment

Tags: archives, git, github, open-source, tools, ai, til, generative-ai, llms, ai-assisted-programming, claude-code

Read the whole story
alvinashcraft
10 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories