TIL: Downloading archived Git repositories from archive.softwareheritage.org
Back in February I blogged about a neat Python library calledsqlite-s3vfs for accessing SQLite databases hosted in an S3 bucket, released as MIT licensed open source by the UK government's Department for Business and Trade.
I went looking for it today and found that the github.com/uktrade/sqlite-s3vfs repository is now a 404.
Since this is taxpayer-funded open source software I saw it as my moral duty to try and restore access! It turns out a full copy had been captured by the Software Heritage archive, so I was able to restore the repository from there. My copy is now archived at simonw/sqlite-s3vfs.
The process for retrieving an archive was non-obvious, so I've written up a TIL and also published a new Software Heritage Repository Retriever tool which takes advantage of the CORS-enabled APIs provided by Software Heritage. Here's the Claude Code transcript from building that.
Tags: archives, git, github, open-source, tools, ai, til, generative-ai, llms, ai-assisted-programming, claude-code


