Chromaforge is a Go CLI that reconstructs the AcoustID fingerprint SQLite database used by Chromakopia. It is designed for the one-time initial build on an Azure L16s_v3 VM with local NVMe scratch space and a managed disk for the finished database.
Incremental updates are handled by Chromakopia, not this repository.
chromaforge build
- Replays the AcoustID daily JSON update archive from
https://data.acoustid.org/ - Builds a fresh SQLite database with the libSQL driver
- Uses a local cache directory beside
--dbunless--cache-diris set - Places SQLite temp files under the cache path by default;
--temp-diroverrides it - Prefetches upcoming AcoustID archive files in background download workers while replay/index work is running
--download-workerscontrols that background download concurrency--gomaxprocs,--decode-workers, and--workerslet you tune CPU/core usage explicitly--cache-sizeand--mmap-sizetune replay/write memory, while--index-cache-sizeand--index-mmap-sizetune the later index-build phase- On first
Ctrl+C, finishes the current day, saves resume progress beside--db, and exits cleanly; a secondCtrl+Caborts immediately - Supports
--soft-heap-limitto cap SQLite heap usage for the process - Uses unsafe bulk-load mode with journaling disabled during replay/index builds, then finalizes the database back to WAL
- Defers the final
acoustidunique index andidx_hashuntil bulk inserts complete - Supports
--skip-validateso build completion is not blocked on validation - Optionally
rsyncs the final.dbto the configured output path - Optionally triggers Azure VM self-deallocation
chromaforge validate
- Verifies the final tables and indexes exist
- Performs sampled acoustid and hash spot checks without
ORDER BY RANDOM() - Skips
PRAGMA quick_checkby default for speed - Supports
--quick-checkwhen you want the slower SQLite consistency pass - Supports
--full-integrity-checkwhen you want the slowest fullPRAGMA integrity_check - Supports
--count-rowswhen you want exactCOUNT(*)scans instead of the fast default
chromaforge backfill-metadata
- Replays archive metadata into an existing database without rebuilding
sub_fingerprints - Fills missing
mb_idanddurationvalues in place - Uses
--decode-workersto parallelize fingerprint JSON decode/filter work while keeping SQLite writes sequential - Uses a separate resume file beside
--dbso interrupted backfills can continue later - Leaves existing fingerprint hashes and indexes intact
chromaforge match
- Accepts a raw Chromaprint fingerprint with
--fingerprintor--fingerprint-file - Accepts
fpcalc -rawoutput directly, includingDURATION=... - Uses the same sampled sub-fingerprint hashing the builder stored in SQLite
- Applies a small duration filter by default when query duration is known
- Returns the top local candidate matches ranked by aligned hash hits
chromaforge version
- Prints version metadata injected at build time
- Go 1.24+
- Network access to
https://data.acoustid.org/ - CGO-enabled builds
rsync is only required when using --output.
go build ./cmd/chromaforgeExample:
chromaforge build \
--db /mnt/nvme/chromakopia.db \
--gomaxprocs 12 \
--download-workers 12 \
--temp-dir /mnt/nvme/.chromaforge-tmp \
--cache-size 4294967296 \
--mmap-size 4294967296 \
--index-cache-size 2147483648 \
--index-mmap-size 2147483648 \
--workers 16 \
--decode-workers 16 \
--batch-size 500 \
--skip-validate \
--soft-heap-limit 2147483648Azure VM example with copy + self-deallocate:
chromaforge build \
--db /mnt/nvme/chromakopia.db \
--output /mnt/disk/chromakopia.db \
--gomaxprocs 12 \
--download-workers 12 \
--temp-dir /mnt/nvme/.chromaforge-tmp \
--cache-size 4294967296 \
--mmap-size 4294967296 \
--index-cache-size 2147483648 \
--index-mmap-size 2147483648 \
--workers 16 \
--decode-workers 16 \
--batch-size 500 \
--soft-heap-limit 2147483648 \
--self-deallocatechromaforge validate --db /mnt/disk/chromakopia.dbQuick check example:
chromaforge validate \
--db /mnt/disk/chromakopia.db \
--quick-checkFull validation example:
chromaforge validate \
--db /mnt/disk/chromakopia.db \
--full-integrity-check \
--count-rows \
--timeout 0chromaforge backfill-metadata \
--db /mnt/disk/chromakopia.db \
--gomaxprocs 32 \
--decode-workers 32 \
--download-workers 16Raw fingerprint example:
chromaforge match \
--db /mnt/disk/chromakopia.db \
--fingerprint '123,456,789,101112'fpcalc -raw example:
fpcalc -raw song.mp3 | chromaforge match \
--db /mnt/disk/chromakopia.db \
--fingerprint-file -Disable duration filtering:
fpcalc -raw song.mp3 | chromaforge match \
--db /mnt/disk/chromakopia.db \
--fingerprint-file - \
--duration-window 0Deploy only the build path from this repo:
- Create the resource group.
- Create the managed disk that will persist
chromakopia.db. - Create a user-assigned managed identity for the build VM.
- Grant that identity
Virtual Machine Contributorscoped to the VM or an appropriate parent scope. - Create the
L16s_v3VM. - Attach the managed disk.
- Paste
deploy/cloud-init.yamlinto the VM Custom data field.
The build VM downloads the latest chromaforge binary, mounts the managed disk and local NVMe, runs the build, copies the resulting database with rsync, and then asks Azure to deallocate the VM.
The included Dockerfile provides a reproducible build image:
docker build -t chromaforge:latest .- The final database contains only
fingerprintsandsub_fingerprints, plusidx_hash. - Build-time replay state is held outside the final schema.
track_meta-updatefiles are ignored becausetitleandartistare no longer stored in the database.- Metadata backfill and duplicate-acoustid ingest only fill missing metadata fields; they do not overwrite existing non-empty values.
Apache License 2.0. See LICENSE.