A Linux FUSE filesystem that encodes files as DNA sequences (FASTA) for DNA storage research.
POSIX tools (cp, cat, rm) work on a normal mount point. On flush, file data is compressed (Zstd), encrypted (AES-256-GCM), Reed-Solomon encoded, translated into homopolymer-safe DNA oligos, and written to RAM-backed storage (tmpfs). Wet-lab synthesis is not implemented yet; output is DNPX1 .fasta suitable for oligo ordering APIs.
- An open-source research prototype OS bridge between files and DNA-encoded storage
- DNPX1 container format with Reed-Solomon FEC (8+4 shards, ~40% parity)
- Inspired by Goldman et al. (Nature, 2013) homopolymer constraints
- Reproducible benchmarks, CI, mock synthesizer API, grant documentation
- Not a replacement for SSDs or cloud storage
- Not connected to physical DNA synthesizers in the current release
- Not validated by independent wet-lab round-trip experiments yet (playbook included)
graph TD
A[POSIX write] --> B[Zstd compress]
B --> C[AES-256-GCM encrypt]
C --> D[RS 8+4 FEC]
D --> E[DNA encode DNPX1]
E --> F[tmpfs pool A]
E --> G[tmpfs pool B]
- Compress — reduces oligo synthesis cost for repetitive data
- Encrypt — Argon2id-derived key, random salt in journal
- FEC — Reed-Solomon erasure coding per 256-byte stripe
- Encode — homopolymer-safe trinary state machine (A/C/G/T)
- Store — mirrored
.fastafiles on two localtmpfsmounts
| Component | Description |
|---|---|
| FUSE VFS | create, read, write, unlink via standard shell tools |
| DNPX1 codec | RS(8+4) FEC + cross-stripe fountain + GC-aware DNA (100% recovery @ 2% dropout) |
| Pluggable codec | StorageCodec trait + ExternalCodec adapter (DNA-Aeon, DNA Fountain, DNA-RS, NOREC4DNA presets) |
| Codec leaderboard | neutral arena comparing all codecs on one channel (docs/LEADERBOARD.md) |
| Channel profiles | DT4DDS-inspired Illumina / photolitho / decay / nanopore models |
| PCR random access | retrieve one file from a multi-file pool by primer (pool.rs) |
| Sequencing sim | coverage consensus + indel-aware resync decode |
| Partial read | decode only stripes covering a byte range |
| Journal | manifest.json for file metadata across remounts |
| Encryption | AES-256-GCM; key zeroized on unmount |
| Benchmarks | make reproduce → CSV + break-even table |
| Mock API | HTTP synthesizer backend (crates/dna_api_mock) |
| Grant pack | docs/grants/, wet-lab playbook, preprint draft |
- Rust (
dna_vfs_core) libfuseviafuserreed-solomon-erasurefor FEC- Linux or WSL2
sudo apt install fuse libfuse-dev pkg-config
# rustup: https://rustup.rsgit clone https://github.com/thesnmc/DNA-POSIX.git
cd DNA-POSIX
./launch_lab.shEnter a mount password when prompted. Files go under dna_vfs/bio_drive/.
make reproduce
cat benchmarks/results.csv
cat benchmarks/break_even.csvCompare DNPX1 against reference codecs (Church 2012, Goldman 2013, 2-bit ceiling) on the same corpus and channel — or plug in an external encoder:
cargo run --no-default-features --bin dna_leaderboard -- --corpus benchmarks/corpus
cat benchmarks/leaderboard.csvSee docs/LEADERBOARD.md.
See docs/SPEC.md, docs/CODEC.md, docs/ARCHITECTURE.md.
cargo run --release -p dna_api_mock
curl http://127.0.0.1:8787/v1/healthdna_vfs_core/ Rust library, FUSE daemon, benchmark binary
crates/dna_api_mock/ Reference synthesizer HTTP API
benchmarks/ Reproducible benchmark script
docs/ Spec, codec, grants, preprint
wetlab/ Physical round-trip playbook
dna_vfs/ Mount point and cache directories (runtime)
| Project | Relationship |
|---|---|
| Goldman et al. 2013 | Encoding inspiration |
| OligoArchive (CIDR '19) | DB archival tier; wet-lab round-trip |
| LiqSD (FAST '25) | DNA block device; RS + FUSE stack |
| VibeDNA | Similar “DNA filesystem” concept |
Our wedge: open, reproducible POSIX bridge + DNPX1 spec + India/sovereign archival narrative.
See CONTRIBUTING.md.
See LICENSE.