Skip to content

thesnmc/DNA-POSIX

Repository files navigation

DNA-POSIX

A Linux FUSE filesystem that encodes files as DNA sequences (FASTA) for DNA storage research.

POSIX tools (cp, cat, rm) work on a normal mount point. On flush, file data is compressed (Zstd), encrypted (AES-256-GCM), Reed-Solomon encoded, translated into homopolymer-safe DNA oligos, and written to RAM-backed storage (tmpfs). Wet-lab synthesis is not implemented yet; output is DNPX1 .fasta suitable for oligo ordering APIs.

License Platform

What this is

  • An open-source research prototype OS bridge between files and DNA-encoded storage
  • DNPX1 container format with Reed-Solomon FEC (8+4 shards, ~40% parity)
  • Inspired by Goldman et al. (Nature, 2013) homopolymer constraints
  • Reproducible benchmarks, CI, mock synthesizer API, grant documentation

What this is not

  • Not a replacement for SSDs or cloud storage
  • Not connected to physical DNA synthesizers in the current release
  • Not validated by independent wet-lab round-trip experiments yet (playbook included)

Pipeline

graph TD
    A[POSIX write] --> B[Zstd compress]
    B --> C[AES-256-GCM encrypt]
    C --> D[RS 8+4 FEC]
    D --> E[DNA encode DNPX1]
    E --> F[tmpfs pool A]
    E --> G[tmpfs pool B]
Loading
  1. Compress — reduces oligo synthesis cost for repetitive data
  2. Encrypt — Argon2id-derived key, random salt in journal
  3. FEC — Reed-Solomon erasure coding per 256-byte stripe
  4. Encode — homopolymer-safe trinary state machine (A/C/G/T)
  5. Store — mirrored .fasta files on two local tmpfs mounts

Features

Component Description
FUSE VFS create, read, write, unlink via standard shell tools
DNPX1 codec RS(8+4) FEC + cross-stripe fountain + GC-aware DNA (100% recovery @ 2% dropout)
Pluggable codec StorageCodec trait + ExternalCodec adapter (DNA-Aeon, DNA Fountain, DNA-RS, NOREC4DNA presets)
Codec leaderboard neutral arena comparing all codecs on one channel (docs/LEADERBOARD.md)
Channel profiles DT4DDS-inspired Illumina / photolitho / decay / nanopore models
PCR random access retrieve one file from a multi-file pool by primer (pool.rs)
Sequencing sim coverage consensus + indel-aware resync decode
Partial read decode only stripes covering a byte range
Journal manifest.json for file metadata across remounts
Encryption AES-256-GCM; key zeroized on unmount
Benchmarks make reproduce → CSV + break-even table
Mock API HTTP synthesizer backend (crates/dna_api_mock)
Grant pack docs/grants/, wet-lab playbook, preprint draft

Tech stack

  • Rust (dna_vfs_core)
  • libfuse via fuser
  • reed-solomon-erasure for FEC
  • Linux or WSL2

Getting started

Prerequisites

sudo apt install fuse libfuse-dev pkg-config
# rustup: https://rustup.rs

Build and run

git clone https://github.com/thesnmc/DNA-POSIX.git
cd DNA-POSIX
./launch_lab.sh

Enter a mount password when prompted. Files go under dna_vfs/bio_drive/.

Reproduce all numbers

make reproduce
cat benchmarks/results.csv
cat benchmarks/break_even.csv

Neutral codec leaderboard

Compare DNPX1 against reference codecs (Church 2012, Goldman 2013, 2-bit ceiling) on the same corpus and channel — or plug in an external encoder:

cargo run --no-default-features --bin dna_leaderboard -- --corpus benchmarks/corpus
cat benchmarks/leaderboard.csv

See docs/LEADERBOARD.md.

See docs/SPEC.md, docs/CODEC.md, docs/ARCHITECTURE.md.

Mock synthesizer API

cargo run --release -p dna_api_mock
curl http://127.0.0.1:8787/v1/health

Project layout

dna_vfs_core/          Rust library, FUSE daemon, benchmark binary
crates/dna_api_mock/   Reference synthesizer HTTP API
benchmarks/            Reproducible benchmark script
docs/                  Spec, codec, grants, preprint
wetlab/                Physical round-trip playbook
dna_vfs/               Mount point and cache directories (runtime)

Prior art (we compare honestly)

Project Relationship
Goldman et al. 2013 Encoding inspiration
OligoArchive (CIDR '19) DB archival tier; wet-lab round-trip
LiqSD (FAST '25) DNA block device; RS + FUSE stack
VibeDNA Similar “DNA filesystem” concept

Our wedge: open, reproducible POSIX bridge + DNPX1 spec + India/sovereign archival narrative.

Contributing

See CONTRIBUTING.md.

License

See LICENSE.

About

The ultimate POSIX-to-DNA kernel module. By leveraging Triple Modular Redundancy, AES-128 encryption, and dynamic biological primers, this engine translates digital files into self-healing, printable oligos—treating liquid biology as a 1.0 PB zero-trust hard drive.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors