Lightweight immutable key-value store using S3 versioning
ImmuKV is a simple, serverless immutable key-value store that uses only S3 versioning - no DynamoDB, no background jobs, no complex infrastructure.
- Maximum simplicity - Just S3, no background repair jobs, no status tracking
- Global ordering - All changes recorded in versioned global log
- Fast key access - Single S3 read for latest value
- Automatic orphan repair - ETag-based conditional writes handle failures inline
- Cryptographic integrity - SHA-256 hash chain prevents tampering
- Global log (
_log.json) - Single versioned object containing all changes - Key objects (
keys/{key}.json) - One versioned object per key for fast access - Two-phase writes - Log first (never lost), then key object (may be orphaned temporarily)
- Inline repair - Orphaned entries automatically repaired during normal operations
Gains:
- Extreme simplicity (just S3, no background jobs)
- Fast key lookups (single S3 read)
- Lower cost (no DynamoDB)
- Lambda/serverless friendly
Limitations:
- Must read log versions sequentially (no random access by entry number)
- S3 version IDs are opaque strings (not sequential integers)
- Orphans exist temporarily (repaired within configurable interval, default 5 minutes)
pip install immukvnpm install immukvfrom immukv import ImmuKVClient, Config
config = Config(
s3_bucket="your-bucket",
s3_region="us-east-1",
s3_prefix=""
)
# Identity functions for JSON values (use custom encoders/decoders for complex types)
def identity(x): return x
with ImmuKVClient(config, identity, identity) as client:
# Write
entry = client.set("sensor-012352", {"alpha": 0.15, "beta": 2.8})
print(f"Committed: {entry.version_id}")
# Read (single S3 request)
latest = client.get("sensor-012352")
print(f"Latest: {latest.value}")
# History
history, _ = client.history("sensor-012352", None, None)
for entry in history:
print(f"Seq {entry.sequence}: {entry.value}")import { ImmuKVClient, Config } from 'immukv';
const config: Config = {
s3Bucket: 'your-bucket',
s3Region: 'us-east-1',
s3Prefix: ''
};
// Identity functions for JSON values (use custom encoders/decoders for complex types)
const identity = <T>(x: T): T => x;
const client = new ImmuKVClient(config, identity, identity);
// Write
const entry = await client.set('sensor-012352', { alpha: 0.15, beta: 2.8 });
console.log(`Committed: ${entry.versionId}`);
// Read (single S3 request)
const latest = await client.get('sensor-012352');
console.log('Latest:', latest.value);
await client.close();Every write operation happens in two phases:
-
Phase 1: Log Write (always succeeds or throws)
- Append entry to
_log.jsonusing optimistic locking - S3 creates new version automatically
- Entry is now durable and will never be lost
- Append entry to
-
Phase 2: Key Object Write (may fail temporarily)
- Write/update
keys/{key}.jsonwith full entry data - If this fails, entry is "orphaned" (exists in log but not in key object)
- Orphans are automatically repaired on next activity
- Write/update
- Pre-flight check: Every write operation repairs any existing orphan first
- Conditional reads: Read operations check for orphans at configurable intervals
- ETag-based repair: Uses stored previous ETag for idempotent conditional writes
- No background jobs: All repair happens inline during normal operations
Each entry includes:
hash- SHA-256 hash of entry dataprevious_hash- Hash from previous entry
This creates a tamper-evident chain where modifying any past entry breaks all subsequent hashes.
aws s3api put-bucket-versioning \
--bucket your-bucket \
--versioning-configuration Status=Enableds3://your-bucket/
├── _log.json (versioned)
│ ├── Version: xxx (latest)
│ ├── Version: yyy
│ └── Version: zzz (first)
└── keys/
├── sensor-012352.json (versioned)
├── sensor-012353.json (versioned)
└── ...
# Python
from immukv import Config, S3Overrides, S3Credentials
config = Config(
s3_bucket="bucket-name",
s3_region="us-east-1",
s3_prefix="",
kms_key_id=None, # Optional KMS encryption
repair_check_interval_ms=300000, # 5 minutes
read_only=False, # Set True to disable writes
overrides=S3Overrides(
endpoint_url=None, # Custom S3 endpoint
credentials=None, # S3Credentials or async CredentialProvider
force_path_style=False, # Required for MinIO
)
)// TypeScript
const config: Config = {
s3Bucket: 'bucket-name',
s3Region: 'us-east-1',
s3Prefix: '',
// kmsKeyId: 'optional-key-id',
repairCheckIntervalMs: 300000, // 5 minutes
readOnly: false,
// overrides: {
// endpointUrl?: string,
// credentials?: StaticCredentials | CredentialProvider,
// forcePathStyle?: boolean,
// }
};Both clients support pluggable credential providers for dynamic credential refresh (e.g., OIDC federation, custom STS flows).
# Python - async credential provider
from immukv import S3Credentials, Config, S3Overrides
async def my_credential_provider() -> S3Credentials:
# Fetch credentials from your identity provider
return S3Credentials(
aws_access_key_id="AKIA...",
aws_secret_access_key="...",
aws_session_token="...", # Optional
expires_at=some_datetime, # Optional (defaults to 1 hour from now)
)
config = Config(
s3_bucket="bucket-name",
s3_region="us-east-1",
s3_prefix="",
overrides=S3Overrides(credentials=my_credential_provider),
)// TypeScript - async credential provider
import { Config, CredentialProvider } from 'immukv';
const myCredentialProvider: CredentialProvider = async () => ({
accessKeyId: 'AKIA...',
secretAccessKey: '...',
sessionToken: '...', // Optional
});
const config: Config = {
s3Bucket: 'bucket-name',
s3Region: 'us-east-1',
s3Prefix: '',
overrides: {
credentials: myCredentialProvider,
},
};Creates new immutable entry in log and key object.
entry = client.set("key1", {"data": "value"})Retrieves latest value for key (single S3 read).
entry = client.get("key1")Retrieves all versions of a key (newest first).
entries, oldest_version = client.history("key1", None, 10)Retrieves entries from global log across all keys (newest first).
entries = client.log_entries(None, 100)Lists all keys in lexicographic order.
keys = client.list_keys(None, 100)Lists keys matching the given prefix (lexicographic order). Filtering is done server-side.
keys = client.list_keys_with_prefix("sensor-", None, 100)const keys = await client.listKeysWithPrefix('sensor-', undefined, 100);Verifies hash integrity of single entry.
is_valid = client.verify(entry)Verifies hash chain integrity.
is_valid = client.verify_log_chain(100)- No concurrent write conflicts (optimistic locking with retry)
- Log is always updated first (data never lost)
- Log is immutable and append-only
- Hash chain integrity (tampering breaks subsequent hashes)
- Global ordering (log versions provide chronological order)
- Eventual consistency (orphans repaired automatically)
- Bounded repair time (within
repair_check_interval_msof activity)
- Immediate consistency (key object write can fail temporarily)
- Transactional semantics (not ACID - log + key are separate writes)
- Latest entry always consistent (most recent may be orphaned briefly)
- Audit logs needing global ordering
- Configuration management with history
- Calibration parameters for IoT devices
- Lambda/serverless environments
- Simple compliance logging
- Applications tolerating eventual consistency
- Need guaranteed immediate consistency
- Need sub-second repair guarantees
- High-frequency writes (>100/sec per key)
- Applications requiring ACID transactions
Based on 1M write operations:
| Component | Cost |
|---|---|
| S3 PUT requests (log) | 1M x $0.005/1K = $5.00 |
| S3 PUT requests (keys) | 1M x $0.005/1K = $5.00 |
| S3 GET requests | 1M x $0.0004/1K = $0.40 |
| S3 storage (1KB/entry) | 1GB x $0.023 = $0.023 |
| Total | ~$10.42 |
DynamoDB equivalent: ~$1.25/M writes + ~$5/M reads = $6.25+ (plus storage)
ImmuKV is cost-effective for audit log patterns with occasional reads.
immukv/
├── cdk/ # CDK construct package
│ ├── src/
│ │ ├── index.ts
│ │ └── immukv.ts # CDK construct implementation
│ ├── test/
│ ├── tsconfig.test.json
│ └── package.json
├── python/ # Python package
│ ├── src/immukv/
│ │ ├── _internal/ # Internal implementation details
│ │ │ ├── __init__.py
│ │ │ ├── json_helpers.py # Internal JSON utilities
│ │ │ ├── s3_client.py # S3 client (aiobotocore with sync bridge)
│ │ │ ├── s3_helpers.py # S3 helper functions
│ │ │ ├── s3_types.py # S3-related type definitions
│ │ │ └── types.py # Internal type definitions
│ │ ├── __init__.py
│ │ ├── client.py # Main client implementation
│ │ ├── json_helpers.py # JSON serialization helpers
│ │ ├── py.typed # PEP 561 marker for type hints
│ │ └── types.py # Type definitions
│ ├── stubs/
│ │ └── wrapt/ # Type stubs for wrapt
│ │ ├── __init__.pyi
│ │ └── proxies.pyi
│ ├── tests/
│ └── pyproject.toml
├── typescript/ # TypeScript package
│ ├── src/
│ │ ├── internal/ # Internal implementation details
│ │ │ ├── jsonHelpers.ts # Internal JSON utilities
│ │ │ ├── s3Client.ts # S3 client implementation
│ │ │ ├── s3Helpers.ts # S3 helper functions
│ │ │ ├── s3Types.ts # S3-related type definitions
│ │ │ └── types.ts # Internal type definitions
│ │ ├── index.ts
│ │ ├── client.ts # Main client implementation
│ │ ├── jsonHelpers.ts # JSON serialization helpers
│ │ └── types.ts # Type definitions
│ ├── tests/
│ ├── tsconfig.json
│ ├── tsconfig.test.json
│ └── package.json
└── README.md
# Python
cd python
pip install -e ".[dev]"
pytest
# TypeScript
cd typescript
npm install
npm testMIT License
Contributions welcome! Please open an issue or pull request.