// PUBLIC SPECIFICATION
Cypherrum Vault Format Specification
Cypherrum Vault Format Specification — Version 1
Status: Public specification Version: 1.0 Date: 2026-04-04
This document describes the on-disk format of Cypherrum encrypted vaults. It is intended for interoperability, security auditing, and third-party implementations.
Overview
A Cypherrum vault encrypts files transparently using AES-256-GCM with Argon2id key derivation and optional ML-KEM-768 post-quantum recovery. Files are split into fixed-size chunks, each independently encrypted with a per-file key derived from the master key via HKDF. Directory metadata is encrypted separately with per-directory keys.
Directory Structure
- vault_root/
- .<16_hex_chars>/
- Hidden internal directory
- vault.json
- Vault configuration (plaintext)
- keys/
- master.key
- Encrypted master key
- recovery.key
- ML-KEM-768 encapsulated master key
- recovery.pub
- ML-KEM-768 public key (1184 bytes)
- hybrid.priv
- Encrypted ML-KEM-768 ephemeral private key
- d/
- Encrypted directory metadata
- <2_hex>/<62_hex>
- SHA-256(dir_id) → 2-level hash path
- f/
- Encrypted file chunks
- <2_hex>/<62_hex>.<N>
- SHA-256(file_id).chunk_index
The internal directory name is . followed by 16 random hex characters (8 random bytes). This hides the vault internals from casual browsing.
vault.json
Plaintext JSON. All cryptographic material is in the keys/ directory.
{
"version": 1,
"cipher_suite": "AES-256-GCM",
"chunk_size": 1048576,
"compression": "zstd",
"key_protection": "hybrid-pq",
"kdf": {
"algorithm": "argon2id",
"memory": 262144,
"iterations": 3,
"parallelism": 4,
"salt": "<base64, 32 bytes>"
},
"root_dir_id": "<64-char hex>",
"hmac": "<64-char hex>"
}
| Field | Type | Description |
|---|---|---|
version | int | Must be 1 |
cipher_suite | string | "AES-256-GCM" (fixed) |
chunk_size | int | Plaintext chunk size: 262144 (256KB), 1048576 (1MB), or 4194304 (4MB) |
compression | string | "" (none) or "zstd" (Zstandard) |
key_protection | string | "" (classical) or "hybrid-pq" (ML-KEM-768 + AES) |
kdf.algorithm | string | "argon2id" (fixed) |
kdf.memory | uint32 | Memory cost in KiB (default: 262144 = 256MB) |
kdf.iterations | uint32 | Time cost (default: 3) |
kdf.parallelism | uint8 | Thread count (default: 4) |
kdf.salt | base64 | 32-byte random salt |
root_dir_id | hex | Root directory identifier (32 random bytes → 64 hex) |
hmac | hex | HMAC-SHA256(master_key, root_dir_id) for integrity verification |
Key Derivation
Password → Password Key
password_key = Argon2id(
password,
salt = kdf.salt,
memory = kdf.memory (KiB),
iterations = kdf.iterations,
parallelism = kdf.parallelism,
output = 32 bytes
)
Password Key → Master Key Decryption
Classical mode (key_protection: ""):
master_key = AES-256-GCM-Decrypt(
key = password_key,
ciphertext = keys/master.key
)
Hybrid mode (key_protection: "hybrid-pq"):
- Read
keys/master.key: version(1) + kem_ct_len(2) + kem_ct + aes_gcm_data - Read
keys/hybrid.priv: AES-256-GCM encrypted with password_key - Decrypt hybrid private key with password_key
- Decapsulate:
shared_secret = ML-KEM-768.Decaps(private_key, kem_ct) - Derive hybrid key:
HKDF-SHA256(password_key || shared_secret, info="cypherrum-hybrid-v1")→ 32 bytes - Decrypt master key with hybrid key via AES-256-GCM
Master Key → File/Directory Keys
file_key = HKDF-SHA256(master_key, file_id_bytes, info="cypherrum-file-key-v1") → 32 bytes
dir_key = HKDF-SHA256(master_key, dir_id_bytes, info="cypherrum-dir-key-v1") → 32 bytes
Verification
On unlock, verify: HMAC-SHA256(master_key, root_dir_id_hex) == vault.json.hmac
This detects wrong passwords, tampered configs, and corrupted vaults.
Key Files
keys/master.key
Classical (60 bytes):
| Offset | Size | Field |
|---|---|---|
| 0 | 12 | GCM nonce |
| 12 | 32 | Encrypted master key |
| 44 | 16 | GCM tag |
Hybrid (variable):
| Offset | Size | Field |
|---|---|---|
| 0 | 1 | Version (0x01) |
| 1 | 2 | KEM ciphertext length (big-endian) |
| 3 | varies | ML-KEM-768 ciphertext |
| 3+len | 12 | GCM nonce |
| … | 32 | Encrypted master key |
| … | 16 | GCM tag |
keys/recovery.key
| Offset | Size | Field |
|---|---|---|
| 0 | 2 | KEM ciphertext length (big-endian) |
| 2 | varies | ML-KEM-768 ciphertext |
| 2+len | 12 | GCM nonce |
| … | 32 | Encrypted master key |
| … | 16 | GCM tag |
Recovery: decapsulate with the ML-KEM-768 private key (held by user offline), then decrypt master key with the shared secret.
keys/recovery.pub
Raw ML-KEM-768 public key, 1184 bytes (NIST FIPS 203).
keys/hybrid.priv
| Offset | Size | Field |
|---|---|---|
| 0 | 12 | GCM nonce |
| 12 | ~2400 | Encrypted ML-KEM-768 private key |
| … | 16 | GCM tag |
Encrypted with the password key. Only present in hybrid mode.
Directory Metadata
Path: d/{SHA256(dir_id_hex)[0:2]}/{SHA256(dir_id_hex)[2:64]}
Format:
| Offset | Size | Field |
|---|---|---|
| 0 | 12 | GCM nonce |
| 12 | varies | Encrypted JSON |
| … | 16 | GCM tag |
Key: HKDF-SHA256(master_key, dir_id_bytes, "cypherrum-dir-key-v1")
Plaintext JSON:
{
"id": "<64-char hex dir ID>",
"entries": [
{
"name": "filename.txt",
"type": "file",
"fileId": "<64-char hex>",
"size": 1024,
"modTime": "2026-01-01T00:00:00Z",
"chunkCount": 1
},
{
"name": "subdir",
"type": "dir",
"dirId": "<64-char hex>",
"modTime": "2026-01-01T00:00:00Z"
}
]
}
File Chunks
Path: f/{SHA256(file_id_hex)[0:2]}/{SHA256(file_id_hex)[2:64]}.{chunk_index}
Key: HKDF-SHA256(master_key, file_id_bytes, "cypherrum-file-key-v1")
Chunk format (version 0x01 — uncompressed)
| Offset | Size | Field |
|---|---|---|
| 0 | 1 | Version (0x01) |
| 1 | 4 | Chunk index (big-endian uint32) |
| 5 | 12 | GCM nonce |
| 17 | varies | AES-256-GCM ciphertext |
| … | 16 | GCM tag |
Chunk format (version 0x02 — zstd compressed)
Same layout as 0x01, but the plaintext was compressed with Zstandard before encryption. If compression didn’t reduce size, version 0x01 is used instead (per-chunk decision).
AAD (Additional Authenticated Data)
AAD = file_id_bytes (32 bytes) || chunk_index (big-endian uint32, 4 bytes)
This prevents:
- Reading chunks from the wrong file
- Reordering chunks within a file
Recovery Key (User-Facing)
The recovery key shown to users is the ML-KEM-768 private key, base64-encoded and formatted as 8-character groups separated by dashes:
abcd1234-efgh5678-ijkl9012-mnop3456-...
This key can recover the vault if the password is forgotten:
- Parse and base64-decode the private key
- Decapsulate
keys/recovery.key→ shared secret - Decrypt master key with shared secret
- Re-encrypt master key with a new password
Security Properties
- Confidentiality: All file content and directory metadata encrypted with AES-256-GCM
- Integrity: GCM authentication tags on every chunk and metadata file; HMAC on vault config
- Key isolation: Per-file and per-directory keys via HKDF (compromise of one key doesn’t expose others)
- Anti-reordering: Chunk index in AAD prevents shuffling attacks
- Anti-swapping: File ID in AAD prevents cross-file chunk substitution
- Post-quantum recovery: ML-KEM-768 (NIST FIPS 203) protects against harvest-now-decrypt-later
- Password protection: Argon2id with 256MB memory cost resists GPU/ASIC brute force
- Compression oracle resistance: Compression is applied before encryption at the chunk level; the per-chunk compress-or-not decision and random nonces limit information leakage
Compatibility
This specification describes version 1 of the Cypherrum vault format. No other versions exist. The version field in vault.json must be 1.
Implementations MUST reject vaults with unknown versions, unsupported cipher suites, or invalid HMAC values.