zvault/README.md

1.4 KiB

ZVault Backup solution

Goals

  • Blazingly fast backup runs
  • Space-efficient storage
  • Independent backups

Design

  • Use rolling checksum to create content-dependent chunks
  • Use sha3-shake128 to hash chunks
  • Use mmapped hashtable to find duplicate chunks
  • Serialize metadata into chunks
  • Store small file data within metadata
  • Store directory metadata to avoid calculating checksums of unchanged files (same mtime and size)
  • Store full directory tree in each backup (use cached metadata and checksums for unchanged entries)
  • Compress data chunks in blocks of ~10MB to improve compression ("solid archive")
  • Store metadata in separate data chunks to enable metadata caching on client
  • Encrypt archive
  • Sort new files by file extension to improve compression

Configurable parameters

  • Rolling chunker algorithm
  • Minimal chunk size [default: 1 KiB]
  • Maximal chunk size [default: 64 KiB]
  • Maximal file size for inlining [default: 128 Bytes]
  • Block size [default: 10 MiB]
  • Block compression algorithm [default: Brotli 6]
  • Encryption algorithm [default: chacha20+poly1305]

TODO

  • Remove old data
  • Locking / Multiple clients

Modules

  • Rolling checksum chunker
    • Also creates hashes
  • Mmapped hashtable that stores existing chunks hashes
  • Remote block writing and compression/encryption
  • Inode data serialization
  • Recursive directory scanning, difference calculation, new entry sorting