File Systems
There are several common ways to store binary information:
- Database or key-value store (e.g. PostgreSQL, SQLite) - Good for small files or a finite amount of files which fit within the confines of a database.
- Object store (e.g. S3) - same as a key-value store but typically designed to scale lots of files across multiple HDDs and hosts.
- File systems (e.g. EXT4) - good for files where certain operations benefit from a hierarchical data structure, e.g. list, delete. File systems typically come with metadata such as permissions and owners.
- Block storage - you get raw disk access but need to layout your binary data manually and in fixed block sizes.
Standard File Systems
Name | Snapshots | RAID | Checksumming | Compression | CoW | Erasure coding | Encryption |
---|---|---|---|---|---|---|---|
BTRFS[1] | Yes, Writable | Yes[2] | Yes | Yes | Yes | Unstable | No |
ZFS | Yes[3], Writable[4] | Drive-level | Yes | Yes | Yes | Yes | Yes |
EXT4 | No | No | No | No | No | - | No |
XFS | No | No | No | No | No | - | No |
bcachefs[5] | Yes | Yes | Yes | Yes | Yes | Unstable | Yes |
Notes:
- For single drives/blocks, I prefer BTRFS for it's feature set.
- For multiple drives, ZFS is often preferred for it's reliability.
- bcachefs is still under development
For multi-drive deployment, my preferences are:
- If you have multiple same-sized disks and no need for expansion, use ZFS raid.
- If you have different sized disks, want maximum storage, and don't need live parity then use snapraid + btrfs + mergerfs.
- If you need real-time parity and expansion and don't mind slow rebuilds, use mdraid + btrfs.
Windows
- NTRFS
- ReFS
Mac
- APFS
- Mac OS Extended
Overlay File Systems
- MergerFS - a union file system to combine multiple folders on a single computer.
Block Overlays
The create a view of one or more block storage, typically using one or more block storage.
Object Stores
- Minio - S3-compatible object store
- Ceph - joins drives across multiple computers. Has block, file, and object storage APIs.
- Rook - deployment of Ceph using Kubernetes
- SeaweedFS - joins drives across multiple computers to object storage APIs (incl. S3). Has file storage when paired with a database using the SeaweedFS Filer.
Distributed File Systems
- GlusterFS - joins filesystem directories across multiple computers
- Ceph - joins drives across multiple computers. Has block, file, and object storage APIs.
- JuiceFS - creates a POSIX-compatable file storage using an S3 object storage and metadata database.
Databases
SQL
- PostgreSQL
- MySQL
- SQLite
NoSQL
- MongoDB
Cloud Providers
See Cloud Providers