ZFS
How to use ZFS:
Background
There are three levels to understand
- zpools are a JBOD of one or more vdevs
- vdevs are groups of drives, likely in raidz (or raidz2, raidz3) or mirror.
- datasets are filesystems stored on a zpool, similar to partitions
- zvol is a virtual block device on a zpool without a filesystem
Usage
# Create a zpool with a mirror vdev. zpool create -f -o ashift=12 -o compression=zstd $zpool_name mirror \ ata-diskA \ ata-diskB # Create a dataset. zfs create -o encryption=aes-256-gcm -o keyformat=passphrase $zpool_name/$dataset_name
- Notes
- You should always use the id under
/dev/disk/by-id/
- E.g.
/dev/disk/by-id/ata-diskA
- E.g.
Alerts
First setup postfix to send emails.
Then setup ZED notifications
Automatic Scrubs
By default, ZFS on Ubuntu will automatically scrub every month
Automatic Snapshots
See sanoid
zfs list -t snapshot
Caching
ZFS has two read caches:
- ARC - this is enabled by default and uses half of your memory. This memory will be released if you approach out of memory.
- L2ARC - you can enable additional caching by adding an L2ARC drive for ARC to overflow to.
For writes:
- SLOG - A separate log, typically an SSD backed mirror to write the ZFS intent log (ZIL).
In general, you will want to use an Intel Optane SSD for caching as they're supposed to last longer and have less latency.
A 16GB Optane stick can be had for ~$12.
ARC
arc_summary
or arcstat
will show you the memory used by ARC. This does not appear in htop
.
If you want to reduce arc memory usage, you can set limits by creating /etc/modprobe.d/zfs.conf
:
/etc/modprobe.d/zfs.conf
# Set Max ARC size => 4GB == 4294967296 Bytes options zfs zfs_arc_max=4294967296 # Set Min ARC size => 1GB == 1073741824 options zfs zfs_arc_min=1073741824
L2ARC
L2ARC costs about 80 bytes per record. Historically, this used to be 320 bytes, but now it's mostly negligible.
At the default of 128K record size, 1 GiB has 8196 records, hence requiring approx 656 KiB of memory.
At 4K record size, you will need approx. 20 MB of RAM per GB.
To add an l2arc:
sudo zpool add $pool cache $device
SLOG
sudo zpool add $pool log $device
# or
# sudo zpool add $pool log mirror $device1 $device2
Expanding
You can only expand by adding vdevs or replacing all drives in a vdev with larger ones.
See [1]
After replacing all drives in a vdev, you need to run:
sudo zpool online -e $pool $disk
on any disk.
Pros and Cons
VS Snapraid + btrfs + mergerfs
- Pros
- ZFS has realtime parity.
- ZFS can work while degraded.
- ZFS snapshots with send and receive.
- ZFS has encryption on per-dataset.
- ZFS handles everything altogether including parity on permissions
- Cons
- The main con is that ZFS is less expandable.
- You can only expand by replacing every drive or adding entire vdevs.
- If many drives die, i.e. >2 for raidz2, you lose all your data.