ZFS: Difference between revisions
(15 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
There are three levels to understand | There are three levels to understand | ||
* zpools are a JBOD of one or more vdevs | * zpools are a JBOD of one or more vdevs | ||
* vdevs are groups of drives, likely in raidz( | * vdevs are groups of drives, likely in raidz (or raidz2, raidz3) or mirror. | ||
* datasets are filesystems stored on a zpool, similar to partitions | * datasets are filesystems stored on a zpool, similar to partitions | ||
* zvol is a virtual block device on a zpool without a filesystem | * zvol is a virtual block device on a zpool without a filesystem | ||
Line 11: | Line 11: | ||
<pre> | <pre> | ||
# Create a zpool with a mirror vdev. | # Create a zpool with a mirror vdev. | ||
zpool create -f -o ashift=12 -o compression= | zpool create -f -o ashift=12 -o compression=zstd $zpool_name mirror \ | ||
ata-diskA \ | |||
ata-diskB | |||
# Create a dataset. | # Create a dataset. | ||
zfs create -o encryption=aes-256-gcm -o keyformat=passphrase $zpool_name/$dataset_name | zfs create -o encryption=aes-256-gcm -o keyformat=passphrase $zpool_name/$dataset_name | ||
</pre> | </pre> | ||
;Notes | |||
* You should always use the id under <code>/dev/disk/by-id/</code> | |||
** E.g. <code>/dev/disk/by-id/ata-diskA</code> | |||
==Alerts== | |||
First [https://askubuntu.com/questions/1332219/send-email-via-gmail-without-other-mail-server-with-postfix setup postfix to send emails].<br> | |||
Then [https://askubuntu.com/questions/770540/enable-zfs-zed-email-notifications-on-16-04 setup ZED notifications] | |||
==Automatic Scrubs== | |||
By default, ZFS on Ubuntu will automatically scrub every month | |||
==Automatic Snapshots== | |||
See [https://github.com/jimsalterjrs/sanoid sanoid] | |||
<pre> | |||
zfs list -t snapshot | |||
</pre> | |||
==Caching== | |||
ZFS has two read caches: | |||
* ARC - this is enabled by default and uses half of your memory. This memory will be released if you approach out of memory. | |||
* L2ARC - you can enable additional caching by adding an L2ARC drive for ARC to overflow to. | |||
For writes: | |||
* SLOG - A separate log, typically an SSD backed mirror to write the ZFS intent log (ZIL). | |||
In general, you will want to use an Intel Optane SSD for caching as they're supposed to last longer and have less latency.<br> | |||
A 16GB Optane stick can be had for ~$12. | |||
===ARC=== | |||
<code>arc_summary</code> or <code>arcstat</code> will show you the memory used by ARC. This does not appear in <code>htop</code>. | |||
If you want to reduce arc memory usage, you can set limits by creating <code>/etc/modprobe.d/zfs.conf</code>: | |||
{{hidden | <code>/etc/modprobe.d/zfs.conf</code> | | |||
<pre> | |||
# Set Max ARC size => 4GB == 4294967296 Bytes | |||
options zfs zfs_arc_max=4294967296 | |||
# Set Min ARC size => 1GB == 1073741824 | |||
options zfs zfs_arc_min=1073741824 | |||
</pre> | |||
}} | |||
===L2ARC=== | |||
L2ARC costs about 80 bytes per record. Historically, this used to be 320 bytes, but now it's mostly negligible.<br> | |||
At the default of 128K record size, 1 GiB has 8196 records, hence requiring approx 656 KiB of memory.<br> | |||
At 4K record size, you will need approx. 20 MB of RAM per GB. | |||
To add an l2arc: | |||
<syntaxhighlight lang="bash"> | |||
sudo zpool add $pool cache $device | |||
</syntaxhighlight> | |||
===SLOG=== | |||
<syntaxhighlight lang="bash"> | |||
sudo zpool add $pool log $device | |||
# or | |||
# sudo zpool add $pool log mirror $device1 $device2 | |||
</syntaxhighlight> | |||
==Expanding== | |||
You can only expand by adding vdevs or replacing all drives in a vdev with larger ones.<br> | |||
See [https://docs.oracle.com/cd/E19253-01/819-5461/githb/index.html]<br> | |||
After replacing all drives in a vdev, you need to run: | |||
<code>sudo zpool online -e $pool $disk</code> on any disk. | |||
==Pros and Cons== | ==Pros and Cons== | ||
Line 28: | Line 90: | ||
* ZFS handles everything altogether including parity on permissions | * ZFS handles everything altogether including parity on permissions | ||
;Cons | ;Cons | ||
* The main con is that ZFS is less expandable. You | * The main con is that ZFS is less expandable. | ||
** You can only expand by replacing every drive or adding entire vdevs. | |||
* If many drives die, i.e. >2 for raidz2, you lose all your data. | * If many drives die, i.e. >2 for raidz2, you lose all your data. | ||