> For the complete documentation index, see [llms.txt](https://book.bsdcn.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://book.bsdcn.org/ask/flat/chapter-28-the-zfs-file-system/di-28.6-jie-zfs-diao-you.md).

# 28.6 ZFS Tuning

## Technical Potential and Practical Challenges

ZFS's performance advantages and advanced features require targeted parameter tuning to be fully realized. Tuning strategies depend on the specific environment and must be personalized based on storage hardware, workload characteristics, and usage scenarios. The primary tuning directions include ARC cache size, record size, and compression algorithm selection. ZFS is not a typical out-of-the-box file system.

## Tuning

Adjust tunable parameters to make ZFS perform optimally under different workloads. The following values can be adjusted at any time via sysctl(8), and can also be permanently set in **/boot/loader.conf** or **/etc/sysctl.conf**.

* `vfs.zfs.arc.max` - The maximum size of the ARC. The default value is `0`, which for FreeBSD effectively takes the larger of total memory - 1 GiB and 5/8 of total memory. This value must not be lower than 64 MiB. If the system runs other daemons or processes that may need memory, a smaller value can be used.

  > **Note**
  >
  > You can dynamically modify `vfs.zfs.arc.max` using sysctl(8). However, once the system is running, it cannot be changed back to `0`; furthermore, if the set value is lowered below the current ARC usage, the ARC will not proactively shrink and will only trigger reclamation when the system experiences memory pressure.
* `vfs.zfs.arc.min` - The minimum size of the ARC. The default value is `0`, which effectively takes the larger of 32 MiB and 1/32 of total memory. This value can be adjusted to prevent other applications from evicting the entire ARC.
* `vfs.zfs.vdev.min_auto_ashift` - The minimum `ashift` (sector size) automatically used when creating a pool. This value is a power of 2. The default value is `9`, meaning `2^9 = 512`, i.e., a 512-byte sector size. To avoid **write amplification** and achieve optimal performance, this value should be set to the largest sector size used by devices in the pool. Common drives have 4 KB sectors. Using the default `ashift` value of `9` with these drives causes write amplification. On these devices, a single 4 KB write of data would be written as eight 512-byte writes. Setting `vfs.zfs.vdev.min_auto_ashift` to `12` (`2^12 = 4096`) before creating the pool forces ZFS to use 4 KB blocks, achieving optimal performance.

  > **Tip**
  >
  > For FreeBSD ZFS systems installed via bsdinstall, the `vfs.zfs.vdev.min_auto_ashift` value defaults to `12`; see the source code file **usr.sbin/bsdinstall/scripts/zfsboot**.

  Forcing 4 KB blocks is also very useful in pools where disk upgrades are planned. Current disks use 4 KB sectors, and the `ashift` value cannot be changed after the pool is created.

  In certain specific situations, a smaller 512-byte block size may be more appropriate. For example, when using 512-byte disks for databases or virtual machine storage, smaller blocks transfer less data during small random reads. This can provide better performance when using smaller ZFS record sizes.
* `vfs.zfs.prefetch.disable` - Disables predictive prefetch. A value of `0` enables it, and a value of `1` disables it. The default is `0` (predictive prefetch enabled). Prefetch reads data blocks much larger than the requested blocks into the ARC, anticipating that this data will be needed later. If the workload has a large number of random reads, disabling prefetch may improve performance by reducing unnecessary reads. Note that this parameter only disables predictive prefetch and does not affect prescient prefetch (such as the prefetch used by `zfs send`), which never issues I/O that will ultimately not be needed and therefore does not affect performance.
* `vfs.zfs.l2arc.write_max` - Limits the maximum amount of data written per second to each L2ARC device. The default value is `67108864` bytes (64 MiB). Total L2ARC throughput grows linearly with the number of cache devices in the pool.
* `vfs.zfs.l2arc.noprefetch` - Whether to write prefetched but unused-by-application buffers to L2ARC. The default value is `1` (disabled). A value of `0` enables it. When disabled, prefetched data is not cached to L2ARC. Setting this value to `0` allows sequential reads from disk to be cached to L2ARC and subsequently served from L2ARC. This can be beneficial when L2ARC devices are much faster than pool disks for sequential reads.
* `vfs.zfs.l2arc.mfuonly` - Controls what content is cached from ARC to L2ARC. The default value is `0`, meaning both MRU and MFU data and metadata are cached to L2ARC. When set to `1`, only MFU data and metadata are cached, suitable for scenarios involving reading and writing large amounts of data that will not be accessed again, to avoid wasting L2ARC space. When set to `2`, all metadata (MRU+MFU) is cached but only MFU data, suitable for scenarios where you want to cache as much metadata as possible during high data turnover.
* `vfs.zfs.l2arc.dwpd_limit` - The Drive Writes Per Day limit for L2ARC devices, expressed as a percentage, with a default value of `100`. `100` equals 1.0 DWPD, meaning each L2ARC device writes at most its own capacity once per day. Lower values support fractional DWPD (50 = 0.5 DWPD, 30 = 0.3 DWPD, suitable for QLC SSDs). Higher values allow more writes (300 = 3.0 DWPD). The actual write rate is always limited by `vfs.zfs.l2arc.write_max`. A value of `0` disables the DWPD rate limit. The DWPD limit only takes effect after the initial fill phase is complete and the total L2ARC capacity is at least twice `arc_c_max`.
* `vfs.zfs.txg.timeout` - The maximum number of seconds between transaction groups, i.e., the maximum interval for flushing dirty data to disk. The default value is `5` seconds. When the current transaction group is written to the pool, if this amount of time has elapsed since the previous transaction group, a new transaction group is started. If enough data has been written, the transaction group may be triggered early. Larger values may improve read performance by delaying asynchronous writes, but this can cause uneven performance when writing transaction groups.
* `vfs.zfs.vdev.scrub_min_active` - The minimum number of concurrent I/Os per device during scrub operations. The default value is `1`. When the vdev is idle, the concurrency automatically increases to `vfs.zfs.vdev.scrub_max_active`.
* `vfs.zfs.vdev.scrub_max_active` - The maximum number of concurrent I/Os per device during scrub operations. The default value is `3`. Increasing this value can speed up scrub completion but will increase read/write latency and reduce throughput.
* `vfs.zfs.vdev.rebuild_min_active` - The minimum number of concurrent I/Os per device during sequential rebuild (distinct from traditional resilver) operations. The default value is `1`.
* `vfs.zfs.vdev.rebuild_max_active` - The maximum number of concurrent I/Os per device during sequential rebuild operations. The default value is `3`. Increasing this value can speed up rebuild completion but will increase read/write latency.
* `vfs.zfs.vdev.nia_delay` - For non-interactive I/O (scrub, resilver, remove, initialize, and rebuild), the concurrent I/O count is limited to each queue's `min_active` unless the vdev is in an "idle" state. A vdev is considered "idle" when there is no interactive I/O activity and `vfs.zfs.vdev.nia_delay` non-interactive operations have completed since the last interactive operation, at which point the concurrency for non-interactive operations increases to each queue's `max_active`. The default value is `5`.
* `vfs.zfs.vdev.nia_credit` - Some mechanical hard drives process sequential I/O at higher priority, causing concurrent random I/O latency to reach several seconds. To prevent non-interactive I/O (such as scrub) from monopolizing the device, when there are outstanding interactive I/Os, at most `vfs.zfs.vdev.nia_credit` non-interactive operations can be issued. This forced wait ensures that mechanical hard drives process interactive I/O within a reasonable time. The default value is `5`.

## References

Documentation related to ZFS tuning includes:

* The official documentation of the OpenZFS project at <https://openzfs.github.io/openzfs-docs/index.html>, which includes dedicated chapters on performance and tuning, covering module parameters, workload tuning, and more.
* *The Design and Implementation of the FreeBSD Operating System (2nd Edition)*: includes principled descriptions of ZFS.
* *FreeBSD Mastery: ZFS* and *FreeBSD Mastery: Advanced ZFS*: of limited value.
* [Oracle Solaris Administration: ZFS File System](https://docs.oracle.com/cd/E26926_01/html/E25826/index.html): this document was written before the OpenZFS project was launched and does not include OpenZFS development progress from the past fifteen years; for reference only.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://book.bsdcn.org/ask/flat/chapter-28-the-zfs-file-system/di-28.6-jie-zfs-diao-you.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
