This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
TL;DR: Building a new NAS, working with ZoL from Ubuntu 19.04, write speeds reasonable (~300MB/s), read speeds awful (wildly fluctuating, but averaging 150MB/s), even in configurations like a 4-way mirror that should just blast out data. (Desired final configuration is stripe mirror or RAIDZ2.) I'm at wit's end. I've tried nearly everything I can find, and I can't see what's wrong or where to go next. I'm hoping somebody can point me in the right direction at least.
Hardware: Gigabyte C246-based motherboard with an i3-9100, 32GB ECC memory, SATA SSD boot drive, an Intel X520-DA2 NIC with a 10Gb connection to the switch (and back to my other test client machine), 4x HGST HUH721010ALE604 10TB drives attached to motherboard SATA ports.
I've tried single disks, 2-way mirrors, 4-way mirrors, 2/2 stripe/mirror and RAIDZ2. I get about the same results on all: reasonably constant and appropriate write speeds for the configuration (200-300MB/s), but wildly oscillating read speeds that are always far under what the configuration should put out (typically 100-150MB/s). And frighteningly consistent no matter what the disk configuration.
(Edit: I should mention that the use case here is that this is my own personal NAS. I know everybody wants to think in IOPS, but the use case here is a huge file tree of mostly large image files that are accessed by me from one machine at a time. So it's single client, large individual files, fed through SMB once I get speeds where I want them. Hence why I'm working with real example datasets and measuring things in raw throughput.)
Test setup:Two directories, one with ~12GB of large (25MB) RAW image files, the other with 44GB of RAW image files. I basically use the first one for benchmarking, and the second one as an ARC flush. Copy the first on, copy the second on, copy the first off with timing... It doesn't seem to matter if I copy locally to the boot SSD or over the network, bottleneck is ZFS disk reads.
zfs pool and dataset parameters are pretty typical for the recommended:
ashift=12
compression=lz4 (and off - no difference)atime=offxattr=sa
Linux module has ARC size set to 24GB rather than the default of half of RAM. I've tried messing with the other various tunables that might affect things (zfs_vdev_async_read_max_active, zfs_vdev_sync_read_max_active, etc.) to absolutely no effect.
zpool iostat shows that's roughly reading off all drives equally, at roughly ~40-50MB/s. Waaaaay under what I should get. System load never even gets to 1, and typically idles at near 0. There's nothing else running on this box.
The *only* thing I've found that makes a difference is recordsize. At 1M, the read speed evens out (no oscillations in speed), but even then is only ~240MB on a 4-way mirror that should be kicking ass. A single disk can almost hit that. Four of them in parallel should be nearly unstoppable.
I don't think it's the disks or disk controllers or such. I tried BTRFS just for fun and could routinely get 500-600MB/sec easily (basically limited by the copy destination) either locally or over the LAN. But BTRFS for very important data terrifies me. Likewise for Linux md RAID-1 and EXT4 - also seems to only be limited by how fast my destination can suck up data. Also, if I pull the 12G dataset data back right after I write it, it happily pulls it from the ARC as fast as it can. It really seems to be something to do with ZFS fetching from disk.
Honestly if I could get to 300-400MB/s out of ZFS, I'd be happy. Anybody have suggestions where to go next?
Subreddit
Post Details
- Posted
- 5 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/zfs/comment...