Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Advertise on LowEndTalk.com
ZFS write IOPS
New on LowEndTalk? Please read our 'Community Rules' by clicking on it in the right menu!

ZFS write IOPS

Hi guys, I am running out of space on my 2TB Nvme SSD server and want to upgrade it to 4TB. 4TB SSD storage is getting out of my budget. I use it as a file server for a service with ~10k concurrent users so the IOPS get around 1k at times (7k during backup, but slow backup is not a problem). Will using ZFS on a server with like 16GB RAM, 512 GB SSD and 4TB HDD be able to achieve the same IOPS (around 1k?). This would be much cheaper as this config is available on Hetzner auctions. Thanks

Comments

  • dfroedfroe Member, Provider

    It depends. Of course you cannot expect similiar IOPS from a HDD array compared to SSD.

    If you are using a couple of GB from the SSD as ZIL and the remaining space as L2ARC you may see similiar results in certain use cases.

    However for read performance I doubt that all your regularly accessed data will fit into 500 GB of L2ARC. And obviously reading data from HDD will be much slower than from SSD.

    Regarding write performance it depends whether you are optimizing for async or sync write operations. Async writes will always go into RAM first without involving ZIL. But for large write operations your sustainable rate will be limited by HDD throughput. For Sync operations a ZIL on SSD will help but remember to mirror it - and of course only use ECC RAM especially with ZFS if your data is important to you.

    Designing storage systems can be quite complex and depends on a lot of factors. But if your question is, if a system with 4 TB HDD + 512 GB SSD can typically reach performance of a 2 TB NVMe system, then the answer will be: Most likely not.

    Thanked by 2Falzo nowthisisfun

    IT Service David Froehlich | Individual network and hosting solutions | AS39083 | RIPE LIR services (IPv4, IPv6, ASN)

  • FalzoFalzo Member
    edited November 18

    I agree with @dfroe but want to add that besides all these options your achievable iops also heavily depend on the blocksize, which also means on the filesize in the end.

    technically you will have an IO limit for your SSD and a bandwidth limit of course. if you have a large blocksize you oviously won't need as much IO to reach your bandwidth limit and with tons of small files / small blocksize you might not even manage to get to the bandwidth limit before you run out of IOps...

    so it also depends on your specific workload in terms of number and size of files.

    PS: practically speaking I would guess you won't be satisfied with zfs.

    Thanked by 2dfroe nowthisisfun

    UltraVPS.eu KVM in US/UK/NL/DE: 15% off first 6 month | Netcup VPS/rootDS - 5€ off: 36nc15279180197 (ref)

  • SplitIceSplitIce Member, Provider
    edited November 18

    Moving from a 2TB NVMe to a 4TB HDD is a considerable IOPS decrease.

    Typical IOPS for spinning rust is in the range of ~50/sec so you arent going to be acheiving 1-7k regardless of how you use your 512GB drive (write cache, ZIL, etc) unless you perhaps only have a very small active dataset.

    Although your backup job willl mean you don't

    X4B - DDoS Protection: Affordable Anycast DDoS mitigation with PoPs in the Europe, Asia, North and South America.
    Latest Offer: Brazil Launch 2020 Offer
  • rcxbrcxb Member

    @SplitIce said:
    Typical IOPS for spinning rust is in the range of ~50/sec

    That seem extremely low. Here's what a quick search turned up:

    The HGST Ultrastar He6 averages 204 IOPS at QD256, while the 7K4000 delivers 215 IOPS.

    Source: https://www.tweaktown.com/reviews/6211/hgst-ultrastar-he6-6tb-helium-enterprise-hdd-review/index.html

  • @rcxb said: That seem extremely low.

    Take a note about HDD cache and overall system RAM. Large HDDs tend to have massive cache + RAID cards with tons of caching.

    hostwp.net -- Wordpress Hosting for Developers.

  • SplitIceSplitIce Member, Provider

    @rcxb measured some drives to get that number (got 45-55 on average). Of course fast enterprise drives will do more it's the scale that matters (factor of 100!)

    X4B - DDoS Protection: Affordable Anycast DDoS mitigation with PoPs in the Europe, Asia, North and South America.
    Latest Offer: Brazil Launch 2020 Offer
  • @dfroe @Falzo Thanks for the reply! The average file size is around 3 MB and the max (99 percentile) file size is 130 M. The write volume per day is around 50 Gb max, so plenty of time to flush the disk (practically no users from 12 PM to 6 AM). Also, frequently accessed data will be around 100 GB max, so I am optimistic that I can pull this off. I am thinking of trying lvmcache, will update here once that is done.

    Thanked by 1Falzo
  • @nowthisisfun I actually use ssd cached zfs on my home server, and though this is a quite old 120G ssd it caches a 4x4TB striped mirrored zpool. just ran fio via yabs on it for you:

    fio Disk Speed Tests (Mixed R/W 50/50):
    ---------------------------------
    Block Size | 4k            (IOPS) | 64k           (IOPS)
      ------   | ---            ----  | ----           ---- 
    Read       | 7.58 MB/s     (1.8k) | 118.95 MB/s   (1.8k)
    Write      | 7.62 MB/s     (1.9k) | 119.57 MB/s   (1.8k)
    Total      | 15.21 MB/s    (3.8k) | 238.52 MB/s   (3.7k)
               |                      |                     
    Block Size | 512k          (IOPS) | 1m            (IOPS)
      ------   | ---            ----  | ----           ---- 
    Read       | 218.53 MB/s    (426) | 217.69 MB/s    (212)
    Write      | 230.14 MB/s    (449) | 232.19 MB/s    (226)
    Total      | 448.68 MB/s    (875) | 449.88 MB/s    (438)
    

    so as you can see this more or less is all cached, for large blocksizes the bandwidth clearly limits the available IOps but for smaller blocksizes it manages to hit more than 1k.

    keep in ming this is just on a single testfile that easily fits in the cache (probably even ARC)... also the underlying quasi raid-10 might reach 200-250 iops standalone as well.

    workloads of your use case are much more stressful and constant so they will be more problematic, you probably only find out by thorough testing then.

    I would be very interested in your findings with LVM cache...

    Thanked by 1nowthisisfun

    UltraVPS.eu KVM in US/UK/NL/DE: 15% off first 6 month | Netcup VPS/rootDS - 5€ off: 36nc15279180197 (ref)

Sign In or Register to comment.