Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with OpenID
Advertise on LowEndTalk.com

In this Discussion

SSD-cached storage at VPS node
New on LowEndTalk? Please read our 'Community Rules' by clicking on it in the right menu!

SSD-cached storage at VPS node

AlexeiAlexei Member

Hello LET!

I am going to rent a dedicated server for several VMs with Proxmox + KVM on Ubuntu.

Storage: 4 x 4TB hard drives at hardware RAID10 (random controller) from Hetzner.

The question is how to speed up i/o. Pretty large mongo db with concurrent writes.

Idea is to add one (or two?) NVMe cards for cache and to use software like BCache, Flashcache, EnhanceIO, dm-cache, etc.

The question is: 1. Is there any sense to add NVMe for cache? 2. Which cache software will be the most efficient and reliable in this case?

Thanks

Comments

  • williewillie Member

    Don't ssd cache. Get rid of the hdd's entirely and use a pure ssd system, with fast ssds and enormous amounts of ram. How big is your mongodb? Remember that mongodb works by mmap'ing the entire disk area and then updating with pure memory operations. What is the application if I can ask?

  • Zfs with ssd cache

  • pbgbenpbgben Member, Provider

    How large? The DB is TB?

  • AlexeiAlexei Member

    @willie said: Don't ssd cache. Get rid of the hdd's entirely and use a pure ssd system, with fast ssds and enormous amounts of ram. How big is your mongodb? Remember that mongodb works by mmap'ing the entire disk area and then updating with pure memory operations. What is the application if I can ask?

    Unfortunately SSD is not in budget.

    Now we have around ~1TB of data on that mongo VM with WiredTiger storage engine.

    Just collecting raw data, aggregating and searching records. For the faster search queries prepared collections are exported to Elasticsearch (they also consume space).

    @PieNotEvenEaten said: Zfs with ssd cache

    I am afraid of possible failure and data loss. Construction is pretty complex.

    If it is way efficient and the same reliable as alternative methods - then OK.

    But very few info on the net regarding such setups. Any comparative tests?

    pbgben said: How large? The DB is TB?

    Now around 1TB of data. As WiredTiger archives it, it takes less space.

  • williewillie Member

    I think you're better off not using Mongo in this case. Mongo is very storage inefficient and it sounds like you're not using its capabilities much. Are you storing and indexing complicated documents with no regular structure? Are you using replication? For what you're doing I'd probably just log the incoming records to disk files and feed them into elastic search in the background.

  • AlexeiAlexei Member

    @willie said: I think you're better off not using Mongo in this case.

    Maybe in future, because:

    • Was a problem when after server reboot WiredTiger powered storage become corrupted and data finally unrecoverable.

    • Facing 16MB object limit.

    @willie said: Are you storing and indexing complicated documents with no regular structure?

    Yes and No

    @willie said: Are you using replication?

    Nope

    @willie said: For what you're doing I'd probably just log the incoming records to disk files and feed them into elastic search in the background.

    Mongo is good for dropping something in.

    Elastic can't fully replace mongo for us. It has own limitations:

    • Can't store array of arrays.

    • Disk space inefficient. It does not really delete documents, just "hides" it. So, index size grows fast with concurrent updates.

    • Storage reliability. Simply we don't want to use it as primary data store.

  • williewillie Member

    Primary data store would be append-only log: elastic would just be the index and you could rebuild it if you had to. Old and rewritten records get deleted when you merge/compress indexes, but yeah, that's a high load if you're updating existing records a lot. Elastic is much more disk efficient than Mongo in typical cases (hmm, at least with Mongo 2.x--I haven't used Wiredtiger).

    How much of the TB of data is actually documents, vs. media and other non-searchable stuff? Maybe you could store those separately. 16MB objects sounds like it might want this treatment.

    Mongo 2.x data recovery from crashed db's is doable though messy. There's a ton of redundancy in the extent files so you have good chances of finding all or most of the documents. No idea about Wiredtiger.

  • @Alexei 02 NVMe P3700 400GB for L2ARC and Zil cache. 9-12 SAS 1TB 7200 RPM for data and RAID 50 - 3 drives for 1 RAID 5 group.

Sign In or Register to comment.