Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Distributed Storage Technologies (Like CEPH) - Performance, setup and cost.
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Distributed Storage Technologies (Like CEPH) - Performance, setup and cost.

randvegetarandvegeta Member, Host Rep

Hello All,

I am wondering how many providers here use distributed cloud storage technologies in their VPS/Cloud services, and why?

We use Virtuozzo for our cloud infrastructure but it is not really suitable for LET pricing given licensing costs would make up most of the LET budget. But we get fairly good performance, with high levels of redundancy and in my opinion ease of setup.

Our setup includes a cluster of nodes, each with 3 SATA HDDs (7,200RPM), 1 SSD for the OS and 1 SSD for the cache/journal. All components are Enterprise / Data Center grade with features like ECC and Power-Loss-Protection built in.

I would be very interested to know how many other providers here use some sort of Cloud Storage, what they use, how it performs and if what the costs (if any) are. I am particularly interested in open source and free deployments like CEPH and how it compares in the real world to commercial deployments like Virtuozzo.

Are there good free cloud storage solutions? Do they perform well? Are they easy to setup and maintain? Does it work well with commodity hardware?

Or perhaps everyone here is really just deploying regular VPS?

Comments

  • AshleyUkAshleyUk Member
    edited October 2016

    We have just setup a CEPH cluster to look at migrating away from the standard VPS setup via SolusVM.

    It does require some decent hardware and network to get performance, we are running 20Gbps on the Storage Network and 20Gbps on the client network per a node. SSD's for the Journals and a SSD Hot Cache Tier in front of the spinning disks.

    All based and running using inbuilt CEPH features, does require understanding of how it works and good command line skills, is not a pretty point and click GUI (not that I have used the commercial versions like you stated but can imagine there more friendly this way)

    Any questions let me know.

  • DETioDETio Member
    edited October 2016

    @randvegeta said:

    Hello All,

    I am wondering how many providers here use distributed cloud storage technologies in their VPS/Cloud services, and why?

    We use Virtuozzo for our cloud infrastructure but it is not really suitable for LET pricing given licensing costs would make up most of the LET budget. But we get fairly good performance, with high levels of redundancy and in my opinion ease of setup.

    Our setup includes a cluster of nodes, each with 3 SATA HDDs (7,200RPM), 1 SSD for the OS and 1 SSD for the cache/journal. All components are Enterprise / Data Center grade with features like ECC and Power-Loss-Protection built in.

    I would be very interested to know how many other providers here use some sort of Cloud Storage, what they use, how it performs and if what the costs (if any) are. I am particularly interested in open source and free deployments like CEPH and how it compares in the real world to commercial deployments like Virtuozzo.

    Are there good free cloud storage solutions? Do they perform well? Are they easy to setup and maintain? Does it work well with commodity hardware?

    Or perhaps everyone here is really just deploying regular VPS?

    Access to open technologies is very limited in the hosting market, the majority of providers here utilize SolusVM - which unfortunately doesn't support HA and never will.

    We've actually been building an OpenSource Cloud Platform VirtEngine https://github.com/virtengine/dash, which supports CEPH in order to build highly available and redundant setups.

    Almost all OpenSource platforms (OpenStack, OpenNebula, & CloudStack) Support some form of HA - through Ceph or other Storage solutions.

    We don't find CEPH to be costly at all, since it's opensource & free ofcourse - two being that it's possible to setup CEPH without the needs of SANs. If you look at our documentation, https://docs.virtengine.com/#high-availability it is possible to setup redundancy with only the following minimum requirements:

    • 2 Drives per Compute Node
    • Private Network

    1 Drive is automatically converted into Cloud Storage, and can then store other compute node's VM's inside it automatically - which is efficient as most hypervisor nodes already carry 2 drives.

    This means no expensive SAN requirements to get redundancy setup, we do recommend CEPH and it's always been reliable.

  • DETio said: Access to open technologies

    You literally spam the same exact shit in every thread,

    Poor marketing skills 101.

    Learn to market better instead of spamming the same shit in every thread.

    Thanked by 3rm_ k0nsl Infinity
  • AshleyUkAshleyUk Member
    edited October 2016

    @Foul said:

    DETio said: Access to open technologies

    You literally spam the same exact shit in every thread,

    Poor marketing skills 101.

    Learn to market better instead of spamming the same shit in every thread.

    Exactly what I was thinking, and for someone who's actually in the position to maybe be a customer puts you 100% in my no-go list.

    Thanked by 1Foul
  • FoulFoul Member
    edited October 2016

    AshleyUk said: Exactly what I was thinking, and for someone who's actually in the position to maybe be a customer puts you 100% in my no-go list.

    Exactly, it's just making him look desperate.

    @randvegeta ;

    Have you checked out OpenStack or OpenNebula?

  • DETioDETio Member
    edited October 2016

    Foul said: You literally spam the same exact shit in every thread,

    Poor marketing skills 101.

    Learn to market better instead of spamming the same shit in every thread.

    Thanks for your feedback,

    What I have said in this thread might have been said before on different discussions which have been related to different things (for example in discussions dedicated to SolusVM) but it is still relevant as it is:

    • True
    • Related to this discussion - Hosting Providers & the use of distributed cloud storage technologies.
  • FoulFoul Member
    edited October 2016

    DETio said: What I have said in this thread might have been said before on different discussions but it is:

    It's the same thing of you repeating your failed marketing attempts. You say you're launching November 1st, yet no alpha, or general beta to test it out. You're just in shambles.

    You keep badmouthing SolusVM because you seem to think thats your only marketing strategy, I'd suggest going to a trade school and learn some business administration skills.

    You're driving away future clientele slowly.

    // Not trying to be rude, just being blunt. \

    Thanked by 1rm_
  • DETioDETio Member
    edited October 2016

    Foul said: It's the same thing of you repeating your failed marketing attempts. You say you're launching November 1st, yet no alpha, or general beta to test it out. You're just in shambles.

    You keep badmouthing SolusVM because you seem to think thats your only marketing strategy, I'd suggest going to a trade school and learn some business administration skills.

    You're driving away future clientele slowly.

    // Not trying to be rude, just being blunt. \

    We're simply pointing out the issues of SolusVM - which are in itself utterly true. Pointing out the flaws in competing products isn't so bad of a marketing strategy fyi. Not unless it's abused and misleading or untrue.

    We have been beta testing our software for the last 3 weeks+, and have been launching our commercial cloud platform on-premise for a variety of hosting providers for testing purposes.

    Our opensource edition can be tested by anyone, you might need some technical experience to get it working however.

    We've recently purchased hardware (around 10 Compute Nodes) collocated at Psychz.net in order to setup a general public beta test, we're simply waiting for our PDU to arrive at this stage in order to get it setup.

  • randvegetarandvegeta Member, Host Rep
    edited October 2016

    @DETio, not sure what SolusVM has to do with the topic of this thread. I have to agree with @Foul and @AshleyUK, this kind of looks like shameless self-promotion, particularly since you did not even mention anything about cloud storage options other than that your own system will make use of CEPH.

    @AshleyUK, how is the performance of CEPH? How would you compare it to say a locally attached SATA drive?

    Actually with regards to Virtuozzo Cloud Storage, there is no fancy GUI. It is all setup via CLI. But they are well documented and the installation for me was very easy to understand. It also integrates seamlessly with their VMs and Containers (obviously). CEPH is completely separate from the Virtualization servers (or am I wrong) so I think the setup would be more complex either way.

    Our Virtuozzo Cloud also uses pretty substancial hardware, running over 10G. I don't see the need for 20G given we only have 3 disks per node and they run at a 3Gbit each. Unless the cluster is rebuilding, the network load is very low, pushing just a few 10Mbit at any given time.

    I am generally happy with the performance with regards to IOPS. Benchmark results below:

    • Sequential read performance, 4 threads: 620.18134 MB/s
    • Sequential write performance, 4 threads: 134.34494 MB/s
    • Random read performance, 16 threads: 11287.41602 iops
    • Random write performance, 16 threads: 1017.76062 } sync/s

    So it's pretty good IMO. But at $3 /100GB in licensing, it is not exactly cheap. So I'm really interested in alternative options.

    @Foul - No I have not checked either OpenStack or OpenNebula. What can they offer in terms of distributed cloud storage?

  • DETioDETio Member
    edited October 2016

    randvegeta said: Not what SolusVM has to do with the topic of this thread.

    To be honest - innocent mistake, I misread "I am wondering how many providers here use distributed cloud storage technologies in their VPS/Cloud services, and why?" with:

    Why barely any providers here use DRBD technologies to offer HA? Thus I started off with how many providers use SolusVM which is limited to LVM Storage.

    I've since edited my comment since I just noticed :-)

    @Foul, now I understand where you are coming from. Apologies for seeming like a jerk.

  • randvegetarandvegeta Member, Host Rep

    @DETio,

    Do you support any other distributed storage tech? If so, what else?

    Also, if you are supporting them on your platform, have you done any performance benchmarks?

  • DETio said: which is limited to LVM due to the fact that Ceph or other RBD technologies bring in High Availability.

    You... what? Why should this be a limit? The fuck are you even talking about? I can mount Ceph/GlusterFS/ZFS via iSCSI PERFECTLY FINE and run SolusVM via LVM on it, and you know this as much as i do so don't talk crap.

    DETio said: Why barely any providers here use DRBD technologies to offer HA?

    Because DRBD, the original implementation, is horrible crap. If you've ever used it - i did many years in hosting - you would know that.

    DETio said: two being that it's possible to setup CEPH without the needs of SANs

    You might want to note though that your 2 node cluster is useless then and the peformance.... useless as well. Ceph works by scale, not by having 2 nodes.

    randvegeta said: Our setup includes a cluster of nodes, each with 3 SATA HDDs (7,200RPM)

    Do not use SATA for storage on scale, SAS can write and read to cache at the same time while SATA is limited to one operation. Ideally you want NVMe via PCIe anyway but this is SSD only.

    randvegeta said: I would be very interested to know how many other providers here use some sort of Cloud Storage, what they use, how it performs and if what the costs (if any) are.

    We have ~ 100TB in Ceph (over 30 nodes~) and about 150TB as ZFS (much less nodes, larger). Ceph performs better but requires much more backend work plus the redundancy works different from a traditional RAID setup. Unlike the guy above states, on a scale Ceph works fine on Gbit especially if you only have 2 HDDs per server.

    randvegeta said: Are they easy to setup and maintain?

    No, not really.

    randvegeta said: Does it work well with commodity hardware?

    Ceph can take an awful lot of failure and still rebuild, it is however not the holy grail of HA everyone says it is - we see fail constantly on our size which probably does not pop up on a 3 server cluster in some basement of @DETio

  • DETioDETio Member
    edited October 2016

    randvegeta said: Do you support any other distributed storage tech? If so, what else?

    https://www.drbd.org/en/

    https://ceph.com - we recommend CEPH

  • DETioDETio Member
    edited October 2016

    William said: You might want to note though that your 2 node cluster is useless then and the peformance.... useless as well. Ceph works by scale, not by having 2 nodes.

    A 2 node cluster with a failover node is definitely not 'useless', Ceph does gain it's advantage on larger deployments however we simply note what the minimal requirements are. We don't really recommend a 2 node cluster for production, however are showing proof of concept on how it functions.

  • randvegetarandvegeta Member, Host Rep

    William said: Do not use SATA for storage on scale, SAS can write and read to cache at the same time while SATA is limited to one operation. Ideally you want NVMe via PCIe anyway but this is SSD only.

    SAS are not worth the money. We will either use SATA or just go straight for SSDs. Our current deployment has acceptable performance for what we need as our SSD cache improves overall performance. No where near as good as pure SSD storage but better than simple SATA.

    SAS I'm sure are much better than SATA but the price (at least in HK) doesnt seem much lower than SSDs, and SSDs are much much faster.

    I'm also wondering if the SSD cache/journal makes up for any shortfalls from the SATA of SAS?

    What is the replica ratio on CEPH systems? With Virtuozzo, everything is triplicated, so only 1/3 of the disk is actually usable. Multiple replicas does however improve the read/write performance and I think, like CEPH, means the more nodes in the cluster, the better!

    Do you have any experience with Virtuozzo? If yes, what do you think of the performance compared to CEPH?

    My company has a number of old machines that are being decommissioned as they really don't make any sense to sell as dedicated servers (in HK) and are not suitable/required for VPS or shared hosting. Some of our old machines are being shipped to Lithuania where they make more sense but whatever we don't ship, I am considering building an inexpensive cloud storage.

    I am thinking about taking all of our older servers (mostly i3-3220 or similar), filling them with our older HDDs (mostly 1-2TB), sticking with a Gigabit network and using them for the cloud storage for general purpose storage instead of using commercial NAS/SAN boxes. How long would it take to setup a small 5 node cluster?

    Is ZFS easier to setup/manage?

Sign In or Register to comment.