New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Clustered filesystems between VPS in distant datacenters?
Can anyone share their experiences and preference for a clustered/replicated filesystem that will run between VPS in different datacenters around the world?
I could run rsyncs, but would prefer automatic, masterless replication. Maybe GlusterFS or XtreemFS? But do providers enable the fuse module? Anyone still using UnisonFS?
Mucho mahalos.
Comments
Yes, keep in mind, that 10ms for example makes booting containers nearly unusable.
Not to speak of KVM's.
I tested a Nextcloud installation with server running at Netcup (DE) and storage in a GlusterFS running at HostHatch (NL) and UltraVPS (DE). Not more than 13ms latency between sites. It was usable but it was enough to create some split-brain problems after some time.
Maybe a 3rd node would have mitigated the split-brain problem, but that solution would have been a bit expensive for my need.
Always take at least 3 nodes, otherwise you may run into problems.
If you just use 2 nodes, and they loose each other, you are done.
@vimalware has cluster experience.
So does Colocrossing.
Francisco
and so does the magnificent pony.
Are you talking about this? https://www.cis.upenn.edu/~bcpierce/unison/
That is a file synchronizer (like rsync), not a filesystem. I never heard anyone call it UnisonFS before :P (usually just "Unison")
I use it every day to synchronize files between my devices (home desktop, work desktop, work laptop). I wrote a "blog post" about why it was better for my needs than anything else: http://blog.perennate.com/20160531_unison.html
I run it manually when I login to the computer and again when I logout. I don't think it does automatic synchronization (which is not something that I want).
I use rsync for this to avoid clustered FS issues
Regarding DFS, I've tried GlusterFS and XtreemFS (also MooseFS and its successor LizardFS) before over WAN, I don't think any of them really work well with high latencies. But GlusterFS is much more mature than XtreemFS/LizardFS.
What would work much better over WAN is a weakly synchronized filesystem that synchronizes in the background. On conflicts it could rename one file and overwrite with the latest version. I'm not sure if there's any production software that does this though.
If you really want to use DFS, I think it might be best to run a three-way-replicated GlusterFS cluster on three close-by VMs, and then mount that cluster from all the rest of your VMs.
With KVM, fuse is not an issue since your provider has no control of whether you enable fuse in your VM. On OpenVZ, most non-retard providers have a button you can press to enable FUSE; or at least if you open ticket they can do it pretty quickly. If your provider replies "wut fuse??" or "no" then you should cancel your VPS since the provider will more likely than not be out of business this time next year.
Edit: sounds like AFS (e.g. openafs) might be a good fit for this (weak consistency). Of course it depends on the semantics you want from your filesystem; if you need to write to a file in parallel from multiple VMs then AFS certainly wouldn't work.
It's common advice not to use GlusterFS installs with two nodes for anything but geo-replication, where you could follow a master-slave hierarchy as well. Unless you want to spread bricks unevenly, it doesn't make sense to opt for a different use according to the
n = k + m
formula; and if that's the case, your redundancy is purely virtual