Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Distributed Filesystem over multible Server
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Distributed Filesystem over multible Server

BigBBigB Member
edited March 2020 in Help

Hello

I want to know what the best solution would be to create a distributed file system.

Example of what i want:

I have 6 Servers with 1TB storage each.

I want a filesystem with 3TB storage distributed over these Servers so that one or more Servers can go offline and the files are still accessable.

The Servers are all connected over WAN and the solution should be free or cheap.

The Filesystem should be expandable dynamically.

Is this possible?

How good / bad are the iops?

Thanks

Comments

  • cociucociu Member

    yes , the free one is proxmox with clusters.

  • Ceph

  • LunarLunar Member

    @Diamondz said:
    Ceph

    No.. Ceph over WAN? You need Ceph on a 10G LAN minimum. Ceph is very latency sensitive.

  • @Lunar said:

    @Diamondz said:
    Ceph

    No.. Ceph over WAN? You need Ceph on a 10G LAN minimum. Ceph is very latency sensitive.

    What about GlusterFS?

  • LunarLunar Member
    edited March 2020

    @greattomeetyou said:

    @Lunar said:

    @Diamondz said:
    Ceph

    No.. Ceph over WAN? You need Ceph on a 10G LAN minimum. Ceph is very latency sensitive.

    What about GlusterFS?

    Haven't had much experience with it, but I don't believe it's designed for higher latency WAN.

    I know https://tahoe-lafs.org/trac/tahoe-lafs is designed for more distributed setups though.

  • Distributed storage over WAN, why would you ever want to do that?

    You can do it over different DCs with a private 10G link.

    Thanked by 1Lunar
  • jfracjfrac Member, Host Rep

    Some time ago I tried glusterfs on a small gigabit lan for some workstations, it wasn't very stable or fast. In the end we went with the usual big 16x disk raid10 server with bonded interfaces.

  • NeoonNeoon Community Contributor, Veteran

    glusterfs, but in general, the higher the latency is, more crappy is the performance.

  • moosefs will work if the locations are close enough,

    The more latency between the various components the worse the performance will be.
    Also unless it's a private WAN circuit use a VPN, don't expose any of these clustered filesystems directly to the internet.

  • I think storpool[dot]com will play for your needs. This is not free solution, but for your capacity will be affordable.

  • if you just want to store stuff and dont care much about the logistics behind it or how to access it... => take a look at drftpd.

    https://drftpd.org/

    Thanked by 1doughnet
  • telimptelimp Member
    edited April 2020

    CephFS over Infiniband-IB , LizardFS ( or MoosFS is same like Lizard but not free ) over Infiniband-IB, GlusterFS over 10G low latency network - for cloud or virtual servers

    BeeGFS, HadoopFS( HDFS ) for HPC

    If you work on MS, Win server 2019 the FS is very nice

    Thanked by 1mailcheap
  • How would MinIO S3 play in this?

  • @Terenas said:
    if you just want to store stuff and dont care much about the logistics behind it or how to access it... => take a look at drftpd.

    https://drftpd.org/

    Agree this way works!

  • chxchx Member
    edited April 2020

    drftpd.org is dead , links broken etc and today it lives at https://github.com/drftpd-ng/drftpd

    I guess you could use http://www.net2ftp.com/or http://www.smoothftp.com/ as a webfrontend.

  • TerenasTerenas Member
    edited April 2020

    does ftpfs support the pret-list thing?
    i have fond memories of using drftpd a long time ago.

  • jsgjsg Member, Resident Benchmarker

    @telimp said:
    ... LizardFS ( or MoosFS is same like Lizard but not free ) ...

    No, LizardFS is based on MooseFS, which also offers a free version but some good stuff is only available in the commercial version. On the other hand MooseFS is well tested and established while LizardFS is relatively new and far less tested.
    Sad story, I know.

    @BigB

    Forget about a simple answer because there are many factors in play some of which aren't immediately recognizable. Plus your priorities aren't clear enough.

    I'm afraid you will have to look at the candidates and check them against your requirements and priorities. Possibly the hardest issue is your "one or more Servers can go offline and the files are still accessable" requirement hand in hand with "as cheap as possible".
    Reason: redundancy is still a hard problem, at least beyond 2 times. One of the main reasons is that not all "ECC" algorithms are able to also correct errors (detecting an error is one thing, being able to also correct it is another thing). Plus, "correction capable error coding" beyond 2 errors get expensive in terms of overhead. Some systems for example need a minimum of 8 (or even more) servers and you might need triple (or even more) of your net capacity.
    One good (in fact the only one I know of) solution is Mojette Transform erasure coding which however isn't widely available yet (and what is available isn't extensively tested and well established yet).

    So, my advice is to clearly define your needs and to make a hard priority list ("hard" meaning that some lower priorities can be ignored if the highest priorities are met). And be prepared to not get all of what you want. Oh, and also keep the CAP theorem in mind (or look it up); it might also serve you as a guide when making your priority list.

Sign In or Register to comment.