New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
How to setup a storage network for 5000 TB Space
DevOpsYouKnow
Member
Hello Guys,
This is a hypothetical question, but I just want to know how people setup a storage network (in a data center) that can serve like 5000TB of space (actual data). So if it's RAID 10, how much will be the total storage? What equipments do they actually need in real life?
In short, how to setup a cloud storage system in the cloud for huge storage like this.
Thanks
Ben
Comments
5000 / (8TB HDD * 4)
625 HDDs, 157 arrays.
7 of these chassis http://www.supermicro.com/products/chassis/4u/
Send me my $50 usd consultation fee
glusterfs+zfs.
Wow, and how many servers do we need to manage these HDD? (sorry if that qualifies as a noob question)
The answer to that is in his reply. 7 is the magic number.
While you could potentially just use one server to manage all this, at this scale it would be moronic to have a single point of failure.
Given arrays of 16 hdds each in raid-6 with 2 hot spares each, you could fill that 90-bay server with 5148tb each. Given that an 8tb hdd gives just over 7tb of space, you'd end up with 5147=490TB/server, say 500TB/server.
That would mean you need 10 servers, but I'd build in some redundancy on server-level, so +2 servers for redundancy.
This woul net you 12 servers with 90 drives each, so just over 1000 drives required. Cheapest 8TB drive is like 250+ usd, so you'd end up with about a 750k worth of drives. I think a fair bet is that you would spend another 250k on servers and networking equipment, so your 5000TB storage should come in at around 1M usd.
copy backblaze?
They claim $118k per PB so multiple by 5.
Similar to netomx you could use 500x10TB WD Gold drives in this case
http://www.supermicro.com/products/chassis/4U/847/SC847DE1C-R2K04JBOD
Thats a total of $295,000 for the HDDs alone. Thats 6 of those cases one less than netomx. Keep in mind this is in RAID 0 so I'd be careful and go RAID 10 doubling the cost of everything
What are you going to store?
XXX archived in zip and backups of it in a backups x 1000 ?
1s and 0s. Always in demand.
Assuming they have that much hardware available. And without overselling, I dont think they will be able to recover even the hardware costs, leave alone the network/power !
5 000 TB? Peanuts!
In theory you could build 5 000 TB storage system, but there is few points besides massive price:
Time need to transfer such amount of data in/out of the storage cluster. Data recovery rate which would be around 47 days to fully dump 5000 TB on 10 Gbps line;
Maintenance. Dear God, you will curse that day when you decide to setup such bestiality, you will dream nightmares about HDD replacement and RAID rebuild;
Better look at this:
simplify your life
I'll just leave this here... walks away
In theory? 5PB is big but not that big.
I think it'd be nuts to build this on commodity hardware unless you have the time and money to engineer it like BackBlaze or Google does.
Everyone I know who has >1PB of data is doing it with purpose-designed storage arrays.
You think already wrong, on this scale you do not run simple 'RAID' anymore. You run CEPH or similar systems which work different. For few large scale nodes you'd use ZFS, for small Ceph. If you want Infiniband (which provides speed and RDMA) you'd use GlusterFS.
Large scale HW raid (eg. external SAS chassis as the HP P2000 or D2000, older also MSA30/60) are mostly useless by now, they have a use as HBA like expanders for ZFS but also limited as you get often 24-48 disks on a single 4x 3-6Gbit SAS port (12-24Gbit, 1-0.5Gbit per HDD, not enough at full usage, dual ports are usually used redundant and not bonding/separation).
You should just use more nodes with less HDDs - Ceph deal config are 4-8 HDD nodes, meaning 40U á 4 á 10TB = 1600TB per rack, in a mirror (essentially RAID10) config 800TB usable.
Interconnect these by 10G (2 switches á 40 ports, LACP, 4x 10G between them, 2x 40G as uplinks) which is plenty for 4 HDDs per node which cap out, non mirrored, at 4-5G anyway.
As master nodes (3 at least) you should use high speed RAM (DDR4 or many sticks) with 2x40GE+ (or bond 10G ports, which DOES work but not perfect) to the 10G switches, then uplink your stuff to this systems.
This comes out to about 800TB mirrored storage in 45U + PDUs.
If I needed 5PB of storage I would just call up EMC and have them send out an engineer to do everything for me. Simple as that.
But how about the pleasure to create an infrastructure, learn something new? When you get custom made PC, do you ask engineers to build it for you, or you build it yourself?
You're talking about something completely different, $1k of hardware you can shrug off if you make a mistake, $1mm is a bit different.
Get 5000 1TB disks and run them in a big raid-0 array. Who wants their data to be available next
monthweek anyway?Get 5001 1TB disks and run them in a big raid-5 array.
Bummer when one fails and it takes a decade to rebuild, but...
It'll only take a few hours (cough) I mean days.