All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Is Raid 5 acceptable again now with SSDs?
Let's get this out of the way first. Raid 5 isn't acceptable in modern day high capacity spindle drives. It's not, don't argue that it is because it's all documented that it's not. Their high failure rate in rebuilds along with slow drive access and horrendously slow rebuild times make it clearly not acceptable. Raid 6 is perhaps a little better since you get a second drive redundancy but still plagued by the slow and risky rebuild with sub par write speeds..
With that said raid 5 using SSDs? Personally I haven't done it but SSDs are stupid cheap right now and the thought is tempting. I'm sure a number of providers here have tinkered with it or run their whole "ship" off of it. What has your testing shown?
What's a typical rebuild time vs spindle as well as read/write speeds gains.
Thanks!
Comments
RAID-1 is much better than RAID-5 with SSD's
RAID5 kills SSDs faster, due to the amount of writes RAID5 performs during parity operations. Generally you'll want an SLC SSD to mitigate against it.
Go with RAID 1
And if you 4 disk then go for RAID 10
Of course raid 1 is better than raid 5 and 4x drive raid 10 is better than 5. But what about 5-8 drive arrays? Does anyone have real world experience with larger ssd arrays in raid 5?
I cannot think of a single use case for raid 5 on SSD's
When you consider chassis with 3 drive bays essentially do not exist, its 2, 4 then more, the only possible reason I can imagine is that someone has spent all their pocket money, they can only afford 3 drives and need a bit more storage than raid 1 provides and is not brave enough for raid 0.
Raid 5 for spinners is fine if the use case is high storage within a limited number of drive bays e.g. 4 when the cost of going up a U + additional power and significant chassis cost increase does not outweigh the benefits, especially in a market whereby everyone want 1TB for $3
So I am not arguing with you I am just saying your analysis of spinner use in raid 5 is wrong, although I also now appreciate you probably just did not want to discuss that at all as I type this, sorry.
tl;dr the gains and loss will be the same as spinners in terms of percentages but if your going for a 4 bay server just buy bigger drives and use raid 10, raid 5 on ssd's is only for those that cannot afford 4 but need more storage than raid 1 provides.
https://www.lowendtalk.com/discussion/160169/its-my-birthday-crazy-offers-lifetimes-directadmin-hosting-in-germany-and-los-angeles
Ask Mike how it works out for him...
Well, he rents the machines, he does not pay for replacement if he tires them down faster. Gives him in result more storage.
I suspect Raid 5 with decent SSD with that storage bonus is not bad for Shared Hosting.
Sorry to spoil the mass of the church of "everyone knows".
Raid 6 is not Raid 5 plus another XOR. It is Raid 5 plus a considerably more compute intense algorithm (galois), so Raid 6 will rebuild slower than Raid 5. And btw. even for the simple Raid 5 XOR the processors usually used on (not ultra cheap) Raid cards do have hardware support.
The difference between spindles and SSDs mainly comes down to price, size and the nature of SSDs (and NVMes) which however can be addressed properly. Now to price ... Looking for a reasonably good quality drive (2.5 mio hrs MTBF, etc) one finds that the options boil down to paying about 10 times the price of a decent spindler. Example: 12 TB spindle about 400€, 12 TB SSD about 4000€.
Probably the biggest error I see being made again and again is to play that game without context. What are you after? A large general storage array? An array for your database? or ...?
For one of the classical use cases, large general (non specific) storage of say 100 TB one needed a very very well filled purse to go the SSD route.
Other relevant questions are how serious your need is to have your storage array online at full speed, how you do your backup, etc. Raid 5 + hot spare can be an excellent and reliable solution and Raid 6 is not per se better, nor is Raid 10 always the best answer.
So, my answer to OP is: Raid 5 with SSD was never not acceptable but it was rarely the best solution.
one of three SSDs in a dedi, raid5 on a lsi megaraid, moderate workload, only hosting.
~120TB written in about 6 years :-D
back in the day it was cheap for nearly 600GB hw-cached raided SSD ;-)
Huh? RAID5 has lower write overhead than RAID1. Worst case is a 3 disk array, where the overhead is 50%, and this goes down with larger arrays.
Although technically correct, I am unsure if this matters in this day and age, with cpus being as advanced as they are. On NVMe I can see this making a difference (might become cpu-bottlenecked), but on spinning rust it's most likely IO limited. I have no clue about "normal" sata ssds.
You'll be fine with raid 5 and some proper SSDs. But sure if you're after consumer hardware, then I agree - raid 5 might not be the best choice.
Any decent enterprise-grade drive has enough endurance to run in a raid 5 setup, for 5-6 years with even rather decent workloads on them.
In X company we did 20 960GB (or 1920GB in some cases) SSDs in raid 6, we'd still throttle on IO when a drive would rebuild.. Which.. I guess makes sense :')
With that said - I can't remember when I last saw a raid 5 environment.
I'm thinking about either 3,5, or 6+ consumer 256gb or 512gb SSD drives in a raid 5 with a hot spare that backs up to a spindle drive nightly. Run a simple software linux raid (I find modern CPUs have no problem with raid 5). Couple that with a cheap 10gbps pci card for a pretty decent and quick nas box.
Since SSDs are dirt cheap right now and incredibly easy to just tack inside a case it seems actually viable these days. This wouldn't be for a production or business use. I've found that high quality KVMs can be had pretty cheaply these days and object storage solutions fill the gap nicely with the need for big storage. Running a dedicated server is all but unnecessary unless you have a specific security need.
However I'm finding that my local nas solution is lacking in the I/O speeds I need locally to satisfy my changing use cases. SSDs are a good fit but I want to be able to keep tossing more in and keeping growing as needed.
use zfs.
I have about five only slightly used 250GB ssds laying around right now, interested? 😁
though shipping can be costly depending on your location and may be a dealbreaker...
The processor is just 1 problem, the other one is memory. The processor is a problem because the Galois Field GF(2^8) calculation is significantly more compute intense than the simple XOR operation. And the memory iis a problem because the block sizes are considerably larger than the chache lines. One classical approach to solve that is to use ASICS which typically are but "bent" standard processors (typ. Arm, sometimes PowerPc) with a different cache control and structure, e.g. in the form of less lines which are larger.
A modern X86 can do those calculations too of course but that's a waste of both computing and electrical power, the latter of which also creates BBU power backup problems.
@Zerpy @sureiam
Let me introduce the other enemy URE and BER, about both of which there is lots of talk, lots of misunderstanding, and little tangible, let alone reliable information. Typical numbers that are thrown around is 10^14 for consumer spindles, 10^15 for enterprise spindles, 10^15 for consumer SSD, and 10^16 for enterprise SSDs.
But there are ugly buts, one important one of which is the fact that those numbers are (a) highly likely wrong and misunderstood, and (b) statistical values.
So no an enterprise spindle is not likely to have a URE every about 125 TB. It's actual URE is more likely to be good for 1250 TB - but it may also happen after just 1.25 TB.
Another issue is the story about Raid 5 completely failing during a rebuild due to URE. That story is highly likely wrong and based on a few worst case experiences (keep in mind, it's based on statistics).
Plus, of course drive manufacturers have come up with their own solutions which unfortunately are usually proprietary. The important take away is that one can considerably enhance protection against the much feared "Raid 5 rebuild doesn't work beyond 12.5 TB" story.. The sad part is that there also is a legal reality in which enterprises are much more likely to sue a drive manufacturer, and to pull it through, than the average Joe Consumer is. The result is that enterprise grade drives are in fact really more reliable than consumer drives. In the former the given URE is highly likely a worst case (read: the flat end of a bell curve) while in the latter it's highly likely a positive outlook (read: the 85% center of a bell curve).
Translation into reality: Actual UREs of enterprise spindles are highly likely more like 10^17 and the total loss of a whole Raid 5 array is extremely unlikely (under normal load and with a proper controller).
I've seen excited stories about SSDs being about 100 times more reliable than spindles. That may or may not be the case but it's largely theory because (enterprise grade) SSDs (to not even talk about NVMes) also are 10+ times more expensive than spindles - and keep in mind what we were talking about in the first place: large storage (You'll probably not run your local 1 TB drive in a Raid 5 or 6).
But we are told, there is a saviour, Raid 6. Well, sorry no, not really. Raid 6 doesn't protect you from UREs and BER problems plus Raid 6 comes with "embedded Raid 5" as one of its mechanisms.
I personally almost always run Raid 5 - plus - I do backups. And that, backups, is the real protection against Raid failure plus it's considerably cheaper than throwing disks at URE (which Raid 6 does).
As for ZFS I'm looking at that with the eyes of security developer: ZFS goes against KISS in a big way. Throwing additional layers and complexity at problems that are about reliability and availability is a big no-no.
I've done raid 6 across 4 smaller spindle drives where data storage was more important than speed. As raid 10 gives 1 drive failure raid 6 gave 6. In hindsight Raid 10 would have been better but it's also lasted 5 years now off of used Hitachi drives soooo who's to say what was better.
ZFS confuses and frustrates me. I feel like with that kind of effort I would consider minio s3 before ZFS.
+1 for @jsg comment.
RAID should never be your backup plan, just get an HDD for backups.
I wouldn't use raid 5....
Watched a raid 5 nuke itself during rebuild after another drive had failed.
Minio is great, especially when using distributed mode. I currently have Minio in distributed mode across 4 servers and 4 drives. Couple that with the Minio mc client which has the ability to "watch" a folder for changes and additions. I'll take Minio's redundancy for backup purposes, any day over raid 5 or ZFS.
GF(2^8) computation doesn't require much memory at all. For a 4 disk RAID6, only multiply by 2 is needed, which is only slightly slower than XOR.
I don't get your cache line point. Cache lines are only 64 bytes wide. Any I/O block size should be significantly larger, regardless of RAID or not, otherwise you're doing something horribly wrong. GF(2^8) is byte granular, so cannot possibly cross cache lines or otherwise be affected by it.
Intel's Ice Lake and Temont processors include the GFNI instruction set, which should make GF(2^8) roughly as fast as XOR.
This is what people often forget while considering a RAID setup for their server.
You need to know what are you going to do with the server.
GF(2^8) calculations are multiply and mod (addition and subtraction are basically xor, btw).
A 4 disk Raid 6 array is quite unrealistic. Usually Raid 6 arrays are considerably larger (typ. 8+ disks) and then GF(2^8) is computing expensive.
As for cache lines you simply mistook it. Yes, the calculation itself is 8 bit granular, but the amount of data to be dealt with is usually n x sector_size and those data come over the bus (PCIE) and are temp stored in memory. But the calculation doesn't happen in memory, so an optimized engine does (a) compute along the current cache line, and (b) load new lines and store processed lines. For that one wants less granularity and longer lines. The trick is basically to have computing time and load-store time nicely balanced (and fast, of course).
Yay, so let's use those power hungry rather expensive intel processors as disk controllers. Brillant. Being at it let's also use 40 ton trucks when we need some bread from the bakery.
Lol whoops meant 2 drive failure. But ya the system has been extremely reliable. Taking it offline soon because it's been running for 5 years and upgrades are required but VPS's have gotten so affordable that it's just not worth having my own dedicated server anymore. > @Weblogics said:
Right I've heard great things and the ability to keep growing with more servers (or mini PCs in my mind) is really tempting. I haven't actually set one up yet though just been everything I've read makes it seem more viable. Also every backup solution and worthwhile application works with S3 these days so that's a huge plus
We have run inhouse servers on RAID 5 for 10+ years. We were lucky, as many are. The server was needed for a small database driven software for a small company. RAID 5 is perfect in this scenario - low cost, enough fault tolerance and I/O throughput.
For a shared hosting server with NVMe, SW RAID 1 is good, for HDD, RAID 1 is no longer an option, I would prefer RAID 10. Different scenario, different RAID. You get the idea.
This is no different from software RAID5.
CPUs can pull from memory way faster than disks can DMA to it.
I think you may be confused. Cache line size is not important, but an optimized GF(2^8) does have to care about cache blocking (aka loop tiling). This is mostly because the computation is often fast enough to saturate the L2 cache bandwidth. The cache blocking size is completely selectable by the algorithm, so as long as the sector size is larger than it (which it typically will be), it's not an issue.
Ignoring the unnecessary snark, you'll find that most servers today already have an Intel CPU in them. Also, in all practical cases, all data has to pass through the CPU at one point anyway, so it's not like it's making unnecessary round trips.
As for hardware controllers, I suspect many controllers have rather poor implementations of RAID6. As such, I wouldn't be surprised if a hardware RAID6 implementation is much more of a bottleneck than a software implementation would be.
So? Raid Controller do both.
That's BS. In fact, some modern processors use a variant of PCIe even for inter-core communication. Moreover your assumptions are questionable. Not every disk read translates to a real disk read; it may well be a disk cache read. Plus and more importantly, Raid controller hardly ever (wastefully) use X86. Usually they use Arm or PowerPC based MCUs for various reasons, one of which is the difference between $5 vs $100.
Well, frankly it seems that actually you are out fo your depth and talking from a pure X86 perspective. The reality of MCUs however is quite different.
You miss the point again. The problem is this: there is, say 4KB bytes of data which (a) need to be striped (which is virtually cost free) and (b) pushed through 2 algorithms. How fast that can be done is depending on diverse factors, one important of which is to keep L1 load/store in balance with computing. If you compute faster than you can read from/write back to memory, you are wasting. If you compute slower you are wasting. So you want a good balance. Btw, many MCUs have just 1 level of cache or even none. Considering that memory access is about 100 times slower than cache access, keeping the balance is a significant part of the whole mechanism.
No. Look up DMA and the PCIe bus.
Yes and no. I agree with your suspicion that Raid controllers might have suboptimal implementations or even errors. But no re the software implementation. Because Raid controllers do have software implementations. It's just that they have processors or MCUs that are optimized for the job by e.g. cache design or hardware optimized for e.g. mul-mod.
I wasn't aware that people actually still use hardware RAID controllers with SSDs considering the myriad of issues they have, like these. My apologies.
isnt the issue that raid5 requires an erase operation before writing?
It's the other way around, we use hardware RAID exclusively precisely for the myriad of issues software RAID implementations have. Not to mention a cached hardware RAID will provide much much higher performance than any software solution will ever have.
I believe the cached hardware raids with batt backups are very reliable. With that said, I've heard about an equal number of Raid controllers failing vs drives even in a raid 1. But if you are running anything other than a reliable Linux based platform Raid controllers are the way to go.
Linux Software Raid though IMO is extremely reliable and a great option.