New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
2 is fairly low.
I replace all drives above zero regardless if they are inraid
It's not in a raid @GenjiSwitchPls but it's this servers OS drive.
It's hard to judge based on the data provided since it could be a bathtub-curve defect that occurred early on. I'd monitor it and if it starts incrementing, or reallocated sector count starts increasing too, ask them to replace it.
You're taking backups of your data, right? Otherwise....
@Damian thank you for the tip. I'm a backup freak...but just this server is just a TestLAB server with Hyper-V VMs. They are stored on the second disk.
If the OS disk dies, or if I have to change it, I just DD my Windows Server 2016 customized template on to the server, reboot, import Hyper-V VMs in to Hyper-V and are up and running in no time at all
Then keep the drive. It is fine for your needs
There is a program you can run that will try write to those sectors. If it fails then they get marked as permanently bad(offline uncorrectable). I have seen these heal themselves, or go into offline uncorrectable while the drive still works fine. I have also seen these as a precursor to more and more failures and the drive eventually dying.
I would run a bad sector finding program and flush this stuff out. You want to write and then read to each sector to verify all sectors. There are programs that can do that non-destructively but it might take awhile.
You only get those flags when it unsuccessfully tries to write to a sector. If you have more bad ones it won't necessarily warn you until it tries to write to them.
You should run smartctl long test, if pass, no need to worry, if failed, just submit the result and they will give you new drive
Started a smartctl long test now - ETA 10 hours. So will see tomorrow. Thank you @supick (lol, off topic, but your name sounds really funny in my language. It's sounds like "suck di**" but in Norwegian)
Sorry for hijacking, but I have a drive on Hetzner too which is part of a RAID1 showing some errors, wondering if it's bad enough that I should ask for a replacement:
1) Boot to rescue
2) run their hardware check (hwcheck) command on the drives
3) it will output if it falls within hetzner's replacement service or not
In Portuguese too...
Device Model: ST3000DM001-9YN166
Ladies and gentlemen, the ST3000DM001, which even got its own Wikipedia article.
Long test only does read/verify. To fully test you want write/read/verify. For non-destructive it would do read/write/read/verify.
What does "1 Raw_Read_Error_Rate" and "7 Seek_Error_Rate" mean? These two values on my hetzner server is extremely high. I contacted customer support. They said it is okay and has nothing to do with the disk defectiveness.
That's correct, those being high is okay for Seagate's version of SMART. Any Seagate drive, even a brand new one, will show high values there. What you should look out for instead, is:
Yeah, this drive needs replacing @wwabbit. It has a serious number of errors, on top of that it's a drive for domestic use and the particular model is known to have issues. I'm surprised they even sold it to you that way in a first place.
hetzner probably bought thousands of those drives back in the day, so you see them a lot especially on their older ranges offered through their auctions. they never hid the fact they are using desktop grade hardware instead of enterprise models...
@wwabbit drive is fine, the errors logged seem to be related on power-problems at a reboot or something with the SATA cable, the reading of UDMA_CRC_Error_Count also matches this situation. one could assume that happened because the disc has served in other boxes before and was tranferred to anotherserver ot whatever, maybe it has been hot-plugged or something like that - there are a lot of possibilities ;-)
Really, 1382 uncorrectable sectors are a-OK with you? Even the drive itself considers that not good, as the "normalized" value for that reading has fallen all the way down to 1 (from 100).
In what way? I would agree it "matches" if there was 1382 of those, but with only 4, kindly enlight in what way four UDMA errors lead to 1382 bad sectors.
I have to admit, I totally missed that reading - most of my hetzner servers have the 3TB toshiba drives where such number is not even available at all.
so my saying was about the errors logged below all those the numbers from the smart log, for specific times when obviously something else ocurred - those match the UDMA error count...
I agree that Reported_Uncorrect number doesn't help to build trust at all - yet those might just be a follow up on the four logged incidents and may be related to a bad sata cable or power failure or something like that. I'd guess they simply might be higher because the drive probably tried more then once to (re)read a sector before logging the whole incident.
as long as there are no reallocated or pending sectors I'd say those are more likely errors related to an external issue and not the disk health itself.
BUT that's just my pov. @wwabit of course could just simply open ticket and see what they do or say about ;-)
PS: but as long it's in a raid1 like he stated and the other disk have other readings one would might want to avoid rebuilding/resyncing the raid and such ^^
Not with a drive with its own Wikipedia article for unreliability. Not in this case you would want to "give it more chances".
yeah might be correct... I have three of those drives in a personal nas at home. a fourth died on me just at the beginning, yet I was too much of a cheapskate to replace the other ones. their working reliable every since
and conditions down in my basement probably aren't even comparable to a nice cold dc ;-)
just saying while statistics obviously show that this particular drive type is far from being up to par with other drives in terms of reliability I'd try to not jump to conclusions in every single case that easily.
after all can't tell how hetzner handles this. personally never had them replace a disk in one of my servers, only cancelled a server I grabbed from the auctions once (during trial) because of reallocated sector and runtime_badblock on one drive and simply picked another one.