Monitoring the health of SSDs

SteveMC · November 2018

Hi,

So far, I had servers with HDD, in a near future, I should be able to take a server with SSDs. So I was wondering which S.M.A.R.T. parameters to look careful at, to follow the evolution of the health of these drives?

Thank you,

Falzo · November 2018

usually there is something like total blocks written. depending on the blocksize you can calculate the TB written. every model comes with a guarantued number (read the tech specs from vendor) so you can check, how far it is in its real lifespan.
also SSDs usually have reserved blocks and s.m.a.r.t. should show if and how much are already in use.

ofc those numbers are not set in stone, they just should help to get a better idea than just looking at power-on-hours ;-)

SteveMC · November 2018

Thank you very much @Falzo !

eol · November 2018

Wear_Leveling_Count
Uncorrectable_Error_Cnt

SteveMC · November 2018

Thank you @eol .

datanoise · November 2018

You often have two values regarding the data written on the disk: data sent to the drive and data written to the NAND. The one to take into account is the second one. Some controllers are very good at compressing stuff (like the SandForce SF-2281).

Available_Reservd_Space will tell you how much of the extra NAND is still available. 100 means nothing has been used yet.

Don't forget to use trim!

zllovesuki · November 2018

Don't bother. It dies when it decides to die. No amount of of monitoring can help you escape this.

Our department recently had an incident where the entire storage node became unresponsive. It was caused by a failed hard drive (SSD is applicable here) where the kernel decided to lock up. A coworker has to wake up in the middle of the night to investigate in the datacenter.

Although monitoring will mitigate other risks, ultimately you can't monitor your way to freedom.

SteveMC · November 2018

Thank you @datanoise and @zllovesuki for your additional valuable information.

saibal · November 2018

The Golden Rule: Keep backups. The more the better

Zerpy · November 2018

@datanoise said:
Don't forget to use trim!

Or just use a proper disk with a decent amount of over-provisioning.. Aka most enterprise grade drives will do

Howdy, Stranger!

Categories

In this Discussion

Monitoring the health of SSDs

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Monitoring the health of SSDs

Comments