Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Monitoring the health of SSDs
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Monitoring the health of SSDs

Hi,

So far, I had servers with HDD, in a near future, I should be able to take a server with SSDs. So I was wondering which S.M.A.R.T. parameters to look careful at, to follow the evolution of the health of these drives?

Thank you,

Comments

  • usually there is something like total blocks written. depending on the blocksize you can calculate the TB written. every model comes with a guarantued number (read the tech specs from vendor) so you can check, how far it is in its real lifespan.
    also SSDs usually have reserved blocks and s.m.a.r.t. should show if and how much are already in use.

    ofc those numbers are not set in stone, they just should help to get a better idea than just looking at power-on-hours ;-)

    Thanked by 1SteveMC
  • Thank you very much @Falzo !

  • Wear_Leveling_Count
    Uncorrectable_Error_Cnt

    Thanked by 1SteveMC
  • Thank you @eol .

  • datanoisedatanoise Member
    edited November 2018

    You often have two values regarding the data written on the disk: data sent to the drive and data written to the NAND. The one to take into account is the second one. Some controllers are very good at compressing stuff (like the SandForce SF-2281).

    Available_Reservd_Space will tell you how much of the extra NAND is still available. 100 means nothing has been used yet.

    Don't forget to use trim!

    Thanked by 1SteveMC
  • Don't bother. It dies when it decides to die. No amount of of monitoring can help you escape this.

    Our department recently had an incident where the entire storage node became unresponsive. It was caused by a failed hard drive (SSD is applicable here) where the kernel decided to lock up. A coworker has to wake up in the middle of the night to investigate in the datacenter.

    Although monitoring will mitigate other risks, ultimately you can't monitor your way to freedom.

    Thanked by 1SteveMC
  • Thank you @datanoise and @zllovesuki for your additional valuable information.

  • The Golden Rule: Keep backups. The more the better :smiley:

    Thanked by 3eol SteveMC datanoise
  • @datanoise said:
    Don't forget to use trim!

    Or just use a proper disk with a decent amount of over-provisioning.. Aka most enterprise grade drives will do :)

    Thanked by 2SteveMC vimalware
Sign In or Register to comment.