Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Advertise on LowEndTalk.com
How are NVME drives holding up in long term?
New on LowEndTalk? Please read our 'Community Rules' by clicking on it in the right menu!

How are NVME drives holding up in long term?

Considering replacing all our SSD and spinning disks with NVME but concerned about how these drives will hold up over several years. Anyone running NVME over 2-3 years? Anyone had any NVME's 'crash' after being in service for over 6 months.

Thanks!

Comments

  • WebProjectWebProject Member, Provider

    NVME should be more reliable than any SSD or HDD, I managed to burn my first SSD drive on my home PC within 6 months as it was manufacturer fault so I had replacement.

    VPS Price Match Guarantee on: All our range of DDOS protected XEN-HVM VPS Plans
    Are you looking for best price for self-managed VPS? See WebProVPS website for more details.
  • vfusevfuse Member, Provider

    We've had a 2 nvme's give out after about 220TB written to them (~9 month usage) at hetzner on our logging cluster. We replace the servers now after ~200TB write.

    Thanked by 1pluush

    NIXStats monitoring Web, Server(Linux, Windows - $9.95/m), Logging (Free!) and Blacklists (start at 512 for $3.75/m) - Uptime Report - API Docs

  • edited March 20

    vfuse said: We've had a 2 nvme's give out after about 220TB written to them (~9 month usage) at hetzner

    What models are they? 220TB seems low unless they are 256GB drives.
    Did you run smart tool to check regularly?

    Thanked by 1pluush
  • PureVoltagePureVoltage Member, Provider

    No issues yet other than one DOA.
    Using enterprise U.2 drives however.

    PureVoltage Colocation with 6 Global locations, Seattle, LA, New York, Dallas, Chicago, and Amsterdam

  • They went to 5-year warranties faster than when SSD's started coming with 5-year warranties. Lower component count typically leads to higher MTBF.

  • vfusevfuse Member, Provider

    @greattomeetyou said:

    vfuse said: We've had a 2 nvme's give out after about 220TB written to them (~9 month usage) at hetzner

    What models are they? 220TB seems low unless they are 256GB drives.
    Did you run smart tool to check regularly?

    They're mainly SAMSUNG MZVLB512HAJQ-00000 (consumer nvme's). It could also have to do with the temperature (sensor1 averages 60 degrees c, sensor2 averages 95c).

    NIXStats monitoring Web, Server(Linux, Windows - $9.95/m), Logging (Free!) and Blacklists (start at 512 for $3.75/m) - Uptime Report - API Docs

  • mehargagsmehargags Member
    edited March 20

    NVMe drives tend to misbehave or go faulty faster if temperature is not optimal. @vfuse you might want to report it to Hetzner to check the rack cooling

  • vfusevfuse Member, Provider

    We already reported when they failed, only thing we noticed is that Helsinki the servers are much cooler compared to Falkenstein. They're all really hot in Falkenstein even tho they're in different dc# for HA.

    NIXStats monitoring Web, Server(Linux, Windows - $9.95/m), Logging (Free!) and Blacklists (start at 512 for $3.75/m) - Uptime Report - API Docs

  • PulsedMediaPulsedMedia Member, Provider

    vfuse said: They're mainly SAMSUNG MZVLB512HAJQ-00000 (consumer nvme's). It could also have to do with the temperature (sensor1 averages 60 degrees c, sensor2 averages 95c).

    I have always been under the impression that SSDs are not that temperature sensitive for hot temp, cold yes, but not hot. There seems to be consensus that the flash chips itself work better at higher temp, which makes cooling solutions difficult when controller needs to be kept cool but the actual flash hotter.

    vfuse said: We already reported when they failed, only thing we noticed is that Helsinki the servers are much cooler compared to Falkenstein. They're all really hot in Falkenstein even tho they're in different dc# for HA.

    It's actually located in Tuusula ;) It's newer, probably lower use % and finnish climate is rather cold typically year around, not many 30C+ days, but lots around the 0C mark :)
    We have DC in Helsinki and something like 7 months of the year just outside air circulation is pretty much all that is needed, it's only those 2-3 mid summer months we need to crank up the AC.

  • FranciscoFrancisco Top Provider

    @mehargags said:
    NVMe drives tend to misbehave or go faulty faster if temperature is not optimal. @vfuse you might want to report it to Hetzner to check the rack cooling

    We put double sided heatsinks on all of ours just to be safe.

    Lots of air moving over them too.

    Francisco

    BuyVM - Free DirectAdmin, Softaculous, & Blesta! / Anycast Support! / Windows 2008, 2012, & 2016! / Unmetered Bandwidth!
    BuyShared - Shared & Reseller Hosting / cPanel + Softaculous + CloudLinux / Pure SSD! / Free Dedicated IP Address
  • PulsedMediaPulsedMedia Member, Provider

    Francisco said: We put double sided heatsinks on all of ours just to be safe.

    We use one sided typically, typical chinesium "rubber band" mount which we replace with a zip tie and pressure on controller chip not flash chips. What kind of heatsinks are you using?

  • FranciscoFrancisco Top Provider

    @PulsedMedia said:

    Francisco said: We put double sided heatsinks on all of ours just to be safe.

    We use one sided typically, typical chinesium "rubber band" mount which we replace with a zip tie and pressure on controller chip not flash chips. What kind of heatsinks are you using?

    https://www.amazon.com/gp/product/B07PS9S2DZ/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1

    Each pack includes one of each. One 'band' based if you want, and one 'full double sided enclosure' based.

    These kits have 2 thermal pads, one for each side, so it squeezes it together like a sandwich.

    Francisco

    Thanked by 2PulsedMedia Aidan
    BuyVM - Free DirectAdmin, Softaculous, & Blesta! / Anycast Support! / Windows 2008, 2012, & 2016! / Unmetered Bandwidth!
    BuyShared - Shared & Reseller Hosting / cPanel + Softaculous + CloudLinux / Pure SSD! / Free Dedicated IP Address
  • PulsedMediaPulsedMedia Member, Provider

    Thanks. I've seen those, but have not got any so far. Tried many of the cheaper ones tho. I'll buy some and do metrics on them too :)

  • webguyzwebguyz Member
    edited March 20

    We use a heatsink brand called Warship which you can find on Ebay. We also turn up the fans a bit on the supermicros we use them in. Average temp is 28C. Might hit 30C if the disk is very busy. Kind of pricy at around 5 bux but a worthwhile investment I think. Everything I have read suggests excessive heat is a real killer and causes throttling. Have about 25 vms on each hyper-v server and they are very fast. Really like the Sabrent 2TB NVME models.

  • PulsedMediaPulsedMedia Member, Provider

    @webguyz said:
    We use a heatsink brand called Warship which you can find on Ebay. We also turn up the fans a bit on the supermicros we use them in. Average temp is 28C. Might hit 30C if the disk is very busy. Kind of pricy at around 5 bux but a worthwhile investment I think. Everything I have read suggests excessive heat is a real killer and causes throttling. Have about 25 vms on each hyper-v server and they are very fast. Really like the Sabrent 2TB NVME models.

    https://www.ebay.com/itm/WARSHIP-M-2-NGFF-PCIE-NVMe-2280-SSD-Heatsink-Cooling-Fin-Radiator-Thermal-Pads/273027268702?epid=14009596637&hash=item3f91b1805e:g:OKMAAOSwFOZbIShb

    this one? Could not find for 5$.

    anyone tried those full copper ones?

  • PulsedMediaPulsedMedia Member, Provider

    Cool, and i see quantity discounts too. I'll buy a few of those as well! :)
    Planning to ramp up the ZEN MiniDedis by end of this year once we can automate them, these might make good for a good standardized solution if the thermals hold up.

  • NDTNNDTN Member, Provider

    Use U.2 Enterprise NVMe like Intel P4610, we have been deploying a lot of NVMe servers in the past years and none giving issues. For example, the endurance of the Intel P4610 1.6TB is 12.25PBW while consumer NVMe like Intel 660P 2TB only has 400TB.

  • Hetzner_OLHetzner_OL Member, Provider, Top Provider

    As a general rule of thumb, if you think that there is ever an issue with the performance of the hardware that you think is re-occurring, please communicate with our team about it. In some situations, our team may need to try to document any potential issues with hardware to see if it's part of a larger problem. --Katie

    We (Katie and Helena) will do our best to answer your Hetzner questions and pass on your feedback. Hetzner Online's not liable for any corny jokes that we make. (https://www.hetzner.com)

  • TimboJonesTimboJones Member
    edited March 23

    @PulsedMedia said:

    vfuse said: They're mainly SAMSUNG MZVLB512HAJQ-00000 (consumer nvme's). It could also have to do with the temperature (sensor1 averages 60 degrees c, sensor2 averages 95c).

    I have always been under the impression that SSDs are not that temperature sensitive for hot temp, cold yes, but not hot. There seems to be consensus that the flash chips itself work better at higher temp, which makes cooling solutions difficult when controller needs to be kept cool but the actual flash hotter.

    I'd be curious where you heard that consensus from, that sounds like nonsense.

    (on a side note, I'd think that'd make cooling solutions easier, when you have a side that needs cooling and a side that can take the heat).

    Edit: Found an article that likely refers to the high temperature for NAND controllers you were talking about. https://www.eeweb.com/profile/eli-tiomkin/articles/industrial-temperature-and-nand-flash-in-ssd-products

  • PulsedMediaPulsedMedia Member, Provider

    TimboJones said:

    I'd be curious where you heard that consensus from, that sounds like nonsense.

    Higher than the controller chip.

    I did not save links, but i've seen this multiple times in regard to M.2 cooling especially with the new PCI-E Gen 4 drives and the difficulty of their cooling and why most drives don't have coolers on them, where too cold nand chips is no good neither and they will wear out faster at colder temps.

  • @PulsedMedia said:

    TimboJones said:

    I'd be curious where you heard that consensus from, that sounds like nonsense.

    Higher than the controller chip.

    I did not save links, but i've seen this multiple times in regard to M.2 cooling especially with the new PCI-E Gen 4 drives and the difficulty of their cooling and why most drives don't have coolers on them, where too cold nand chips is no good neither and they will wear out faster at colder temps.

    Last paragraph of the link I posted:

    The best way to optimize the data retention of a NAND-based SSD is to limit the temperature at which the NAND flash is stored. When the drive has reached or is approaching its end of life, limiting the time of exposure to high temperature will also help extend the data retention.

  • bacloudbacloud Member, Provider

    Started NVMe VPS from July of 2017. P3600 Intel, all NVMe drives are ok, no one is dead.

    Thanked by 1pluush

    High quality VPS just from $2.80 LT/NL/USA
    Need Custom server? Please contact us [email protected] or Skype Andrius.Bacoud

  • PulsedMediaPulsedMedia Member, Provider

    TimboJones said: The best way to optimize the data retention of a NAND-based SSD is to limit the temperature at which the NAND flash is stored. When the drive has reached or is approaching its end of life, limiting the time of exposure to high temperature will also help extend the data retention.

    That talks about data retention and does not define high temp. Is high temperature in this regard 40C? 100C? 200C? 300C?

    I was talking about overall write cycles. Data retention when talked about usually referes to number of years the data is safe unpowered at the device.

  • pluushpluush Member
    edited March 26

    I have used NVMe for almost 2 years. But not in server environment (desktop PC). I bought a (supposedly) non-retail SM961 which has 2LC (instead of 3LC), no problems so far, and IIRC never a disk-related crash. And I expect them to last longer than TLC SATAs. Would kinda be disappointed if they give up at 200TB...

    my 4yo Tablet PC SATA m.2 SSD even wrote 15TB NAND without actively abusing it.

Sign In or Register to comment.