All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Proxmox & NVME huge data written numbers in SMART - anyone else experiencing this?
Hello,
anyone experiencing huge numbers in SMART? I'm running proxmox on server with 4 virtual machines running on Debian. I've checked the nvme smart and the Data Units Written are way too high. I've got two Intel SSDPE2MX450G7 running in raid 1.
SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 26 Celsius
Available Spare: 97%
Available Spare Threshold: 10%
Percentage Used: 20%
Data Units Read: 27,836,670 [14.2 TB]
Data Units Written: 724,552,635 [370 TB]
Host Read Commands: 320,954,705
Host Write Commands: 7,553,830,582
Controller Busy Time: 50
Power Cycles: 24
Power On Hours: 5,576
Unsafe Shutdowns: 3
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Is this a firmware bug? Because when I divide data written by number of power on seconds it would be 180MB/s on average, which is not possible as the VMs are mostly idling.
edit:
zpool iostat rpool 60
capacity operations bandwidth
pool alloc free read write read write
rpool 342G 74.3G 0 91 10.0K 1.95M
rpool 342G 74.3G 0 90 7.80K 1.95M
rpool 342G 74.3G 0 107 7.60K 2.91M
rpool 342G 74.3G 0 85 22.1K 2.15M
rpool 342G 74.3G 0 92 8.47K 2.16M
rpool 342G 74.3G 0 90 6.67K 1.71M
Comments
Did you own them from the start? Did it show zero written right after purchase?
Also does it keep increasing at the same rate right now? You can check with
iotop
if there's any disk activity.proxmox version ?
how? where does this output come from, and does it show that [370TB] number directly or did you add it?
however, it is simply wrong.according to this https://www.intel.com/content/dam/support/us/en/documents/solid-state-drives/Intel_SSD_Smart_Attrib_for_PCIe.pdf'data units written' is the number of 512 byte units written.724,552,635 * 512 byte makes it 370GB instead.either you or the software used to calculate that data got it wrong by factor 1000 ;-)
I may be too stupid, to read my own reference.
still I'd also doubt the numbers being correct. maybe it comes down to the part of the filesystem having another blocksize and the conversion for that number which should be done by the controller does not work as intended...
Yes, both disk were brand new. SMART values are pretty similar in this case.
Data Units Read: 27,836,766 [14.2 TB]
Data Units Written: 724,543,634 [370 TB]
Data Units Read: 27,838,911 [14.2 TB]
Data Units Written: 724,576,211 [370 TB]
iotop is pretty normal for low usage server, 500kB/s with 10-15MB/s spikes. Nothing even close to 180MB/s avg.
I don't know how it grows over time, I just noticed it today.
It was Proxmox 5 since the beginning, 5.3-7 currently.
The output comes from
smartctl -a /dev/nvme[0-1]n1
On some forums I've found this:
Back in times it was a bug, but the firmware updated came out in 2015, so I don't think it affects my server.
Doesn't mean your drive got that firmware update.
Well this model has been launched Q3 2016, so I think it's safe to assume the firmware is already patched.
Nope.
Yep. Both NVMEs got firmware MDV10290 which is latest one available for this model.
Nice.
Sorry I'm not familiar with forum's clowns. I answered you, because I thought you could help me with the issue. My bad!
Me neither.
Where did you get 180 MB/s? Seems to be 20 MB/s average.
370 TB * (1024^2 MB/TB) / (5576 powered on hours) / (1hr/3600second) = 20
One of us sucks at maths just need to figure out if it's you or me.
Yup, my bad. Anyway it's still way more than it should be according to
zpool iostat
2 MB/s is still a lot if you just have 4 idling VMs. That means the VMs are writing 1 TB every six days. Might make sense to try to find the source of the high I/O.
But otherwise check the SMART info again tomorrow and see what the rate of change is, as rm_ suggested.
iotop.