Slow KVM I/O speeds when host node much faster (virtio)

OttoYiu · February 2013

I'm running a single CentOS 6 virtual machine on a SolusVM KVM hostnode using virtio.

On the hostnode, I can get (elevator:deadline):

# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 5.38596 s, 199 MB/s

but on the VM, I only get (elevator: noop):

root@grasshopper [~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 18.4232 s, 58.3 MB/s

I can actually feel the sluggishness within the VM itself...

The hostnode has 4 x 1TB in RAID-10 with hardware raid + bbu. From your experience, does 199MB/s sound reasonable for a hard-drive only array?

Does anyone know why this is, and what I can do to fix it? Perhaps the LVM is misaligned? If so, how can I tackle that problem?

Thanks in advance,

MonsteR · February 2013

Check your raid to see if its ok, and maybe rebuild the raid mirrors.

Spencer · February 2013

@MonsteR said: Check your raid to see if its ok, and maybe rebuild it.

Yea that makes no sense at all

OttoYiu · February 2013

@MonsteR said: Check your raid to see if its ok, and maybe rebuild the raid mirrors.

The host node array is healthy. It has ~200MB/s writes, which I think is reasonable? I'm not too sure though, as I usually deal with RAID-6 arrays with 6+ more drives. Perhaps someone can correct me, or comment on that.

@Spencer said: Yea that makes no sense at all

and yes; not sure why I'm seeing such a big discrepancy when there's a SINGLE virtual machine running on the host node with no load as I'm still testing the performance.

MonsteR · February 2013

@Spencer its 5am here so might of worded it wrong.

OttoYiu · February 2013

I created a LV and mounted it directly to the host node itself, and ran the test:

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 12.9223 s, 83.1 MB/s

so it looks like the bottleneck is the LVM/PV, but I'm not sure how to fix it
The partition tables looks aligned to me:

# fdisk -ul /dev/sda

Disk /dev/sda: 1998.0 GB, 1997998653440 bytes
255 heads, 63 sectors/track, 242909 cylinders, total 3902341120 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009ce3c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048   335546367   167772160   83  Linux
/dev/sda2       335546368   352323583     8388608   82  Linux swap / Solaris
/dev/sda3       352323584  3902341119  1775008768   8e  Linux LVM

data_alignment_detection = 1
is also set in /etc/lvm/lvm.conf.

BradND · February 2013

Software raid?

George_Fusioned · February 2013

Is your LVM's PE Size set to 32M?

onepound · February 2013

@George_Fusioned said: Is your LVM's PE Size set to 32M?

+1 on this

Patrick · February 2013

Check 'vgdisplay' and check PE size like others mentioned!

What RAID Card & Drives are you using? May be BBU is charging? 199 still seems low with a raid controller and bbu up.

Jacob · February 2013

4 x 1TB in HW RAID = 200MB/s is quite bad. You can achieve that with SW RAID.

HW RAID has consistancy, which is it's main advantage, the controller you're using doesn't look like a performance controller.

But I have never used KVM so it may just be KVM, or your partitioning on the HN.

Tell us more and we can probably give better answers.

dano · February 2013

I would say something must be afoul in your setup. I have an old intel 5450(32GB Mem) with a degraded hardware RAID10 array of 3x10k RPM SAS disks (one drive died last week), with Xen as a hypervisor. Currently on an Ubuntu 12 VM(shared with 6 other VM's), I am seeing about 120-130MB/s via using dd on this degraded RAID10 array. The host does about 300+MB/s with dd.

I also have an older 5340(8GB Mem) machine with a single 7200 rpm with Centos 6 and Qemu/KVM running. I just installed a Debian 5 virtual machine, and dd is showing about 115-125MB/s. This host machine shows 450MB/s via dd now, with a single disk and one vm running.

I am thinking that a RAID10 setup with 7200RPM drives should be speedier than 199MB/s on the host and 60MB/s on a vm -- I have seen vibration cause MAJOR I/O issue. The company I worked for bought cheap servers and was having VERY slow I/O to some raid10 and raid6 setups. After lots of troubleshooting, it was found that the cage that holds the disk was flimsy and needed reinforcement. The company that sold us these cheap servers came out and drilled holes and added screws to keep the disk cage from vibrating so much.

This helped lots - more doubled the IO of the machines to acceptable levels. If your using CHEAP servers, this is something to think about - sometimes these machines are not tested to well and you end up being the beta tester for a shoddy chassis design.

Video that I saw once about vibration and IO performance.

Otherwise, I would explore your raid controller and disk firmware to see if there is anything that can be done to solve the root cause of your slow disk i/o.

NHRoel · February 2013

What is your default caching? Change it to write back.

OttoYiu · February 2013

Oh no, it's set to 4MB.

vgdisplay
  --- Volume group ---
  VG Name               kvm_vg1
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  299
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                6
  Open LV               6
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1.65 TiB
  PE Size               4.00 MiB
  Total PE              433351
  Alloc PE / Size       194560 / 760.00 GiB
  Free  PE / Size       238791 / 932.78 GiB
  VG UUID               Zt02Z7-r8D5-i8wC-iGee-gajc-BAD6-rfmJCk

I try changing it, but it's not letting me:

# vgchange -s 32 kvm_vg1
  New extent size is not a perfect fit

Also, this server has an older LSI 8888-ELP:

                    Versions
                ================
Product Name    : MegaRAID SAS 8888ELP
Serial No       : P066970608
FW Package Build: 11.0.1-0048

Adapter 0 -- Virtual Drive Information:
Virtual Disk: 0 (Target Id: 0)
Name:
RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
Size:1.816 TB
State: Optimal
Stripe Size: 64 KB
Number Of Drives per span:2
Span Depth:2
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disabled
Encryption Type: None

BatteryType: iBBU
Voltage: 4007 mV
Current: 0 mA
Temperature: 30 C

BBU Firmware Status:

  Charging Status              : None
  Voltage                      : OK
  Temperature                  : OK
  Learn Cycle Requested        : No
  Learn Cycle Active           : No
  Learn Cycle Status           : OK
  Learn Cycle Timeout          : No
  I2c Errors Detected          : No
  Battery Pack Missing         : No
**  Battery Replacement required : Yes
  Remaining Capacity Low       : Yes**
  Periodic Learn Required      : No

Probably a bad BBU that's disabling the write-back cache...

Also, thank you for everyone's help so far, even though I forgot to mention that this server is only for personal use and that I'm not going to be providing VMs to anyone (I'm not a competitor!)

NHRoel · February 2013

@OttoYiu Guest cache not the LVM. Use writeback caching.

support123 · February 2013

@Jacob said: 4 x 1TB in HW RAID = 200MB/s is quite bad. You can achieve that with SW RAID.

How much you achieve with hardware raid?

jh · February 2013

delete

support123 · February 2013

@jhadley said: delete

??

NHRoel · February 2013

@ftpit said: How much you achieve with hardware raid?

If caching enabled, you are looking at 250+, but regular raid, nothing special.

Damian · February 2013

@Jacob said: the controller you're using doesn't look like a performance controller.

It's not; the sales/marketing point of the controller he says he's using is for large external SATA connections. It's not the best controller to use for internal drives, but it's still much better than software raid.

OttoYiu · February 2013

I asked the datacenter to replace the BBU for this particular server.

I'm still having troubles with setting the PE size though. Do I have to recreate the VG? Also, how does the PE size affect performance?

@Damian said: It's not; the sales/marketing point of the controller he says he's using is for large external SATA connections. It's not the best controller to use for internal drives, but it's still much better than software raid.

I use this card for several large RAID-6 arrays, and they work pretty well. This is my first time using this card for a small RAID-10 however.

Damian · February 2013

Yes, it's a great card for large arrays, but it's kinda unspectacular for smaller arrays. But the price was probably right, and it still works better than software RAID

OttoYiu · February 2013

So I was reading up on the PE size, and I couldn't find any references regarding performance hits with small PE and LVM2. Can anyone point me in the right direction?

NHRoel · February 2013

Your speed will improve once you update your guest caching to writeback.

OttoYiu · February 2013

So, the BBU has finished its learning cycle and writeback is now enabled on the hostnode:

[root@lax-kvm1 ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.23221 s, 332 MB/s

On a LV mounted on the host node:

[root@lax-kvm1 backup]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.84068 s, 280 MB/s

On the VM (with guest writeback enabled like what @NHRoel said)

root@grasshopper [~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 6.46885 s, 166 MB/s

Why is there still such a big discrepancy between the 3?

NHRoel · March 2013

Did you ever manage to get a hold on this?

OttoYiu · March 2013

@NHRoel said: Did you ever manage to get a hold on this?

Nope, I still have no idea why the discrepancy.

XFS_Brian · March 2013

What do you have the Guest Disk Cache set as on the node settings?

OttoYiu · March 2013

@XFS_Brian said: What do you have the Guest Disk Cache set as on the node settings?

I have set it as 'Default'. I overrode it to writeback in the VM settings of the specific VM I was using to test the speeds. Would it matter in this case?

XFS_Brian · March 2013

I was having the same issue that you are having as well. I set mine to None which helped. May want to give this a try. If it works on that VM then I would make it a server wide setting.

OttoYiu · March 2013

@XFS_Brian said: I was having the same issue that you are having as well. I set mine to None which helped. May want to give this a try. If it works on that VM then I would make it a server wide setting.

Wow. You're a life saver! First the internal IP problem that I was facing, then this one.

[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 4.50717 s, 238 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.54242 s, 303 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.58158 s, 300 MB/s

Tried this with a fresh VM with guest cache set to 'None', and bam - faster speeds all around.

I guess it was double-caching...

Thanks again Brian and everyone who helped!

Edit: It seems that there is a correlation with the size of the logical volumn and write speeds...

A fresh VM that is 160G in size as follows:

[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 4.91814 s, 218 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 5.23405 s, 205 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 4.97634 s, 216 MB/s

vs 40G in size:

[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 4.50717 s, 238 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.54242 s, 303 MB/s
[root@centos6-min ~]# dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.58158 s, 300 MB/s

Both VMs are deployed with the same template.

That's weird :O

Howdy, Stranger!

Categories

In this Discussion

Slow KVM I/O speeds when host node much faster (virtio)

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Slow KVM I/O speeds when host node much faster (virtio)

Comments