Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


help me understand if there is any problem with i/o
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

help me understand if there is any problem with i/o

vanarpvanarp Member
edited June 2012 in Help

Let me say this upfront that I am new to working with VPS and most of my learning has been through this forum only. I am observing an interesting issue with my VPS which I wanted to clarity with you.

Any time I run the DD command on my VPS after a while (say an hour after previous run), its output shows something like this:

vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 95.857 s, 11.2 MB/s

Now any immediate runs of same DD command (tested within five mins of above run) shows much improved speeds:

vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test

16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 8.38861 s, 128 MB/s
vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 11.3696 s, 94.4 MB/s
vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 7.27529 s, 148 MB/s
vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 5.73212 s, 187 MB/s
vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 5.74627 s, 187 MB/s
vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 9.25248 s, 116 MB/s

Please help me understand whether this is the expected behavior or I should suspect anything wrong with the disk i/o ?

If it helps, I am trying to use this as a web server with LAMP stack. I started paying attention to dd command output when I was observing slow response on my site (wordpress) once in a while.

Comments

  • yomeroyomero Member

    I recommend you to try ioping (google it) instead of stressing the server with lots of dd's.
    Will give you a better idea of how stable is the i/o performance. But according to these results, sounds like some customers are running heavy cronjobs or p2p or a bad optimized db.

    Thanked by 1marrco
  • HalfEatenPieHalfEatenPie Veteran
    edited June 2012

    Correct me if I'm wrong but I think they're being cached (that's why you're getting improved speed right after one another).

    Of course I'm in the same boat as you (learning through this forum and few other articles) so this is just my speculation.

  • It's definitely caching, that's why the drastic speed up. Really though 11.2 MB/s cold is HORRIFIC, you really should open a ticket with your provider, they either have really bad hardware, or you have a bad neighbor who's pounding the disk.

  • prometeusprometeus Member, Host Rep

    The subseguent run are helped by some sort of caching. The first slow run is consistent after 15-20 minutes?
    As far as you know you are on a busy node? There are some disks with aggressive energy saving features that some provider don't disable so disks that are not spinning require a slow start, but after then speed should improve until the next idle sleep...

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    Might be caching, but might be hourly crons too.

    I know when hourly crons hit we see a spike of 800+ iops/sec on some of our nodes :S

    Francisco

  • vanarpvanarp Member

    Thank you for such quick responses!

    When you say it could be due to caching, is it good or bad ??

    @yomero said: I recommend you to try ioping

    Will run ioping in a while and share the results here.

    @prometeus said: The first slow run is consistent after 15-20 minutes?

    I just ran it again and here it is.

    vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test

    16384+0 records in
    16384+0 records out
    1073741824 bytes (1.1 GB) copied, 8.92858 s, 120 MB/s

    May be I need to give it more time?

    @prometeus said: There are some disks with aggressive energy saving features that some provider don't disable so disks that are not spinning require a slow start, but after then speed should improve until the next idle sleep...

    This is what I have been suspecting. But, how to resolve this when host is not ready to acknowledge the issue? I am thinking if it needs may be run DD from CRON ;-)

  • vanarpvanarp Member

    @taipres said: you really should open a ticket with your provider, they either have really bad hardware, or you have a bad neighbor who's pounding the disk.

    I want to be sure that there is actually a serious problem. If it is bad neighbor why do you think subsequent runs of DD do not exhibit the issue?

  • prometeusprometeus Member, Host Rep
    edited June 2012

    dd show only one face of the i/o, one that usually isn't used so often in real life computing :-)

    however 10M/s are for sure a low result for than kind of test. What are ioping results?

  • vanarpvanarp Member

    Here are the results of ioping commands:

    vanarp@vps:~$ ioping -c 10 .

    4096 bytes from . (ext3 /dev/xvda1): request=1 time=0.3 ms

    4096 bytes from . (ext3 /dev/xvda1): request=2 time=0.6 ms
    4096 bytes from . (ext3 /dev/xvda1): request=3 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=4 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=5 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=6 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=7 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=8 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=9 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=10 time=4.1 ms
    --- . (ext3 /dev/xvda1) ioping statistics ---
    10 requests completed in 9019.3 ms, 1316 iops, 5.1 mb/s
    min/avg/max/mdev = 0.3/0.8/4.1/1.1 ms

    vanarp@vps:~$ ioping -R .

    --- . (ext3 /dev/xvda1) ioping statistics ---

    5341 requests completed in 2990.3 ms, 3420 iops, 13.4 mb/s
    min/avg/max/mdev = 0.1/0.3/259.8/3.7 ms

    vanarp@vps:~$ ioping -RL .

    --- . (ext3 /dev/xvda1) ioping statistics ---

    3094 requests completed in 3000.1 ms, 1586 iops, 396.5 mb/s
    min/avg/max/mdev = 0.4/0.6/30.5/0.7 ms

  • prometeusprometeus Member, Host Rep

    @vanarp said: ioping -c 10 .

    make this a bit longer ( -c 30) than rerun it after a few hours.

    also let us see the output of
    vmstat 1 20

  • yomeroyomero Member

    Pretty good IMHO.

    Maybe are the cronjobs, or some bad users :|

    Thanked by 1marrco
  • vanarpvanarp Member

    @prometeus said: make this a bit longer ( -c 30) than rerun it after a few hours.
    also let us see the output of vmstat 1 20

    sure, i will run the commands after a few hours and post the results.

    @yomero said: Pretty good IMHO

    I feel the same too. Only the slow speed after a break worries me and I want to be sure it is normal or not.

  • klikliklikli Member

    Actually, is it ok to use /dev/urandom instead?

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @klikli said: Actually, is it ok to use /dev/urandom instead?

    no!

    /dev/urandom has a very very small pool size.

    Francisco

  • yomeroyomero Member
    edited June 2012

    @Francisco said: /dev/urandom has a very very small pool size.

    I don't think this is the biggest problem. Edit: well, is related...

    If you do it, your first bottleneck will be the CPU trying to get more random to write. You will barely get ~10MB/s or less.

  • vanarpvanarp Member

    After many hours (of idle time on vps) here are the latest test results. All commands were run one after the other.

    vanarp@vps:~$ ioping -c 30 .

    4096 bytes from . (ext3 /dev/xvda1): request=1 time=0.3 ms

    4096 bytes from . (ext3 /dev/xvda1): request=2 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=3 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=4 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=5 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=6 time=0.0 ms
    4096 bytes from . (ext3 /dev/xvda1): request=7 time=0.5 ms
    4096 bytes from . (ext3 /dev/xvda1): request=8 time=0.5 ms
    4096 bytes from . (ext3 /dev/xvda1): request=9 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=10 time=0.5 ms
    4096 bytes from . (ext3 /dev/xvda1): request=11 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=12 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=13 time=7.0 ms
    4096 bytes from . (ext3 /dev/xvda1): request=14 time=13.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=15 time=0.6 ms
    4096 bytes from . (ext3 /dev/xvda1): request=16 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=17 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=18 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=19 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=20 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=21 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=22 time=0.6 ms
    4096 bytes from . (ext3 /dev/xvda1): request=23 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=24 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=25 time=8.9 ms
    4096 bytes from . (ext3 /dev/xvda1): request=26 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=27 time=0.5 ms
    4096 bytes from . (ext3 /dev/xvda1): request=28 time=0.4 ms
    4096 bytes from . (ext3 /dev/xvda1): request=29 time=0.3 ms
    4096 bytes from . (ext3 /dev/xvda1): request=30 time=0.3 ms

    --- . (ext3 /dev/xvda1) ioping statistics ---

    30 requests completed in 29115.1 ms, 762 iops, 3.0 mb/s
    min/avg/max/mdev = 0.0/1.3/13.3/2.9 ms

    vanarp@vps:~$ vmstat 1 20

    procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----

    r b swpd free buff cache si so bi bo in cs us sy id wa
    0 0 3504 253180 59208 94644 0 0 7 15 10 16 1 1 98 0
    0 0 3504 253180 59208 94644 0 0 0 0 14 24 0 0 100 0
    0 0 3504 253180 59208 94644 0 0 0 0 13 23 0 0 100 0
    0 0 3504 253180 59208 94644 0 0 0 0 11 20 0 0 100 0
    0 0 3504 253180 59216 94636 0 0 0 16 20 44 0 0 100 0
    0 0 3504 253180 59216 94636 0 0 0 0 10 20 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 11 20 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 11 20 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 10 22 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 11 20 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 10 20 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 6 18 0 0 99 0
    0 0 3504 253180 59216 94644 0 0 0 28 11 24 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 9 19 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 10 21 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 8 17 0 0 99 0
    0 0 3504 253180 59216 94644 0 0 0 0 10 21 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 10 21 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 9 19 0 0 100 0
    0 0 3504 253180 59216 94644 0 0 0 0 9 19 0 0 100 0

    vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test

    16384+0 records in

    16384+0 records out
    1073741824 bytes (1.1 GB) copied, 150.739 s, 7.1 MB/s

    vanarp@vps:~$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync;rm test

    16384+0 records in

    16384+0 records out
    1073741824 bytes (1.1 GB) copied, 5.49874 s, 195 MB/s

  • prometeusprometeus Member, Host Rep

    latency is good, vmstat show an almost idle machine...

  • vanarpvanarp Member
    edited June 2012

    So, can it be concluded that the issue is due to one of the below?

    1. Excessive caching of slow disks resulting in better performance of subsequent operations

    2. Aggressive energy savings enabled for disks that they need wake-up call before they can perform up to the speed

    EDIT: 3. There are very i/o intensive stuff run by the neighbors on the node

    What would you do if you are in this situation?

  • Ignore the DD command really ioping shows the real story that the disks are working mostly fine. I say mostly because the DD command shows that fetching cold data is a little slow. Could be anything, but I suspect that it's more likely that the server could be a RAID-1 vs a RAID-10, or and it can happen sometimes the SATA disk dropped out of SATA 3G and is running SATA 1.5G. That can cause all sorts of fun issues, but a raid array is only as fast as the slowest disk.

    One reason I'm looking into CacheCade technology from LSI, it's a hybrid SSD solution without all the expense and greatly improves IO for customers.

    Thanked by 1taipres
  • yomeroyomero Member

    Another possibility... is a hard drive dying maybe...

  • vanarpvanarp Member

    @yomero said: Another possibility... is a hard drive dying maybe...

    Ohh noooo...

    i hope the host recognizes the issue before i act on it :(

  • yomeroyomero Member

    @vanarp said: i hope the host recognizes the issue before i act on it :(

    Is just an idea.
    Maybe is something else.
    The problem is that you see the issue in production more than synthetic tests, slow loading and so.

Sign In or Register to comment.