Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with OpenID
Advertise on LowEndTalk.com

In this Discussion

Does anyone else experiencing high rates of linux server crashes today?

Does anyone else experiencing high rates of linux server crashes today?

corpuscorpus Member
edited June 2012 in General

apt-get -a beer

Thanked by 1djvdorp
«1

Comments

  • wow, what is that? leap second causing so much headache :P ?

    I use http://tuxlite.com to configure all my VPSes and I love it!

  • CoreyCorey Member

    Haven't had any problems here yet... but running OS based on RHEL.

    BitAccel - OpenVZ VPS / IRC,VPN,Anything Legal & Unrivaled Support!
  • I've heard that leap second could cause Linux crash before. Actually I consider it a joke at first, but it became true now...

  • Yes, 2 servers :-(

    IperWeb & Prometeus, Hosting Provider since 1997. iwStack cloud infrastructure
  • JarJar Member

    Thankful for no issues here. CentOS on everything mission critical, not particularly a preference but just how it ended up.

  • subigosubigo Member
    edited June 2012

    lol Debian lol

    No, no issues here.

    edit: I guess I should have said "lol ntpd lol". I don't put that on production servers either.

  • It's sad for the IT industry that such a thing still causes problems. I'm really afraid of the year the unix time will overflow 32 bit integers. (2039 or smth)

    Thanked by 1klikli
  • CoreyCorey Member

    @subigo why don't you use ntpd?

    BitAccel - OpenVZ VPS / IRC,VPN,Anything Legal & Unrivaled Support!
  • subigosubigo Member

    @Corey said: @subigo why don't you use ntpd?

    See: All the people having issues right now.

    I never install a service for something a simple cron can do.

  • PADPAD Member

    NTPD isn't needed. A cron can do its job. @subigo has a point - its a useless service. Lol.

  • vedranvedran Moderator

    @gsrdgrdghd said: I'm really afraid of the year the unix time will overflow 32 bit integers. (2039 or smth)

    Everyone will be using 64 bit by then?

  • @vedran said: Everyone will be using 64 bit by then?

    Well lets hope that :P

  • PADPAD Member

    We'de have a problem if we still use 32bit/64bit in 2039. LOL.

  • Well 64bit shouldn't be a problem, theres plently of space to expand :D

    However i don't linke to make any assumptions about 2039, just think back on what the world thought about computing in 1985.

  • JacobJacob Member

    No crashes, Just 2 Failed drives.

  • @PAD said: NTPD isn't needed. A cron can do its job. @subigo has a point - its a useless service. Lol.

    That is one of the dumbest things I've ever read here, which is saying a lot.

    My Advice: : VPS Advice | My Blog: : raindog308.com
  • subigosubigo Member

    @raindog308 said: That is one of the dumbest things I've ever read here, which is saying a lot.

    Back that statement up. I say putting NTP on production servers is one of the dumbest things ever.

    Thanked by 1Taylor
  • @subigo said: Back that statement up. I say putting NTP on production servers is one of the dumbest things ever.

    I say there are environments where having closely synchronized clocks is a requirement. Kerberos-based environments for example. Coordination with external partners where time is significant is another. Not all environments like a brute-force daily time change where the clock jumps forward a half-second or two.

    PAD said it is "useless" which means there are absolutely no uses whatsoever.

    My Advice: : VPS Advice | My Blog: : raindog308.com
  • subigosubigo Member

    @raindog308 said: I say there are environments where having closely synchronized clocks is a requirement. Kerberos-based environments for example. Coordination with external partners where time is significant is another. Not all environments like a brute-force daily time change where the clock jumps forward a half-second or two.

    PAD said it is "useless" which means there are absolutely no uses whatsoever.

    I think he's just saying there's no reason to use a service to get and set time, when you already have a service installed that can do it for you (cron).

    I've had so many issues with NTP in the past that I refuse to ever use it on an important production server again. Instead, I have a series of pages across the Internet like this one: http://94.249.244.181/date.php

    ...then on production servers I just setup a cron to grab that page and set the date every ten minutes.

  • NickMNickM Member

    ntpd does the job, and it does it well. Using cron for this, on the other hand, is a hackish solution, at best.

  • sleddogsleddog Member
    edited July 2012

    4 Debian Squeeze boxes here running the standard kernel & ntpd (2 hardware boxes, 2 KVM). No issues.

    If I'm reading correctly, the leap second gets inserted sometime during the day on June 30, UTC. It's now July 1 UTC....

  • subigosubigo Member
    edited July 2012

    @NickM said: ntpd does the job, and it does it well. Using cron for this, on the other hand, is a hackish solution, at best.

    Bullshit. NTP has had so many bugs over the years it's retarded. One of my first VPS nodes crashed daily for about a week and it ended up being a fast clock that NTP choked on while trying to sync. You realize NTP is a cron that does nothing more than grab the time from other servers, correct?

    edit: I'm just going to leave this here: buglist

  • BuzzPoetBuzzPoet Member
    edited July 2012

    Well, this is how Ubuntu 12.04 with NTP installed handled it on my laptop (from syslog):

    656 Jun 30 19:59:59 laptop kernel: [72245.253499] Clock: inserting leap second 23:59:60 UTC

    Went off without a hitch. I don't know why people think Debian is more stable than Ubuntu. Ubuntu, even in non-LTS releases, has always been more stable for me, and they accomplish this in 2 months of beta testing compared to Debian's 6-7 months.

    • Oh yeah, I should point out that one of the links on that ServerFault page said that :60 is not POSIX compliant. That's why Red Hat counts 23:59:59 twice. Notice that Ubuntu ignores the POSIX specification and makes a system that just works.
  • flyfly Member
    edited July 2012

    @subigo what are you trying to say by linking the buglist? software has bugs.... surprised?

    if you think your php solution is better than ntp.... that's kinda funny.

    https://bugs.php.net/bug.php?id=50696

  • If you don't want to use ntp, rdate out of cron is probably simpler than parsing php.

    My Advice: : VPS Advice | My Blog: : raindog308.com
  • subigosubigo Member

    @kbar said: @subigo what are you trying to say by linking the buglist? software has bugs.... surprised?

    if you think your php solution is better than ntp.... that's kinda funny.

    https://bugs.php.net/bug.php?id=50696

    Yeah, but software with hundreds of open bugs, most of which are over a year old, is retarded when that software is nothing more than a time sync.

    And don't be stupid. Did you even read that PHP bug report (yeah, I saw it on Hacker News too)? Are you going to tell me that using date(); and pulling that number via a bash script is somehow more complex than NTP? I do in 10 lines of code what NTP does in tens of thousands.

  • @subigo said: I do in 10 lines of code what NTP does in tens of thousands.

    10 lines of code excluding the webserver and php code.

    FreeVPS.us - The oldest post to host VPS provider
  • subigosubigo Member

    @dmmcintyre3 said: 10 lines of code excluding the webserver and php code.

    Including the PHP code... it's just "date()".

    And who says you have to use your own server to pull data from? There's a million places to pull official atomic time and NTP time (like, I don't know, maybe the NTP pool servers)... or if you don't like that: http://tycho.usno.navy.mil/cgi-bin/timer.pl

    Seriously people... do you even know what NTPd is doing? It's pinging one of the external pool servers and then setting the OS time. That's it. Nothing else.

  • NickMNickM Member

    @subigo said: Seriously people... do you even know what NTPd is doing? It's pinging one of the external pool servers and then setting the OS time. That's it. Nothing else.

    Actually, it's a little more complicated than that. It measures the delay between your server and the ntp server, and a bunch of other stuff. ntpd is almost always going to more accurate than anything you're going to claim is "just as good".

    Thanked by 2raindog308 yomero
  • subigosubigo Member

    @NickM said: Actually, it's a little more complicated than that. It measures the delay between your server and the ntp server, and a bunch of other stuff. ntpd is almost always going to more accurate than anything you're going to claim is "just as good".

    Right... it pings the NTP pool and takes the average of the most similar times in the pool and then sets the OS time accordingly.

    So you're installing an entire service package so you can sync your system up with a pool... great. And then if your clock happens to be out of sync too much, ntpd will refuse to set the time (unless you override it) and sometimes goes into a loop, crashing your server. Or one of the other hundreds of bugs that happen when ntpd doesn't feel like running correctly.

    I run a simple cron to keep my network in sync and it has worked for years. Over the course of a year, my time might (MIGHT) be off by 1-2 minutes when compared to the NTP pool. And if I really cared about syncing up with the NTP pool (which I do not), I could simply run ntpd on the server that my crons pull from. If you did that, you'd be in sync with the pool and you wouldn't even have to run the ntpd service.

    I don't care who you are or what you say... running an entire buggy service to keep in sync with another network pool will never be as safe and as simple as:

    #!/bin/bash
    
    TIME=`wget -q -O - http://node5.zensix.com:1111/status/date.php`
    date $TIME
    
  • NickMNickM Member

    I've used ntpd on all of my servers, for years, with 0 problems. And you're acting like ntpd is some huge daemon that eats your RAM and disk and CPU, but it's not. It's less than a megabyte installed, and uses 1.8MB of RAM. You may not need or want your servers to have a reliable date and time, but I do, and most other people do. In fact, as mentioned before, it's required for many services.

    Thanked by 1TheHackBox
  • It was fixed in 2.6.32 and later, but anyone who gets backports should be fine as well. So far so good on all servers.

  • @nickM agreed I have customers who expect the time on their VM's to be correct, and some programs require it, Exim will crash occasionally if the time goes backwards and NTPD running fixed that issue on one of our email servers strangely, we noticed the clock over a 24 hour period was getting off by nearly a minute so we sync the clock every 15 minutes on that node. Onboard clocks are not designed to be terribly accurate hence the need for syncing often, I have some servers that it's a requirement for them to sync every minute to a installed GPS system to make sure were within +-5ms of GPS time to ensure logs are accurate. Those of us with cisco gear using msec on our logs also require this to resolve certain complicated problems that creep up in the software with cisco TAC.

  • yomeroyomero Member

    @NickM said: You may not need or want your servers to have a reliable date and time, but I do, and most other people do. In fact, as mentioned before, it's required for many services.

    +1

  • subigosubigo Member

    @NickM said: I've used ntpd on all of my servers, for years, with 0 problems. And you're acting like ntpd is some huge daemon that eats your RAM and disk and CPU, but it's not. It's less than a megabyte installed, and uses 1.8MB of RAM.

    Not everyone has flood insurance, but everyone who has had their houses washed away in a flood does.

    @NickM said: You may not need or want your servers to have a reliable date and time, but I do, and most other people do. In fact, as mentioned before, it's required for many services.

    Maybe you missed the part where my servers are always just as on time as any server running ntpd. And I certainly haven't come across any services that require ntpd to be running... on OpenVZ nodes or shared servers.

    You guys can all run ntpd for the rest of your lives. I don't care what services you run, at all. I'm just telling you that ntpd is buggy, has always been buggy, and you're lucky it's never caused you problems.

  • rds100rds100 Member

    Yes, i don't trust ntpd too much. That's why i run it on separate small boxes, my other servers sync from the ntpd boxes via ntpdate run from cron.

  • subigosubigo Member

    Also, I love the fact that people are in a thread saying how stable ntpd is, when that thread is about how ntpd crashed a shit ton of servers today.

  • rds100rds100 Member

    @subigo all software has bugs, including ntpd. But a server crashing because of ntpd is not really ntpd's fault, it's other software fault (i.e. kernel).

    Thanked by 2TheHackBox yomero
  • OliverOliver Member

    No issues here and FWIW I live in the future (as far as timezones go).

    Ransom IT | ɹǝpun uʍop sdʌ | vps down under | AU/NZ VPS Provider | KVM in Sydney, Adelaide and Auckland | OpenVZ in Sydney and Melbourne
  • MikHoMikHo Member

    @oliver I'll pm you for tomorrows lottery numbers, ok? 50-50 split?

    http://www.lowendguide.com/ - the guides to administer your lowend vps | Make money writing tutorials
    Free CPanel Shared Hosting Locations: Miami (US) | Rotterdam (NL)
  • dmmcintyre3dmmcintyre3 Member
    edited July 2012

    I have used this to keep my Xen/KVM/dedicated system's clocks in sync for over a year without issues.

    # cat /etc/rc.local|grep ntpdate
    ntpdate -u time.server.ip.addr
    # cat /etc/crontab|grep ntpdate
    23 */4 * * * root ntpdate -u time.server.ip.addr >/dev/null
    FreeVPS.us - The oldest post to host VPS provider
  • ZiggaZigga Member

    @MikHo said: @oliver I'll pm you for tomorrows lottery numbers, ok? 50-50 split?

    I live in the future too, if @Oliver won't take you up on it I will. Make sure its tatts lotto with the huge powerball thingo

    google that will ya?

  • OliverOliver Member
    edited July 2012

    I lied, seems one of my nodes had decided the leap second was a problem.

    Fixed it by running the following then restarting a few VPS containers that were hogging CPU...

    date -s "`date -u`"
    

    image

    Ransom IT | ɹǝpun uʍop sdʌ | vps down under | AU/NZ VPS Provider | KVM in Sydney, Adelaide and Auckland | OpenVZ in Sydney and Melbourne
  • @Oliver said: I lied, seems one of my nodes had decided the leap second was a problem.

    Fixed it by running the following then restarting a few VPS containers that were hogging CPU...

    Java and mysql were 100% on cpu... I had to fix the same way. no need to restart applications/containers

    IperWeb & Prometeus, Hosting Provider since 1997. iwStack cloud infrastructure
  • efballefball Member

    My debian and RHEL machines were all fine, but all three of my Ubuntu boxes (desktop machines, 2 Lucid, 1 Precise) went wonkers. The CPU was crazy busy. I tried killing and restarting things but they just wouldn't run right. A reboot fixed everything.

  • Had a few nodes go absolutely bonkers on CentOS6

    Had some Ubuntu nodes go a little weird.

    Rebooted all nodes anyways for good measure. Only one out of 100+ failed to come back up! Joy ;) All good now

  • RobertRobert Member

    In my day job we had about 30-40 Ubuntu boxes go crazy. We rebooted quite a few before someone worked out that setting the time fixed it.

  • The time fixes we applied fixed some nodes; however some just didnt want to drop their loads.

  • Wtf, yeah mine is down.. Just noticed after 2 days -_-

  • huluwahuluwa Member

    Too old kernel, upgrade it. haha

    Do...dudu

Sign In or Register to comment.