Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Host Issues with IPMI and hardware
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Host Issues with IPMI and hardware

edited January 2014 in General

I recently signed up with a host that I won't disclose for now.

Assigned hardware and found out that the server was missing 4GB of RAM. 20GB instead of 24GB. I brought it to their attention and they let me know that this was a rare instance that I was delivered the wrong setup. Okay. They offered me $3/off my monthly bill or to move me to a machine with 24GB. I opted for the latter. They did tell me there would be downtime during the move (and I would still be responsible for that time). Again, I didn't have much issue, as long as it wasn't more than a few hours. The server is of a Dell Chassis and has MegaRac IPMI interface (with integrated KVM). I had loaded ESXI on the machine and was fine with this, however I decided I wanted to load Server 2012 and just use HyperVM for my pet projects. During the middle of the install (using remote media), MegaRac completely died on me. I could not access the interface at all. I tried accessing it from IPMITool to reboot the MC using a cold reset. No dice. My host advised me that since this was un-manged, I would have no assistance with this. They offered me three solutions - rent a KVM at $30, pay $20 for a cold rack reboot (remote hands scheduled with other work,'discounted rate' as they referred to it), or wait anywhere from 1-4 weeks to be moved to additional hardware, no firm ETA. Unfortunately, my hands are tied so I paid the $20 and opted for the cold rack reboot (in the hopes that this would fix IPMI, with no guarantee). So, we're about 24 hours out with no IPMI or access to the server, on hardware that was delivered incorrectly. I understand that it might be harder back east due to the storms and such, but it's a lot of things not going right. This just worried me down the road if I have a RAM module or hard drive failure that I will be charged for the replacement time even though it's not my hardware. They've been somewhat responsive, but I feel like I've gotten the short end of the stick with paying $75 and only utilizing the server for more than about 2 days.

Do I stick it out or cut my losses and run?

«1

Comments

  • We have had this issue before on our Dell servers where the IPMI just stops for no apparent reason. After messing with the jumpers and CMOS on the motherboard it seems the only way to fix it is to replace the board which is what we eventually did. The issue was resolved after that and it seems your provider will have to do the same.

  • Just when you thought Dell Drac couldn't get worse - AMI MegaRac came along to trump it all. I have had experience of Megarac on a Dell server, it was NOT pleasant.

  • edited January 2014

    @fizzyjoe908 said:
    We have had this issue before on our Dell servers where the IPMI just stops for no apparent reason. After messing with the jumpers and CMOS on the motherboard it seems the only way to fix it is to replace the board which is what we eventually did. The issue was resolved after that and it seems your provider will have to do the same.

    They'll most likely charge me to do that replacement, which I don't feel I have to pay since it's a hardware issue. I've rented for years now and never once had a provider charge me for equipment failures on their end. Let's hope it's not the same.

    Maybe once they move me to the new machine, the problem with fizzle out...

  • Isn't there some sort of money back guarantee? If it's faulty hardware, you shouldn't be charged for replacing it. After all it's not your own server coloed, it's their server, you pay for a server with functional hardware and they have to provide it.

  • If you rent the server then unless its something you have caused, then the liability to repair it is with the company renting it to you.

    They can't charge you - its like when you rent a car, if it breaks down they pay to have it fixed. Either they fix it.

    We have some issues with some Dell Megarac based servers, we're working around it by giving the customer a dedicated KVM and hardware power cycler and just forgetting the Megarac ever existed.

  • edited January 2014

    @rds100 said:
    Isn't there some sort of money back guarantee? If it's faulty hardware, you shouldn't be charged for replacing it. After all it's not your own server coloed, it's their server, you pay for a server with functional hardware and they have to provide it.

    That's to my understanding, but apparently IPMI isn't covered under hardware (and some are built into the chassis)?

  • If IPMI is advertised as part of the package then they should provide you a comparable solution without cost.

  • @daxterfellowes said:
    That's to my understanding, but apparently IPMI isn't covered under hardware (and some are built into the chassis)?

    That isn't correct. IPMI is always either built in to the motherboard or a PCI addon card.

  • qpsqps Member, Host Rep

    We have quite a few of the Dells with the MegaRac BMC. When they work, they usually work pretty well. Generally, when they stop responding, they need the firmware reflashed to be brought back to life.

  • @MarkTurner the HP iLO 2 that's on the server with Delimiter works great! Never had an issue with it responding. Although remote virtual media is wonky slow, but that's just virtual media in general. Your staff even said they'll remote mount for me if I ever needed to again.

    The company has since contacted me know that the $20 is for un-sledding and re-sledding the server to do a cold reboot in the hopes IPMI will come back. I still am not quite sure why that's been passed on to me.

  • qpsqps Member, Host Rep
    edited January 2014

    daxterfellowes said: The company has since contacted me know that the $20 is for un-sledding and re-sledding the server to do a cold reboot in the hopes IPMI will come back. I still am not quite sure why that's been passed on to me.

    I agree that you shouldn't be charged for this.

    What model server is this?

  • @daxterfellowes - they can't charge you for their hardware failure. Is this a real company or a back bedroom operation?

  • @qps - the key point 'when they work' but suddenly they drop connections, go into hibernation mode or die.

    These AMI derived IPMIs are dreadful

  • qpsqps Member, Host Rep
    edited January 2014

    MarkTurner said: @qps - the key point 'when they work' but suddenly they drop connections, go into hibernation mode or die.

    These AMI derived IPMIs are dreadful

    You make it sound like this happens all the time. In our experience, this doesn't happen very often. Most of them work just fine.

    I should note that we have hundreds of servers in service with this type of BMC, so we speak from experience.

  • edited January 2014

    The host has contacted me through private message (could they ever guess it was me from...).

    They informed me that my period would be extended once the originally ordered hardware is delivered. Still probing to see regarding the 'remote hands' fee.

    I appreciate all the input from you guys and additionally for @MarkTurner for having awesome staff over at Delimiter on my other machine.

  • @qps - I've found the opposite both in Supermicros and Dell (especially Dell) Megarac IPMIs. Thankfully we run 99% HP and their ILO has been rocksolid

  • @daxterfellowes said:
    MarkTurner the HP iLO 2 that's on the server with Delimiter works great! Never had an issue with it responding. Although remote virtual media is wonky slow, but that's just virtual media in general. Your staff even said they'll remote mount for me if I ever needed to again.

    The company has since contacted me know that the $20 is for un-sledding and re-sledding the server to do a cold reboot in the hopes IPMI will come back. I still am not quite sure why that's been passed on to me.

    Unless they have an amazing network or a unique location that you require, DUMP THEM!

    I wouldn't be paying $20 remote hands for reboot on a dedicated server, especially if IPMI broke.

    $3 off for 4GB of missing ram sounds like a joke, how much are you paying for this server? To me, it seems like they're charging you Coloesque fees for a rented dedi. If something on their server broke, it's their job to fix it, unmanaged or not.

  • @nunim said:
    $3 off for 4GB of missing ram sounds like a joke, how much are you paying for this server? To me, it seems like they're charging you Coloesque fees for a rented dedi. If something on their server broke, it's their job to fix it, unmanaged or not.

    It's cheap for the hardware, $53/month for 24GB DDR3 and Dual Intel Xeon L5520.
    I don't need a managed server, all I require is that I have working IPMI. I don't even care about routes or network blend. I just want functioning.

  • @daxterfellowes said:
    It's cheap for the hardware, $53/month for 24GB DDR3 and Dual Intel Xeon L5520. I don't need a managed server, all I require is that I have working IPMI. I don't even care about routes or network blend. I just want functioning.

    That's not a bad deal but there's cheaper providers out there, mind naming the provider? I still feel that 4GB of Ram is worth more then $3 and why it's taking them this long to correct it I do not know.

  • edited January 2014

    It's @Fliphost. I'm sure I'm shooting my self in the foot by doing this (for any future support I may need), but I'm sure they can chime in and give their side of the story if they wish (since there's always two sides to things).

  • They likely had to pay for the datacenter's remote hands technicians to do the reboot for you which they then passed the cost on to you. For this reason, it is smart to work out a deal with your facility to provide either free simple remote hands such as that or a limited number of free hours per month.

  • I had the same problem, just downloaded Supermicro IPMI View to manage it and cold rest IPMI. This server's IPMI looks like not very stable.

    I thought they can do a better job. Bind the only one SSD to raid card at sata 1 link speed... Costed me lots of time to find the problem and solutions, just because 'unmanaged server'.

    In fact, not too bad, but should be better.

    Thanked by 1daxterfellowes
  • edited January 2014

    @fizzyjoe908 said:
    They likely had to pay for the datacenter's remote hands technicians to do the reboot for you which they then passed the cost on to you. For this reason, it is smart to work out a deal with your facility to provide either free simple remote hands such as that or a limited number of free hours per month.

    They've told me the remote hands fee that I paid could not be refunded, which I understand. I've colocated before and know how costly things like that are. However, it appears that they have their own technician doing the work since someone was "not able to do it today" due to inclement weather. Which was strange since Dallas, TX was not having any inclement weather to my knowledge and all traffic cams for Dallas show clear weather (just cold, like all places are).

    @hepochen - I've already tried every method under the book to revive the IPMI via all available IPMITools. Unfortunately it won't respond to it anymore. Pings perfectly fine, but won't respond to to login at all.

    EDIT: Well butter my rump and call me Thanksgiving dinner; either the host just fixed it or the Supermicro SMCIPMITool http://www.supermicro.com/products/nfo/ipmi.cfm was able to connect and reset the BMC...

  • hepochen said: I had the same problem, just downloaded Supermicro IPMI View to manage it and cold rest IPMI. This server's IPMI looks like not very stable.

    Correct, it's basically old as hell gear that's not update-able -- thus the issues most of us who had the displeasure of buying into Dell's C6100 XS-TY3 dumping spree have noted.

    Anyway, the issue here is that the web management daemon (Probably some terrible HTTPd like thttpd executing fake asp as cgi) can't keep itself alive to save its own life, but the underlying KVM daemon keeps functioning fine.

    To make this a bit more 'reliable' -- you're welcome to use any remote management IPMI toolsets that bypasses the HTTP daemon completely, be this IPMIView, or ipmitool, freeipmi-tools, openipmi; the list goes on.

    Most of these tools will be able to reset the unit even remotely if it's responding to pings but the http daemon is dead -- and as long as suitable ipmi drivers (ipmi_devintf, ipmi_si) have been loaded, both of these commands will fix it:

     ipmitool bmc reset warm #Instant, usually no delay. But doesn't fix a lot of things 
    ipmitool bmc reset cold #Cold reset, arp cache will flush out and the entire device will reboot.

    Either way, please treat the IPMI on these boxes as something that will probably not be as reliable as most people expect IPMI devices to be, a least not with the HTTP access method, certainly.

    -- Paul

    Thanked by 1daxterfellowes
  • @Wintereise I had tried IPMITool, freeipmi both on Windows and Linux platforms, neither able to connect or authenticate with the daemon. However, once I tried the SuperMicro one, it was able to successfully reset the HTTP daemon.

    Even though this is a Dell...

    Any ideas on that for future references?

  • IPMI is a open standard, so all devices basically act the same and support the same bare minimum set of commands.

    Dell has their extra set of OEM commands, as do Supermicro and HP. Basically, most of these companies have modified versions of ipmitool available with that extra feature set.

    Either way, glad to know that you sorted it out. Wasn't aware of the Supermicro trick, but good find.

  • qpsqps Member, Host Rep

    Wintereise said: Correct, it's basically old as hell gear that's not update-able -- thus the issues most of us who had the displeasure of buying into Dell's C6100 XS-TY3 dumping spree have noted.

    They are absolutely update-able; there was a new firmware released August 23 for the BMC. I believe they were manufactured through 2012, so some of them are still relatively new.

    daxterfellowes said: Even though this is a Dell...

    Interesting - we've had no issues with the standard ipmitool in Linux on these BMCs.

  • @qps said:

    It could have been me, although I believe I was utilizing the correct syntax.
    Dell's IPMI tool keep giving me:

    Activate Session error: Invalid data field in request
    Error: Unable to establish LAN session
    Error getting BMC time info.

    And still does. Odd.

  • qpsqps Member, Host Rep

    daxterfellowes said: It could have been me, although I believe I was utilizing the correct syntax. Dell's IPMI tool keep giving me:

    Activate Session error: Invalid data field in request Error: Unable to establish LAN session Error getting BMC time info.

    And still does. Odd.

    Do you know which version of the BMC firmware you have loaded on there?

  • qpsqps Member, Host Rep

    said: Assigned hardware and found out that the server was missing 4GB of RAM. 20GB instead of 24GB.

    We found a few of these systems that were only showing 20 GB RAM instead of 24 GB RAM even though 24 GB was installed. All it took to fix was to re-seat the RAM. Apparently it had come a little loose in shipping.

Sign In or Register to comment.