Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Convincing OVH something's wrong
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Convincing OVH something's wrong

jhjh Member
edited September 2015 in General

Does anyone have any experience talking to OVH about broken hardware? One of our clients got one of their higher end servers and paid the extra for hardware RAID (about £500 total). I log on, install CentOS 7 in the CP and after a few hrs it goes read only. The RAID controller says everything's ok - VD, PD, controller, battery etc but it's trying to rebuild. I reboot, same all over again.

I spoke to them this morning. I was polite, showed them evidence and asked if they could check it physically and maybe replace the controller. They promised me a solution today. Just before the close of business, I contacted them again to ask what they're doing and they told me that the RAID controller reports itself as healthy.

Comments

  • @jh said:
    Does anyone have any experience talking to OVH about broken hardware? One of our clients got one of their higher end servers and paid the extra for hardware RAID (about £500 total). I log on, install CentOS 7 in the CP and after a few hrs it goes read only. The RAID controller says everything's ok - VD, PD, controller, battery etc but it's trying to rebuild. I reboot, same all over again.

    I spoke to them this morning. I was polite, showed them evidence and asked if they could check it physically and maybe replace the controller. They promised me a solution today. Just before the close of business, I contacted them again to ask what they're doing and they told me that the RAID controller reports itself as healthy.

    Run it though their hardware test that you can run from the recovery console, and see if any errors come up on the outcome of this.

  • I assume you are using OVH UK?

  • MarkTurner said: I assume you are using OVH UK?

    The server was bought at the Irish store and my UK account is the techncal contact so I've been speaking to the UK people.

    AshleyUk said: Run it though their hardware test that you can run from the recovery console, and see if any errors come up on the outcome of this.

    Thanks - I'll give this a go as well.

  • What read only error did you get exactly? I have a client with dedicated server, which he runs multiple VM;s out of. His CentOS 7 VM's went into a read-only system, requiring a reboot and fsck...but none of the Ubuntu or CentOS 6 VM's had the error. I haven't been able to reproduce the error since, but none the less it seems familiar to this.

  • I guess you don't use avatar like yours here in your OVH account. Otherwise, I could see the reason why they didn't believe you :-)

    Thanked by 1Jeffrey
  • jhjh Member
    edited September 2015

    @OnraHost it said "rejecting i/o to offline device". CentOS 7, stock kernel, no VMs.

    @jvnadr it's a low end avatar to remove obvious connections with my work life :)

  • @OnraHost said:
    What read only error did you get exactly? I have a client with dedicated server, which he runs multiple VM;s out of. His CentOS 7 VM's went into a read-only system, requiring a reboot and fsck...but none of the Ubuntu or CentOS 6 VM's had the error. I haven't been able to reproduce the error since, but none the less it seems familiar to this.

    I got client with similar issue with MG-128 Centos 7.1 hypervisor + MegaRaid + 4 x 800 GB SSD (good old Ext4) on the hypervisor itself... do not have time to try to reproduce client whant it online ASAP, after reboot and fsck everything is looking ok I checked Cotnroler, Virtual Drive, Phisical Drives and have no issues for almost 3 weeks (I knock on wood), I start to think that it is related to Centos 7 itself...

  • ClouviderClouvider Member, Patron Provider
    edited September 2015

    Last time our Customer had issue with RAID controller we end up recovering from backups after >48 hours of 'the controller is ok' while controller log (coincidentally at the time when monitoring reported server down) reported 'battery cable broken'.

    Server is still booted in recovery, they haven't fixed it yet (> week now). Probably won't be fixed until expires, and someone will be lucky to get a broken server on provisioning :-).

    Thanked by 1ErawanArifNugroho
  • I made an order with OVH three days ago. Sent in copies of license, credit card, utility bill along with fingerprints and a blood sample. Absolutely no communication back.

    Sorry for not offering anything positive I just wanted to gripe. Good luck though!

  • @badpatrick they need sperm/egg sample,too

    I think it might help to call them but that's just OVH for you. Imagine their support always being like that when something does not get fixed automatically.

    Thanked by 2badpatrick Droidzone
  • edited September 2015

    badpatrick said: Sent in copies of license, credit card, utility bill along with fingerprints and a blood sample. Absolutely no communication back.

    No eye ball and first born blood? Then you're out of luck with summoning OVH support

    Thanked by 34n0nx badpatrick netomx
  • Eye of newt, hair of frog, hens teeth, and a fat log.

  • Thanks all. I'll try a different OS tomorrow and more contact on Monday. Remember that this is one of their top end servers, not a Kimsufi so although they are supposed to be mor helpful, they are probably more reluctant to replace expensive hardware.

  • netomxnetomx Moderator, Veteran

    @jh said:
    Thanks all. I'll try a different OS tomorrow and more contact on Monday. Remember that this is one of their top end servers, not a Kimsufi so although they are supposed to be mor helpful, they are probably more reluctant to replace expensive hardware.

    Maybe Debian + Proxmox for your CentOS needs?

  • singsingsingsing Member
    edited September 2015

    jh said: they are probably more reluctant to replace expensive hardware

    While I'm hardly OVH's biggest fan, I seriously doubt that the problem is OVH's reluctance to replace broken hardware.

    Mostly, they seem to be doing an experiment which I would call "clean-room hosting", where the technical team is completely isolated from the customer support team. (This is partly meant as a joke, but I do think it is quite hard to get escalation to technical from OVH support).

    Their main strategy is probably to monitor everything possible and replace something if -they- notice it's broken. And then assume customers have no idea what they're talking about for any remaining problems. Incidentally, they are probably right about this at least 95% of the time. But it sucks when the chips fall into that remaining 5% category.

  • perennateperennate Member, Host Rep
    edited September 2015

    I ordered an EG-128 with hardware RAID controller and SSD in OVH BHS, the RAID controller was broken and the virtual disks kept freezing after ~30 minutes of being booted. Contacted OVH BHS and they replaced the RAID controller within six hours. The server failed to reinstall from panel (probably something wasn't synchronized between RAID controller and their management software), but this was fixed after another two to three hours (the issue with reinstall was automatically reported since the panel ran into an error; I think I called them anyway, but I don't think that helped in that case; the first time they did "hardware diagnosis", the issue wasn't solved, but after the second time all was good).

    Here's the ticket I opened (called them immediately after opening):

    Hi,

    We believe this server has a faulty RAID controller. First I will say what we have done and what signs we are seeing that point to faulty RAID controller as the source of the problem.

    We have been getting issues on our server where the filesystem frequently freezes, and a hard reset is necessary to bring the system back online. We did not see any problems in /var/log/syslog or other log files; when the problem occurs, any further communication with the disk is terminated and only cached data can be read/written (we tried dd if=/dev/sda1 of=/run/test bs=64k count=4, even this command to just read from the partition fails).

    We then booted the server into rescue mode and mounted the /dev/sda1 partition to /mnt/sda1. We verified that the mount succeeded and files were accessible. After approximately 20 minutes, however, any operation involving the partition was frozen and we saw the same disk frozen issues. Furthermore, we attempted to run fdisk -l /dev/sdb and this also was frozen (/dev/sda is the 2x SSD in RAID1 while /dev/sdb is the 2x HDD in RAID1, these are two independent logical RAID devices so it seems unlikely that it is disk issue).

    Then we also try running these commands to see more information about status:

    MegaCli -AdpAllInfo -aALL
    MegaCli -LDInfo -Lall -aALL | egrep 'Adapter|Size' | grep -v Strip

    However both of the commands simply freeze as if they are not able to contact the RAID controller or some similar problem.

    The server is currently booted into rescue mode (ssh root@{redacted}; I assume you have access to the password or can connect via SSH key), so if you want to see in more detail what the issue is then you can connect. You will observe that both fdisk -l /dev/sda and fdisk -l /dev/sdb fail.

    We do not have any important data on these servers. However if the RAID controller is replaced then I assume that the data can be preserved? This would be easier so we wouldn't need to reinstall, but if not possible then please proceed with whatever action that will fix the problem.

    Thank you.

    The data was lost after the RAID controller was replaced, but this was a new server (and I didn't try to see if the RAID controller could auto-detect the RAID configuration from the disks without reinstall).

    Anyway it probably won't help you since ovh.ie support is terrible.

    Edit: also in your case it sounds like it might be more likely that the disk is malfunctioning?

  • LordSpockLordSpock Member, Host Rep

    I always ring their French brand up, and speak to them in French - they are more than happy to help me then :)

  • @LordSpock said:
    I always ring their French brand up, and speak to them in French - they are more than happy to help me then :)

    Only 70.8% of people actually passed their French GCSE. :P

  • @TropiThomas said:
    Only 70.8% of people actually passed their French GCSE. :P

    I didn't even do french back at high school.

  • @wych said:
    I didn't even do french back at high school.

    Nor did I. But only 70% of kids this year passed. 1% more than last year :P

  • jhjh Member
    edited September 2015

    TropiThomas said: Only 70.8% of people actually passed their French GCSE. :P

    The rest probably don't pass the others either knowing our crappy state education system. Then they go on to do a BTEC and end up at the JCP and finally MCD.

    I'll try the French office tomorrow. I did fine with Online.net's telephone support!

    Thanked by 2vpsGOD 28Tom
  • jh said: I did fine with Online.net's telephone support!

    They do know quite alot of english.

  • wych said: They do know quite alot of english.

    Spoke to 3 and were confident enough to speak to me in English.

  • FlamesRunnerFlamesRunner Member
    edited September 2015

    Vous êtes sérieux?
    Au Canada, à l'école primaire et secondaire, vous devez apprendre le français.

    Translation: It's good to be in Canada.

    Now, if I could only speak Spanish...

  • jhjh Member
    edited September 2015

    @FlamesRunner said:
    Vous êtes sérieux?
    Au Canada, à l'école primaire et secondaire, vous devez apprendre le français.

    Translation: It's good to be in Canada.

    I went to a public school and dabbled in English, French, Spanish, Italian, German and Mandarin. There was also Latin and Greek. Some 95% of the students got entirely A* or A grades and as for French, I sat GCSE early and got full marks in French, then sat A levels and got full marks in French again and also did an "Advanced Extension". I forgot what I knew in the others though - turns out the teachers lied - most people I encounter are happy to speak English ;)

    That's not the case in the state schools though - I know because I moved from a dodgy state school to a decent public school. They did a German exchange when I was 12 and in the English class, the teacher was correcting the English students...

  • Um, dodgy state school?

    I don't exactly know how you guys run your education system, so I can't exactly get at what you're saying :/

  • jhjh Member
    edited September 2015

    FlamesRunner said: Um, dodgy state school?

    We have 3 tiers: state schools (free, generally in disrepair), private schools (average about $13k a year, no help from the state, generally better results) and public schools (similar to private but more history/prestige). My public school was founded in the 1500s and the teachers still wear gowns!

    There are also some in-between ones - faith schools are very common and I think they get some funding from the Church and some from the state. Also academies - state schools that supposedly are less controlled by politicians.

    So in summary your academic prospects are largely determined by your parents' income. My wife and I jokingly debate where our children might go - I'm insistent it won't be a state school and she doesn't want them talking like the Queen.

    Dodgy is a colloquial British word that means poor quality.

  • Well, I knew what dodgy meant, but thanks for the explanation :)

  • @jh said:
    Does anyone have any experience talking to OVH about broken hardware? One of our clients got one of their higher end servers and paid the extra for hardware RAID (about £500 total). I log on, install CentOS 7 in the CP and after a few hrs it goes read only. The RAID controller says everything's ok - VD, PD, controller, battery etc but it's trying to rebuild. I reboot, same all over again.

    I spoke to them this morning. I was polite, showed them evidence and asked if they could check it
    physically and maybe replace the controller. They promised me a solution today. Just before the close >of business, I contacted them again to ask what they're doing and they told me that the RAID ??
    controller reports itself as healthy.

    I had a similar problem with a P410 controller in a DL120G7 @ online.net via oneprovider, raid controller reported everything fine but Dmesg was reporting I/O errors and the array kept going read only.

    Diagnostics kept reporting it as OK even through it wasn't, credit to Oneprovider though they did arrange a replacement server after I sent them several logs showing the I/O error, definately was something faulty as I didn't get the problem on the replacement (Another DL120G7 with the P410 raid controller)

  • IDK how OVH is like over seas, but here in their BHS-Canada location, they're great. I didn't have to send them any proof of address, and my server was up and running within 30 minutes of ordering. I've had a few hiccups with their network, but everything was fixed in a matter of minutes.

Sign In or Register to comment.