Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


BuyVM: All data lost on nodes 27, 41, 52, 58
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

BuyVM: All data lost on nodes 27, 41, 52, 58

klikliklikli Member
edited May 2012 in General

Just got the following email from them:

On Friday at 11:10 PM GMT -8 we experienced a catastrophic failure of a power strip, causing near fatal damage to node 27, 41, 52, & 58's file systems.

>

When the power strip failed, our RAID batteries were damaged causing all data stored in our write-back caches to be lost. We worked quickly with our datacenter to diagnose the issue as we originally thought it was a network failure or a configuration issue. Once we knew the severity of the issue we had our datacenter replace all of our RAID batteries as well as multiple failed hard drives also damaged.

>

At this time we're awaiting node27 to finish its long FSCK. Nodes 41 & 58 have both returned with severe damage to the file system.

>

We are quickly working to provision you a fresh VPS with your stored IP information. If you're able to rebuild your VPS with minimal disruption, we would greatly appreciate it if you were able to just continue with a fresh VM as we work to try to recover what we can for other users without any backups.

>

With that being said, we want to remind everyone that we do have free backup space available under our BuyVM+ product. This product currently offers 5GB of space at no additional cost as well as free DNS hosting.

Although I do not store anything critical on my BuyVM VPS, it would still take time for me to re-configure stuffs to make it re-functional again:(

«13456

Comments

  • FRCoreyFRCorey Member

    Never heard of PDU's going out like that. @buyvm who made the strips?

  • jarjar Patron Provider, Top Host, Veteran

    It sounds like more than a few people had a rough night. Here's to hoping everyone had backups.

  • NateN34NateN34 Member

    Wow, that really sucks.

    And to think I almost used the VPS for very important data...thank god I did not.

  • cedriccedric Member

    @NateN34 said: And to think I almost used the VPS for very important data...thank god I did not.

    This is why you have a backup system of some sorts. Hardware failures can happen quite unexpectedly.

  • InfinityInfinity Member, Host Rep
    edited May 2012

    Human error possibly, the only time a power strip has failed on me is when my uncle cplaced a diode in it killing everything connected to it including my Q6600 mobo + psu. That was the worst 'prank' ever, but there is a such thing as bad power strips..

    http://m.youtube.com/?dc=organic&source=mog#/watch?v=7psPwpZWoW0

  • rds100rds100 Member

    Shit happens. Keep backups.

  • JacobJacob Member
    edited May 2012

    Moral of the story is use software raid, Luckily my node was not affected but regardless I am syncing hourly backups now.

    Unfortunate, People in the IRC was also saying the storage VPSs was not working, The node was showing online.

  • miTgiBmiTgiB Member

    @Jacob said: I am syncing hourly backups now.

    That is just overkikll, while nobody likes these events, they do happen, but it is no reason to panic if you have a reasonable backup strategy in place. Maybe you can just run replication to a remote MySQL server, as if everyone were to kneejerk to hourly backups, how much will that degrade node performance :(

  • AldryicAldryic Member

    Just the four nodes mentioned, Liam. We're doing our best to put the pieces back together from this... episode.

  • manmamanma Member

    Looks like I have a new empty VPS. I guess this doesn't bode well for my data...

  • lbftlbft Member
    edited May 2012

    @Jacob said: Unfortunate, People in the IRC was also saying the storage VPSs was not working

    I was probably one of the people you saw on IRC. Storage had downtime with AFAIK no data loss for what I assume was an unrelated issue (it probably would've been up sooner if they weren't already battling major problems).

    I also happen to have VPSes on two of the dead nodes (lucky me! 4 out of 60-something nodes die and I'm on two of them). I keep backups of my BuyVM stuff elsewhere because having your sole backup in the same place is a terrible idea - there are any number of failure/disaster scenarios that could take out an entire datacentre.

    @Jacob said: Moral of the story is use software raid

    I would imagine that more data has been lost to software raid bugs/config issues than freak hardware asplosions.

    Edit: I had to open my bloody mouth and tempt fate, maybe 5 minutes after I posted this my storage crapped itself.

  • pcanpcan Member

    The RAID writeback cache battery has reclaimed another victim, it seems. That reminds me a big IBM server I managed in the late '90s. This expensive mission-critical machine had a impressive look, multiple redundancies and failovers. It had many processors and a huge proprietary SCSI RAID card. Over time it was repurposed to a less mission critical role and the maintenance contract was cancelled. Shortly after that, the server simply turned off by itself. Power switch had no effect. This was strange, the machine had 3 power supplies and a impressive service panel full of diagnostic LEDs. They were all off. With no maintenance contract and no hope to cost-effective repair, I dismantled the cover and started taking out the reduntant components to find the culprit. One of the battery packs of the RAID card had a short circuit. The short circuit triggered the protection switch of the PCI backplane, that turned off the diagnostic card, that turned off the power supply control module. After unplugging this battery pack, the server restarted and booted fine (using the other battery).

    Well, It seems that later today I will have a VPS to restore from backup...

  • manmamanma Member

    So is it safe to say my data is lost forever and I'll need to rebuild from a week old backup? I just need to know whether to feel hopeful or devastated.

  • If you're able to rebuild your VPS with minimal disruption, we would greatly appreciate it if you were able to just continue with a fresh VM as we work to try to recover what we can for other users without any backups.

    Looks like they are trying to recover for people with no backups.

  • CVPS_ChrisCVPS_Chris Member, Patron Provider

    They are human, holy shit!

    Anyway, hope it gets better for you guys, I can only imagine. What kind if PDU was it?

  • manmamanma Member

    So, how would i got about finding out if there's a chance of my data getting recovered? Would it be wise to just leave my VPS alone for the day?

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @CVPS_Chris said: Anyway, hope it gets better for you guys, I can only imagine. What kind if PDU was it?

    It was supposed to be some decent 10 port trippelite but i get the feeling it was something cheapo =\

    We're still doing FSCK's where we can

    The data is still around just in really bad shape. Anywhere from 10GB to 30GB in lost+found.

    Since they were all 128MB's i'm hoping most people are just vpn's and simple things they can easily rebuild. For anyone that's needing data just let us know what folders to check and we'll do our best as the nodes return.

    Francisco

    Thanked by 1manma
  • FranciscoFrancisco Top Host, Host Rep, Veteran

    Mass remakes are done so I ask you to please log a ticket and let us know:

    • if you're needing us to hunt for data
    • if you do, what data to hunt for

    Francisco

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @manma said: So, how would i got about finding out if there's a chance of my data getting recovered? Would it be wise to just leave my VPS alone for the day?

    Assume we can't.

    If your VPS is just a config thing and you're out some time then I ask that you please rebuild and let us hunt for people that didn't keep backups of their lifes work.

    Francisco

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @pcan said: Well, It seems that later today I will have a VPS to restore from backup...

    All but node58 have been provisioned to new gear, i'm just working on node58 right now.

    Francisco

  • Hardware failures can occur eberytime even to the best Provider. Thumbs up to Francisco and team for the way they're dealing with this issue.

  • manmamanma Member
    edited May 2012

    @Francisco said: Assume we can't.

    If your VPS is just a config thing and you're out some time then I ask that you please >rebuild and let us hunt for people that didn't keep backups of their lifes work.

    Francisco

    Sadly I didn't keep backups of my life's work. My fault, I know, but I wouldn't blame any of you if I couldn't get anything back. You're still my #1 VPS provider despite all of this. I've gone ahead and logged a ticket with the most important directory, and I was instructed to wait until bzImage made an announcement regarding the FSCKs finishing, so I'll just be patient until then.

    Thanks for being so transparent about all of this. Its amazing how an unmanaged VPS provider does more for its customers than a managed shared host would.

  • BlueVMBlueVM Member

    Sorry to hear your having a bad day/night. Let us know if there is anything we can do to help you out...

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @manma said: Sadly I didn't keep backups of my life's work. My fault, I know, but I wouldn't blame any of you if I couldn't get anything back. You're still my #1 VPS provider despite all of this.

    We'll do our best to hunt for things, just give a file list to Anthony.

    I was positive we mentioned buyvm+ in one of our company emails a while back and I wish we had even more people using it (around ~1000 people do right now).

    As I mentioned, we'll be increasing space on the offering so people can store a lot more.

    Francisco

  • Mon5t3rMon5t3r Member

    image

    thankfully i didn't received any email.. :D

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @BlueVM said: Sorry to hear your having a bad day/night. Let us know if there is anything we can do to help you out...

    Anthony is pulling NFS mounts off everything but node27 right now to start salvaging what he can. Some of the boxes are booted on live CD's just because the HN got smacked around. I mean, we were missing half of our kernel on the box and it was on a different partition all together o_O

    Francisco

  • AsimAsim Member

    speaking of backups, I just noticed that my backup vps is also down since hours

  • FranciscoFrancisco Top Host, Host Rep, Veteran

    @Asim said: speaking of backups, I just noticed that my backup vps is also down since hours

    Yea, unrelated, i'm just seeing what's acting up on that one.

    Francisco

  • AsimAsim Member

    @Francisco said: Yea, unrelated, i'm just seeing what's acting up on that one.

    Thanks, I have already created a ticket for the same.

  • miTgiBmiTgiB Member

    @Asim said: I just noticed that my backup vps is also down

    Stop selling off all your servers, how can you even keep tabs on that merry go round ;)

    Thanked by 1[Deleted User]
Sign In or Register to comment.