Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Spectre and Meltdown - The what is my provider going to do about it? thread! - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Spectre and Meltdown - The what is my provider going to do about it? thread!

24

Comments

  • @perennate said:
    Ah, this is with RHEL? The RHEL patch seemed more comprehensive than the Debian one actually since Debian hasn't updated kernel package yet AFAIK. So that is strange...

    Yeah the example above is from the RHEL (tested on actual Red Hat, CentOS and CloudLinux).

    I've asked a bunch of other sysadms working with RHEL based systems, which see same behaviour - so.. either we're all updating things incorrectly - or there's yet to be a new microcode release.

    Online.net which keeps the list up to date also have most of them in "Pending" because they're waiting.

    @ramnet said:
    Red Hat released microcode updates also.

    Sure - but those microcodes ain't really fixing spectre, they're not enabling ibpb and ibrs which is required.

    @ramnet said:
    Linux has long had the ability to patch the microcode during OS bootup, unlike certain other OSes which require BIOS updates to do that.

    Correct - but if the microcode is not there, you'll still have to reboot to get it applied when it's available

  • NeoonNeoon Community Contributor, Veteran
    edited January 2018

    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI,Virtmach & BandwagnHost needed a long period of time to boot up the servers back.

    I just stay with dedis, to much hassle.

  • Zerpy said: I've asked a bunch of other sysadms working with RHEL based systems, which see same behaviour - so.. either we're all updating things incorrectly - or there's yet to be a new microcode release.

    Intel's press release says most of microcode updates coming next week AFAIK

  • @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI is down since 80 minutes and counting.

    I just stay with dedis, to much hassle.

    Maybe people run fsck as well :-D

    @eva2000 said:

    Zerpy said: I've asked a bunch of other sysadms working with RHEL based systems, which see same behaviour - so.. either we're all updating things incorrectly - or there's yet to be a new microcode release.

    Intel's press release says most of microcode updates coming next week AFAIK

    Yeap, that's what I'm counting on as well - so we all just have to sit tight :-D

    But the fact that people believe they're all 100% safe now is fake news <3

  • gestiondbigestiondbi Member, Patron Provider

    @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI is down since 80 minutes and counting.

    I just stay with dedis, to much hassle.

    D*uq. No nodes has been done for so long. I know you don't like us, but no need to bash and say bulls**t on forums...

  • NeoonNeoon Community Contributor, Veteran

    @davidgestiondbi said:

    @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI is down since 80 minutes and counting.

    I just stay with dedis, to much hassle.

    D*uq. No nodes has been done for so long. I know you don't like us, but no need to bash and say bulls**t on forums...

    "Server Unity GestionDBI England went offline. Detected: 06.01.2018 19:50:10"

    Just checked if my Monitoring went shit, I did not:

  • gestiondbigestiondbi Member, Patron Provider

    @Neoon said:

    @davidgestiondbi said:

    @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI is down since 80 minutes and counting.

    I just stay with dedis, to much hassle.

    D*uq. No nodes has been done for so long. I know you don't like us, but no need to bash and say bulls**t on forums...

    "Server Unity GestionDBI England went offline. Detected: 06.01.2018 19:50:10"

    Just checked if my Monitoring went shit, I did not:

    • Did you open a ticket? no
    • Did you try to login to SolusVM Portal? Probably not
    • Why you say we are down, WHEN it's your VPS that is down?

    All nodes are up and running, excluding LAX-03 that is currently rebooting for the last 5 minutes.

  • NeoonNeoon Community Contributor, Veteran

    @davidgestiondbi said:

    • Did you open a ticket? no
    • Did you try to login to SolusVM Portal? Probably not
    • Why you say we are down, WHEN it's your VPS that is down?

    All nodes are up and running, excluding LAX-03 that is currently rebooting for the last 5 minutes.

    So, you go, reboot the nodes, and check if they back up but you do not care if the costumer VM's are backup up? well ok then.

    I had 10 restarts today, this question was asked in general, I did listed just one provider, that was maybe a bit unfair but I have updated it.

    At least half of them, went down for about 1 hour, so I go for each of these and open a ticket? No. I do expect that a Provider brings up the VM's and I have not to login in each panel and reboot them by hand.

    Everyone got it working, except gestionDBI.

    Thanked by 1Zerpy
  • WSSWSS Member

    Oh, look, it's time for some @Neoon rage. Which project are you going to abandon now?

  • NeoonNeoon Community Contributor, Veteran

    @WSS said:
    Oh, look, it's time for some @Neoon rage. Which project are you going to abandon now?

  • ClouviderClouvider Member, Patron Provider

    @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI,Virtmach & BandwagnHost needed a long period of time to boot up the servers back.

    I just stay with dedis, to much hassle.

    I guess to safely switch off all VMs?

  • NeoonNeoon Community Contributor, Veteran

    @Clouvider said:

    @Neoon said:
    Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
    GestionDBI,Virtmach & BandwagnHost needed a long period of time to boot up the servers back.

    I just stay with dedis, to much hassle.

    I guess to safely switch off all VMs?

    How many containers do you need on a single node that you need 60min+ to reboot it?

  • WSSWSS Member

    @Clouvider said:
    I guess to safely switch off all VMs?

    They're OVZ tho. If they're simfs, just reboot now and it'll work itself out if you've got a journaling filesystem.

  • ClouviderClouvider Member, Patron Provider

    Neoon said: How many containers do you need on a single node that you need 60min+ to reboot it?

    1 Windows KVM that refuses to budge. And then you have a choice. Downtime or potential loss of data?

  • NeoonNeoon Community Contributor, Veteran

    @Clouvider said:

    Neoon said: How many containers do you need on a single node that you need 60min+ to reboot it?

    1 Windows KVM that refuses to budge. And then you have a choice. Downtime or potential loss of data?

    Well I said containers, most of them where OVZ boxes.

  • WSSWSS Member

    @Clouvider said:

    Neoon said: How many containers do you need on a single node that you need 60min+ to reboot it?

    1 Windows KVM that refuses to budge. And then you have a choice. Downtime or potential loss of data?

    Take a snapshot, shutdown, restart using snapshot?

  • ClouviderClouvider Member, Patron Provider

    @WSS said:

    @Clouvider said:

    Neoon said: How many containers do you need on a single node that you need 60min+ to reboot it?

    1 Windows KVM that refuses to budge. And then you have a choice. Downtime or potential loss of data?

    Take a snapshot, shutdown, restart using snapshot?

    Always an option. Depends on how many of those stubborn VMs you have at scale, you may still get to these 60 minutes mentioned.

  • gestiondbigestiondbi Member, Patron Provider

    Funny fact, longest reboot time was MTL-02, with ~35min.

    Thanked by 2Clouvider WSS
  • WSSWSS Member

    @davidgestiondbi Well, this is an outrage - I could (probably) be done pooping by then!

    Thanked by 1gestiondbi
  • uh... ... ...... .. sudo (Holy shit that's half of it) apt-get (common sense part) update (omg that simple?) (hit enter) (only if I bought softlayer or theplanet!!)

  • NeoonNeoon Community Contributor, Veteran

    @davidgestiondbi said:
    Funny fact, longest reboot time was MTL-02, with ~35min.

    Funny fact, the Smokeping just send emails to 1 address, I still got my VPS suspended for 20 monitored servers.

  • perennateperennate Member, Host Rep

    cheapwebdev said: uh... ... ...... .. sudo (Holy shit that's half of it) apt-get (common sense part) update (omg that simple?) (hit enter) (only if I bought softlayer or theplanet!!)

    Well, last I checked neither Debian nor Ubuntu have kernel updates yet, other updates (e.g. qemu) are likely missing too. Also apt-get update only fetches the package index heh.

  • ClouviderClouvider Member, Patron Provider

    @Neoon said:

    @davidgestiondbi said:
    Funny fact, longest reboot time was MTL-02, with ~35min.

    Funny fact, the Smokeping just send emails to 1 address, I still got my VPS suspended for 20 monitored servers.

    You should really work on this rage mate. It’s not healthy...

  • NeoonNeoon Community Contributor, Veteran

    @Clouvider said:

    @Neoon said:

    @davidgestiondbi said:
    Funny fact, longest reboot time was MTL-02, with ~35min.

    Funny fact, the Smokeping just send emails to 1 address, I still got my VPS suspended for 20 monitored servers.

    You should really work on this rage mate. It’s not healthy...

    I am calm, its fine, I just do like to bash sometimes at something, so you do like to OVH.

    Thanked by 2Clouvider Zerpy
  • MaouniqueMaounique Host Rep, Veteran

    mfs said: Prometeus have issued alerts and/or reboots (Prometeus announced they'll they may retire the XenPower product completely)

    We did plan for a long time to retire XenPower in the PV form and replace it with HVM, we thought this bug is a good opportunity, but it seems we may not be ready in time.

    We are rebooting now the OVZ nodes one by one so it will take 24 hours, possibly more. Expected downtime for every node is below 30 minutes if everything goes well.
    Some will take as low as 10 min, the e3 ones, largest some 20, up to 30. Containers may take longer to come up though in some cases, if you have been down for more tahn 30 minutes, please check the announcement and if nothing about your node, please open a ticket.
    We do not expect problems, but this is not an exact science.

    Thanked by 2vimalware sureiam
  • Mr_TomMr_Tom Member, Host Rep

    One of my providers did live migrations of VMs and updated the hosts.

    Eg, move running VMs to a new host live, patch/reboot other host, and circle machines while patching previous host.

    I've not heard from DO/Vultr/Hosthatch/ZX about it to be honest - but at the same time I've not logged in to check either lol.

  • MaouniqueMaounique Host Rep, Veteran

    Mr_Tom said: One of my providers did live migrations of VMs and updated the hosts.

    This can work and is a good opportunity to see how well this works in the event of a real node failure.
    IWStack with SAN storage supports this, those nodes with local ssd storage do not, though. Since the first line of lack of defense is OVZ, though, we will do those first.

  • Mr_TomMr_Tom Member, Host Rep

    @Maounique said:

    Mr_Tom said: One of my providers did live migrations of VMs and updated the hosts.

    This can work and is a good opportunity to see how well this works in the event of a real node failure.
    IWStack with SAN storage supports this, those nodes with local ssd storage do not, though. Since the first line of lack of defense is OVZ, though, we will do those first.

    Apparently one set of migrations went wrong and the VMs had to be rebooted. None of my services were affected, other than a slight loss of network to one VM for about 3 minutes.

  • MaouniqueMaounique Host Rep, Veteran
    edited January 2018

    Mr_Tom said: Apparently one set of migrations went wrong and the VMs had to be rebooted.

    We have an iwstack node down atm, but is not with SAN storage, it is one of the SSD nodes. It is also unrelated, 2 of the disks died and we try to recover the data now.
    You know that guy Murphy, I expect a lot of unrelated failures when the workload is the highest with planned stuff.

    Thanked by 1vimalware
  • Ramnode had a big reboot yesterday.

Sign In or Register to comment.