All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
XenPower Uptime/Redudancy
For the last day or so I've had a major issue with our XenPower Dallas server. Salvatore replied to my ticket to say that the issue was regarding one of the upstreams which wasn't announcing their prefix correctly so some traffic was getting dropped.
I thanked Salvatore for fixing the problem but I also mentioned that I was quite concerned that it took over 24 hours to fix the problem and asked why it took so long to fix.
This is the reply I received from another member of support regarding my concerns:
If you are a business where uptime is very important, I suggest you switch to a redundant setup like www.iwstack.com. Our budget brands such as XenPower and OVerZold do not benefit from the same level of uptime and network redundancy.
XenPower is thought to be a budget brand offering a lot of resources at a low price, however it does not compare well with a business grade cloud with a lot of extra features and HA/fail-over. For a business, the small price difference of a couple of dollars should be totally irrelevant.
I moved over to Prometues/Incero because they both have top notch reputations so this response really did surprise me. Uptime is important to everybody not just businesses. Never has it been mentioned that XenPower is a budget brand. In fact the complete opposite has been stated numerous times on LET.
To receive a reply worded in that way after experiencing over 24 hours of downtime leaves a very bad taste in the mouth. Is it a reply which should be expected and accepted?
Comments
All other brands then Prometeus/iwStack is low end brands from Mr. Salvaltore. In fact, I pretty sure Mr. S want iwStack.com to be the brand with the highest quality now, since they can offer more features there then on the Prometeus.com brand.
Nothing unusual for a business trying to upsell their clients.
So because XenPower is "low end" I should just accept over 24 hours down time and not ask any questions?
I would class LES at $5/y budget. Not a $90/y XEN. In your opinions what should I be spending to avoid situations such as this then?
Is there an uptime SLA? Has it been met? If not, ask them.
That is budget. Even the most basic Linode offers is $20/month, so more than 2x $90/year.
There will be issues when you do not control all the aspects of the DC. In Italy, everything is under control, from the DC to the multiple peering and carriers, however, in US we depend on Incero staff. This is one of the reasons we are still testing there, hopefully things will improve or we will be able to find the perfect mix, but that is not likely to be up to par with DC in Italy. US people have another approach towards business, a high cost of litigation means they might provide what they promised, but if they dont, you cant do much about it without huge legal costs, especially for non-locals, therefore, finding a perfect DC there is going to take some time.
That being said, Incero is not bad for an US DC, there are very few incidents like these, too bad they were quick to blame their upstream and suggest our customers to no longer use cogent when the issue was on their side. Me being out for a couple of days and Salvatore sick, didnt help it either.
So the problem is at XenPower Dallas? Sometimes small details have a big chance for explanation. It should be noted in the first posts.
As for the simple uptime, we can see uptime.erawan.me, Most of my vps were from Prometeus, because it's stability, and it's hard to trust another provider for the uptime and anoter factor, based on my experiences.
As for now, I hope uncle get well soon
It was a problem with a small subnet of ours not being announced properly with some routes over a couple of carriers being broken/filtered. Nobody is perfect and will never be, the more people along the chain, the higher the chance something will go wrong and will take longer to fix, and, as you know, if something can go wrong, it will. The difference is made on the level of cooperation you get to fix it.
Nobody said you should not ask any question, but lets be fair:
Prometues don't seem to advertise an SLA. Salvatore has stated on a couple of forums that their uptime is near 100%. I guess 99.7% is still "near". My fault for not checking this beforehand.
This isn't Linode but if you are saying that's what I need to be paying to avoid similar situations then fine. I didn't realise $90/y was looked on as being such a poor product, even by their own staff.
Maounique thank you for explaining what went wrong. It would have been good to have that sort of reply in my ticket rather than the upsell I received instead.
My replies to your edit: That was the point of this post - I asked questions/raised concerns and received no answers. 1) From my testing around 15% of locations could see the server. That's down in my book. 2) The ticket has zero explanations, that's what I was asking for. 3) So budget means over 24 hours down time is Ok? 4) Shouldn't be necessary but maybe I will have to consider it.
my bad, sorry. I do need some rest
I count this as explanation:
"I'm sorry it took so long. One of the upstream wasn't announcing our prefix correctly so some traffic were dropped."
You replied saying how bad we are and how you thought we were better. I agree it is not a good impression for a first time customer, I agree it should have been solved faster, I agree it is a failure, but if you were online long enough, you know things on the internet do not stay up all the time, including google or amazon, you know you need to setup redundancy if you need 100% uptime, you know that, even so, a major carrier failing, it will not be up 100%, more like, less people will see it down. We have redundant network, redundant storage, but that is as far as it goes, you do need to setup redundant locations too. With IWStack you can do that from the same interface, you can scale it up or down as needed, can setup internal load balancing and firewall/NAT/IPSec so if you have to switch something you just clone the VM, redirect ports and do the work with 0 downtime, etc.
Not by far. I agreed and still agree this was a serious screw-up, my point is that you will need at least 2 locations to make sure things will stay up close to 100%. Hardware failure can strike at any time, also, and for this you need a redundant setup such as iwstack, but even that will not be able to supplant the issue of possible network failure, either in our network, at the gates or in the internet carriers except if you setup 2 locations there too. You say uptime is very important to you, this is why I thought you are a company, no matter what is the reason you need high uptime for, you need to get a redundant setup.
We offer free backup space, free anycast DNS failover, exactly because nothing is perfect, everything can fail, given enough servers, enough switches, enough routes, enough people, enough time, it will never be 100%, but it can be close if we take precautions and consider our options well, even at a budget price.
1) Ok
2) You count that as an explanation? One short sentence. No details, no reasons for long delay to fix, why did it happen in the first place, what's been done to stop similar situations etc. I did not reply saying how bad you are at all. This was my reply:
Which was polite and completely reasonable. To which I received the reply in the first post with zero answers to my questions. I have since received another reply from the same member of staff with more facetious comments regarding my previous hosting experiences. I did not want this to play out the way that it is. It was not my intention. But I will defend my corner when necessary.
3) Uptime is important to everyone, no matter how small and insignificant.
4) I backup daily and using Anycast DNS should be totally overkill for what I'm doing. Going by your previous post you mentioned that you were not available and Salvatore was sick. So I should build redundancy into my setup to cater for your staffing issues as the problem would have been sorted much more quickly had you been available?
Then you do something like this: http://www.tuxz.net/blog/High_Availability_Automated_origin_failover_using_CloudFlare_Nagios_and_OpenShift/
Once again, we are both sorry for this situation, however, we can count the failures we had in 2 years+ on the fingers on one hand. This excludes planned downtime for upgrades and such, short DDoSes for a few minutes and OVZ reboots due to kernel issues, soft lockups and race conditions. Demanding reassurance this will not happen again is not reasonable, I am sorry to say, but I did point you on how to increase your uptime in the future. You took it wrong and hence this whole issue.
Either way, to show how sorry we are, I am offering you a full refund for this issue, even if we do not offer a SLA. What do you think?
Money, Availability, Latency. Pick two.
I'm going to leave it there as we're going round in circles.
@serverian thanks for the suggestion. Looks like it might be a bit tricky to use that setup with Wordpress but I need to sort something.
This looks easier: http://blog.booru.org/?p=12
Go tell that at WHT, you would be ridiculed because by general industry standards that really is ultra low budget pricing for the specs & quality you get.
Here the problem was out of hands of the provider & they could do nothing but wait for their DC to fix it yet they tried to explain it to you.
Problems can happen to any vps providers even the best in the industry, try to understand & move along or if uptime is so important to you host your stuff in multiple locations with same or multiple providers so that if things go wrong at one place the other one is ready to take its place.