All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Infected VMs - what can my provider do?
Hi all,
I have a VM with a provider I found through lowendbox. When I first got the box, it would lose networking for around a minute and then resume:
ping 172.245.108.236 | perl -nle 'print scalar(localtime), " ", $_'
Fri Mar 20 10:48:19 2015 64 bytes from 172.245.108.236: icmp_req=147 ttl=58 time=0.305 ms
Fri Mar 20 10:48:20 2015 64 bytes from 172.245.108.236: icmp_req=148 ttl=58 time=0.288 ms
Fri Mar 20 10:48:21 2015 64 bytes from 172.245.108.236: icmp_req=149 ttl=58 time=0.402 ms
Fri Mar 20 10:48:22 2015 64 bytes from 172.245.108.236: icmp_req=150 ttl=58 time=0.360 ms
Fri Mar 20 10:49:19 2015 64 bytes from 172.245.108.236: icmp_req=207 ttl=58 time=10.7 ms
Fri Mar 20 10:49:26 2015 64 bytes from 172.245.108.236: icmp_req=214 ttl=58 time=0.378 ms
Notice how after 10:48:22 there are no pings for around a minute? The pings would resume.
The host responded with
I have suspended the infected VM. This should be fixed.
That worked. Then it happened again:
Mon Mar 30 05:14:43 2015 64 bytes from 172.245.108.236: icmp_req=39 ttl=50 time=48.0 ms
Mon Mar 30 05:15:42 2015 64 bytes from 172.245.108.236: icmp_req=98 ttl=50 time=47.9 ms
^ I minute wait between the last 2 pings
The response?
I have removed an infected VM.
So I asked
Also, it seemed to be the same problem as the previous one. Can you scan the node for VMs with the same infection?
The response?
I have to scan for them manually sadly, there is no way to make it automated.
Now it's back
Tue Apr 7 12:00:06 2015 64 bytes from 172.245.108.236: icmp_req=125 ttl=50 time=60.4 ms
Tue Apr 7 12:00:38 2015 64 bytes from 172.245.108.236: icmp_req=157 ttl=50 time=60.5 ms
^ 32 second wait between the last 2 pings.
So, is there anything you think this provider could do? Or should I just accept this will happen every 7-10 days and find a different host.
Thanks!
Comments
They should provide hardware/network support, to ask them to ensure it won't happen even if they need to scan manually every VPS. I mean, it's their job to take care of the node's health.
Let me guess, Phase7? I had nearly the same issue, but the Support could not fix it.
If i were you , i would move after the 3rd time the same problem occurs , it is just not worth my time.
A provider can't force somebody to secure their VPS, but the provider should be terminating that client if it happens multiple times. The provider should also have automation is place to suspend VPSs that are compromised and used for attacks or impact other clients on the node. We don't have all of the details though so it's hard to tell what the provider can and cannot do as well as what clients will allow the provider to do (some providers have included fail2ban installed and configured by default on VPSs which got them some bad threads on WHT for forcing security on clients, other providers required clients to use an SSH port other than 22 for security and clients, again, complained about how it should be their choice if they want a secure VPS or not even though you can see how insecure VPSs impact other people).
Anything in the logs? I wonder if this is an IP conflict or something else.
As per the IP, this is a ColoCrossing reseller in Buffalo.
Definitely it's your provider's job to ensure the service quality. If they are not happy/able to do that, I would recommend finding a company that does.
I wonder if the infected VPS is sending outbound flood, I see a lot of VPS's with an insecure password with this stuff!
Random file name in /boot and process with the same name, does anyone else see this?
1009893 ukxnngtcpu ls
1009896 ukxnngtcpu grep "A"
1009899 ukxnngtcpu ifconfig eth0
1009901 ukxnngtcpu ifconfig
1009902 ukxnngtcpu netstat -an
Helped somebody out recently who had the same thing running on his box. It's installed by a SSH bruteforcer - one from South Korea in this case (guy tried to clear the logs and didn't really do it very successfully). The process pretends to be all kinds of system-y looking stuff, but
pstree
shows the real process name. Saw no strange network activity or otherwise on that system, though, it was just sitting there consuming CPU.My recommendation for that one would be to just wipe the VM and start over.
On OpenVZ I've had luck mounting the ploopy thing, then removing the cronjob, removing the init bits and the files in /boot, then starting the container up with no IP's added just to be safe...
The problem is that you can't know whether it's really gone. It's installed as root, so it could be hiding pretty much anywhere.
correct but not every VPS, as service is unmanaged and if customer don't know how to secure VPS, I think they need to upgrade to managed service. if you have read the subject it's VPS is infected not VPS node, see the difference?
I would expect the provider to suspend the infected VPS if it's causing issues for other users on the node?
That's exactly what I meant.
Try add:
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048
to /etc/sysctl.conf and then execute 'sysctl -e -p'?.
it'll be good to null route the IP, instead of suspend so that OP can rectify the security issue
Hi,
Thanks for the responses. Just to clarify - this isn't my box that's infected. It's somewhere else on the node. Every time this happens, the provider suspends someone else's VM and it's fixed for a few days.
Still no response from them on this, but I've linked through this thread in the support ticket so hopefully it'll give them some ideas of how they can clean up the node.
Cheers
I received a response from the host!
That's it. No further detail.
This has left me slightly perplexed. The VM was doing this the very first day I got it and the very same issue has twice been fixed by support by, in their words, removing infected VMs. Now, suddenly, it's isolated to my container.
By the time the support ticket was replied to, the network hangs had stopped (they do stop and start for periods) so I suppose when he checked there was nothing to see.
So annoying. I hate getting into discussions with support. Would be better to move on I think.
@d60eba The only thing a provider can and should do with a noticed infected VM is to turn it off right away and promptly warn the user regarding the issue.
The rest, such as, but not limited to: system installation, integrity management, backup, restoration - is all up to the user.
In a situation of integrity compromise, the system must be rebuilt from scratch, when it comes to malware there is no quick "scan and fix" way of maintaining integrity.
Nothing beats a well defined system, period.
@Janevski I agree that a general malware scan is tough, but as the provider had already identified 2 infected VMs I thought perhaps a quick scan could be made for whatever it was that infected the first 2. I don't know anything about OpenVZ and how it's setup though so it's interesting to hear you say that.
Anyway, the network hangs are back and worse than ever:
I checked the next IP along and the same thing:
even though I shut my VPS down. So I don't think this can be an issue with just my container.
So, time for a new host I think.
I don't know about provider scanning all VMs, but they should certainly set up a script or other system so that the moment they detect packet flood from some VM, they shut down the VM.
http://vpsantiabuse.com is nothing short of amazing for OpenVZ.
@linuxthefish vpsantiabuse looks amazing. I will recommend it to my (now ex) host
Agreed, on all points
This provider doesn't seem to be proactive (screen new clients, monitor for abuse) only reactive. Don't waste your time doing their work. Move on. There's lots of genuinely good low end hosts!