Anyone had this issue with arp requests on CentOS?

jar · January 2013

This is an interesting issue, and I'm wondering if any other providers might have gone through a similar problem. This system is currently inaccessible, so I can't really take advice and run tests right now, but I'd like to see if sharing this generates some ideas.

So we get a new server set up, rented IP range and connected to our provider's network. The primary IP operates fine, the node is fully accessible from the internet. So I install OpenVZ, set all of the configuration correctly (knowing the configuration is correct, I still compared to a working system that is almost identical and connected to identical network equipment), create OpenVZ container, container has no internet. Also cannot ping container from the outside.

Now, I know exactly what you're thinking. I messed up a setting in sysctl.conf or vz.conf. Unfortunately, not so. IP forwarding is enabled, verified as enabled. Neighbor_devs is set to all. Tried setting neighbor_devs to detect. Tried enabling arp proxy.

The only way I can get an IP beyond the primary one to work on this system is to manually initiate an arp request OR to have the provider remove the subnet from the vlan and then add it back. Of course, only a matter of time before the arp entry expires.

According to my provider, and from everything he has shown me (been quite open about it), there is 0 difference between two nearly identical servers and their setup. Yet, on server1, I can do "ifconfig eth0:0 whateveriphere up" and immediately ping it from the outside internet. On server2, if I do the same, I have to manually initiate an arp request or it will never be pingable.

Both systems running CentOS 6.3, issue persists with or without OpenVZ kernel.

So at the risk of sounding like I can't run my business, because I know that's how some people will take it here, I'm being humble enough to ask for ideas. My real hope is that I find someone else who has had the same problem.

Nick_A · January 2013

Yes, I've had this same problem (as we've discussed) :P

I can't recall if I fixed it on my own or if it ever went away with the provider you're using.

jar · January 2013

@Nick_A said: I can't recall if I fixed it on my own or if it ever went away with the provider you're using.

I'm told that ip stealing protection on SolusVM was related to your issue, not sure on that. I'm down to malfunctioning NIC, gateway not acting as it should, or bug in CentOS. But...I've never had to manually initiate an arp request on any server with any operating system prior to this. I've always added IPs as an alias and then used them (aside from OpenVZ usage obviously). I'd be thrilled to find out I'm a complete idiot so this could be over haha

eastonch · January 2013

Jarland,

Are you having this problem on a fresh install, or just after installing the kernel and/or SolusVM / other sugary treats...

I wouldn't perhaps say it was a faulty NIC if you're able to manually initiate the ARP request, however who knows? Are both nodes setup the exact same?

jar · January 2013

Fresh install of CentOS or Scientific 6.3, with or without OpenVZ installed by hand or with SolusVM (done for kicks, when options start running out). Every time we tried something different yesterday we did a fresh install, just to make sure it was a clean slate.

Network and mostly hardware are exactly the same. At least same model, not the exact same gateway just identical.

eastonch · January 2013

Looks like you're wittled through all the things I would of; good luck in fixing it; and keep us updated, for seo keepsakes

jar · January 2013

Thanks brother. Definitely let you guys know what happens so this can be documented for anyone else. I've run into pieces that look similar everywhere, a couple people that may or may not have the same issue, never quite exactly this and a solid solution that doesn't involve a returned arp error.

I'll say this, Gordon has gone above and beyond. Given the circumstances I don't blame him for thinking its my config first and foremost. Don't blame him one bit. But he's messing with it now.

eastonch · January 2013

@Jarland

Did you mention you were getting another node setup? I'd be interested to see if the new node has the same problem.

Have you performed a yum-upgrade?

Check the versions of the NODE 2 to th NODE 1. It's a longshot, but it's a process of elimination.

jar · January 2013

Yeah this is our second Dallas server. The first one works great and works exactly as I would expect. Now, it was configured with SolusVM when we set it up, but I tried that on the new one too. I keep the first node up to date because I haven't seen a risky package update since first install, so both are running the same packages. I had to install the previous openvz kernel on the new one to completely match the first, then started mirroring all relevant configs aside from hardware addresses and IPs. Didn't seem to have any effect.

At least I'm not the only one who thinks its strange. I'd love to hear "Hey stupid, do this, can't believe you're a provider!" But I'll settle for knowing it makes others scratch their heads

qps · January 2013

Is it an Intel-based Ethernet card? I've heard of some of their cards having issues. Sometimes upgrading to the latest version of the drivers helps. I think there was a thread about in on WHT a while back.

jar · January 2013

Interesting. It is intel. Worth looking into, thanks

Jacob · January 2013

Yeah. I'm really not sure what to suggest on this. I've never encountered this before, but had some flakey .32 servers, that required alot of tweaking to perfect. (Perhaps, try this with .18 kernel?)

what @qps said is also a very good point, but Intel NICs are often quite reliable, and I've never had a NIC fail on me, or any faults at all. touch wood

On a couple supermicro boards, the inbuilt NICs have driver problems. But without a doubt gordon would already know this. (Driver updates can fix this).

@jarland said: I'd love to hear "Hey stupid, do this, can't believe you're a provider!" But I'll settle for knowing it makes others scratch their heads

miTgiB · January 2013

@Jacob said: On a couple supermicro boards, the inbuilt NICs have driver problems. But without a doubt gordon would already know this. (Driver updates can fix this).

While I doubt this is a related issue, E3 boards have horrible nic issues with CentOS (Or RHEL based distro's), and updating to the latest bios and using kmod-e1000e from elrepo will fix all but OpenVZ based distro's. the tell tale sign is NIC reset on the console and only way to resolve is reboot, but again, doesn't seem related at all to what @jarland is describing.

Corey · January 2013

@jarland use

'ip route add' to add your other ranges don't waste an ip in ifcfg eth0:0 . Make sure iptables is turned off.

eastonch · January 2013

I would most likely rofl if this issue is IpTables related.

Corey · January 2013

@Corey said: @jarland use

'ip route add' to add your other ranges don't waste an ip in ifcfg eth0:0 . Make sure iptables is turned off.

service networking restart and then ping and wait

George_Fusioned · January 2013

Try manually installing the e1000 drivers like others have suggested. I use the following on my CentOS 6 servers:

wget http://sourceforge.net/projects/e1000/files/e1000e stable/1.6.3/e1000e-1.6.3.tar.gz/download
tar -zxf e1000e-*
cd e1000e-*/src
make CFLAGS_EXTRA=-DDISABLE_PCI_MSI CFLAGS_EXTRA=-DE1000E_NO_NAPI install

Then add the following to your kernel line in grub.conf

pcie_aspm=off e1000e.IntMode=1,1

jar · January 2013

@Corey said: don't waste an ip in ifcfg eth0:0

Main reason I started doing it that way was that it was a good way to show what I believed the problem to be while removing OpenVZ from the equation. Opens up my options for troubleshooting.

@eastonch said: I would most likely rofl if this issue is IpTables related.

Oh man I wish it was

@George_Fusioned said: Try manually installing the e1000 drivers like others have suggested. I use the following on my CentOS 6 servers:

Thanks! Gave it a shot and no luck, but certainly didn't hurt anything.

This is why I love LET.

Handing it back to Gordon for him to toy with it some more. I'm about to lose my mind. This thing is such a beast too. Dual E5, 8x 500GB drives.

trewq · January 2013

I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

jar · January 2013

@trewq said: I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

That is quite strange. Maybe what we figure out helps you gain an IP

Corey · January 2013

@jarland said: That is quite strange. Maybe what we figure out helps you gain an IP

Sorry I couldn't be of help... those are the issues I usually run into with openvz.

concerto49 · January 2013

@trewq said: I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

Same issue here strangely.

jar · January 2013

@Corey said: Sorry I couldn't be of help... those are the issues I usually run into with openvz.

Appreciate any attempts. It's really all any of us can do, throw random ideas at it. It seems I've hit a pretty legit issue.

jar · January 2013

Broadcast limit changed on gateway, problem appears solved. Watched Gordon repeat all of our steps on the KVM and was quite generous in dealing with the issue once it was caught.

Corey · January 2013

@jarland said: Broadcast limit changed on gateway, problem appears solved. Watched Gordon repeat all of our steps on the KVM and was quite generous in dealing with the issue once it was caught.

Broadcast limit on gateway? So on the providers switch there was a limit on your gateway? Or do you have your own layer3 device?

jar · January 2013

@Corey said: Broadcast limit on gateway? So on the providers switch there was a limit on your gateway? Or do you have your own layer3 device?

All Gordon's equipment. Gateway, switch, words are getting lumped together in my head after all this

support123 · January 2013

@jarland I am having same problem with my server in incero.can you enlighten how you solved it?

AnthonySmith · January 2013

@ftpit

Did you read the thread the resolution is in it.....2 posts up.

rsk · January 2013

@jarland said: So we get a new server set up, rented IP range and connected to our provider's network. The primary IP operates fine, the node is fully accessible from the internet. So I install OpenVZ, set all of the configuration correctly (knowing the configuration is correct, I still compared to a working system that is almost identical and connected to identical network equipment), create OpenVZ container, container has no internet. Also cannot ping container from the outside.

This is the problem we faced too

support123 · January 2013

@rsk said: This is the problem we faced too

Are you with incero too?

jar · January 2013

@ftpit said: I am having same problem with my server in incero.can you enlighten how you solved it?

It is entirely possible that there could be another problem, perhaps even software. However, if you have ruled this out, you can ask Gordon if he can see any reason why you would be having the same issue that Jarland & Ryan from Catalyst had. The situation should still be pretty fresh on his mind.

Howdy, Stranger!

Categories

In this Discussion

Anyone had this issue with arp requests on CentOS?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Anyone had this issue with arp requests on CentOS?

Comments