Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Anyone had this issue with arp requests on CentOS?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Anyone had this issue with arp requests on CentOS?

jarjar Patron Provider, Top Host, Veteran
edited January 2013 in Help

This is an interesting issue, and I'm wondering if any other providers might have gone through a similar problem. This system is currently inaccessible, so I can't really take advice and run tests right now, but I'd like to see if sharing this generates some ideas.

So we get a new server set up, rented IP range and connected to our provider's network. The primary IP operates fine, the node is fully accessible from the internet. So I install OpenVZ, set all of the configuration correctly (knowing the configuration is correct, I still compared to a working system that is almost identical and connected to identical network equipment), create OpenVZ container, container has no internet. Also cannot ping container from the outside.

Now, I know exactly what you're thinking. I messed up a setting in sysctl.conf or vz.conf. Unfortunately, not so. IP forwarding is enabled, verified as enabled. Neighbor_devs is set to all. Tried setting neighbor_devs to detect. Tried enabling arp proxy.

The only way I can get an IP beyond the primary one to work on this system is to manually initiate an arp request OR to have the provider remove the subnet from the vlan and then add it back. Of course, only a matter of time before the arp entry expires.

According to my provider, and from everything he has shown me (been quite open about it), there is 0 difference between two nearly identical servers and their setup. Yet, on server1, I can do "ifconfig eth0:0 whateveriphere up" and immediately ping it from the outside internet. On server2, if I do the same, I have to manually initiate an arp request or it will never be pingable.

Both systems running CentOS 6.3, issue persists with or without OpenVZ kernel.

So at the risk of sounding like I can't run my business, because I know that's how some people will take it here, I'm being humble enough to ask for ideas. My real hope is that I find someone else who has had the same problem.

«1

Comments

  • Nick_ANick_A Member, Top Host, Host Rep

    Yes, I've had this same problem (as we've discussed) :P

    I can't recall if I fixed it on my own or if it ever went away with the provider you're using.

  • jarjar Patron Provider, Top Host, Veteran
    edited January 2013

    @Nick_A said: I can't recall if I fixed it on my own or if it ever went away with the provider you're using.

    I'm told that ip stealing protection on SolusVM was related to your issue, not sure on that. I'm down to malfunctioning NIC, gateway not acting as it should, or bug in CentOS. But...I've never had to manually initiate an arp request on any server with any operating system prior to this. I've always added IPs as an alias and then used them (aside from OpenVZ usage obviously). I'd be thrilled to find out I'm a complete idiot so this could be over haha

  • Jarland,

    Are you having this problem on a fresh install, or just after installing the kernel and/or SolusVM / other sugary treats...

    I wouldn't perhaps say it was a faulty NIC if you're able to manually initiate the ARP request, however who knows? Are both nodes setup the exact same?

  • jarjar Patron Provider, Top Host, Veteran
    edited January 2013

    Fresh install of CentOS or Scientific 6.3, with or without OpenVZ installed by hand or with SolusVM (done for kicks, when options start running out). Every time we tried something different yesterday we did a fresh install, just to make sure it was a clean slate.

    Network and mostly hardware are exactly the same. At least same model, not the exact same gateway just identical.

  • Looks like you're wittled through all the things I would of; good luck in fixing it; and keep us updated, for seo keepsakes :)

  • jarjar Patron Provider, Top Host, Veteran

    Thanks brother. Definitely let you guys know what happens so this can be documented for anyone else. I've run into pieces that look similar everywhere, a couple people that may or may not have the same issue, never quite exactly this and a solid solution that doesn't involve a returned arp error.

    I'll say this, Gordon has gone above and beyond. Given the circumstances I don't blame him for thinking its my config first and foremost. Don't blame him one bit. But he's messing with it now.

  • @Jarland

    Did you mention you were getting another node setup? I'd be interested to see if the new node has the same problem.

    Have you performed a yum-upgrade?

    Check the versions of the NODE 2 to th NODE 1. It's a longshot, but it's a process of elimination.

  • jarjar Patron Provider, Top Host, Veteran

    Yeah this is our second Dallas server. The first one works great and works exactly as I would expect. Now, it was configured with SolusVM when we set it up, but I tried that on the new one too. I keep the first node up to date because I haven't seen a risky package update since first install, so both are running the same packages. I had to install the previous openvz kernel on the new one to completely match the first, then started mirroring all relevant configs aside from hardware addresses and IPs. Didn't seem to have any effect.

    At least I'm not the only one who thinks its strange. I'd love to hear "Hey stupid, do this, can't believe you're a provider!" But I'll settle for knowing it makes others scratch their heads ;)

  • qpsqps Member, Host Rep

    Is it an Intel-based Ethernet card? I've heard of some of their cards having issues. Sometimes upgrading to the latest version of the drivers helps. I think there was a thread about in on WHT a while back.

  • jarjar Patron Provider, Top Host, Veteran

    Interesting. It is intel. Worth looking into, thanks :)

  • JacobJacob Member
    edited January 2013

    Yeah. I'm really not sure what to suggest on this. I've never encountered this before, but had some flakey .32 servers, that required alot of tweaking to perfect. (Perhaps, try this with .18 kernel?)

    what @qps said is also a very good point, but Intel NICs are often quite reliable, and I've never had a NIC fail on me, or any faults at all. touch wood

    On a couple supermicro boards, the inbuilt NICs have driver problems. But without a doubt gordon would already know this. (Driver updates can fix this).

    @jarland said: I'd love to hear "Hey stupid, do this, can't believe you're a provider!" But I'll settle for knowing it makes others scratch their heads ;)

  • @Jacob said: On a couple supermicro boards, the inbuilt NICs have driver problems. But without a doubt gordon would already know this. (Driver updates can fix this).

    While I doubt this is a related issue, E3 boards have horrible nic issues with CentOS (Or RHEL based distro's), and updating to the latest bios and using kmod-e1000e from elrepo will fix all but OpenVZ based distro's. the tell tale sign is NIC reset on the console and only way to resolve is reboot, but again, doesn't seem related at all to what @jarland is describing.

  • @jarland use

    'ip route add' to add your other ranges don't waste an ip in ifcfg eth0:0 . Make sure iptables is turned off.

  • I would most likely rofl if this issue is IpTables related. :)

  • @Corey said: @jarland use

    'ip route add' to add your other ranges don't waste an ip in ifcfg eth0:0 . Make sure iptables is turned off.

    service networking restart and then ping and wait

  • Try manually installing the e1000 drivers like others have suggested. I use the following on my CentOS 6 servers:

    wget http://sourceforge.net/projects/e1000/files/e1000e stable/1.6.3/e1000e-1.6.3.tar.gz/download
    tar -zxf e1000e-*
    cd e1000e-*/src
    make CFLAGS_EXTRA=-DDISABLE_PCI_MSI CFLAGS_EXTRA=-DE1000E_NO_NAPI install

    Then add the following to your kernel line in grub.conf

    pcie_aspm=off e1000e.IntMode=1,1
  • jarjar Patron Provider, Top Host, Veteran
    edited January 2013

    @Corey said: don't waste an ip in ifcfg eth0:0

    Main reason I started doing it that way was that it was a good way to show what I believed the problem to be while removing OpenVZ from the equation. Opens up my options for troubleshooting.

    @eastonch said: I would most likely rofl if this issue is IpTables related. :)

    Oh man I wish it was :(

    @George_Fusioned said: Try manually installing the e1000 drivers like others have suggested. I use the following on my CentOS 6 servers:

    Thanks! Gave it a shot and no luck, but certainly didn't hurt anything.

    This is why I love LET.

    Handing it back to Gordon for him to toy with it some more. I'm about to lose my mind. This thing is such a beast too. Dual E5, 8x 500GB drives.

  • trewqtrewq Administrator, Patron Provider

    I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

  • jarjar Patron Provider, Top Host, Veteran

    @trewq said: I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

    That is quite strange. Maybe what we figure out helps you gain an IP ;)

  • @jarland said: That is quite strange. Maybe what we figure out helps you gain an IP ;)

    Sorry I couldn't be of help... those are the issues I usually run into with openvz.

  • @trewq said: I'm having this issue with just one IP in the middle of a range. It's so weird. I get my ips so cheap it's not worth fixing it.

    Same issue here strangely.

  • jarjar Patron Provider, Top Host, Veteran

    @Corey said: Sorry I couldn't be of help... those are the issues I usually run into with openvz.

    Appreciate any attempts. It's really all any of us can do, throw random ideas at it. It seems I've hit a pretty legit issue.

  • jarjar Patron Provider, Top Host, Veteran
    edited January 2013

    Broadcast limit changed on gateway, problem appears solved. Watched Gordon repeat all of our steps on the KVM and was quite generous in dealing with the issue once it was caught.

  • @jarland said: Broadcast limit changed on gateway, problem appears solved. Watched Gordon repeat all of our steps on the KVM and was quite generous in dealing with the issue once it was caught.

    Broadcast limit on gateway? So on the providers switch there was a limit on your gateway? Or do you have your own layer3 device?

  • jarjar Patron Provider, Top Host, Veteran
    edited January 2013

    @Corey said: Broadcast limit on gateway? So on the providers switch there was a limit on your gateway? Or do you have your own layer3 device?

    All Gordon's equipment. Gateway, switch, words are getting lumped together in my head after all this ;)

  • @jarland I am having same problem with my server in incero.can you enlighten how you solved it?

  • AnthonySmithAnthonySmith Member, Patron Provider

    @ftpit

    Did you read the thread the resolution is in it.....2 posts up.

  • rskrsk Member, Patron Provider

    @jarland said: So we get a new server set up, rented IP range and connected to our provider's network. The primary IP operates fine, the node is fully accessible from the internet. So I install OpenVZ, set all of the configuration correctly (knowing the configuration is correct, I still compared to a working system that is almost identical and connected to identical network equipment), create OpenVZ container, container has no internet. Also cannot ping container from the outside.

    This is the problem we faced too :(

  • @rsk said: This is the problem we faced too :(

    Are you with incero too?

  • jarjar Patron Provider, Top Host, Veteran

    @ftpit said: I am having same problem with my server in incero.can you enlighten how you solved it?

    It is entirely possible that there could be another problem, perhaps even software. However, if you have ruled this out, you can ask Gordon if he can see any reason why you would be having the same issue that Jarland & Ryan from Catalyst had. The situation should still be pretty fresh on his mind.

Sign In or Register to comment.