All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
ebtables eats 100% CPU on proxmox node
Good day.
I have installed PROXMOX VE on Deb10 host node where has been already running and configured Firewalld & MariaDB & postfix & BIND.
It is new fresh installation of PROXMOX where host is currently w no load, no traffic, yet idling. (host node is OVH KVM 2vCPU, nested virtualization enabled)
Host node net config /etc/network/interfaces:
auto lo
iface lo inet loopback
# The primary network interface for PROXMOX host node
auto ens3
iface ens3 inet static
address 152.228.91.75/32
gateway 152.228.90.1
pointopoint 152.228.90.1
dns-nameservers 127.0.0.1 8.8.8.8 8.8.4.4
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
post-up echo 1 > /proc/sys/net/ipv4/conf/ens3/proxy_arp
# for KVM VM-IPs
auto vmbr0
iface vmbr0 inet static
address 152.228.91.75/32
bridge_ports none
bridge_stp off
bridge_fd 0
# up ip route for additional-IPs/32 on vmbr0
up ip route add 178.32.100.221/32 dev vmbr0
Firewalld uses 2 zones:
PUBLIC for host node (for ens3):
<?xml version="1.0" encoding="utf-8"?>
<zone>
<short>Public</short>
<interface name="ens3"/>
<service name="smtp"/>
<service name="smtps"/>
<service name="pop3"/>
<service name="pop3s"/>
<service name="imap"/>
<service name="imaps"/>
<port port="587" protocol="tcp"/>
<port port="53" protocol="tcp"/>
<port port="2222" protocol="tcp"/>
<port port="3306" protocol="tcp"/>
<port port="53" protocol="udp"/>
<port port="22" protocol="tcp"/>
<port port="8006" protocol="tcp"/>
</zone>
TRUSTED for KVM instance (for vmbr0; allowing all traffic to vmbr0 IPs passthrough host system without filtering by its running firewall):
<?xml version="1.0" encoding="utf-8"?>
<zone target="ACCEPT">
<short>Trusted</short>
<description>All network connections are accepted.</description>
<interface name="vmbr0"/>
</zone>
ISSUE:
Immediately (or after host node reboot) all works fine:
- Hostnode OS has low CPU load (0% - max 8 or 9%; permanently always under 10%),
- KVM instance (using vmbr0 iface) load is also idling (<4-5% CPU permanently, all time, its LAMP+MTA & all internet connections work fine),
- both no traffic, ALL FINE
BUT: after a few hours (second days morning) i always found:
- Hostnode does have crazy load 100% of CPU: ebtables-restor process consume permanently/whole time all available CPU; traffic is still idling
- (KVM instance remain ok, idling CPU and no traffic too, there is no problem)
BTW: I tried the same (allow all traffic for vbmr0 go directly to VM without be affected by host firewall) also with other (to me know) option with other firewall config as FORWARD rules:
firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -i vmbr0 -j ACCEPT
firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -o vmbr0 -j ACCEPT
firewall-cmd --reload; service firewalld restart
But unfortunately, the result after several hours is exactly the same ISSUE as in 1st case with vmbr0 added in TRUSTED zone (host node does have after few hours 100% bussy CPU, consumed by ebtables-restor )
I am already total desparate from it: I killed 2 whole days with this issue (tried all reinstall several times on new, but always exactly the same problem happened and ebtables consume all 100% CPU on idling hostnode & servers. I also cannot find 1 such or similar cases when search on internet or any useful resolutions, suggestions, nothing... Absolutely strange thingh for me and i am not able resolving it and move somewhere.
BTW#2: I know that best is have on host running nothing but Proxmox only, but i need keep current case from several reasons (have and keep on host node running also firewalld + mentioned software, and cant place it all inside nested VM behind proxmox host node)
Thank you very much for all effort and help, i would be really grateful if someone can help me resolve this crazy, strange (for me) issue. All attempts really highly appreciated by me.
Comments
Additionally - here is actual complete iptables content if this can be useful:
iptables -S
Thanks for all help!
you hosting node on VPS, try to host the PROXMOX VE node on the dedicated server instead of limited resources VPS.
Hello. I assume that it is (tiny resources on VPS) sure not root/cause of mentioned issues in this case because:
1) when all work well, load on host is permanently less than 10% (most time 1-4% and this is consumed by KVM process usually), but then later from unknown reason ebtables begin sabotage whole system and load is once 100%
2) the same issue on dedicated server - ebtables process try insane/abnormal eat CPU resources (dual XEON with 24 cores, on above mentioned & descripted scenario), i assume there is something stucking & cycling around eternal what does ebtables (while i reboot node or restart firewall, then all run again few hours fine, and around)
@ everybody who is interesting in this issue:
UPDATE:
After another day of searching, reading and investigations probably i found probable reason of problem and also usable solution (not quit clean but yet seems all work well)
CAUSE:
Seems, when firewalld and Proxmox VE meet self on one host/OS, they compete about control of ebtables, have conflict: On begin, ebtables is empty (all connection allowed), but after any time i found ebtable full of nonsensual rules and many times repeating rows, after few hours i found there several thousands of them. Then ibtables-restor try/begin maintain (unsuccessful, around) ebtables and this begin eat permanently crazy portion of CPU resources and cause my issue.
MY PARTIAL SOLUTION (what seems yet that work well):
As i do not need ethernet bridge firewall ever (because each IP is filtered by own OS firewall), i decided try disable ebtables. As i not found option in firewalld DOCS how to do it, i disabled ebtables in kernel (in modprobe.d as blacklisted module). From this time all works well: after more than 24h is load still normal and very low ( CPU load averages example => 0.14 (1 min) 0.21 (5 mins) 0.18 (15 mins) ), when ebtables is off.