For kube-proxy, node addition and node update is semantically
considered as similar event, we have exactly same handler
logic for these two events resulting in duplicate code and
unit tests.
This merges the `NodeHandler` interface methods OnNodeAdd and
OnNodeUpdate into OnNodeChange along with the implementation
of the interface.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
ProxyHealthServer now consumes NodeManager to get the latest
updated node object for determining node eligibility.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Dan Winship <danwinship@redhat.com>
NodeManager, if configured with to watch for PodCIDR watch, watches
for changes in PodCIDRs and crashes kube-proxy if a change is
detected in PodCIDRs.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Dan Winship <danwinship@redhat.com>
NodeManager initialises node informers, waits for cache sync and polls for
node object to retrieve NodeIPs, handle node events and crashes kube-proxy
when change in NodeIPs is detected.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Dan Winship <danwinship@redhat.com>
This simplifies how the proxier receives update for change in node
labels. Instead of passing the complete Node object we just pass
the proxy relevant topology labels extracted from the complete list
of labels, and the downstream event handlers will only be notified
when there are changes in topology labels.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
For kube-proxy, node addition and node update is semantically
considered as similar event, we have exactly same handler
logic for these two events resulting in duplicate code and
unit tests.
This merges the `NodeHandler` interface methods OnNodeAdd and
OnNodeUpdate into OnNodeChange along with the implementation
of the interface.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
ProxyHealthServer now consumes NodeManager to get the latest
updated node object for determining node eligibility.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
NodeManager, if configured with to watch for PodCIDR watch, watches
for changes in PodCIDRs and crashes kube-proxy if a change is
detected in PodCIDRs.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
NodeManager initialises node informers, waits for cache sync and polls for
node object to retrieve NodeIPs, handle node events and crashes kube-proxy
when change in NodeIPs is detected.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
This simplifies how the proxier receives update for change in node
labels. Instead of passing the complete Node object we just pass
the proxy relevant topology labels extracted from the complete list
of labels, and the downstream event handlers will only be notified
when there are changes in topology labels.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Rather than having a RetryAfter function, do a retry (at a fixed
interval) if the work function returns an error.
Co-authored-by: Antonio Ojea <aojea@google.com>
Burst syncs are theoretically useful for dealing with a single change
that results in multiple Run() calls (eg, a Service and EndpointSlice
both changing), but 2 isn't enough to cover all cases, and a better
way of dealing with this problem is to just use a smaller
minSyncPeriod.
Co-authored-by: Antonio Ojea <aojea@google.com>
- Use structured logging.
- Use t.Helper() in unit tests.
- Improve some comments.
- Remove an unnecessary check/panic.
Co-authored-by: Antonio Ojea <aojea@google.com>
With filter-output chain already operating with priority
post DNAT, we can merge both the chains together.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
With this commit the filter-input, filter-forward, and filter-output base chains
are hooked with priority 0. For filtering before DNAT, filter-prerouting-pre-dnat
and filter-output-pre-dnat should be used which have a priority lower than DNAT
(-110)
Signed-off-by: Daman Arora <aroradaman@gmail.com>
With this commit, the conntrack reconciler clears the stales
entries when endpoints change port without changing IP.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
A packet can traverse the service-xxxx chains by matching on either
service-ips or service-nodeports verdict map. We masquerade off-cluster
traffic to ClusterIP (when masqueradeAll = false) by adding a rule in
service-xxxx which checks if destination IP is ClusterIP, port and
protocol matches with service specs and source IP doesn't belong to
PodCIDR and masquerade on match.
If the packet reaches the service chain by match on service-ips map,
then ClusterIP, port and protocol are already matching service specs.
If it comes via external-xxxx chain then the destination IP will
never be ClusterIP. Therefore, we can simplify the masquerade
off-cluster traffic to ClusterIP check by simply matching on
destination ip and source ip.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
To make switching to/from nftables easier, kube-proxy runs iptables
and ipvs cleanup when starting in nftables mode, and runs nftables
cleanup when starting in iptables or ipvs mode. But there's no
guarantee that the node actually supports the mode we're trying to
clean up, so don't log errors if it doesn't.
iptables and ipvs were both leaving KUBE-MARK-MASQ behind (even though
the corresponding KUBE-POSTROUTING rule to actually do the masquerade
got deleted).
iptables was failing to clean up its KUBE-PROXY-FIREWALL chain (the
cleanup rules never got updated when that was split out of
KUBE-FIREWALL), and also not cleaning up its canary chain.
This also fixes it so that ipvs.CleanupLeftovers only deletes
ipvs/ipset stuff once, rather than first deleting all of it on behalf
of the IPv4 Proxier and then no-op "deleting" it all again on behalf
of the IPv6 Proxier.
Various parts of kube-proxy passed around a "hostname", but it is
actually the name of the *node* kube-proxy is running on, which is not
100% guaranteed to be exactly the same as the hostname. Rename it
everywhere to make it clearer that (a) it is definitely safe to use
that name to refer to the Node, (b) it is not necessarily safe to use
that name with DNS, etc.