This simplifies how the proxier receives update for change in node
labels. Instead of passing the complete Node object we just pass
the proxy relevant topology labels extracted from the complete list
of labels, and the downstream event handlers will only be notified
when there are changes in topology labels.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
This simplifies how the proxier receives update for change in node
labels. Instead of passing the complete Node object we just pass
the proxy relevant topology labels extracted from the complete list
of labels, and the downstream event handlers will only be notified
when there are changes in topology labels.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Rather than having a RetryAfter function, do a retry (at a fixed
interval) if the work function returns an error.
Co-authored-by: Antonio Ojea <aojea@google.com>
Burst syncs are theoretically useful for dealing with a single change
that results in multiple Run() calls (eg, a Service and EndpointSlice
both changing), but 2 isn't enough to cover all cases, and a better
way of dealing with this problem is to just use a smaller
minSyncPeriod.
Co-authored-by: Antonio Ojea <aojea@google.com>
To make switching to/from nftables easier, kube-proxy runs iptables
and ipvs cleanup when starting in nftables mode, and runs nftables
cleanup when starting in iptables or ipvs mode. But there's no
guarantee that the node actually supports the mode we're trying to
clean up, so don't log errors if it doesn't.
iptables and ipvs were both leaving KUBE-MARK-MASQ behind (even though
the corresponding KUBE-POSTROUTING rule to actually do the masquerade
got deleted).
iptables was failing to clean up its KUBE-PROXY-FIREWALL chain (the
cleanup rules never got updated when that was split out of
KUBE-FIREWALL), and also not cleaning up its canary chain.
This also fixes it so that ipvs.CleanupLeftovers only deletes
ipvs/ipset stuff once, rather than first deleting all of it on behalf
of the IPv4 Proxier and then no-op "deleting" it all again on behalf
of the IPv6 Proxier.
Various parts of kube-proxy passed around a "hostname", but it is
actually the name of the *node* kube-proxy is running on, which is not
100% guaranteed to be exactly the same as the hostname. Rename it
everywhere to make it clearer that (a) it is definitely safe to use
that name to refer to the Node, (b) it is not necessarily safe to use
that name with DNS, etc.
Basically all callers want dual-stack-if-possible, so simplify that.
Also, tweak the startup-time checking in kubelet to treat "no iptables
support" as interesting but not an error.
Remove a bunch of comments that are either inaccurate ("the proxier
can only be tested by e2e tests") or weirdly overspecific about
obvious details ("the proxier will not exit if an iptables call
fails").
Historically it took an exec argument so you could pass a FakeExec to
mock its behavior in unit tests, but it has a fake implementation now
that is much more useful for unit tests than trying to use the real
implementation with a fake exec. (The unit tests still use fake execs,
but they don't need to use a public constructor.) So remove the exec
args from the public constructors.
Remove the utilexec.Interface args from the iptables/ipvs constructors
(which have been unused since the conntrack cleanup code was ported to
netlink).
Remove the EventRecorder fields from the iptables/ipvs Proxiers, which
have been unused since we removed the port-opener code in 2022.
Remove the strictARP field from the ipvs Proxier, which has apparently
always been unused (strictARP is only looked at at construct time).
KubeProxy operates with a single health server and two proxies,
one for each IP family. The use of the term 'proxier' in the
types and functions within pkg/proxy/healthcheck can be
misleading, as it may suggest the existence of two health
servers, one for each IP family.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
kube-proxy needs to delete stale conntrack entries for UDP services to
avoid blackholing traffic. Instead of using the conntrack binary it
can use netlink calls directly, reducing the containers images size and
the security surface.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Antonio Ojea <aojea@google.com>
* LocalTrafficDetector construction and test improvements
* Reorder getLocalDetector unit test fields so "input" args come before "output" args
* Don't pass DetectLocalMode as a separate arg to getLocalDetector
It's already part of `config`
* Clarify test names in preparation for merging
* Merge single-stack/dual-stack LocalTrafficDetector construction
Also, only warn if the *primary* IP family is not correctly configured
(since we don't actually know if the cluster is really dual-stack or
not), and pass the pair of detectors to the proxiers as a map rather
than an array.
* Remove the rest of Test_getDualStackLocalDetectorTuple
Windows proxy metric registration was in a separate file, which had
led to some metrics (eg the new ProxyHealthzTotal and ProxyLivezTotal)
not being registered for Windows even though they were implemented by
platform-generic code.
(A few other metrics were neither registered on, nor implemented on
Windows, and that's probably a bug.)
Also, beyond linux-vs-windows, make it clearer which metrics are
specific to individual backends.
This reverts commit 8bccf4873b, except
for the nftables unit test changes, since we still want the "new"
results (not to mention the bugfixes), just for a different reason
now.