kube-proxy needs to delete stale conntrack entries for UDP services to
avoid blackholing traffic. Instead of using the conntrack binary it
can use netlink calls directly, reducing the containers images size and
the security surface.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Antonio Ojea <aojea@google.com>
objects.
Change the order of operations to stop current iteration if no changes
to the service chains are needed.
Bump syncProxy frequency to 1 hour.
In a test kind cluster creation of 10K services, 2 endpoints each,
takes ~25m before the fix and ~9min after. Maximum memory usage
during creation is ~650MiB and 260MiB respectively.
Another important metric is the time it takes to create 1 new service
when 10K svc already exist. It used to take ~8m before the fix,
with partialSync it takes ~141ms.
Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
* LocalTrafficDetector construction and test improvements
* Reorder getLocalDetector unit test fields so "input" args come before "output" args
* Don't pass DetectLocalMode as a separate arg to getLocalDetector
It's already part of `config`
* Clarify test names in preparation for merging
* Merge single-stack/dual-stack LocalTrafficDetector construction
Also, only warn if the *primary* IP family is not correctly configured
(since we don't actually know if the cluster is really dual-stack or
not), and pass the pair of detectors to the proxiers as a map rather
than an array.
* Remove the rest of Test_getDualStackLocalDetectorTuple
The constructors only return an error if you pass them invalid data,
but we only ever pass them data which has already been validated,
making the error checking just annoying. Just make them return garbage
output if you give them garbage input.
Windows proxy metric registration was in a separate file, which had
led to some metrics (eg the new ProxyHealthzTotal and ProxyLivezTotal)
not being registered for Windows even though they were implemented by
platform-generic code.
(A few other metrics were neither registered on, nor implemented on
Windows, and that's probably a bug.)
Also, beyond linux-vs-windows, make it clearer which metrics are
specific to individual backends.
This reverts commit 8bccf4873b, except
for the nftables unit test changes, since we still want the "new"
results (not to mention the bugfixes), just for a different reason
now.
The behavior when you specify no --nodeport-addresses value in a
dual-stack cluster is terrible and we can't fix it, for
backward-compatibility reasons. Actually, the behavior when you
specify no --nodeport-addresses value in a single-stack cluster isn't
exactly awesome either...
Allow specifying `--nodeport-addresses primary` to get the
previously-nftables-backend-specific behavior of listening on only the
node's primary IP or IPs.
Add packet tracing unit tests for ipv4 and ipv6.
Remove unreachable code from runChain, since some of the parsed rules
are never generated by the proxy implementation.
Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Do an extra "add+delete" once to ensure all previous base chains in the
table will be recreated. Otherwise, altering properties (e.g. priority)
of these chains would fail the transaction.
Signed-off-by: Quan Tian <qtian@vmware.com>
NFTables proxy will no longer install drop and reject rules for node
port services with no endpoints in chains associated with forward and
output hooks.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
NFTables proxy will now drop traffic directed towards unallocated
ClusterIPs and reject traffic directed towards invalid ports of
Cluster IPs.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
And use the fake interface in the unit tests, removing the dependency
on setting up FakeExec stuff when conntrack cleanup will be invoked.
Also, remove the isIPv6 argument to CleanStaleEntries, because it can
be inferred from the other args.
The iptables and nftables proxy backends had 2 unit tests
(TestDeleteEndpointConnections and TestProxierDeleteNodePortStaleUDP)
that were effectively testing that:
- If the proxy saw various Service/EndpointSlice events this would
result in specific changes to the service/endpoints trackers, AND
- If the service/endpoints trackers changed in those specific ways
this would result in specific UpdateServiceMapResult and
UpdateEndpointsMapResult values being generated, AND
- If you passed those specific UpdateServiceMapResult and
UpdateEndpointsMapResult values to conntrack.CleanStaleEntries it
would make specific calls to the lower-level conntrack methods,
AND
- If you called the lower-level conntrack methods with those
specific arguments, it would result in specific executions of the
conntrack binary, mixed with a specific number of klog
invocations.
This... is not a good unit test. We already test the change tracker
behavior in other unit tests, and we already tested the
Update{Service,Endpoints}MapResult behavior in the pkg/proxy unit
tests, and we already tested the conntrack exec behavior in
pkg/proxy/conntrack/conntrack_test.go, and we now test the
CleanStaleEntries behavior in pkg/proxy/conntrack/cleanup_test.go. So
there is no need to try to test the top-to-bottom behavior as a "unit
test".