Commit Graph

78 Commits

Author SHA1 Message Date
Dan Winship
13f0449e4c Fix up kube-proxy import ordering/organization. 2025-03-07 10:43:43 -05:00
Kubernetes Prow Robot
80026570aa Merge pull request #130119 from npinaeva/nft-restart
[kube-proxy: nftables] Optimize kube-proxy restart time
2025-03-04 10:17:44 -08:00
Nadia Pinaeva
cc0faf086d [kube-proxy:nftables] Skip EP chain updates on startup.
Endpoint chain contents are fairly predictable from their name and
existing affinity sets. Skip endpoint chain updates, when we can be sure
that rules in that chain are still correct.

Add unit test to verify first transaction is optimized.
Change baseRules ordering to make it accepted by nft.ParseDump.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
2025-02-27 10:07:22 +01:00
Ryota Sakamoto
f484ae5bcb Fix kernel version check condition in nftables proxier
Signed-off-by: Ryota Sakamoto <skmt@amazon.com>
2025-02-24 18:45:16 +00:00
Nadia Pinaeva
7d5f3c5723 [kube-proxy:nftables] Read map/set elements on setup.
We used to flush and re-add all map/set elements on nftables
setup, but it is faster to read the existing elements and only
transact the diff.

Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
2025-02-18 11:28:41 +01:00
Kubernetes Prow Robot
3a4c2a0bbb Merge pull request #129271 from aroradaman/dual_stack_healthz
Dual stack healthz server
2025-01-20 07:32:42 -08:00
olderTaoist
561c1d235a full sync per one hour with BFR 2025-01-14 09:24:38 +08:00
Daman Arora
d6c575532a pkg/proxy/healthcheck: rename 'proxier' to 'proxy'
KubeProxy operates with a single health server and two proxies,
one for each IP family. The use of the term 'proxier' in the
types and functions within pkg/proxy/healthcheck can be
misleading, as it may suggest the existence of two health
servers, one for each IP family.

Signed-off-by: Daman Arora <aroradaman@gmail.com>
2025-01-08 17:26:47 +05:30
Dan Winship
f5969adb14 Clean up NewServiceChangeTracker/NewEndpointsChangeTracker args
Remove the now-unused event recorders, and put the remaining args into
a sensible order, and consistent between the two.
2024-12-14 12:12:42 -05:00
Nadia Pinaeva
90e64a57c6 kube-proxy,nftables: add debug logging for failed transaction.
Use a rate limiter to avoid large output with a high rate.

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
2024-12-13 15:53:19 +01:00
Antonio Ojea
f93e6f3d3a kube-proxy implement dual stack metrics
Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Antonio Ojea <aojea@google.com>
2024-12-12 16:13:30 +05:30
Daman Arora
c398af07fa proxy: refactor UpdateEndpointsMapResult
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-10-28 20:10:34 +05:30
Daman Arora
1ad8880c0f proxy/conntrack: reconciler
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-10-28 20:08:53 +05:30
Paco Xu
0e10a3a28c Revert "re: kube-proxy: internal config: refactor HealthzAddress and MetricsAddress " 2024-10-21 11:36:59 +08:00
Daman Arora
48f1356b2f pkg/proxy: refactor NodePortAddresses to NodeAddressHandler
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-10-14 21:49:29 +05:30
Daman Arora
c34b20fa63 proxy/conntrack: use proxier ip family for conntrack cleanup
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-09-04 22:56:03 +05:30
Daman Arora
b0f823e6cc remove the conntrack binary dependency
kube-proxy needs to delete stale conntrack entries for UDP services to
avoid blackholing traffic. Instead of using the conntrack binary it
can use netlink calls directly, reducing the containers images size and
the security surface.

Signed-off-by: Daman Arora <aroradaman@gmail.com>
Co-authored-by: Antonio Ojea <aojea@google.com>
2024-09-04 21:48:34 +05:30
Nadia Pinaeva
3ccf5b8a55 [kube-proxy:nftables] Add partialSync mode to only transact changed
objects.
Change the order of operations to stop current iteration if no changes
to the service chains are needed.
Bump syncProxy frequency to 1 hour.
In a test kind cluster creation of 10K services, 2 endpoints each,
takes ~25m before the fix and ~9min after. Maximum memory usage
during creation is ~650MiB and 260MiB respectively.
Another important metric is the time it takes to create 1 new service
when 10K svc already exist. It used to take ~8m before the fix,
with partialSync it takes ~141ms.

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
2024-07-23 17:32:30 +02:00
Nadia Pinaeva
dc13e42f56 [kube-proxy:nftables] cleanup: remove unused parameter and fix typo.
Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
2024-07-23 17:32:29 +02:00
Dan Winship
f762e5c8de Remove an unnecessary comment in nftables output
(It's redundant with the chain name.)
2024-07-18 10:54:30 -04:00
Dan Winship
b39fd03ee4 Allow disabling nftables kernel version check 2024-07-08 07:29:27 -04:00
Dan Winship
505f6833d9 Require kernel 5.13 for nftables kube-proxy 2024-07-01 10:07:27 -04:00
Dan Winship
912eca9e8b Reorganize nftables proxy init
Move the "nftables is supported" check into a separate function, and
call it before the --init-only return.
2024-07-01 10:07:27 -04:00
Quan Tian
9d71e5338d Remove unused sysctl parameter from nftables proxy
Signed-off-by: Quan Tian <quan.tian@broadcom.com>
2024-06-08 21:48:54 +08:00
Dan Winship
f1f390f13b clean up LocalTrafficDetector construction / tests (#124582)
* LocalTrafficDetector construction and test improvements

* Reorder getLocalDetector unit test fields so "input" args come before "output" args

* Don't pass DetectLocalMode as a separate arg to getLocalDetector

It's already part of `config`

* Clarify test names in preparation for merging

* Merge single-stack/dual-stack LocalTrafficDetector construction

Also, only warn if the *primary* IP family is not correctly configured
(since we don't actually know if the cluster is really dual-stack or
not), and pass the pair of detectors to the proxiers as a map rather
than an array.

* Remove the rest of Test_getDualStackLocalDetectorTuple
2024-04-28 08:51:23 -07:00
Kubernetes Prow Robot
ae8474adcd Merge pull request #124557 from danwinship/metrics-and-stuff
kube-proxy metrics cleanup (and stuff)
2024-04-26 18:31:57 -07:00
Dan Winship
c4dd2c5ad7 Re-enable V(9) transaction logging in nftables proxy 2024-04-26 11:41:51 -04:00
Dan Winship
d4e6e62134 Add nftables cleanup failure metric, fix cleanup bug
If the sync fails, don't try to cleanup, since it's guaranteed to fail
too.
2024-04-26 11:41:51 -04:00
Dan Winship
fc05a294cc Rename nftables sync failure metric 2024-04-26 09:27:41 -04:00
Dan Winship
1823de063b fix "Iptables" -> "IPTables" in metrics variable names 2024-04-26 09:27:41 -04:00
Dan Winship
dc1155bd53 Move LocalTrafficDetector from pkg/proxy/util/iptables to pkg/proxy/util
Since it's used for nftables as well now.
2024-04-25 08:51:43 -04:00
Ziqi Zhao
be4535bd34 convert k8s.io/kubernetes/pkg/proxy to contextual logging, part 1
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2024-04-22 13:08:41 +08:00
Dan Winship
19b3a9e194 (Mostly) Revert "change --nodeport-addresses behavior to default to primary node ip only"
This reverts commit 8bccf4873b, except
for the nftables unit test changes, since we still want the "new"
results (not to mention the bugfixes), just for a different reason
now.
2024-04-18 09:25:06 -04:00
Quan Tian
42672ee2ea Make comment about reject action more accurate
Signed-off-by: Quan Tian <qtian@vmware.com>
2024-02-07 23:57:47 +08:00
Quan Tian
c7e48f1ebf kube-proxy: flush nftables base chains on startup
Do an extra "add+delete" once to ensure all previous base chains in the
table will be recreated. Otherwise, altering properties (e.g. priority)
of these chains would fail the transaction.

Signed-off-by: Quan Tian <qtian@vmware.com>
2024-02-07 23:57:40 +08:00
Kubernetes Prow Robot
27ad20db35 Merge pull request #123005 from danwinship/minor-proxy-cleanup
Minor proxy cleanup
2024-01-28 08:44:38 -08:00
Kubernetes Prow Robot
053acbed90 Merge pull request #122724 from nayihz/feat_nft_nodeport_addr
change --nodeport-addresses behavior to default to primary node ip only
2024-01-26 16:32:03 +01:00
Kubernetes Prow Robot
e023511deb Merge pull request #122920 from danwinship/knftables-migration
Update knftables, with new sigs.k8s.io module name
2024-01-26 07:14:16 +01:00
Dan Winship
ebba2d4472 Move some code in the proxiers
For no real reason, the core Proxier definitions weren't at the start
of the files.

(This just moves code around. It doesn't change anything.)
2024-01-25 18:41:58 -05:00
nayihz
8bccf4873b change --nodeport-addresses behavior to default to primary node ip only 2024-01-25 13:42:30 +08:00
Kubernetes Prow Robot
55f9657e07 Merge pull request #122692 from aroradaman/reject-packets-to-invalid-port
proxy/nftables: reject packets destined for invalid ports of service ips
2024-01-24 23:17:34 +01:00
Dan Winship
09abfa46be Update knftables, with new sigs.k8s.io module name 2024-01-23 08:09:05 -05:00
Daman Arora
25a40b1c7c pkg/proxy/nftables: handle traffic to node ports with no endpoints
NFTables proxy will no longer install drop and reject rules for node
port services with no endpoints in chains associated with forward and
output hooks.

Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-01-21 20:07:56 +05:30
Daman Arora
4b40299133 pkg/proxy/nftables: handle traffic to cluster ip
NFTables proxy will now drop traffic directed towards unallocated
ClusterIPs and reject traffic directed towards invalid ports of
Cluster IPs.

Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-01-21 19:58:37 +05:30
Daman Arora
01d7de5464 pkg/proxy/nftables: rename constant names for nftable objects
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-01-21 13:12:18 +05:30
Dan Winship
fcb51554a1 Plumb the conntrack.Interface up to the proxiers
And use the fake interface in the unit tests, removing the dependency
on setting up FakeExec stuff when conntrack cleanup will be invoked.

Also, remove the isIPv6 argument to CleanStaleEntries, because it can
be inferred from the other args.
2024-01-15 13:09:05 -05:00
Daman Arora
4ffa12b9d9 pkg/proxy/nftables: drop ct-state-invalid rule
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-01-10 22:53:09 +05:30
Kubernetes Prow Robot
95a159299b Merge pull request #122614 from tnqn/nftables-firewall
kube-proxy: fix LoadBalancerSourceRanges not working for nftables mode
2024-01-09 22:27:16 +01:00
Quan Tian
f21f8d9984 kube-proxy: fix LoadBalancerSourceRanges not working for nftables mode
Previously, the firewall-check chain was run in input, forward, and
output hook but not prerouting hook. When the LoadBalancer traffic
arrived at input or forward hook, it had been DNATed to endpoint IP and
port, so the firewall-check chain didn't take effect, traffic from out
of LoadBalancerSourceRanges was not dropped.

It was not detected by unit test because the chains were sorted by
priority only, while hook should be taken into consideration.

The commit links the firewall-check chain to prerouting hook and unlinks
it from input and forward hook to ensure the traffic is filtered before
DNAT. The priorities of filter chains are updated from "DNATPriority-1"
to "DNATPriority-10" to allow third parties to insert something else
between them.

Signed-off-by: Quan Tian <qtian@vmware.com>
2024-01-09 17:34:16 +08:00
Lars Ekman
50b3ffc71f kube-proxy: LoadBalancerSourceRanges as *net.IPNet 2024-01-09 09:17:56 +01:00