Commit Graph

6746 Commits

Author SHA1 Message Date
Jan Safranek
e6807a8e4f Use _ for unused parameters
Sometimes the logger is not used. This fixes some linter warnings.
2024-11-06 11:16:06 +01:00
Jan Safranek
dfb88095b0 Rename label to seLinuxLabel
In various parameters, variables and fields. To make the name more
obvious.
2024-11-06 11:16:06 +01:00
Jan Safranek
e438bc0561 Rework event recorder startup
* Remove Controller.recorder field, there already is eventRecorder.
* Start the event broadcaster in Run(), to save a bit of CPU and memory
  when something initializes the controller, but does not Run() it.
* Log events with log level 3, as the other contollers usually do.
* Use StartStructuredLogging(), which looks fancier than StartLogging
2024-11-06 11:16:06 +01:00
Jan Safranek
da2d9fa16e Fix golint errors
Revealed by the new SELinux warning controller, but not related to it.
2024-11-06 11:16:05 +01:00
Jan Safranek
aa8872d7a3 Add SELinux warning controller 2024-11-06 11:16:02 +01:00
Jan Safranek
0d71dc677e Refactor CreateVolumeSpec
Rename old CreateVolumeSpec to CreateVolumeSpecWithNodeMigration that
extracts volume.Spec with node specific CSI migration.

Add CreateVolumeSpec that does the same, only without evaluating node CSI
migration.
2024-11-06 11:15:31 +01:00
Kubernetes Prow Robot
08391b3d27 Merge pull request #123549 from carlory/kep-3751-finalizer
A new controller adds/removes finalizer to VAC for protection
2024-11-05 21:45:30 +00:00
Filip Křepinský
05bc270870 add tests for getReplicaSetFraction in the deployment controller (#128535)
* better name variables in deployment_util

* add tests for getReplicaSetFraction in the deployment controller

- make validation more robust and make sure we do not divide by 0
2024-11-04 19:11:43 +00:00
Kubernetes Prow Robot
7a4d755644 Merge pull request #128507 from dims/use-k8s.io/utils/lru-instead-of-github.com/golang/groupcache/lru
Use k8s.io/utils/lru instead of github.com/golang/groupcache/lru
2024-11-04 19:11:35 +00:00
Alay Patel
3e3276e9fe Promote PodIndexLabel for Statefulset and IndexedJob stable (#128387)
* lock feature gate for PodIndexLabel and mark it GA

Signed-off-by: Alay Patel <alayp@nvidia.com>

* add emulated version if testing disabling of PodIndexLabel FG

Signed-off-by: Alay Patel <alayp@nvidia.com>

---------

Signed-off-by: Alay Patel <alayp@nvidia.com>
2024-11-04 19:11:28 +00:00
Davanum Srinivas
2b0592ee77 Use k8s.io/utils/lru instead of github.com/golang/groupcache/lru
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2024-11-04 10:51:13 -05:00
Kubernetes Prow Robot
cc319bf654 Merge pull request #128527 from atiratree/annotations-validation
improve validation for ReplicaSet annotations in the deployment controller
2024-11-04 13:07:28 +00:00
Filip Křepinský
3ac0ac7a81 improve validation for ReplicaSet annotations in the deployment controller 2024-11-04 10:43:16 +01:00
Filip Křepinský
a460e2c413 simplify ScalingReplicaSet event building in the deployment controller 2024-11-04 08:42:22 +01:00
Kubernetes Prow Robot
ff5cb3791a Merge pull request #127903 from soltysh/test_daemonset
Add unit tests verifying the update touches old, unhealthy pods first, and only after new pods
2024-10-31 13:53:26 +00:00
Tobias Klauser
b01b016668 Use Go 1.21 min/max builtins
The `min` and `max` builtins are available since Go 1.21[^1]. The
top-level go.mod file is specifying Go 1.22, so the builtins can be
used.

[^1]: https://go.dev/doc/go1.21#language
2024-10-31 11:17:23 +01:00
Maciej Szulik
174288d751 Add unit tests verifying the update touches old, unhealthy pods first, and only after new pods.
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2024-10-31 11:13:01 +01:00
Kubernetes Prow Robot
d001d5684e Merge pull request #128417 from tenzen-y/self-nominate-job-controller-reviewer
Self nominate tenzen-y as a reviewer for the Job controller
2024-10-30 11:21:39 +00:00
Kubernetes Prow Robot
a18b50e7e4 Merge pull request #128373 from mimowo/job-cover-negative-codes
Job Pod Failure policy - cover testing of negative exit codes
2024-10-30 11:21:31 +00:00
Kubernetes Prow Robot
daef8c2419 Merge pull request #127266 from pohly/dra-admin-access-in-status
DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate
2024-10-30 03:41:25 +00:00
Yuki Iwai
eca7ee877a Self nominate tenzen-y as a reviewer for the Job controller
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-10-30 01:14:47 +09:00
Kubernetes Prow Robot
c5ccf59974 Merge pull request #128379 from pohly/dra-owners-wg-label
DRA: add wg/device-management label automatically
2024-10-29 15:24:57 +00:00
Patrick Ohly
4419568259 DRA: treat AdminAccess as a new feature gated field
Using the "normal" logic for a feature gated field simplifies the
implementation of the feature gate.

There is one (entirely theoretic!) problem with updating from 1.31: if a claim
was allocated in 1.31 with admin access, the status field was not set because
it didn't exist yet. If a driver now follows the current definition of "unset =
off", then it will not grant admin access even though it should. This is
theoretic because drivers are starting to support admin access with 1.32, so
there shouldn't be any claim where this problem could occur.
2024-10-29 10:22:31 +01:00
Patrick Ohly
9a7e4ccab2 DRA admin access: add feature gate
The new DRAAdminAccess feature gate has the following effects:
- If disabled in the apiserver, the spec.devices.requests[*].adminAccess
  field gets cleared. Same in the status. In both cases the scenario
  that it was already set and a claim or claim template get updated
  is special: in those cases, the field is not cleared.

  Also, allocating a claim with admin access is allowed regardless of the
  feature gate and the field is not cleared. In practice, the scheduler
  will not do that.
- If disabled in the resource claim controller, creating ResourceClaims
  with the field set gets rejected. This prevents running workloads
  which depend on admin access.
- If disabled in the scheduler, claims with admin access don't get
  allocated. The effect is the same.

The alternative would have been to ignore the fields in claim controller and
scheduler. This is bad because a monitoring workload then runs, blocking
resources that probably were meant for production workloads.
2024-10-29 09:50:11 +01:00
Kubernetes Prow Robot
5f594f4215 Merge pull request #128401 from tenzen-y/use-same-receiver-name
Job: Consistentely use the same reveiver name in the controller
2024-10-29 08:16:55 +00:00
Yuki Iwai
d4959d8d29 Job: Consistentely use the same reveiver name in the controller
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-10-29 14:11:10 +09:00
Yuki Iwai
a23e7a42d3 Job: Refactor uncountedTerminatedPods to avoid casting everywhere
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-10-29 13:12:35 +09:00
Kubernetes Prow Robot
8aae9aabf3 Merge pull request #127661 from pohly/dra-resourceclaim-metrics
DRA resourceclaims: maintain metric of total and allocated claims
2024-10-28 21:12:53 +00:00
Patrick Ohly
9d1b0654e0 DRA: add wg/device-management label automatically
This makes PRs show up automatically in the WG's project
board (https://github.com/orgs/kubernetes/projects/95/views/1).
2024-10-28 16:36:04 +01:00
Michal Wozniak
cad648035a Job Pod Failure policy - cover testing of negative exit codes 2024-10-28 07:24:26 +01:00
Kubernetes Prow Robot
7b7a7968d4 Merge pull request #125314 from enj/enj/i/proto_for_core
Use protobuf for core clients
2024-10-24 18:20:54 +01:00
Adrian Moisey
4d2f3ed8e6 Ensure that a node's CIDR isn't released until the node is deleted
Fixes https://github.com/kubernetes/kubernetes/issues/127792

Fixes bug where a node's PodCIDR was released when the node was given a
delete time stamp, but was hanging around due to a finalizer.
2024-10-24 13:19:34 +02:00
Kubernetes Prow Robot
aa8f2878a5 Merge pull request #117943 from lowang-bh/lessFunCall
improve: reduce function calling number
2024-10-24 04:52:52 +01:00
Aldo Culquicondor
5fab6175b7 Remove alculquicondor from job approvers
Change-Id: I2b1514ff70108602a589522cbb63dcdc88849313
2024-10-23 17:58:55 +00:00
Monis Khan
6595fa4026 Fix tests that assume core clients use JSON
Signed-off-by: Monis Khan <mok@microsoft.com>
2024-10-23 11:35:30 -04:00
Kubernetes Prow Robot
ea379082fb Merge pull request #126322 from lance5890/ds_update_typo
typo: update the daemon update typo
2024-10-23 01:18:06 +01:00
Kubernetes Prow Robot
6edbee19b9 Merge pull request #123152 from tnqn/fix-error-log
Fix internal error when serializing groupLookupFailures in log
2024-10-23 01:17:29 +01:00
Kubernetes Prow Robot
71523a7db6 Merge pull request #122644 from gyuho/logs-removing-taints
chores(controller/nodelifecycle): make node taint removal logs more a…
2024-10-23 01:17:15 +01:00
Kubernetes Prow Robot
3e66160f30 Merge pull request #107362 from shawnhanx/controller_redundant
remove redundant return statement in attachdetach/util/util.go
2024-10-23 01:16:53 +01:00
Patrick Ohly
c2524cbf9b DRA resourceclaims: maintain metric of total and allocated claims
These metrics can provide insights into ResourceClaim usage. The total count is
redundant because the apiserver also provides count of resources, but having it
in the same sub-system next to the count of allocated claims might be more
discoverable and helps monitor the controller itself.
2024-10-18 09:13:42 +02:00
Kubernetes Prow Robot
b1b4e5d397 Merge pull request #128003 from pohly/dra-classic-dra-removal
DRA: remove "classic DRA"
2024-10-18 00:55:17 +01:00
Kubernetes Prow Robot
b7d1766c18 Merge pull request #128158 from pohly/dra-controller-logging
DRA resource claim controller: improve log messages
2024-10-17 20:31:11 +01:00
Kubernetes Prow Robot
51f76febd7 Merge pull request #127402 from mimowo/managed-by-beta-update
Graduate JobManagedBy to Beta in 1.32
2024-10-17 19:27:14 +01:00
Patrick Ohly
d572df2493 DRA resource claim controller: improve log messages
Some code paths didn't log anything. One log message about "claim got deleted"
was incorrect.
2024-10-17 18:28:55 +02:00
Kubernetes Prow Robot
1f9038a468 Merge pull request #127919 from carlory/fix-127852
Fix data race in kubelet/volumemanager
2024-10-17 14:57:03 +01:00
Michal Wozniak
70a8ceb6f0 Graduate JobManagedBy to Beta in 1.32
# Conflicts:
#	pkg/features/kube_features.go
2024-10-17 09:01:54 +02:00
Patrick Ohly
f84eb5ecf8 DRA: remove "classic DRA"
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.

The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
2024-10-16 23:09:50 +02:00
carlory
4558dc1432 node-lifecycle-controller: improve processPod test-coverage 2024-10-10 13:52:10 +08:00
Kubernetes Prow Robot
a1df68a31f Merge pull request #125118 from jsoref/from-to
Order ScalingReplicaSet message from->to
2024-10-09 08:36:22 +01:00
Josh Soref
502c05ed01 chore: Order ScalingReplicaSet message from->to
* change format of event
* include `from 0` for new replicas
* update describe tests to reflect current output

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-10-08 09:59:29 -04:00