* Remove Controller.recorder field, there already is eventRecorder.
* Start the event broadcaster in Run(), to save a bit of CPU and memory
when something initializes the controller, but does not Run() it.
* Log events with log level 3, as the other contollers usually do.
* Use StartStructuredLogging(), which looks fancier than StartLogging
Rename old CreateVolumeSpec to CreateVolumeSpecWithNodeMigration that
extracts volume.Spec with node specific CSI migration.
Add CreateVolumeSpec that does the same, only without evaluating node CSI
migration.
* better name variables in deployment_util
* add tests for getReplicaSetFraction in the deployment controller
- make validation more robust and make sure we do not divide by 0
* lock feature gate for PodIndexLabel and mark it GA
Signed-off-by: Alay Patel <alayp@nvidia.com>
* add emulated version if testing disabling of PodIndexLabel FG
Signed-off-by: Alay Patel <alayp@nvidia.com>
---------
Signed-off-by: Alay Patel <alayp@nvidia.com>
The `min` and `max` builtins are available since Go 1.21[^1]. The
top-level go.mod file is specifying Go 1.22, so the builtins can be
used.
[^1]: https://go.dev/doc/go1.21#language
Using the "normal" logic for a feature gated field simplifies the
implementation of the feature gate.
There is one (entirely theoretic!) problem with updating from 1.31: if a claim
was allocated in 1.31 with admin access, the status field was not set because
it didn't exist yet. If a driver now follows the current definition of "unset =
off", then it will not grant admin access even though it should. This is
theoretic because drivers are starting to support admin access with 1.32, so
there shouldn't be any claim where this problem could occur.
The new DRAAdminAccess feature gate has the following effects:
- If disabled in the apiserver, the spec.devices.requests[*].adminAccess
field gets cleared. Same in the status. In both cases the scenario
that it was already set and a claim or claim template get updated
is special: in those cases, the field is not cleared.
Also, allocating a claim with admin access is allowed regardless of the
feature gate and the field is not cleared. In practice, the scheduler
will not do that.
- If disabled in the resource claim controller, creating ResourceClaims
with the field set gets rejected. This prevents running workloads
which depend on admin access.
- If disabled in the scheduler, claims with admin access don't get
allocated. The effect is the same.
The alternative would have been to ignore the fields in claim controller and
scheduler. This is bad because a monitoring workload then runs, blocking
resources that probably were meant for production workloads.
These metrics can provide insights into ResourceClaim usage. The total count is
redundant because the apiserver also provides count of resources, but having it
in the same sub-system next to the count of allocated claims might be more
discoverable and helps monitor the controller itself.
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.
The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
* change format of event
* include `from 0` for new replicas
* update describe tests to reflect current output
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>