Commit Graph

22 Commits

Author SHA1 Message Date
Ed Bartosh
e70a2ad828 Kubelet: DRA: fix testify errors 2024-09-09 22:18:07 +03:00
Ed Bartosh
e1bc8defac kubelet: Migrate DRA Manager to contextual logging
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
2024-08-22 11:12:41 +03:00
Patrick Ohly
877829aeaa DRA kubelet: adapt to v1alpha3 API
This adds the ability to select specific requests inside a claim for a
container.

NodePrepareResources is always called, even if the claim is not used by any
container. This could be useful for drivers where that call has some effect
other than injecting CDI device IDs into containers. It also ensures that
drivers can validate configs.

The pod resource API can no longer report a class for each claim because there
is no such 1:1 relationship anymore. Instead, that API reports claim,
API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet
itself doesn't extract that information from the claim. Instead, it relies on
drivers to report this information when the claim gets prepared. This isolates
the kubelet from API changes.

Because of a faulty E2E test, kubelet was told to contact the wrong driver for
a claim. This was not visible in the kubelet log output. Now changes to the
claim info cache are getting logged. While at it, naming of variables and some
existing log output gets harmonized.

Co-authored-by: Oksana Baranova <oksana.baranova@intel.com>
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Kubernetes Prow Robot
f2428d66cc Merge pull request #125163 from pohly/dra-kubelet-api-version-independent-no-rest-proxy
DRA: make kubelet independent of the resource.k8s.io API version
2024-07-18 17:47:48 -07:00
Patrick Ohly
7701a48bd6 dra kubelet: bump gRPC API to v1alpha4
The previous changes are an API break, therefore we need a new version.
2024-07-18 23:30:09 +02:00
Matthieu MOREL
f014b754fb fix: enable empty and len rules from testifylint on pkg package
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
2024-07-06 23:15:43 +00:00
Patrick Ohly
bde9b64cdf DRA: remove "source" indirection from v1 Pod API
This makes the API nicer:

    resourceClaims:
    - name: with-template
      resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      resourceClaimName: test-shared-claim

Previously, this was:

    resourceClaims:
    - name: with-template
      source:
        resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      source:
        resourceClaimName: test-shared-claim

A more long-term benefit is that other, future alternatives
might not make sense under the "source" umbrella.

This is a breaking change. It's justified because DRA is still
alpha and will have several other API breaks in 1.31.
2024-06-27 17:53:24 +02:00
Ed Bartosh
6ce294558a kubelet: DRA: add stress test
The tests calls PrepareResources and UnprepareResources API in
parallel to help discover race conditions.
2024-05-03 13:30:29 +00:00
Kevin Klues
86a18d5333 kubelet: DRA: update manager test to adhere to new claiminfo cache APIs
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-05-03 13:28:37 +00:00
Ayato Tokubi
d04f87abde add nil check for Node(Un)PrepareResources.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2024-04-04 23:24:25 +00:00
HirazawaUi
10b6319e64 fix slow dra unit test 2024-03-16 22:21:15 +08:00
Ed Bartosh
26881132bd kubelet: assign Node as an owner for the ResourceSlice
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
2024-03-15 09:46:13 +02:00
Patrick Ohly
d59676a545 dra kubelet: publish NodeResourceSlices
The information is received from the DRA driver plugin through a new gRPC
streaming interface. This is backwards compatible with old DRA driver kubelet
plugins, their gRPC server will return "not implemented" and that can be
handled by kubelet. Therefore no API break is needed.

However, DRA drivers need to be updated because the Go API changed. They can
return
    status.New(codes.Unimplemented, "no node resource support").Err()
if they don't support the new ListAndWatchResources method and
structured parameters.

The controller in kubelet then synchronizes this information from the driver
with NodeResourceSlice objects, creating, updating and deleting them as needed.
2024-03-07 22:22:13 +01:00
TommyStarK
6f021e99cf dra: increase timeout in setupFakeDRADriverGRPCServer to prevent tests to flake.
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2024-01-11 09:20:04 +01:00
charles-chenzz
abaf7a800d increase timeout in fakeDraDriverGrpcServer to fix flake in dra/manger_test 2023-11-07 19:38:27 +08:00
adrianc
3738111337 Add unit tests
adjust existing tests and add new test flows
to cover new DRA manager behaviour

Signed-off-by: adrianc <adrianc@nvidia.com>
2023-10-25 13:20:22 +03:00
Kubernetes Prow Robot
82bca6304b Merge pull request #119464 from TommyStarK/dra/cleanup-manager-unit-tests
dra: cleanup manager unit tests
2023-09-18 07:08:43 -07:00
charles-chenzz
ba9ce3ab08 fix flaky test on dra TestPrepareResources/should_timeout
Co-authored-by: TommyStarK <thomasmilox@gmail.com>
2023-08-03 22:37:54 +08:00
TommyStarK
391c1a3ecc dra: cleanup manager unit tests
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-08-02 23:35:45 +02:00
Ed Bartosh
0ec99fb0b2 Kubelet DRA: fix failing test cases 2023-07-18 19:06:33 +03:00
charles-chenzz
0372e4b662 add unit test for dra/manager.go.
Co-Authored-By: charles-chenzz <Rekles666@gmail.com>
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-07-18 12:14:27 +02:00