Commit Graph

21400 Commits

Author SHA1 Message Date
Mike Spreitzer
77541c1e35 Relax noise margin in TestOneWeightedHistogram
Signed-off-by: Mike Spreitzer <mspreitz@us.ibm.com>
2024-07-24 17:45:12 -04:00
devppratik
f8bf6b97b8 Update Node Monitor Grace Period default duration to 50s
Update description

Improve flag comment

Update Test case value to be 50s by default

Update Description

Run make update

Minor description fix
2024-07-24 22:54:44 +05:30
Jefftree
919e7abe0f update codegen and openapi 2024-07-24 14:41:13 +00:00
Jefftree
0c774d0b1f Change PingTime to be persistent 2024-07-24 14:41:13 +00:00
Jefftree
e1ea24a171 fix ordering issue in candidates 2024-07-24 14:38:13 +00:00
Jefftree
42678f1553 regen clients 2024-07-24 14:38:12 +00:00
Jefftree
fac7581640 feedback: leasecandidate clients 2024-07-24 14:38:12 +00:00
Dr. Stefan Schimanski
68226b0501 Review feedback
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-24 14:38:12 +00:00
Jefftree
c47ff1e1a9 CLE controller and client changes 2024-07-24 14:38:11 +00:00
Jefftree
9b16b0dc97 CLE feature gate 2024-07-24 14:38:11 +00:00
Jefftree
3999b98c88 Coordinated Leader Election Alpha API 2024-07-24 14:38:10 +00:00
Kubernetes Prow Robot
5af1710d90 Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures
Implement resource health in pod status (KEP 4680)
2024-07-23 20:12:24 -07:00
Kubernetes Prow Robot
49ff255074 Merge pull request #126308 from cici37/hotFix
Update with stdlib errors
2024-07-23 18:02:07 -07:00
Sergey Kanzhelev
2253b53b58 generated files 2024-07-24 00:29:35 +00:00
Sergey Kanzhelev
16e8911fdc add AllocatedResourcesStatus field to ContainerStatus 2024-07-24 00:29:34 +00:00
Cici Huang
a48a92c72e Allowing direct CEL reserved keyword usage in CRD (#126188)
* automatically escape reserved keywords for direct usage

* Add reserved keyword support in a ratcheting way, add tests.

---------

Co-authored-by: Wenxue Zhao <ballista01@outlook.com>
2024-07-23 15:45:20 -07:00
Anish Ramasekar
f80c73248f Validate structured authn feature is enabled for discovery url/multiple
audiences

Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2024-07-23 15:12:03 -07:00
Kubernetes Prow Robot
f93fe412c7 Merge pull request #126281 from saschagrunert/oci-volume-docs
[KEP-4639] Mention that `fsGroupChangePolicy` has no effect
2024-07-23 14:40:14 -07:00
cici37
ac2c450da7 Update with stdlib errors 2024-07-23 21:16:53 +00:00
Kubernetes Prow Robot
c2fdeca4ab Merge pull request #126145 from carlory/kep-3751-api
[KEP-3751] Promote VolumeAttributesClass to beta
2024-07-23 13:31:05 -07:00
Kubernetes Prow Robot
107f621462 Merge pull request #126108 from gnufied/changes-volume-recovery
Reduce state changes when expansion fails and mark certain failures as infeasible
2024-07-23 13:30:56 -07:00
Kubernetes Prow Robot
c01bc31fa2 Merge pull request #126163 from haircommander/procMount-baseline
PSA: allow procMount type Unmasked in baseline
2024-07-23 12:21:20 -07:00
Kubernetes Prow Robot
04d2f33641 Merge pull request #124061 from Jefftree/conversion-webhook-invalidca
Validate CABundle when writing CRD
2024-07-23 12:20:53 -07:00
Kubernetes Prow Robot
05bb5f71f8 Merge pull request #120611 from pohly/dra-resource-quotas
DRA: resource quotas
2024-07-23 12:20:44 -07:00
Kubernetes Prow Robot
a00181d4d4 Merge pull request #121902 from carlory/kep-3751-pv-controller
[kep-3751] pvc bind pv with vac
2024-07-23 11:02:13 -07:00
Patrick Ohly
299ecde5cc DRA quota: add ResourceClaim v1.ResourceQuota limits
Dynamic resource allocation is similar to storage in the sense that users
create ResourceClaim objects to request resources, same as with persistent
volume claims. The actual resource usage is only known when allocating claims,
but some limits can already be enforced at admission time:

- "count/resourceclaims.resource.k8s.io" limits the number of ResourceClaim objects in
  a namespace; this is a generic feature that is already supported also without
  this commit.

- "resourceclaims" is *not* an alias - use "count/resourceclaims.resource.k8s.io"
  instead.

- <device-class-name>.deviceclass.resource.k8s.io/devices limits the number of
  ResourceClaim objects in a namespace such that the number of devices
  requested through those objects with that class does not exceed the limit.

A single request may cause the allocation of multiple devices. For exact
counts, the quota limit is based on the sum of those exact counts. For requests
asking for "all" matching devices, the maximum number of allocated devices per
claim is used as a worst-case upper bound.

Requests asking for "admin access" contribute to the quota.

DRA quota: remove admin mode exception
2024-07-23 18:52:34 +02:00
Kubernetes Prow Robot
8e175c688e Merge pull request #126165 from haircommander/selinux-engine_t
PSA: allow container_engine_t selinux type
2024-07-23 09:21:20 -07:00
Kubernetes Prow Robot
fbdfb9d8d9 Merge pull request #126031 from harche/kubelet_cgroupv1_arg
KEP-4569: Kubelet option to disable cgroup v1 support
2024-07-23 09:21:11 -07:00
Peter Hunt
7e750a62a1 PSA: small cleanups for tests that use RelaxPolicyForUserNamespacePods
make sure to cleanup after setting RelaxPolicyForUserNamespacePods
setup test variables to be a little more terse and similar between tests
cleanup Allowed checking

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-07-23 12:01:06 -04:00
Peter Hunt
17521f04a4 PSA: allow procMount type Unmasked in baseline
a masked proc mount has traditionally been used to prevent untrusted containers from accessing leaky kernel APIs.
However, within a user namespace, typical ID checks protect better than masked proc. Further, allowing unmasked proc
with a user namespace gives access to a container mounting sub procs, which opens avenues for container-in-container use cases.

Update PSS for baseline to allow a container to access an unmasked /proc, if it's in a user namespace and if the UserNamespacesPodSecurityStandards feature is enabled.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-07-23 12:01:06 -04:00
Kubernetes Prow Robot
fc03f3e74c Merge pull request #126125 from mprahl/stop-idempotent
Allow calling Stop multiple times on RetryWatcher
2024-07-23 08:16:24 -07:00
Kubernetes Prow Robot
1854839ff0 Merge pull request #126067 from tenzen-y/implement-job-success-policy-e2e
Graduate the JobSuccessPolicy to Beta
2024-07-23 06:14:23 -07:00
Kubernetes Prow Robot
bb350f7111 Merge pull request #125661 from mjudeikis/mjudeikis/poststarthookctx.stopch.cleanup
Clean deprecated context.StopCh
2024-07-23 02:12:22 -07:00
Sascha Grunert
479a7c34fe ImageVolumeSource: mention that fsGroupChangePolicy has no effect
A small documentation follow-up based on the review:
https://github.com/kubernetes/kubernetes/pull/125660#discussion_r1686859866

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-07-23 10:15:18 +02:00
carlory
3a6a4830df pvc bind pv with vac 2024-07-23 15:04:11 +08:00
carlory
0260c7d023 Promote VolumeAttributesClass to beta 2024-07-23 13:58:14 +08:00
Cici Huang
5420b2fe9a Hot fix for panic on schema conversion. (#126167) 2024-07-22 19:43:45 -07:00
Yuki Iwai
551931c6a8 Graduate the JobSuccessPolicy to beta
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-07-23 09:29:06 +09:00
Kubernetes Prow Robot
04cc0a1034 Merge pull request #126187 from seans3/portforward-websockets-metrics
Adds metrics to PortForward Websockets
2024-07-22 16:53:25 -07:00
Kubernetes Prow Robot
f753a444a5 Merge pull request #126091 from seans3/ws-err-extra-info
Adds extra error information from response to bad handshake error when possible
2024-07-22 16:53:16 -07:00
Kubernetes Prow Robot
6e52e705d0 Merge pull request #125374 from pwschuurman/kep-3335-stable
Promote StatefulSetStartOrdinal to stable in 1.31
2024-07-22 14:25:49 -07:00
Sean Sullivan
f387f0b69a Adds extra error information from response to bad handshake error when possible 2024-07-22 14:12:01 -07:00
Sean Sullivan
90d70ed73d Adds metrics to PortForward Websockets 2024-07-22 14:08:42 -07:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Kubernetes Prow Robot
887def08b6 Merge pull request #126237 from cici37/promoteMetrics
Promote metrics for VAP and CRD validation rules to beta.
2024-07-22 10:17:49 -07:00
Kubernetes Prow Robot
0caeba5cbe Merge pull request #126204 from vrutkovs/unsafeRecordQueried-atomicPointer
feature_gate: avoid extra copy when queried feature is already stored, use Set instead of map
2024-07-22 09:09:42 -07:00
Patrick Ohly
d11b58efe6 DRA kubelet: refactor gRPC call timeouts
Some of the E2E node tests were flaky. Their timeout apparently was chosen
under the assumption that kubelet would retry immediately after a failed gRPC
call, with a factor of 2 as safety margin. But according to
0449cef8fd,
kubelet has a different, higher retry period of 90 seconds, which was exactly
the test timeout. The test timeout has to be higher than that.

As the tests don't use the gRPC call timeout anymore, it can be made
private. While at it, the name and documentation gets updated.
2024-07-22 18:09:34 +02:00
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
877829aeaa DRA kubelet: adapt to v1alpha3 API
This adds the ability to select specific requests inside a claim for a
container.

NodePrepareResources is always called, even if the claim is not used by any
container. This could be useful for drivers where that call has some effect
other than injecting CDI device IDs into containers. It also ensures that
drivers can validate configs.

The pod resource API can no longer report a class for each claim because there
is no such 1:1 relationship anymore. Instead, that API reports claim,
API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet
itself doesn't extract that information from the claim. Instead, it relies on
drivers to report this information when the claim gets prepared. This isolates
the kubelet from API changes.

Because of a faulty E2E test, kubelet was told to contact the wrong driver for
a claim. This was not visible in the kubelet log output. Now changes to the
claim info cache are getting logged. While at it, naming of variables and some
existing log output gets harmonized.

Co-authored-by: Oksana Baranova <oksana.baranova@intel.com>
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
20f98f3a2f DRA: update helper packages
Publishing ResourceSlices now supports network-attached devices and the new
v1alpha3 API.  The logic for splitting up across different slices is missing.
2024-07-22 18:09:34 +02:00