kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-11-11 00:56:14 +00:00

Author	SHA1	Message	Date
Kubernetes Prow Robot	3dcad5f0db	Merge pull request #128532 from neolit123/1.32-handle-custom-addreses-comp-readyz kubeadm: use advertise address for WaitForAllControlPlaneComponents	2024-11-06 08:51:29 +00:00
Lubomir I. Ivanov	0cfcaa82e1	kubeadm: use advertise address for WaitForAllControlPlaneComponents	2024-11-05 09:00:38 +02:00
Kubernetes Prow Robot	6fce566781	Merge pull request #128474 from neolit123/1.32-handle-custom-addreses-comp-readyz kubeadm: use actual addresses/ports for WaitForAllControlPlaneComponents	2024-11-02 17:19:26 +00:00
Lubomir I. Ivanov	b2741f7b1c	kubeadm: use actual addresses/ports for WaitForAllControlPlaneComponents By default check the KCM and scheduler on 127.0.0.1:<port> as that is the defaall --bind-address kubeamd uses for these components. For kube-apiserver take the value from APIEndpoint.AdvertiseAddress which is dynamically detected from the host. Unless the user has passed explicitly --advertise-address as an extra arg. Read the <port> values for all components from the --secure-port flag value if needed. Otherwise use defaults. Use /livez for apiserver and scheduler. Add TODO for KCM to switch to /livez as well.	2024-11-02 18:09:36 +02:00
Lubomir I. Ivanov	07918a59e8	kubeadm: support dryrunning upgrade wihout a real cluster Make the following changes: - When dryrunning if the given kubeconfig does not exist create a DryRun object without a real client. This means only a fake client will be used for all actions. - Skip the preflight check if manifests exist during dryrun. Print "would ..." instead. - Add new reactors that handle objects during upgrade. - Add unit tests for new reactors. - Print message on "upgrade node" that this is not a CP node if the apiserver manifest is missing. - Add a new function GetNodeName() that uses 3 different methods for fetching the node name. Solves a long standing issue where we only used the cert in kubelet.conf for determining node name. - Various other minor fixes.	2024-10-31 14:58:47 +02:00
SataQiu	dc48aed791	kubeadm: support joining control plane nodes in dryrun mode without a real initialized control plane	2024-10-28 21:37:58 +08:00
Lubomir I. Ivanov	30f9893374	kubeadm: refactor the dry-run logic The current dryrun client implemnetation is suboptimal and sparse. It has the following problems: - When an object CREATE or UPDATE reaches the default dryrun client the operation is a NO-OP, which means subsequent GET calls must fully emulate the object that exists in the store. - There are multiple implmentations of a DryRunGetter interface such the one in init_dryrun.go but there are no implementations for reset, upgrade, join. - There is a specific DryRunGetter that is backed by a real client in clientbacked_dryrun.go, but this is used for upgrade and does not work in conjuction with a fake client. This commit does the following changes: - Removes all existing dryrun.go implementations. - Add a new DryRun implementation in dryrun.go that implements 3 clients - fake clientset, real clientset, real dynamic client. - The DryRun object uses the method chaining pattern. - Allows the user opt-in into real clients only if needed, by passing a real kubeconfig. By default only constructs a fake client. - The default reactor chain for the fake client, always logs the object action, then for GET or LIST actions attempts to use the real dynamic client to get the object. If a real object does not exist it attempts to get the object from the fake object store. - The user can prepend or append reactors to the chain. - All known needed reactors for operations during init, join, reset, upgrade are added as methods of the DryRun struct. - Adds detailed unit test for the DryRun struct and its methods including reactors. Additional changes: - Use the new DryRun implementation in all command workflows - init, join, reset, upgrade. - Ensure that --dry-run works even if there is no active cluster by returning faked objects. For join, a faked cluster-info with a fake bootstrap token and CA are used.	2024-10-11 00:15:59 +03:00
Lubomir I. Ivanov	b90b280c5a	kubeadm: fix join bug where kubeletconfig was not patched in memory During kubeadm join in 1.30 kubeadm started respecting the kubeletconfiguration healthz address/port. Previously it hardcoded the health check to localhost:defaultport. A corner case was not handled where the user applies --patches on join to modify the local kubeletconfiguration. This results in kubeletconfiguration patch target patches not being applied to the KubeletConfiguration in memory and the health check running on the address:port which are present in the kubelet-config configmap. Fix that by explicitly calling a new function to patch the KubeletConfiguration in memory. This is scoped to only handle the healthz checks after the kubelet config.yaml was already patched and written to disk.	2024-07-20 19:31:19 +03:00
Lubomir I. Ivanov	52302e4ad5	kubeadm: use the actual configured kubelet healthz address:port When doing a kubelet health check on init/join, do not hardcode the "localhost" address. Instead, use the KubeletConfiguration HealthzBindAddress and HealthzPort fields.	2024-06-01 10:10:31 +03:00
Kubernetes Prow Robot	03f24068da	Merge pull request #123341 from neolit123/1.30-health-check-all-cp-components kubeadm: introduce the WaitForAllControlPlaneComponents feature gate	2024-02-29 05:05:42 -08:00
Lubomir I. Ivanov	c29450eb00	kubeadm: apply retries to all API calls in idempotency.go The idempotency.go (perhaps not so accurately named) contains API calls that kubeadm does against an API server using client-go. Some users seem to have unstable setups where for unknown reasons the API server can be unavailable or refuse to respond as expected. Use PollUntilContextTimeout in all exported functions to ensure such API calls are all retry-able. NOTE: The context passed to PollUntilContextTimeout is not propagated in the polled function. Instead the poll function creates it's own context 'ctx := context.Background()', this is to avoid breaking expectations on the side of the callers, that expect a certain type of error and not "context timeout" errors. Additional changes: - Make all context.TODO() -> context.Background() - Update all unit tests and make sure during testing the retry interval and timeout are short. Test coverage of idempotency.go is at ~97%. - Remove the TestMutateConfigMapWithConflict test. It does not contribute much, because conflict handling is done at the API, server side, not on the side of kubeadm. This simulating this is not needed.	2024-02-18 13:14:32 +02:00
Lubomir I. Ivanov	7db7222592	kubeadm: introduce the WaitForAllControlPlaneComponents feature gate WaitForAllControlPlaneComponents is a new feature gate that can be used to tell kubeadm to wait for all control plane components and not only kube-apiserver. - Add the Waiter function WaitForControlPlaneComponents that waits for all CP components in parallel. Uses the regular healthz endpoint for checks of status 200. - Add a new experimental phase to kubeadm join called "wait-control-plane". A similar phase exists for kubeadm init.	2024-02-16 17:33:38 +02:00
xin.li	deec79ad8d	kubeadm: increase ut coverage for apiclient/idempotency Signed-off-by: xin.li <xin.li@daocloud.io>	2024-02-05 23:02:48 +08:00
Lubomir I. Ivanov	2cdd9a7130	kubeadm: use separate context in GetConfigMapWithShortRetry Intentionally pass a new context to this API call. This will let the API call run independently of the parent context timeout, which is quite short and can cause the API call to return abruptly.	2024-01-19 00:19:07 +02:00
Lubomir I. Ivanov	26a79e4c0b	kubeadm: special case context errors in GetConfigMapWithShortRetry If some code is about to go over the context deadline, "x/time/rate/rate.go" would return and untyped error with the string "would exceed context deadline". If some code already exceeded the deadline the error would be of type DeadlineExceeded. Ignore such context errors and only store API and connectivity errors.	2024-01-18 15:35:25 +02:00
Lubomir I. Ivanov	54a6e6a772	kubeadm: keep a function with short timeout in idempotency.go - Name the function GetConfigMapWithShortRetry to be easier to understand that the function is with a very short timeout. Add note that this function should be used in cases there is a fallback to local config. - Apply custom hardcoded interval of 50ms and timeout of 350ms to it. Previously the fucntion used exp backoff with 5 steps up to ~340ms.	2024-01-16 17:53:21 +02:00
Lubomir I. Ivanov	caf5311413	kubeadm: start using the Timeouts struct values Propagate usage of the Timeout struct values. Apply sanitazation to timeout constants in contants.go.	2024-01-14 15:07:56 +02:00
Lubomir I. Ivanov	374e41cf66	kubeadm: replace deprecated wait.Poll() and wait.PollImmediate() Replace the usage of the deprecated wait.Poll() and wait.PollImmediate() functions with wait.PollUntilContextTimeout(). Since we don't have piping of context around kubeadm, use context.Background() everywhere. Some wait.Poll() functions were converted to "immediate" as there is no point for them to not be. This is done for consistency. Replace the only instance of wait.JitterUntil with wait.PollUntilContextTimeout. JitterUntil is not deprecated but this is also done for consistency.	2024-01-14 15:07:55 +02:00
Lubomir I. Ivanov	32fbb23f3b	kubeadm: remove usage of the TryRunCommand() function The function TryRunCommand() uses an exponential backoff, which is good, but it's inconsistent and only used in a couple of places. Remove its usage in the token.go#UpdateOrCreateTokens() and switch to using the standard function used in other places - PollUntilContextTimeout(). Remove wait.go#TryRunCommand(), as there are no other usages.	2023-12-20 08:51:00 +02:00
Lubomir I. Ivanov	557118897d	kubeadm: drop concurrency when waiting for kubelet /healthz The function wait.go#WaitForKubeletAndFunc() has been used in a number of places in kubeadm. It starts a go routine to wait for the kubelet /healthz and in parallel starts another go routine to wait for an custom function. This logic is problematic. If kubeadm is waiting for the kubelet in parallel with something that requires the kubelet, the right solution would be to first wait for the kubelet in serial and only then proceed with the other action. The parallelism here particularly during "init" required a unwanted "initial timeout" of 40s, before the kubelet waiting even starts. In most cases, this makes the kubelet waiter to not even start, while the main point of waiting becomes the "other action". - Remove the function WaitForKubeletAndFunc() from the Waiter interface. - Rename the function WaitForHealthyKubelet() to just WaitForKubelet() to be consistent with the naming WaitForAPI(). - Update WaitForKubelet() to not use TryRunCommand() and instead use PollUntilContextTimeout(). - Remove the "initial timeout" of 40s in WaitForKubelet(). - Make both WaitForKubelet() and WaitForAPI() use similar error handling and output. - Update all usage of WaitForKubelet() to be a serial call before any other action, such as another wait* call. - Make the default wait timeout for the kubelet /healthz to be 1 minute (kubeadmconstants.DefaultKubeletTimeout). - Apply updates to all implementations of the Waiter interface.	2023-12-20 08:51:00 +02:00
Kubernetes Prow Robot	589d6f3886	Merge pull request #117630 from skitt/intstr-fromint32-cluster-lifecycle Cluster lifecycle: use new intstr functions	2023-05-19 08:50:30 -07:00
Stephen Kitt	4c83aae2cc	kubeadm: replace intstr.FromInt with intstr.FromInt32 This touches cases where FromInt() is used on numeric constants, or values which are already int32s, or int variables which are defined close by and can be changed to int32s with little impact. Signed-off-by: Stephen Kitt <skitt@redhat.com>	2023-05-01 09:17:50 +02:00
cui fliter	1359ebcc5b	fix doc mismatch Signed-off-by: cui fliter <imcusg@gmail.com>	2023-04-16 18:29:45 +08:00
Lubomir I. Ivanov	54b73deaca	kubeadm: handle dry run GET actions from fake discovery The kubeadm dry run client reactor code is flawed as it assumes all invoked "get" verb actions can be casted to GetAction. Apparently that is not the case when Discovery().ServerVersion() and other discovery calls are made. In such cases the action type is the bare ActionImpl. Catch if an action can be casted to ActionImpl and construct a GetAction from it. GetActionImpl only suppersets ActionImpl with a Name field (empty string in this case). Add unit test for Discovery().ServerVersion().	2022-12-21 11:49:59 +02:00
QuantumEnergyE	847a39afc0	Retry patch when then service is unavailable or timeout.	2022-11-29 23:09:31 +08:00
SataQiu	d4cafe4738	kubeadm: optimize and make the usage consistent about apierrors.IsNotFound	2022-10-13 23:23:53 +08:00
SataQiu	61cd585ad2	kubeadm: remove redundant import alias and unused apiclient util funcs	2022-09-28 12:36:54 +08:00
B Aravind	b307321c0a	I have evaluated TODO retry remove if feasible (#112383 ) Hi team, hope u all doing well. I have checked TODO that to remove "retry" if feasible but it's important i think that it shouldn't be removed because it was used in every file on your repo. Update idempotency.go Update idempotency.go Update idempotency.go	2022-09-13 22:31:00 -07:00
cndoit18	ec43037d0f	style: remove redundant judgment Signed-off-by: cndoit18 <cndoit18@outlook.com>	2022-08-25 12:07:36 +08:00
XuzhengChang	7824316e89	Print getStaticPodSingleHash err message	2022-03-02 09:34:12 +08:00
SataQiu	2c5aef9036	kubeadm: fix the bug that 'kubeadm init --dry-run --upload-certs' command failed with 'secret not found' error	2022-02-09 12:58:02 +08:00
Monokaix	eab74f15a5	Remove unused arg of kubeadm/WaitForKubeletAndFunc	2021-12-25 09:12:00 +08:00
haoyun	a600e31c55	test: add test for PatchNode when error happend Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-10-19 11:01:01 +08:00
haoyun	bd8f26c2d7	fix: patchNode retry logic Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-10-17 12:36:36 +08:00
Antonio Ojea	0cd75e8fec	run hack/update-netparse-cve.sh	2021-08-20 10:42:09 +02:00
XinYang	72fd01095d	re-order imports for kubeadm Signed-off-by: XinYang <xinydev@gmail.com>	2021-08-17 22:40:46 +08:00
XinYang	c2a8cd359f	re-order the imports in kubeadm Signed-off-by: XinYang <xinydev@gmail.com> Update cmd/kubeadm/app/cmd/join.go Co-authored-by: Lubomir I. Ivanov <neolit123@gmail.com>	2021-07-04 16:41:27 +08:00
Sandeep Rajan	b8a1bd6a6c	remove the deprecated kube-dns as an option in kubeadm	2021-03-04 12:12:54 -05:00
Benjamin Elder	56e092e382	hack/update-bazel.sh	2021-02-28 15:17:29 -08:00
Xianglin Gao	04ef3628e3	refact CreateOrMutateConfigMap and MutateConfigMap with PollImmediate Signed-off-by: Xianglin Gao <xianglin.gxl@alibaba-inc.com>	2020-06-11 00:31:22 +08:00
Xianglin Gao	6d572ea9b7	Add retries for CreateOrUpdateRoleBinding Signed-off-by: Xianglin Gao <xianglin.gxl@alibaba-inc.com>	2020-06-10 00:23:46 +08:00
Xianglin Gao	052eb7d9a5	Add retries for CreateOrUpdateRole Signed-off-by: Xianglin Gao <xianglin.gxl@alibaba-inc.com>	2020-06-10 00:12:25 +08:00
Jordan Liggitt	b7c2faf26c	client-go dynamic client: add context to callers	2020-03-06 10:56:23 -05:00
Mike Danese	76f8594378	more artisanal fixes Most of these could have been refactored automatically but it wouldn't have been uglier. The unsophisticated tooling left lots of unnecessary struct -> pointer -> struct transitions.	2020-03-05 14:59:47 -08:00
Taesun Lee	97fc3e6139	Fix typos in apiclient util fix initalTimeout to initialTimeout	2020-02-20 15:20:04 +09:00
Mike Danese	25651408ae	generated: run refactor	2020-02-08 12:30:21 -05:00
Mike Danese	3aa59f7f30	generated: run refactor	2020-02-07 18:16:47 -08:00
Mike Danese	d55d6175f8	refactor	2020-01-29 08:50:45 -08:00
Rafael Fernández López	14fe7225c1	kubeadm: Improve resiliency in CreateOrMutateConfigMap CreateOrMutateConfigMap was not resilient when it was trying to Create the ConfigMap. If this operation returned an unknown error the whole operation would fail, because it was strict in what error it was expecting right afterwards: if the error returned by the Create call was a IsAlreadyExists error, it would work fine. However, if an unexpected error (such as an EOF) happened, this call would fail. We are seeing this error specially when running control plane node joins in an automated fashion, where things happen at a relatively high speed pace. It was specially easy to reproduce with kind, with several control plane instances. E.g.: ``` [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace I1130 11:43:42.788952 887 round_trippers.go:443] POST https://172.17.0.2:6443/api/v1/namespaces/kube-system/configmaps?timeout=10s in 1013 milliseconds Post https://172.17.0.2:6443/api/v1/namespaces/kube-system/configmaps?timeout=10s: unexpected EOF unable to create ConfigMap k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient.CreateOrMutateConfigMap /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient/idempotency.go:65 ``` This change makes this logic more resilient to unknown errors. It will retry on the light of unknown errors until some of the expected error happens: either `IsAlreadyExists`, in which case we will mutate the ConfigMap, or no error, in which case the ConfigMap has been created.	2019-11-30 22:48:16 +01:00
Jordan Liggitt	752cda4fc4	guard kubeadm dependencies on k8s.io/kubernetes	2019-11-13 15:05:11 -05:00

1 2 3

122 Commits