The current dryrun client implemnetation is suboptimal
and sparse. It has the following problems:
- When an object CREATE or UPDATE reaches the default dryrun client
the operation is a NO-OP, which means subsequent GET calls must
fully emulate the object that exists in the store.
- There are multiple implmentations of a DryRunGetter interface
such the one in init_dryrun.go but there are no implementations
for reset, upgrade, join.
- There is a specific DryRunGetter that is backed by a real
client in clientbacked_dryrun.go, but this is used for upgrade
and does not work in conjuction with a fake client.
This commit does the following changes:
- Removes all existing *dryrun*.go implementations.
- Add a new DryRun implementation in dryrun.go that implements
3 clients - fake clientset, real clientset, real dynamic client.
- The DryRun object uses the method chaining pattern.
- Allows the user opt-in into real clients only if needed, by passing
a real kubeconfig. By default only constructs a fake client.
- The default reactor chain for the fake client, always logs the
object action, then for GET or LIST actions attempts to use the
real dynamic client to get the object. If a real object does not
exist it attempts to get the object from the fake object store.
- The user can prepend or append reactors to the chain.
- All known needed reactors for operations during init, join,
reset, upgrade are added as methods of the DryRun struct.
- Adds detailed unit test for the DryRun struct and its methods
including reactors.
Additional changes:
- Use the new DryRun implementation in all command workflows -
init, join, reset, upgrade.
- Ensure that --dry-run works even if there is no active cluster
by returning faked objects. For join, a faked cluster-info
with a fake bootstrap token and CA are used.
Allow the user to pass custom cert validity period with
ClusterConfiguration.CertificateValidityPeriod and
CACertificateValidityPeriod.
The defaults remain 1 year for regular cert and 10 years for CA.
Show warnings if the provided values are more than the defaults.
Additional changes:
- In "certs show-expiration" use HumanDuration() to print
more detailed durations instead of ShortHumanDuration().
- Add a new kubeadm util GetStartTime() which can be used
to consistently get a UTC time for tasks like writing certs
and unit tests.
- Update unit tests to validate the new customizable NotAfter.
There is no point to track more than 3 etcd versions at a time
where each etcd versions maps to a k8s CP version.
It's 3 instead of 2 (k8s CP / kubeadm version skew size) because
there is a period of time where the 3rd version (newest) will
be WIP at k/k master - e.g. at the time of this commit it's 1.31.
Add a unit test to block on this.
Also fixate the min etcd version to 3.5.11.
The function KubernetesReleaseVersion is being called in
a number of locations during unit tests but by default it
uses a "fetch version from URL" approach.
- Update the function to return a placeholder version
during unit tests.
- Update unit tests for this function.
- Update strings / comments in other version_tests.go
locations.
The improvement is significant:
time go test k8s.io/kubernetes/cmd/kubeadm/app/... -count=1
before:
real 2m47.733s
after:
real 0m10.234s
StaticPodMirroringTimeout and StaticPodMirroringRetryInterval
are use for just an API call to get Pods(). The already existing
constants.KubernetesAPICallRetryInterval
and kubeadmapi.GetActiveTimeouts().KubernetesAPICall.Duration
can be used for that instead.
Follow the same process of adding the Timeouts struct
to UpgradeConfiguration similarly to how it was done for
other API Kinds.
In the Timeouts struct include one new timeout:
- UpgradeManifests
Switch to PollUntilContextTimeout() everywhere to allow
usage of the exposed timeouts in the kubeadm API. Exponential backoff
options are more difficult to expose in this regard and a bit too
detailed for the common user - i.e. have "steps", "factor" and so on.
The function wait.go#WaitForKubeletAndFunc() has been used in
a number of places in kubeadm. It starts a go routine to wait for
the kubelet /healthz and in parallel starts another go routine
to wait for an custom function.
This logic is problematic. If kubeadm is waiting for the kubelet
in parallel with something that requires the kubelet, the right
solution would be to first wait for the kubelet in serial and only
then proceed with the other action. The parallelism here particularly
during "init" required a unwanted "initial timeout" of 40s, before
the kubelet waiting even starts. In most cases, this makes the kubelet
waiter to not even start, while the main point of waiting becomes
the "other action".
- Remove the function WaitForKubeletAndFunc() from the Waiter interface.
- Rename the function WaitForHealthyKubelet() to just WaitForKubelet()
to be consistent with the naming WaitForAPI().
- Update WaitForKubelet() to not use TryRunCommand() and instead
use PollUntilContextTimeout().
- Remove the "initial timeout" of 40s in WaitForKubelet().
- Make both WaitForKubelet() and WaitForAPI() use similar error
handling and output.
- Update all usage of WaitForKubelet() to be a serial call before
any other action, such as another wait* call.
- Make the default wait timeout for the kubelet
/healthz to be 1 minute (kubeadmconstants.DefaultKubeletTimeout).
- Apply updates to all implementations of the Waiter interface.
- Add the new file name: super-admin.conf and a function
to return its default path GetSuperAdminKubeConfigPath()
- Add the ClusterAdminsGroupAndClusterRoleBinding object name.
Move the defaulting of the BootstrapToken type inside the
bootstraptoken/v1 package. This prevents an error where
codegen complains that a defaulter for the type exists in both
the kubeadm v1beta3 and v1beta4.
Adapt kubeadm code to use the defaulter function and constants
that were moved to bootstraptoken/v1.
NOTE: technically this is a breaking change for direct users of
v1beta3/SetDefaults_BootstrapToken().
- drop versions < 1.22 in the etcd map
- use 3.5.9-0 for >= 1.22 versions
- make the minimum version for external etcd 3.4.13-4 and max 3.5.9-0
- update images_test to not rely on a pinned etcd version in tests
note: the image 3.4.18-0 was never released in registry.k8s.io!