Kubelet doesn't perform checkpointing and loses all its internal states after
restarts. It'd then mistaken pods from the api server as new pods and attempt
to go through the admission process. This may result in pods being rejected
even though they are running on the node (e.g., out of disk situation). This
change adds a condition to check whether the pod was seen before and categorize
such pods as updates. The change also removes freeze/unfreeze mechanism used to
work around such cases, since it is no longer needed and it stopped working
correctly ever since we switched to incremental updates.
Changes include:
- Moving stats serving & routes to pkg/kubelet/server/stats/handler.go
- Managing the routes with restful.WebService, rather than manual
parsing
- Misc cleanup
These changes will make adding the new routes for /stats/summary more
manageable.
There has been a recent regression causing kubelet to assume no containers are
running for the pod if kubelet has not seen the pod before. This would cause
all containers to be restarted after kubelet gets restarted. This change fixes
the bug.
Implement a flag that defines the frequency at which a node's out of
disk condition can change its status. Use this flag to suspend out of
disk status changes in the time period specified by the flag, after
the status is changed once.
Set the flag to 0 in e2e tests so that we can predictably test out of
disk node condition.
Also, use util.Clock interface for all time related functionality in
the kubelet. Calling time functions in unversioned package or time
package such as unversioned.Now() or time.Now() makes it really hard
to test such code. It also makes the tests flaky and sometimes
unnecessarily slow due to time.Sleep() calls used to simulate the
time elapsed. So use util.Clock interface instead which can be faked
in the tests.
- Sample is now the toplevel struct, so all child structs have the same
timestamp
- Removed FilesystemStats. There are more discussions needed
wrt. volumes and disk accounting, so this will be added in a follow
up PR
- Removed Options. The most recent sample will be returned.
This API has been discussed ad nauseam across several forums, and this
API represents the latest conclusion. In summary, we will provide this
API as temporary solution for providing the new stats required for 1.2.
In the longterm this API will be split into "essential" stats, which
will be provided by a first-party API served through the kubelet, and
"non-essential" (monitoring) stats, which will be provided by a 3rd
party API served from a pod.