Automatic merge from submit-queue
Fix use of docker removed ParseRepositoryTag() function
Docker has removed the ParseRepositoryTag() function in
leading to failures using the kubernetes Go client API.
Failure:
```
../k8s.io/kubernetes/pkg/util/parsers/parsers.go:30: undefined: parsers.ParseRepositoryTag
```
Automatic merge from submit-queue
Collect and expose runtime's image storage usage via Kubelet's /stats/summary endpoint
This information is useful to users since docker images are typically not stored on the root filesystem.
Kubelet will also consume this feature in the future to decide is evicting images will help with disk usage on the nodes.
cc @kubernetes/sig-node
This has been supplanted by a) the DockerJSON.CreatedAt field and b) the
ContainerStatus.CreatedAt, where the first is used for creating the
second.
The `.Created` field was only written to as far as I can see.
Docker has removed the ParseRepositoryTag() function in
leading to failures using the kubernetes Go client API.
Lets use github.com/docker/distribution reference.ParseNamed()
instead.
Failure:
../k8s.io/kubernetes/pkg/util/parsers/parsers.go:30: undefined: parsers.ParseRepositoryTag
Automatic merge from submit-queue
Refactor image related functions to use docker engine-api
ref #23563
Hopes can do some help, cc @Random-Liu
If it's ok, will add more work here.
Automatic merge from submit-queue
rkt: Return `FinishedAt` for pod
This is implemented via touching a file on stop as a hook in the systemd
unit. The ctime of this file is then used to get the `finishedAt` time
in the future.
In addition, this changes the `startedAt` and `createdAt` to use the api
server's results rather than the annotations it previously used.
It's possible we might want to move this into the api in the future.
Fixes#23887
I did the following manual testing:
```
$ cat ./examples/output/exit-output.yml
apiVersion: v1
kind: Pod
metadata:
labels:
name: exit
name: exit-output
spec:
restartPolicy: Never
containers:
- name: exit
image: busybox
command: ["sh", "-c", "echo Exiting in 60; sleep 60; echo goodbye"]
$ kubectl create -f ./examples/exit/exit-output.yaml
$ # wait
$ kubectl describe pod exit-output | grep State -A 4
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 19 Apr 2016 13:23:13 -0700
Finished: Tue, 19 Apr 2016 13:24:13 -0700
$ kubectl logs exit-output
Exiting in 60
goodbye
```
I double checked as well that the file at `/var/lib/kubelet/pods/$id/finished-$id` existed and looked as expected.
This is related to https://github.com/coreos/rkt/issues/1789#issuecomment-207111814 and follows https://github.com/kubernetes/kubernetes/pull/24367 + https://github.com/coreos/rkt/issues/2445
cc @jonboulle @iaguis @yifan-gu @kubernetes/sig-node
This adds a poll-and-timeout procedure after the pod is
started, to make sure the post-start hooks execute when the
container is actually running.
This is a temporal workaround for implementing post-hooks,
a long term solution is to use lifecycle event to trigger
those hooks, see https://github.com/kubernetes/kubernetes/issues/23084.
Also this fixes a bug of getting container ID for a non-running
container when running pre-stop hook.
This is implemented via touching a file on stop as a hook in the systemd
unit. The ctime of this file is then used to get the `finishedAt` time
in the future.
In addition, this changes the `startedAt` and `createdAt` to use the api
server's results rather than the annotations it previously used.
It's possible we might want to move this into the api in the future.
Fixes#23887
Automatic merge from submit-queue
Kubelet: Refactor all but image related functions in DockerInterface
For #23563.
Based on #23699 and #23844.
Only last 3 commits are new. This PR refactored all functions except image related functions, including:
* CreateExec
* StartExec
* InspectExec
* AttachToContainer
* Logs
* Info
* Version
@kubernetes/sig-node
Automatic merge from submit-queue
docker daemon complains SHM size must be greater than 0
Fixes https://github.com/kubernetes/kubernetes/issues/24588
I am hitting this on Fedora 23 w/ docker 1.9.1 using systemd cgroup-driver.
```
$ docker version
Client:
Version: 1.9.1
API version: 1.21
Package version: docker-1.9.1-9.gitee06d03.fc23.x86_64
Go version: go1.5.3
Git commit: ee06d03/1.9.1
Built:
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Package version: docker-1.9.1-9.gitee06d03.fc23.x86_64
Go version: go1.5.3
Git commit: ee06d03/1.9.1
Built:
OS/Arch: linux/amd64
```
Not sure why I am on the only one hitting it right now, but putting this out here for comment.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @smarterclayton
Automatic merge from submit-queue
Fix PullImage and add corresponding node e2e test
Fixes#24101. This is a bug introduced by #23506, since ref #23563.
The root cause of #24101 is described [here](https://github.com/kubernetes/kubernetes/issues/24101#issuecomment-208547623).
This PR
1) Fixes#24101 by decoding the messages returned during pulling image, and return error if any of the messages contains error.
2) Add the node e2e test to detect this kind of failure.
3) Get present check out of `ConformanceImage.Remove()` and `ConformanceImage.Pull()`. Because sometimes we may expect error to occur in `PullImage()` and `RemoveImage()`, but even that doesn't happen, the `Present()` check will still return error and let the test pass.
@yujuhong @freehan @liangchenye
Also /cc @resouer, because he is doing the image related functions refactoring.
Automatic merge from submit-queue
kubenet: Load bridge netfilter module in Init().
This lets the kubenet loads the bridge netfilter module and set bridge-nf-call-iptables=1
Fix#24018
Follow up PRs would be appreciate if we also load the module in the bridge plugin binary itself. Ref https://github.com/kubernetes/kubernetes/issues/24018#issuecomment-207682514
cc @kubernetes/sig-node @sjpotter @euank
Automatic merge from submit-queue
Expose SummaryProvider for reuse by other parts of kubelet
To support out of resource killing in the kubelet, we will introduce a new top-level module that will ensure node stability by checking if eviction thresholds have been met for memory and file-system usage on the node. In addition, it will then need information about pod memory and disk usage in order to make an eviction selection. Currently, this information is collected in `SummaryProvider` but it's hidden away and not available for re-use by other top-level modules of the kubelet. This initial refactor adds the ability to get summary stat information from the `ResourceAnalyzer` so it can be reused by other top-level modules.
I suspect we will further re-factor this area as code evolves, but this unblocks further progress on out-of-resource killing.
/cc @vishh @timothysc @kubernetes/sig-node @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Add memory available to summary stats provider
To support out of resource killing when low on memory, we want to let operators specify eviction thresholds based on available memory instead of memory usage for ease of use when working with heterogeneous nodes.
So for example, a valid eviction threshold would be the following:
* If node.memory.available < 200Mi for 30s, then evict pod(s)
For the node, `memory.availableBytes` is always known since the `memory.limit_in_bytes` is always known for root cgroup. For individual containers in pods, we only populate the `availableBytes` if the container was launched with a memory limit specified. When no memory limit is specified, the cgroupfs sets a value of 1 << 63 in the `memory.limit_in_bytes` so we look for a similar max value to handle unbounded limits, and ignore setting `memory.availableBytes`.
FYI @vishh @timstclair - as discussed on Slack.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Kubelet: Refactor container related functions in DockerInterface
For #23563.
Based on #23506, will rebase after #23506 is merged.
The last 4 commits of this PR are new.
This PR refactors all container lifecycle related functions in DockerInterface, including:
* ListContainers
* InspectContainer
* CreateContainer
* StartContainer
* StopContainer
* RemoveContainer
@kubernetes/sig-node
Automatic merge from submit-queue
rkt: Fix hostnetwork.
Mount hosts' /etc/hosts, /etc/resolv.conf, set host's hostname
when running the pod in the host's network.
Fix#24235
cc @kubernetes/sig-node
Automatic merge from submit-queue
rkt: Use rkt pod's uuid as the systemd service file's name.
Previously, the service file's name is 'k8s_${POD_UID}.service',
which means we need to `systemctl daemon-reload` if the we replace
the content of the service file (e.g. pod is restarted).
However this makes the journal in the previous pod get disconnected.
This PR solves the issue by using the unique rkt uuid as the service
file's name. After the change, the service file's name will be:
'k8s_${rkt_uuid}.service'.
Fix#23691
Automatic merge from submit-queue
rkt: Update the directory path for saving auth config.
Since #23308 is merged, now we have more stable way to determine where to store the auth configs.
cc @yujuhong @sjpotter
Automatic merge from submit-queue
Allow lazy binding in credential providers; don't use it in AWS yet
This is step one for cross-region ECR support and has no visible effects yet.
I'm not crazy about the name LazyProvide. Perhaps the interface method could
remain like that and the package method of the same name could become
LateBind(). I still don't understand why the credential provider has a
DockerConfigEntry that has the same fields but is distinct from
docker.AuthConfiguration. I had to write a converter now that we do that in
more than one place.
In step two, I'll add another intermediate, lazy provider for each AWS region,
whose empty LazyAuthConfiguration will have a refresh time of months or years.
Behind the scenes, it'll use an actual ecrProvider with the usual ~12 hour
credentials, that will get created (and later refreshed) only when kubelet is
attempting to pull an image. If we simply turned ecrProvider directly into a
lazy provider, we would bypass all the caching and get new credentials for
each image pulled.
Automatic merge from submit-queue
Kubelet: Better-defined Container Waiting state
For issue #20478 and #21125.
This PR corrected logic and add unit test for `ShouldContainerBeRestarted()`, cleaned up `Waiting` state related code and added unit test for `generateAPIPodStatus()`.
Fixes#20478Fixes#17971
@yujuhong