Commit Graph

2716 Commits

Author SHA1 Message Date
k8s-merge-robot
9f00ed6075 Merge pull request #25377 from freehan/kubenetmutex
Automatic merge from submit-queue

modify kubenet mutex and add timer
2016-05-10 17:22:15 -07:00
k8s-merge-robot
3894c7972c Merge pull request #25185 from freehan/kubenetgetpodstatus
Automatic merge from submit-queue

kubenet try to retrieve ip inside pod net namespace

Kubenet currently stores the ips of pods inside a map. Kubelet gets pod ip from kubenet during syncpod. If Kubelet restarts, all pods on the node lost their ips in podStatus. This PR adds logic to retrieve pod IP from pod netns. 

cc: @yujuhong
2016-05-10 16:08:45 -07:00
k8s-merge-robot
f9b8fd0c96 Merge pull request #25011 from zhouhaibing089/addclose
Automatic merge from submit-queue

followup to add http server close method

Fixes #25009, a follow up of https://github.com/kubernetes/kubernetes/pull/24595.
2016-05-09 22:32:02 -07:00
k8s-merge-robot
c4214f743f Merge pull request #24918 from Random-Liu/add-docker-operation-timeout
Automatic merge from submit-queue

Kubelet: Add docker operation timeout

For #23563.
Based on #24748, only the last 2 commits are new.

This PR:
1) Add timeout for all docker operations.
2) Add docker operation timeout metrics
3) Cleanup kubelet stats and add runtime operation error and timeout rate monitoring.
4) Monitor runtime operation error and timeout rate in kubelet perf.

@yujuhong 
/cc @gmarek Because of the metrics change.
/cc @kubernetes/sig-node
2016-05-09 21:51:52 -07:00
k8s-merge-robot
def7639457 Merge pull request #25245 from pmorie/kubelet/cadvisor
Automatic merge from submit-queue

Reduce kubelet LOC: extract cadvisor

Step 2 of #25028 

@yujuhong @kubernetes/sig-node
2016-05-09 21:09:42 -07:00
Minhan Xia
3573903a8d modify kubenet mutex and add timer 2016-05-09 14:54:15 -07:00
k8s-merge-robot
545d56a63b Merge pull request #24810 from derekwaynecarr/sources_cleanup
Automatic merge from submit-queue

Clean-up sources ready tracking in kubelet

moved sources ready tracking behind an interface, made it thread-safe.
2016-05-09 05:48:09 -07:00
k8s-merge-robot
2cf511b1f5 Merge pull request #24750 from derekwaynecarr/kubelet_eviction_flag_parsing
Automatic merge from submit-queue

Kubelet eviction flag parsers and tests

The first two commits are from https://github.com/kubernetes/kubernetes/pull/24559 that have achieved LGTM.  

The last commit is only part that is interesting, it adds the parsing logic to handle the flags, and reserves `pkg/kubelet/eviction` for eviction manager logic.
2016-05-09 04:15:04 -07:00
Tim Hockin
817abc3213 Kill our atomic pkg, now that 1.6 is req'd 2016-05-08 20:30:37 -07:00
k8s-merge-robot
fe135fc251 Merge pull request #24630 from euank/redundant-created
Automatic merge from submit-queue

kubelet: Remove redundant `Container.Created`

As far as I can tell, this has been supplanted by a) the `DockerJSON.CreatedAt` field and b) the
`ContainerStatus.CreatedAt`, where the first is used for creating the
second.

The `.Created` field was only written to as far as I can see.

cc @yifan-gu & @Random-Liu 

Is there any reason we might want to keep this around?
2016-05-08 16:21:05 -07:00
k8s-merge-robot
d4b1b6776a Merge pull request #24557 from swagiaal/attacher-interface
Automatic merge from submit-queue

 Abstract node side functionality of attachable plugins

- Create PhysicalAttacher interface to abstract MountDevice and
  WaitForAttach.
- Create PhysicalDetacher interface to abstract WaitForDetach and
  UnmountDevice.
- Expand unit tests to check that Attach, Detach, WaitForAttach,
  WaitForDetach, MountDevice, and UnmountDevice get call where
  appropriet.

Physical{Attacher,Detacher} are working titles suggestions welcome. Some other thoughts:
- NodeSideAttacher or NodeAttacher.
- AttachWatcher
- Call this Attacher and call the Current Attacher CloudAttacher.
- DeviceMounter (although there are way too many things called Mounter right now :/)

This is to address: https://github.com/kubernetes/kubernetes/pull/21709#issuecomment-192035382

@saad-ali
2016-05-08 14:04:44 -07:00
k8s-merge-robot
f2f3b49f58 Merge pull request #22575 from MikaelCluseau/wip-issue-20466
Automatic merge from submit-queue

Add subPath to mount a child dir or file of a volumeMount

Allow users to specify a subPath in Container.volumeMounts so they can use a single volume for many mounts instead of creating many volumes. For instance, a user can now use a single PersistentVolume to store the Mysql database and the document root of an Apache server of a LAMP stack pod by mapping them to different subPaths in this single volume.

Also solves https://github.com/kubernetes/kubernetes/issues/20466.
2016-05-08 08:45:15 -07:00
k8s-merge-robot
8217172cd4 Merge pull request #19025 from aveshagarwal/master-imagepull-messages
Automatic merge from submit-queue

Fix parallel image pullers event messages with reasons constants.
2016-05-08 07:31:49 -07:00
Andy Goldstein
f091ea5eda Handle image digests in node status and image GC
Start including Docker image digests in the node status and consider image digests during image
garbage collection.
2016-05-07 06:50:51 -04:00
k8s-merge-robot
660050631e Merge pull request #25077 from ncdc/pleg-retry
Automatic merge from submit-queue

PLEG: reinspect pods that failed prior inspections

Fix the following sequence of events:

1. relist call 1 successfully inspects a pod (just has infra container)
1. relist call 2 gets an error inspecting the same pod (has infra container and a transient
container that failed to create) and doesn't update the old/new pod records
1. relist calls 3+ don't inspect the pod any more (just has infra container so it doesn't look like
anything changed)

This change adds a new list that keeps track of pods that failed inspection and retries them the
next time relist is called. Without this change, a pod in this state would never be inspected again,
its entry in the status cache would never be updated, and the pod worker would never call syncPod
again because the most recent entry in the status cache has an error associated with it. Without
this change, pods in this state would be stuck Terminating forever, unless the user issued a
deletion with a grace period value of 0.

Fixes #24819 

cc @kubernetes/rh-cluster-infra @kubernetes/sig-node
2016-05-06 22:14:08 -07:00
Robert Bailey
a2d8b0af13 Merge pull request #25027 from xiangpengzhao/fix_funcname
Rename a func in manager.go
2016-05-06 20:41:26 -07:00
Robert Bailey
b274c5b7de Merge pull request #24843 from derekwaynecarr/graceperiod_override
Allow KillPod to take a gracePeriodOverride
2016-05-06 15:17:56 -07:00
Robert Bailey
2493a9de62 Merge pull request #24959 from Random-Liu/fix-flaky-unit-test
Use fake clock in TestGetPodsToSync.
2016-05-06 14:14:02 -07:00
Robert Bailey
2c678f1ec1 Merge pull request #25053 from yujuhong/rm_cahce_update
kubelet: do not force update the runtime cache
2016-05-06 14:11:38 -07:00
Robert Bailey
d9a4e9b49c Merge pull request #25071 from zhouhaibing089/clock-fix
allow equality to avoid flaky on clock
2016-05-06 14:10:43 -07:00
Robert Bailey
303f059efa Merge pull request #24817 from pmorie/clarify-orphaned-cleanup
Clarify orphaned volume cleanup
2016-05-06 13:52:33 -07:00
Robert Bailey
71706e0ad5 Merge pull request #25206 from yifan-gu/fix_hostport
rkt: When host port is zero, we should not forward the port.
2016-05-06 13:43:56 -07:00
Robert Bailey
1474145db1 Merge pull request #24823 from derekwaynecarr/fix-kubelet-typo
Fix function name typo in kubelet
2016-05-06 13:28:45 -07:00
Minhan Xia
1252f5695b add unit tests for kubenet 2016-05-06 12:10:45 -07:00
Random-Liu
148588e6a1 1) Add docker operation timeout metrics.
2) Cleanup kubelet stats and add runtime operation error and timeout
rate monitoring.
3) Monitor runtime operation error and timeout rate in
kubelet perf.
2016-05-06 10:53:13 -07:00
Random-Liu
66678354a0 Add timeout for all docker operation. 2016-05-06 10:53:13 -07:00
derekwaynecarr
7bab6999d4 Allow KillPod to take a gracePeriodOverride 2016-05-06 12:14:43 -04:00
derekwaynecarr
582e662581 Clean-up sources ready tracking 2016-05-06 12:11:29 -04:00
derekwaynecarr
725af223aa Add parsers for eviction thresholds 2016-05-06 12:06:03 -04:00
k8s-merge-robot
16159b8bd0 Merge pull request #24344 from derekwaynecarr/kubelet-lifecycle-callouts
Automatic merge from submit-queue

Define interfaces for kubelet pod admission and eviction

There is too much code and logic in `kubelet.go` that makes it hard to test functions in discrete pieces.

I propose an interface that an internal module can implement that will let it make an admission decision for a pod.  If folks are ok with the pattern, I want to move the a) predicate checking, b) out of disk, c) eviction preventing best-effort pods being admitted into their own dedicated handlers that would be easier for us to mock test.  We can then just write tests to ensure that the `Kubelet` calls a call-out, and we can write easier unit tests to ensure that dedicated handlers do the right thing.

The second interface I propose was a `PodEvictor` that is invoked in the main kubelet sync loop to know if pods should be pro-actively evicted from the machine.  The current active deadline check should move into a simple evictor implementation, and I want to plug the out of resource killer code path as an implementation of the same interface.

 @vishh @timothysc - if you guys can ack on this, I will add some unit testing to ensure we do the call-outs.

/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra
2016-05-06 08:53:35 -07:00
k8s-merge-robot
32256d53aa Merge pull request #25136 from dcbw/kubenet-fixup-txqueuelen
Automatic merge from submit-queue

kubenet: fix up CNI bridge TX queue length if needed

CNI's bridge plugin mis-handles the TxQLen when creating the bridge,
leading to a zero-length TX queue.  This doesn't typically cause
problems (since virtual interfaces don't have hard queue limits)
but when adding traffic shaping, some qdiscs pull their packet
limits from the TX queue length, leading to a packet limit of 0
in some cases.  Until we can depend on a new enough version of
CNI, fix up the TX queue length internally.

Closes: https://github.com/kubernetes/kubernetes/issues/25092
2016-05-06 06:29:31 -07:00
k8s-merge-robot
66ef87347e Merge pull request #24968 from wojtek-t/remove_node_name
Automatic merge from submit-queue

Remove nodeName from predicate signature.

With this approach, I'm getting the initial throughput (in empty cluster) in 1000-node cluster of ~95pods/s.
Which is ~30% improvement.

@kubernetes/sig-scalability
2016-05-06 04:09:13 -07:00
k8s-merge-robot
346ddc52c2 Merge pull request #24748 from Random-Liu/cleanup-with-new-engine-api
Automatic merge from submit-queue

Kubelet: Cleanup with new engine api

Finish step 2 of #23563

This PR:
1) Cleanup go-dockerclient reference in the code.
2) Bump up the engine-api version.
3) Cleanup the code with new engine-api.

Fixes #24076.
Fixes #23809.

/cc @yujuhong
2016-05-06 03:16:53 -07:00
Wojciech Tyczynski
a51f266ebf Remove nodeName from predicate signature. 2016-05-06 11:23:37 +02:00
k8s-merge-robot
4a00266f40 Merge pull request #25224 from Random-Liu/delete-pod-with-uid
Automatic merge from submit-queue

Delete pod with uid as precondition.

Addressed https://github.com/kubernetes/kubernetes/issues/25169#issuecomment-217033202.

Fix #25169 
Fix #24937

This PR change status manager to delete pods with uid as a precondition, so that kubelet won't delete pods with different uid but the same name and namespace accidentally.

/cc @yujuhong
2016-05-05 22:02:14 -07:00
Paul Morie
bc5d7a1bca Reduce kubelet LOC: extract cadvisor 2016-05-06 00:26:48 -04:00
Mikaël Cluseau
06900a934d Introduce subPath in VolumeMount 2016-05-06 15:08:41 +11:00
Minhan Xia
ae6f9ab970 kubenet try to retrieve ip inside pod net namespace 2016-05-05 17:57:32 -07:00
k8s-merge-robot
03e7e08e70 Merge pull request #25124 from pmorie/kubelet-getters
Automatic merge from submit-queue

Reduce kubelet LOC: extract getters

Step 1 of #25028 as discussed in @kubernetes/sig-node meeting
2016-05-05 16:52:09 -07:00
Random-Liu
cb6fe9e7ef Delete pod with uid as precondition. 2016-05-05 14:34:49 -07:00
zhouhaibing089
5923fd352e followup to add http server close method 2016-05-05 12:04:41 +08:00
Yifan Gu
36f3185223 rkt: When host port is zero, we should not forward the port. 2016-05-04 19:02:39 -07:00
Minhan Xia
04b80f7fb8 rename Status interface to GetPodNetworkStatus 2016-05-04 13:46:31 -07:00
Minhan Xia
265fdd9344 add NetworkStatus in NetworkPlugin interface for kubelet to consume 2016-05-04 13:46:31 -07:00
Dan Williams
aad6535a00 kubenet: fix up CNI bridge TX queue length if needed
CNI's bridge plugin mis-handles the TxQLen when creating the bridge,
leading to a zero-length TX queue.  This doesn't typically cause
problems (since virtual interfaces don't have hard queue limits)
but when adding traffic shaping, some qdiscs pull their packet
limits from the TX queue length, leading to a packet limit of 0
in some cases.  Until we can depend on a new enough version of
CNI, fix up the TX queue length internally.
2016-05-04 10:14:40 -05:00
Sami Wagiaalla
71e7dba845 Abstract node side functionality of attachable plugins
- Expand Attacher/Detacher interfaces to break up work more
  explicitly.
- Add arguments to all functions to avoid having implementers store
  the data needed for operations.
- Expand unit tests to check that Attach, Detach, WaitForAttach,
  WaitForDetach, MountDevice, and UnmountDevice get call where
  appropriet.
2016-05-04 10:18:39 -04:00
Paul Morie
7521503ab9 Reduce kubelet LOC: extract getters 2016-05-04 02:25:22 -04:00
zhouhaibing089
67747ca08f allow equality to avoid flaky on clock 2016-05-04 09:11:22 +08:00
Andy Goldstein
3a87bfb6f7 PLEG: reinspect pods that failed prior inspections
Fix the following sequence of events:

1. relist call 1 successfully inspects a pod (just has infra container)
1. relist call 2 gets an error inspecting the same pod (has infra container and a transient
container that failed to create) and doesn't update the old/new pod records
1. relist calls 3+ don't inspect the pod any more (just has infra container so it doesn't look like
anything changed)

This change adds a new list that keeps track of pods that failed inspection and retries them the
next time relist is called. Without this change, a pod in this state would never be inspected again,
its entry in the status cache would never be updated, and the pod worker would never call syncPod
again because the most recent entry in the status cache has an error associated with it. Without
this change, pods in this state would be stuck Terminating forever, unless the user issued a
deletion with a grace period value of 0.
2016-05-03 11:06:35 -04:00
Random-Liu
4cca5b2290 Use fake clock in TestGetPodsToSync to fix flake. 2016-05-02 16:05:36 -07:00