From a83ddbb30ed5b377209c5291977fd94fedb8de45 Mon Sep 17 00:00:00 2001 From: Dalton Hubble Date: Sat, 9 May 2020 22:48:56 -0700 Subject: [PATCH] Add CoreDNS "soft" nodeAffinity for controller nodes * Add nodeAffinity to CoreDNS deployment PodSpec to prefer running CoreDNS pods on controllers, while relying on podAntiAffinity for spreading. * For single master clusters, running two CoreDNS pods on the master or running one pod on a worker is permissible. * Note: Its still _possible_ to end up with CoreDNS pods all running on workers since we only express scheduling preference ("soft"), but unlikely. Plus the motivating scenario (below) is also rare. Background: * CoreDNS replicas are set to the higher of 2 or the number of control plane nodes to (at a minimum) support Deployment updates or pod restarts and match the cluster size (e.g. 5 master/controller nodes likely means a larger cluster, so run 5 CoreDNS replicas) * In the past (before v1.14), we required kube-dns (CoreOS predecessor) to run CoreDNS pods on master nodes. With CoreDNS this node selection was relaxed. We'd like a gentler form of it now. Motivation: * On clusters using 100% preemptible/spot workers, it is possible that CoreDNS pods schedule to workers that are all preempted at the same time, causing a loss of cluster internal DNS service until a CoreDNS pod reschedules (1 min). We'd like CoreDNS to prefer controller/master nodes (which aren't preempted) to reduce the possibility of control plane disruption --- resources/manifests/coredns/deployment.yaml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/resources/manifests/coredns/deployment.yaml b/resources/manifests/coredns/deployment.yaml index 479acc6..8f55315 100644 --- a/resources/manifests/coredns/deployment.yaml +++ b/resources/manifests/coredns/deployment.yaml @@ -25,6 +25,13 @@ spec: seccomp.security.alpha.kubernetes.io/pod: 'docker/default' spec: affinity: + nodeAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + preference: + matchExpressions: + - key: node.kubernetes.io/master + operator: Exists podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100