diff --git a/README.md b/README.md index 9b87e775..388278f7 100644 --- a/README.md +++ b/README.md @@ -1,128 +1,128 @@ -# My home Kubernetes cluster :sailboat: -_... managed by Flux and serviced with RenovateBot_ :robot: +# My home operations repository 🎛🔨 +_... managed by Flux Renovate, and GitHub Actions_ 🤖
-
-
- - -[![Discord](https://img.shields.io/discord/673534664354430999?color=7289da&label=DISCORD&style=for-the-badge)](https://discord.gg/sTMX7Vh) -[![k3s](https://img.shields.io/badge/k3s-v1.20.6-orange?style=for-the-badge)](https://k3s.io/) -[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white&style=for-the-badge)](https://github.com/pre-commit/pre-commit) -[![renovate](https://img.shields.io/badge/renovate-enabled-green?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjUgNSAzNzAgMzcwIj48Y2lyY2xlIGN4PSIxODkiIGN5PSIxOTAiIHI9IjE4NCIgZmlsbD0iI2ZlMiIvPjxwYXRoIGZpbGw9IiM4YmIiIGQ9Ik0yNTEgMjU2bC0zOC0zOGExNyAxNyAwIDAxMC0yNGw1Ni01NmMyLTIgMi02IDAtN2wtMjAtMjFhNSA1IDAgMDAtNyAwbC0xMyAxMi05LTggMTMtMTNhMTcgMTcgMCAwMTI0IDBsMjEgMjFjNyA3IDcgMTcgMCAyNGwtNTYgNTdhNSA1IDAgMDAwIDdsMzggMzh6Ii8+PHBhdGggZmlsbD0iI2Q1MSIgZD0iTTMwMCAyODhsLTggOGMtNCA0LTExIDQtMTYgMGwtNDYtNDZjLTUtNS01LTEyIDAtMTZsOC04YzQtNCAxMS00IDE1IDBsNDcgNDdjNCA0IDQgMTEgMCAxNXoiLz48cGF0aCBmaWxsPSIjYjMwIiBkPSJNMjg1IDI1OGw3IDdjNCA0IDQgMTEgMCAxNWwtOCA4Yy00IDQtMTEgNC0xNiAwbC02LTdjNCA1IDExIDUgMTUgMGw4LTdjNC01IDQtMTIgMC0xNnoiLz48cGF0aCBmaWxsPSIjYTMwIiBkPSJNMjkxIDI2NGw4IDhjNCA0IDQgMTEgMCAxNmwtOCA3Yy00IDUtMTEgNS0xNSAwbC05LThjNSA1IDEyIDUgMTYgMGw4LThjNC00IDQtMTEgMC0xNXoiLz48cGF0aCBmaWxsPSIjZTYyIiBkPSJNMjYwIDIzM2wtNC00Yy02LTYtMTctNi0yMyAwLTcgNy03IDE3IDAgMjRsNCA0Yy00LTUtNC0xMSAwLTE2bDgtOGM0LTQgMTEtNCAxNSAweiIvPjxwYXRoIGZpbGw9IiNiNDAiIGQ9Ik0yODQgMzA0Yy00IDAtOC0xLTExLTRsLTQ3LTQ3Yy02LTYtNi0xNiAwLTIybDgtOGM2LTYgMTYtNiAyMiAwbDQ3IDQ2YzYgNyA2IDE3IDAgMjNsLTggOGMtMyAzLTcgNC0xMSA0em0tMzktNzZjLTEgMC0zIDAtNCAybC04IDdjLTIgMy0yIDcgMCA5bDQ3IDQ3YTYgNiAwIDAwOSAwbDctOGMzLTIgMy02IDAtOWwtNDYtNDZjLTItMi0zLTItNS0yeiIvPjxwYXRoIGZpbGw9IiMxY2MiIGQ9Ik0xNTIgMTEzbDE4LTE4IDE4IDE4LTE4IDE4em0xLTM1bDE4LTE4IDE4IDE4LTE4IDE4em0tOTAgODlsMTgtMTggMTggMTgtMTggMTh6bTM1LTM2bDE4LTE4IDE4IDE4LTE4IDE4eiIvPjxwYXRoIGZpbGw9IiMxZGQiIGQ9Ik0xMzQgMTMxbDE4LTE4IDE4IDE4LTE4IDE4em0tMzUgMzZsMTgtMTggMTggMTgtMTggMTh6Ii8+PHBhdGggZmlsbD0iIzJiYiIgZD0iTTExNiAxNDlsMTgtMTggMTggMTgtMTggMTh6bTU0LTU0bDE4LTE4IDE4IDE4LTE4IDE4em0tODkgOTBsMTgtMTggMTggMTgtMTggMTh6bTEzOS04NWwyMyAyM2M0IDQgNCAxMSAwIDE2TDE0MiAyNDBjLTQgNC0xMSA0LTE1IDBsLTI0LTI0Yy00LTQtNC0xMSAwLTE1bDEwMS0xMDFjNS01IDEyLTUgMTYgMHoiLz48cGF0aCBmaWxsPSIjM2VlIiBkPSJNMTM0IDk1bDE4LTE4IDE4IDE4LTE4IDE4em0tNTQgMThsMTgtMTcgMTggMTctMTggMTh6bTU1LTUzbDE4LTE4IDE4IDE4LTE4IDE4em05MyA0OGwtOC04Yy00LTUtMTEtNS0xNiAwTDEwMyAyMDFjLTQgNC00IDExIDAgMTVsOCA4Yy00LTQtNC0xMSAwLTE1bDEwMS0xMDFjNS00IDEyLTQgMTYgMHoiLz48cGF0aCBmaWxsPSIjOWVlIiBkPSJNMjcgMTMxbDE4LTE4IDE4IDE4LTE4IDE4em01NC01M2wxOC0xOCAxOCAxOC0xOCAxOHoiLz48cGF0aCBmaWxsPSIjMGFhIiBkPSJNMjMwIDExMGwxMyAxM2M0IDQgNCAxMSAwIDE2TDE0MiAyNDBjLTQgNC0xMSA0LTE1IDBsLTEzLTEzYzQgNCAxMSA0IDE1IDBsMTAxLTEwMWM1LTUgNS0xMSAwLTE2eiIvPjxwYXRoIGZpbGw9IiMxYWIiIGQ9Ik0xMzQgMjQ4Yy00IDAtOC0yLTExLTVsLTIzLTIzYTE2IDE2IDAgMDEwLTIzTDIwMSA5NmExNiAxNiAwIDAxMjIgMGwyNCAyNGM2IDYgNiAxNiAwIDIyTDE0NiAyNDNjLTMgMy03IDUtMTIgNXptNzgtMTQ3bC00IDItMTAxIDEwMWE2IDYgMCAwMDAgOWwyMyAyM2E2IDYgMCAwMDkgMGwxMDEtMTAxYTYgNiAwIDAwMC05bC0yNC0yMy00LTJ6Ii8+PC9zdmc+)](https://github.com/renovatebot/renovate) - - ---- - -## :book:  Overview - -This repository _is_ my home Kubernetes cluster in a declarative state. [Flux](https://github.com/fluxcd/flux2) watches my [cluster](./cluster/) folder and makes the changes to my cluster based on the YAML manifests. - -Feel free to open a [Github issue](https://github.com/toboshii/home-cluster/issues/new/choose) or join the [k8s@home Discord](https://discord.gg/sTMX7Vh) if you have any questions. - -This repository is built off the [k8s-at-home/template-cluster-k3s](https://github.com/k8s-at-home/template-cluster-k3s) repository. - ---- - -## :sparkles:  Cluster setup - -This cluster consists of VMs provisioned on [PVE](https://www.proxmox.com/en/proxmox-ve) via the [Terraform Proxmox provider](https://github.com/Telmate/terraform-provider-proxmox). These run [k3s](https://k3s.io/) provisioned overtop Ubuntu 20.10 using the [Ansible](https://www.ansible.com/) galaxy role [ansible-role-k3s](https://github.com/PyratLabs/ansible-role-k3s). This cluster is not hyper-converged as block storage is provided by the underlying PVE Ceph cluster using rook-ceph-external. - -See my [server/ansible](./server/ansible/) directory for my playbooks and roles, and [server/terraform](./server/terraform) for infrastructure provisioning. - -## :art:  Cluster components - -- [kube-vip](https://kube-vip.io/): Uses BGP to load balance the control-plane API, making it highly availible without requiring external HA proxy solutions. -- [calico](https://docs.projectcalico.org/about/about-calico): For internal cluster networking using BGP. -- [traefik](https://traefik.io/): Provides ingress cluster services. -- [rook-ceph](https://rook.io/): Provides persistent volumes, allowing any application to consume RBD block storage from the underlying PVE cluster. -- [SOPS](https://toolkit.fluxcd.io/guides/mozilla-sops/): Encrypts secrets which is safe to store - even to a public repository. -- [external-dns](https://github.com/kubernetes-sigs/external-dns): Creates DNS entries in a separate [coredns](https://github.com/coredns/coredns) deployment which is backed by my clusters [etcd](https://github.com/etcd-io/etcd) deployment. -- [cert-manager](https://cert-manager.io/docs/): Configured to create TLS certs for all ingress services automatically using LetsEncrypt. -- [kasten-k10](https://www.kasten.io): Provides disaster recovery via snapshots and out-of-band backups. - ---- - -## :open_file_folder:  Repository structure - -The Git repository contains the following directories under `cluster` and are ordered below by how Flux will apply them. - -- **base** directory is the entrypoint to Flux -- **crds** directory contains custom resource definitions (CRDs) that need to exist globally in your cluster before anything else exists -- **core** directory (depends on **crds**) are important infrastructure applications (grouped by namespace) that should never be pruned by Flux -- **apps** directory (depends on **core**) is where your common applications (grouped by namespace) could be placed, Flux will prune resources here if they are not tracked by Git anymore - -``` -./cluster -├── ./apps -├── ./base -├── ./core -└── ./crds -``` - ---- - -## :robot:  Automate all the things! - -- [Github Actions](https://docs.github.com/en/actions) for checking code formatting -- Rancher [System Upgrade Controller](https://github.com/rancher/system-upgrade-controller) to apply updates to k3s -- [Renovate](https://github.com/renovatebot/renovate) with the help of the [k8s-at-home/renovate-helm-releases](https://github.com/k8s-at-home/renovate-helm-releases) Github action keeps my application charts and container images up-to-date - ---- - -## :spider_web:  Networking - -In my network Calico is configured with BGP on my Brocade ICX 6610. With BGP enabled, I advertise a load balancer using `externalIPs` on my Kubernetes services. This makes it so I do not need `Metallb`. Another benefit to this is that I can directly hit any pods IP directly from any device on my local network. All physical hardware (including local clients) are interconnected with 10gig networking, with a seperate dedicated 10gig network for Ceph traffic. - -| Name | CIDR | -| --------------------------- | --------------- | -| Management | `10.75.10.0/24` | -| Physical Servers | `10.75.30.0/24` | -| CoroSync0 | `10.75.31.0/24` | -| CoroSync1 | `10.75.32.0/24` | -| Ceph Cluster | `10.75.33.0/24` | -| Virtual Servers | `10.75.40.0/24` | -| K8s external services (BGP) | `10.75.45.0/24` | -| K8s pods | `172.22.0.0/16` | -| K8s services | `172.24.0.0/16` | - -## :man_shrugging:  DNS - -_(this section blindly copied from [Devin Buhl](https://github.com/onedr0p/home-cluster) as I could never attempt to explain this in a better way)_ - -To prefix this, I should mention that I only use one domain name for internal and externally facing applications. Also this is the most complicated thing to explain but I will try to sum it up. - -On [pfSense](https://arstechnica.com/gadgets/2021/03/buffer-overruns-license-violations-and-bad-code-freebsd-13s-close-call/) under `Services: DNS Resolver: Domain Overrides` I have a `Domain Override` set to my domain with the address pointing to my _in-cluster-non-cluster service_ CoreDNS load balancer IP. This allows me to use [Split-horizon DNS](https://en.wikipedia.org/wiki/Split-horizon_DNS). [external-dns](https://github.com/kubernetes-sigs/external-dns) reads my clusters `Ingress`'s and inserts DNS records containing the sub-domain and load balancer IP (of traefik) into the _in-cluster-non-cluster service_ CoreDNS service and into Cloudflare depending on if an annotation is present on the ingress. See the diagram below for a visual representation.
- + +[![Discord](https://img.shields.io/discord/673534664354430999?style=for-the-badge&label=discord&logo=discord&logoColor=white)](https://discord.gg/k8s-at-home) +[![talos](https://img.shields.io/badge/talos-v1.1.2-brightgreen?style=for-the-badge&logo=linux&logoColor=white)](https://www.talos.dev/) +[![kubernetes](https://img.shields.io/badge/kubernetes-v1.24.3-brightgreen?style=for-the-badge&logo=kubernetes&logoColor=white)](https://kubernetes.io/) +[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white&style=for-the-badge)](https://github.com/pre-commit/pre-commit) +[![GitHub Workflow Status](https://img.shields.io/github/workflow/status/toboshii/home-ops/Schedule%20-%20Renovate?label=renovate&logo=renovatebot&style=for-the-badge)](https://github.com/toboshii/home-ops/actions/workflows/schedule-renovate.yaml) +[![Lines of code](https://img.shields.io/tokei/lines/github/toboshii/home-ops?style=for-the-badge&color=brightgreen&label=lines&logo=codefactor&logoColor=white)](https://github.com/toboshii/home-ops/graphs/contributors) +
--- -## :gear:  Hardware +## 📖 Overview -| Device | Count | OS Disk Size | Data Disk Size | Ram | Purpose | -| ---------------- | ----- | ------------ | ---------------------------------- | ----- | ---------------------------------------- | -| Intel R1208GL4DS | 4 | 120GB SSD | 2x480GB SSD
4x900GB 10.6k SAS | 64GB | Proxmox hypervisors
and Ceph cluster | -| Intel R1208GL4DS | 1 | 120GB SSD | 2x900GB 10.6k SAS | 32GB | Backup cold spare | -| NAS (franxx) | 1 | 120GB SSD | 16x8TB RAIDZ2
6x4TB ZFS Mirror | 192GB | Media and shared file storage | +This is a mono repository for my home infrastructure and Kubernetes cluster implementing Infrastructure as Code (IaC) and GitOps practices using tools like [Kubernetes](https://kubernetes.io/), [Flux](https://github.com/fluxcd/flux2), [Renovate](https://github.com/renovatebot/renovate) and [GitHub Actions](https://github.com/features/actions). + +Feel free to open a [Github issue](https://github.com/toboshii/home-ops/issues/new/choose) or join the [k8s@home Discord](https://discord.gg/sTMX7Vh) if you have any questions. --- -## :wrench:  Tools +## ⛵ Kubernetes -| Tool | Purpose | -| ------------------------------------------------------ | ------------------------------------------------------------------------- | -| [direnv](https://github.com/direnv/direnv) | Sets `KUBECONFIG` environment variable based on present working directory | -| [go-task](https://github.com/go-task/task) | Alternative to makefiles, who honestly likes that? | -| [pre-commit](https://github.com/pre-commit/pre-commit) | Enforce code consistency and verifies no secrets are pushed | -| [stern](https://github.com/stern/stern) | Tail logs in Kubernetes | +This repo generally attempts to follow the structure and practices of the excellent [k8s-at-home/template-cluster-k3](https://github.com/k8s-at-home/template-cluster-k3s), check it out if you're uncomfortable starting out with an immutable operating system. + +### Installation + +The cluster is running on [Talos Linux](https://talos.dev/), an immutable and ephemeral Linux distribution built around Kubernetes, deployed on bare-metal. [Rook Ceph](https://rook.io/) running hyper-converged with workloads provides persistent block and object storage, while a seperate server provides bulk (NFS) file storage. + +### Core components + +- [cilium/cilium](https://github.com/cilium/cilium): Internal Kubernetes networking plugin. +- [rook/rook](https://github.com/rook/rook): Distributed block storage for peristent storage. +- [mozilla/sops](https://toolkit.fluxcd.io/guides/mozilla-sops/): Manages secrets for Kubernetes, Ansible and Terraform. +- [kubernetes-sigs/external-dns](https://github.com/kubernetes-sigs/external-dns): Automatically manages DNS records from my cluster in a cloud DNS provider. +- [jetstack/cert-manager](https://cert-manager.io/docs/): Creates SSL certificates for services in my Kubernetes cluster. +- [kubernetes/ingress-nginx](https://github.com/kubernetes/ingress-nginx/): Ingress controller to expose HTTP traffic to pods over DNS. + +### GitOps + +[Flux](https://github.com/fluxcd/flux2) watches my [cluster](./cluster/) folder (see Directories below) and makes the changes to my cluster based on the YAML manifests. + +[Renovate](https://github.com/renovatebot/renovate) watches my **entire** repository looking for dependency updates, when they are found a PR is automatically created. When PRs are merged, [Flux](https://github.com/fluxcd/flux2) applies the changes to my cluster. + +### Directories + +This Git repository contains the following directories (_kustomizatons_) under [cluster](./cluster/). + +```sh +📁 cluster # k8s cluster defined as code +├─📁 bootstrap # contains the initial kustomization used to install flux +├─📁 flux # flux, gitops operator, loaded before everything +├─📁 crds # custom resources, loaded before 📁 core and 📁 apps +├─📁 charts # helm repos, loaded before 📁 core and 📁 apps +├─📁 config # cluster config, loaded before 📁 core and 📁 apps +├─📁 core # crucial apps, namespaced dir tree, loaded before 📁 apps +└─📁 apps # regular apps, namespaced dir tree, loaded last +``` + +### Networking + +| Name | CIDR | +|----------------------------------------------|-----------------| +| Kubernetes Nodes | `10.75.40.0/24` | +| Kubernetes external services (Cilium w/ BGP) | `10.75.45.0/24` | +| Kubernetes pods | `172.22.0.0/16` | +| Kubernetes services | `172.24.0.0/16` | + +## 🌐 DNS + +### Ingress Controller + +Over WAN, I have port forwarded ports `80` and `443` to the load balancer IP of my ingress controller that's running in my Kubernetes cluster. + +[Cloudflare](https://www.cloudflare.com/) works as a proxy to hide my homes WAN IP and also as a firewall. When not on my home network, all the traffic coming into my ingress controller on port `80` and `443` comes from Cloudflare. In `VyOS` I block all IPs not originating from [Cloudflares list of IP ranges](https://www.cloudflare.com/ips/). + +🔸 _Cloudflare is also configured to GeoIP block all countries except a few I have whitelisted_ + +### Internal DNS + +[k8s_gateway](https://github.com/ori-edge/k8s_gateway) is deployed on my router running [VyOS](https://vyos.io/). With this setup, `k8s_gateway` has direct access to my clusters ingress records and serves DNS for them in my internal network. + +Without much engineering of DNS @home, these options have made my `VyOS` router a single point of failure for DNS. I believe this is ok though because my router _should_ have the most uptime of all my systems. + +### External DNS + +[external-dns](https://github.com/kubernetes-sigs/external-dns) is deployed in my cluster and configured to sync DNS records to [Cloudflare](https://www.cloudflare.com/). The only ingresses `external-dns` looks at to gather DNS records to put in `Cloudflare` are ones where I explicitly set an annotation of `external-dns.home.arpa/enabled: "true"` --- -## :handshake:  Thanks +## 🔧 Hardware -A lot of inspiration for my cluster came from the people that have shared their clusters over at [awesome-home-kubernetes](https://github.com/k8s-at-home/awesome-home-kubernetes) +| Device | Count | OS Disk Size | Data Disk Size | Ram | Operating System | Purpose | +|---------------------------|-------|--------------|----------------------------|-------|------------------|--------------------------------| +| Dell R220 | 1 | 120GB SSD | N/A | 16GB | VyOS 1.4 | Router | +| HP S01-pf1000 | 3 | 120GB SSD | N/A | 8GB | Talos Linux | Kubernetes Control Nodes | +| HP S01-pf1000 | 3 | 120GB SSD | 1TB NVMe (rook-ceph) | 32GB | Talos Linux | Kubernetes Workers | +| SuperMicro SC836 | 1 | 120GB SSD | 16x8TB + 16x3TB ZFS RAIDZ2 | 192GB | Ubuntu 20.04 | NFS | +| Brocade ICX 6610 | 1 | N/A | N/A | N/A | N/A | Core Switch | +| Raspberry Pi 4B | 1 | 32GB SD Card | N/A | 4GB | PiKVM | Network KVM | +| TESmart 8 Port KVM Switch | 1 | N/A | N/A | N/A | N/A | Network KVM switch for PiKVM | +| APC SUA3000RMXL3U w/ NIC | 1 | N/A | N/A | N/A | N/A | UPS | +| APC AP7930 | 1 | N/A | N/A | N/A | N/A | PDU | + +--- + +## 🤝 Thanks + +Thanks to all folks who donate their time to the [Kubernetes @Home](https://github.com/k8s-at-home/) community. A lot of inspiration for my cluster came from those that have shared their clusters over at [awesome-home-kubernetes](https://github.com/k8s-at-home/awesome-home-kubernetes). + +--- + +## 📜 Changelog + +See [commit history](https://github.com/onedr0p/home-ops/commits/main) + +--- + +## 🔏 License + +See [LICENSE](./LICENSE) diff --git a/ansible/ansible.cfg b/ansible/ansible.cfg deleted file mode 100644 index e4e66d97..00000000 --- a/ansible/ansible.cfg +++ /dev/null @@ -1,53 +0,0 @@ -[defaults] - -#--- General settings -nocows = True -forks = 8 -module_name = command -deprecation_warnings = True -executable = /bin/bash - -#--- Files/Directory settings -log_path = ~/ansible.log -inventory = ./inventory -library = /usr/share/my_modules -remote_tmp = ~/.ansible/tmp -local_tmp = ~/.ansible/tmp -roles_path = ./roles -retry_files_enabled = False - -#--- Fact Caching settings -fact_caching = jsonfile -fact_caching_connection = ~/.ansible/facts_cache -fact_caching_timeout = 7200 - -#--- SSH settings -remote_port = 22 -timeout = 60 -host_key_checking = False -ssh_executable = /usr/bin/ssh -private_key_file = ~/.ssh/id_ed25519 - -force_valid_group_names = ignore - -#--- Speed -callback_whitelist = ansible.posix.profile_tasks -internal_poll_interval = 0.001 - -[inventory] -unparsed_is_failed = true - -[privilege_escalation] -become = True -become_method = sudo -become_user = root -become_ask_pass = False - -[ssh_connection] -scp_if_ssh = smart -transfer_method = smart -retries = 3 -timeout = 10 -ssh_args = -o ControlMaster=auto -o ControlPersist=30m -o Compression=yes -o ServerAliveInterval=15s -pipelining = True -control_path = %(directory)s/%%h-%%r diff --git a/ansible/inventory/home-cluster/group_vars/all/calico.yml b/ansible/inventory/home-cluster/group_vars/all/calico.yml deleted file mode 100644 index f1046940..00000000 --- a/ansible/inventory/home-cluster/group_vars/all/calico.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- - -# Encapsulation type -calico_encapsulation: "None" -# BGP Peer IP -# (usually your router IP address) -calico_bgp_peer_ip: 10.75.40.1 -# BGP Autonomous System Number -# (must be the same across all BGP peers) -calico_bgp_as_number: 64512 -# BGP Network you want services to consume -# (this network should not exist or be defined anywhere in your network) -calico_bgp_external_ips: 10.75.45.0/24 -# CIDR of the host node interface Calico should use -calico_node_cidr: 10.75.40.0/24 diff --git a/ansible/inventory/home-cluster/group_vars/all/k3s.yml b/ansible/inventory/home-cluster/group_vars/all/k3s.yml deleted file mode 100644 index 1d0796c8..00000000 --- a/ansible/inventory/home-cluster/group_vars/all/k3s.yml +++ /dev/null @@ -1,35 +0,0 @@ ---- -# -# Below vars are for the xanmanning.k3s role -# ...see https://github.com/PyratLabs/ansible-role-k3s#globalcluster-variables -# - -# Use a specific version of k3s -k3s_release_version: "v1.21.2+k3s1" - -# Install using hard links rather than symbolic links. -# ...if you are using the system-upgrade-controller you will need to use hard links rather than symbolic links as the controller will not be able to follow symbolic links. -k3s_install_hard_links: true - -# Escalate user privileges for all tasks. -k3s_become_for_all: true - -# Enable debugging -k3s_debug: false - -# HA settings -k3s_etcd_datastore: true -k3s_registration_address: 10.75.45.5 -k3s_registration_domain: k8s-api.dfw.56k.sh - -k3s_server_manifests_templates: - - "calico/calico-installation.yaml.j2" - - "calico/calico-bgpconfiguration.yaml.j2" - - "calico/calico-bgppeer.yaml.j2" - - "kube-vip/kube-vip-rbac.yaml.j2" - - "kube-vip/kube-vip-daemonset.yaml.j2" - -# Custom manifest URLs -k3s_server_manifests_urls: - - url: https://docs.projectcalico.org/archive/v3.19/manifests/tigera-operator.yaml - filename: tigera-operator.yaml diff --git a/ansible/inventory/home-cluster/group_vars/all/kube-vip.yml b/ansible/inventory/home-cluster/group_vars/all/kube-vip.yml deleted file mode 100644 index 373ae77b..00000000 --- a/ansible/inventory/home-cluster/group_vars/all/kube-vip.yml +++ /dev/null @@ -1,11 +0,0 @@ ---- - -kubevip_interface: eth0 - -kubevip_bgp_peer_ip: 10.75.40.1 - -kubevip_address: 10.75.45.5 - -kubevip_bgp_as_number: 64512 - -kubevip_bgp_peer_as_number: 64512 diff --git a/ansible/inventory/home-cluster/group_vars/all/rsyslog.yml b/ansible/inventory/home-cluster/group_vars/all/rsyslog.yml deleted file mode 100644 index e96a21ed..00000000 --- a/ansible/inventory/home-cluster/group_vars/all/rsyslog.yml +++ /dev/null @@ -1,8 +0,0 @@ ---- - -# Enable rsyslog -# ...requires a rsyslog server already set up -rsyslog: - enabled: false - ip: 10.75.45.102 - port: 1514 diff --git a/ansible/inventory/home-cluster/group_vars/all/ubuntu.yml b/ansible/inventory/home-cluster/group_vars/all/ubuntu.yml deleted file mode 100644 index d207d47c..00000000 --- a/ansible/inventory/home-cluster/group_vars/all/ubuntu.yml +++ /dev/null @@ -1,20 +0,0 @@ ---- -# Enable to skip apt upgrade -skip_upgrade_packages: false -# Enable to skip removing crufty packages -skip_remove_packages: false - -# Timezone for the servers -timezone: "America/Chicago" - -# Set custom ntp servers -ntp_servers: - primary: - - "gw.dfw.56k.sh" - fallback: - - "0.us.pool.ntp.org" - - "1.us.pool.ntp.org" - - "2.us.pool.ntp.org" - - "3.us.pool.ntp.org" -# Additional ssh public keys to add to the nodes -# ssh_authorized_keys: diff --git a/ansible/inventory/home-cluster/group_vars/gpu-nodes/nvidia-settings.yml b/ansible/inventory/home-cluster/group_vars/gpu-nodes/nvidia-settings.yml deleted file mode 100644 index 240e3e09..00000000 --- a/ansible/inventory/home-cluster/group_vars/gpu-nodes/nvidia-settings.yml +++ /dev/null @@ -1,9 +0,0 @@ ---- - -nvidia_driver: - version: "465.27" - checksum: "sha256:7e69ffa85bdee6aaaa6b6ea7e1db283b0199f9ab21e41a27dc9048f249dc3171" - -nvidia_patch: - version: "d5d564b888aaef99fdd45e23f2fc3eae8e337a39" - checksum: "sha256:d80928c381d141734c13463d69bfaecff77ac66ee6f9036b2f0348b8602989d8" diff --git a/ansible/inventory/home-cluster/group_vars/master-nodes/k3s-settings.yml b/ansible/inventory/home-cluster/group_vars/master-nodes/k3s-settings.yml deleted file mode 100644 index db5b0ea2..00000000 --- a/ansible/inventory/home-cluster/group_vars/master-nodes/k3s-settings.yml +++ /dev/null @@ -1,40 +0,0 @@ ---- -# https://rancher.com/docs/k3s/latest/en/installation/install-options/server-config/ -# https://github.com/PyratLabs/ansible-role-k3s#server-control-plane-configuration - -# Define the host as control plane nodes -k3s_control_node: true - -# k3s settings for all control-plane nodes -k3s_server: - node-ip: "{{ ansible_host }}" - tls-san: - - "{{ k3s_registration_domain }}" - - "{{ k3s_registration_address }}" - docker: false - flannel-backend: "none" # This needs to be in quotes - disable: - - flannel - - traefik - - servicelb - - metrics-server - - local-storage - disable-network-policy: true - disable-cloud-controller: true - write-kubeconfig-mode: "644" - # Network CIDR to use for pod IPs - cluster-cidr: "172.22.0.0/16" - # Network CIDR to use for service IPs - service-cidr: "172.24.0.0/16" - kubelet-arg: - - "feature-gates=GracefulNodeShutdown=true" - # Required to use kube-prometheus-stack - kube-controller-manager-arg: - - "address=0.0.0.0" - - "bind-address=0.0.0.0" - kube-proxy-arg: - - "metrics-bind-address=0.0.0.0" - kube-scheduler-arg: - - "address=0.0.0.0" - - "bind-address=0.0.0.0" - etcd-expose-metrics: true diff --git a/ansible/inventory/home-cluster/group_vars/worker-nodes/k3s-settings.yml b/ansible/inventory/home-cluster/group_vars/worker-nodes/k3s-settings.yml deleted file mode 100644 index 8584868b..00000000 --- a/ansible/inventory/home-cluster/group_vars/worker-nodes/k3s-settings.yml +++ /dev/null @@ -1,12 +0,0 @@ ---- -# https://rancher.com/docs/k3s/latest/en/installation/install-options/agent-config/ -# https://github.com/PyratLabs/ansible-role-k3s#agent-worker-configuration - -# Don't define the host as control plane nodes -k3s_control_node: false - -# k3s settings for all worker nodes -k3s_agent: - node-ip: "{{ ansible_host }}" - kubelet-arg: - - "feature-gates=GracefulNodeShutdown=true" diff --git a/ansible/inventory/home-cluster/host_vars/k8s-cuda01.yml b/ansible/inventory/home-cluster/host_vars/k8s-cuda01.yml deleted file mode 100644 index 87437932..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-cuda01.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -# IP address of node -ansible_host: "10.75.40.24" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - devices: - - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-master01.yml b/ansible/inventory/home-cluster/host_vars/k8s-master01.yml deleted file mode 100644 index 48215e54..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-master01.yml +++ /dev/null @@ -1,16 +0,0 @@ ---- - -# IP address of node -ansible_host: "10.75.40.10" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - # devices: - # - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-master02.yml b/ansible/inventory/home-cluster/host_vars/k8s-master02.yml deleted file mode 100644 index e7fd9ab1..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-master02.yml +++ /dev/null @@ -1,16 +0,0 @@ ---- - -# IP address of node -ansible_host: "10.75.40.11" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - # devices: - # - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-master03.yml b/ansible/inventory/home-cluster/host_vars/k8s-master03.yml deleted file mode 100644 index 0a8ab230..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-master03.yml +++ /dev/null @@ -1,16 +0,0 @@ ---- - -# IP address of node -ansible_host: "10.75.40.12" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - # devices: - # - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-worker01.yml b/ansible/inventory/home-cluster/host_vars/k8s-worker01.yml deleted file mode 100644 index d2b311d3..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-worker01.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -# IP address of node -ansible_host: "10.75.40.20" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - devices: - - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-worker02.yml b/ansible/inventory/home-cluster/host_vars/k8s-worker02.yml deleted file mode 100644 index 28a69c52..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-worker02.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -# IP address of node -ansible_host: "10.75.40.21" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - devices: - - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-worker03.yml b/ansible/inventory/home-cluster/host_vars/k8s-worker03.yml deleted file mode 100644 index 706e01e8..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-worker03.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -# IP address of node -ansible_host: "10.75.40.22" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - devices: - - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/k8s-worker04.yml b/ansible/inventory/home-cluster/host_vars/k8s-worker04.yml deleted file mode 100644 index e045ddcb..00000000 --- a/ansible/inventory/home-cluster/host_vars/k8s-worker04.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -# IP address of node -ansible_host: "10.75.40.23" - -# Ansible user to ssh into servers with -ansible_user: "ubuntu" -# ansible_ssh_pass: "ubuntu" -# ansible_ssh_common_args: "-o UserKnownHostsFile=/dev/null" -ansible_become_pass: "ubuntu" - -# Set enabled to true to mark this host as running a distributed storage rook-ceph -rook_ceph: - enabled: false - devices: - - /dev/nvme0n1 diff --git a/ansible/inventory/home-cluster/host_vars/nas-franxx.yml b/ansible/inventory/home-cluster/host_vars/nas-franxx.yml deleted file mode 100644 index 6c8289c6..00000000 --- a/ansible/inventory/home-cluster/host_vars/nas-franxx.yml +++ /dev/null @@ -1,5 +0,0 @@ ---- - -ansible_host: "10.75.30.15" -ansible_user: toboshii -ansible_become: true diff --git a/ansible/inventory/home-cluster/hosts.yml b/ansible/inventory/home-cluster/hosts.yml deleted file mode 100644 index b3c4fd6d..00000000 --- a/ansible/inventory/home-cluster/hosts.yml +++ /dev/null @@ -1,27 +0,0 @@ ---- - -all: - children: - # Control Plane group, do not change the 'control-plane' name - # hosts should match the filenames in 'host_vars' - master-nodes: - hosts: - k8s-master01: - k8s-master02: - k8s-master03: - # Node group, do not change the 'node' name - # hosts should match the filenames in 'host_vars' - worker-nodes: - hosts: - k8s-worker01: - k8s-worker02: - k8s-worker03: - k8s-worker04: - gpu-nodes: - hosts: - k8s-cuda01: - # Storage group, these are my NAS devices - # hosts should match the filenames in 'host_vars' - storage: - hosts: - nas-franxx: diff --git a/ansible/playbooks/k3s/install.yml b/ansible/playbooks/k3s/install.yml deleted file mode 100644 index 3f2b3e8d..00000000 --- a/ansible/playbooks/k3s/install.yml +++ /dev/null @@ -1,26 +0,0 @@ ---- -- hosts: - - master-nodes - - worker-nodes - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - roles: - - k3s - -- hosts: - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - roles: - - nvidia diff --git a/ansible/playbooks/k3s/nuke.yml b/ansible/playbooks/k3s/nuke.yml deleted file mode 100644 index f9523f14..00000000 --- a/ansible/playbooks/k3s/nuke.yml +++ /dev/null @@ -1,33 +0,0 @@ ---- -- hosts: - - master-nodes - - worker-nodes - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - tasks: - - name: kill k3s - ansible.builtin.command: /usr/local/bin/k3s-killall.sh - - name: uninstall k3s - ansible.builtin.command: - cmd: /usr/local/bin/k3s-uninstall.sh - removes: /usr/local/bin/k3s-uninstall.sh - - name: uninstall k3s agent - ansible.builtin.command: - cmd: /usr/local/bin/k3s-agent-uninstall.sh - removes: /usr/local/bin/k3s-agent-uninstall.sh - - name: gather list of CNI files to delete - find: - paths: /etc/cni/net.d - patterns: "*" - register: files_to_delete - - name: delete CNI files - ansible.builtin.file: - path: "{{ item.path }}" - state: absent - loop: "{{ files_to_delete.files }}" diff --git a/ansible/playbooks/k3s/upgrade.yml b/ansible/playbooks/k3s/upgrade.yml deleted file mode 100644 index 63a1e529..00000000 --- a/ansible/playbooks/k3s/upgrade.yml +++ /dev/null @@ -1,14 +0,0 @@ ---- -- hosts: - - master-nodes - - worker-nodes - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - roles: - - k3s diff --git a/ansible/playbooks/ubuntu/prepare.yml b/ansible/playbooks/ubuntu/prepare.yml deleted file mode 100644 index e9fe88c1..00000000 --- a/ansible/playbooks/ubuntu/prepare.yml +++ /dev/null @@ -1,14 +0,0 @@ ---- -- hosts: - - master-nodes - - worker-nodes - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - roles: - - ubuntu diff --git a/ansible/playbooks/ubuntu/upgrade.yml b/ansible/playbooks/ubuntu/upgrade.yml deleted file mode 100644 index dfc98c69..00000000 --- a/ansible/playbooks/ubuntu/upgrade.yml +++ /dev/null @@ -1,23 +0,0 @@ ---- -- hosts: - - master-nodes - - worker-nodes - - gpu-nodes - become: true - gather_facts: true - any_errors_fatal: true - pre_tasks: - - name: Pausing for 5 seconds... - pause: - seconds: 5 - tasks: - - name: upgrade - ansible.builtin.apt: - upgrade: full - update_cache: true - cache_valid_time: 3600 - autoclean: true - autoremove: true - register: apt_upgrade - retries: 5 - until: apt_upgrade is success diff --git a/ansible/requirements.txt b/ansible/requirements.txt deleted file mode 100644 index d3119825..00000000 --- a/ansible/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -jmespath==0.10.0 diff --git a/ansible/requirements.yml b/ansible/requirements.yml deleted file mode 100644 index d1fa4490..00000000 --- a/ansible/requirements.yml +++ /dev/null @@ -1,6 +0,0 @@ ---- -roles: - - src: xanmanning.k3s - version: v2.11.1 -collections: - - name: community.general diff --git a/ansible/roles/k3s/tasks/addons.yml b/ansible/roles/k3s/tasks/addons.yml deleted file mode 100644 index 990f07c4..00000000 --- a/ansible/roles/k3s/tasks/addons.yml +++ /dev/null @@ -1,13 +0,0 @@ ---- - -- name: addons | check if cluster is installed - ansible.builtin.stat: - path: "/etc/rancher/k3s/config.yaml" - register: k3s_check_installed - check_mode: false - -- name: addons | set manifest facts - ansible.builtin.set_fact: - k3s_server_manifests_templates: [] - k3s_server_manifests_urls: [] - when: k3s_check_installed.stat.exists diff --git a/ansible/roles/k3s/tasks/cleanup.yml b/ansible/roles/k3s/tasks/cleanup.yml deleted file mode 100644 index 4a1e4ce1..00000000 --- a/ansible/roles/k3s/tasks/cleanup.yml +++ /dev/null @@ -1,13 +0,0 @@ ---- - -- name: cleanup | remove deployed manifest templates - ansible.builtin.file: - path: "{{ k3s_server_manifests_dir }}/{{ item | basename | regex_replace('\\.j2$', '') }}" - state: absent - loop: "{{ k3s_server_manifests_templates }}" - -- name: cleanup | remove deployed manifest urls - ansible.builtin.file: - path: "{{ k3s_server_manifests_dir }}/{{ item.filename }}" - state: absent - loop: "{{ k3s_server_manifests_urls }}" diff --git a/ansible/roles/k3s/tasks/main.yml b/ansible/roles/k3s/tasks/main.yml deleted file mode 100644 index 1df3a716..00000000 --- a/ansible/roles/k3s/tasks/main.yml +++ /dev/null @@ -1,17 +0,0 @@ ---- -- include: addons.yml - tags: - - addons - -- name: k3s | cluster configuration - include_role: - name: xanmanning.k3s - public: true - -- include: cleanup.yml - tags: - - cleanup - -- include: kubeconfig.yml - tags: - - kubeconfig diff --git a/ansible/roles/k3s/templates/calico/calico-bgpconfiguration.yaml.j2 b/ansible/roles/k3s/templates/calico/calico-bgpconfiguration.yaml.j2 deleted file mode 100644 index 2ef46fa1..00000000 --- a/ansible/roles/k3s/templates/calico/calico-bgpconfiguration.yaml.j2 +++ /dev/null @@ -1,10 +0,0 @@ ---- -apiVersion: crd.projectcalico.org/v1 -kind: BGPConfiguration -metadata: - name: default -spec: - serviceClusterIPs: - - cidr: "{{ k3s_server['service-cidr'] }}" - serviceExternalIPs: - - cidr: "{{ calico_bgp_external_ips }}" diff --git a/ansible/roles/k3s/templates/calico/calico-bgppeer.yaml.j2 b/ansible/roles/k3s/templates/calico/calico-bgppeer.yaml.j2 deleted file mode 100644 index bfa7cb01..00000000 --- a/ansible/roles/k3s/templates/calico/calico-bgppeer.yaml.j2 +++ /dev/null @@ -1,8 +0,0 @@ ---- -apiVersion: crd.projectcalico.org/v1 -kind: BGPPeer -metadata: - name: global -spec: - peerIP: {{ calico_bgp_peer_ip }} - asNumber: {{ calico_bgp_as_number }} diff --git a/ansible/roles/k3s/templates/calico/calico-installation.yaml.j2 b/ansible/roles/k3s/templates/calico/calico-installation.yaml.j2 deleted file mode 100644 index 16fe73d7..00000000 --- a/ansible/roles/k3s/templates/calico/calico-installation.yaml.j2 +++ /dev/null @@ -1,18 +0,0 @@ -#jinja2:lstrip_blocks: True ---- -apiVersion: operator.tigera.io/v1 -kind: Installation -metadata: - name: default -spec: - calicoNetwork: - # Note: The ipPools section cannot be modified post-install. - ipPools: - - blockSize: 26 - cidr: "{{ k3s_server["cluster-cidr"] }}" - encapsulation: "{{ calico_encapsulation }}" - natOutgoing: Enabled - nodeSelector: all() - nodeAddressAutodetectionV4: - cidrs: - - "{{ calico_node_cidr }}" diff --git a/ansible/roles/k3s/templates/kube-vip/kube-vip-daemonset.yaml.j2 b/ansible/roles/k3s/templates/kube-vip/kube-vip-daemonset.yaml.j2 deleted file mode 100644 index 20fe5eca..00000000 --- a/ansible/roles/k3s/templates/kube-vip/kube-vip-daemonset.yaml.j2 +++ /dev/null @@ -1,64 +0,0 @@ ---- -apiVersion: apps/v1 -kind: DaemonSet -metadata: - name: kube-vip - namespace: kube-system -spec: - selector: - matchLabels: - name: kube-vip - template: - metadata: - labels: - name: kube-vip - spec: - containers: - - name: kube-vip - image: ghcr.io/kube-vip/kube-vip:v0.3.5 - imagePullPolicy: IfNotPresent - args: - - manager - env: - - name: vip_arp - value: "false" - - name: vip_interface - value: lo - - name: port - value: "6443" - - name: vip_cidr - value: "32" - - name: cp_enable - value: "true" - - name: cp_namespace - value: kube-system - - name: vip_startleader - value: "false" - - name: vip_loglevel - value: "5" - - name: bgp_enable - value: "true" - - name: bgp_routerinterface - value: "{{ kubevip_interface }}" - - name: bgp_as - value: "{{ kubevip_bgp_as_number }}" - - name: bgp_peeraddress - value: "{{ kubevip_bgp_peer_ip }}" - - name: bgp_peeras - value: "{{ kubevip_bgp_peer_as_number }}" - - name: vip_address - value: "{{ kubevip_address }}" - securityContext: - capabilities: - add: - - NET_ADMIN - - NET_RAW - - SYS_TIME - hostNetwork: true - nodeSelector: - node-role.kubernetes.io/master: "true" - serviceAccountName: kube-vip - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule diff --git a/ansible/roles/k3s/templates/kube-vip/kube-vip-rbac.yaml.j2 b/ansible/roles/k3s/templates/kube-vip/kube-vip-rbac.yaml.j2 deleted file mode 100644 index d1fef4f6..00000000 --- a/ansible/roles/k3s/templates/kube-vip/kube-vip-rbac.yaml.j2 +++ /dev/null @@ -1,33 +0,0 @@ ---- -apiVersion: v1 -kind: ServiceAccount -metadata: - name: kube-vip - namespace: kube-system ---- -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRole -metadata: - annotations: - rbac.authorization.kubernetes.io/autoupdate: "true" - name: system:kube-vip-role -rules: -- apiGroups: [""] - resources: ["services", "services/status", "nodes"] - verbs: ["list", "get", "watch", "update"] -- apiGroups: ["coordination.k8s.io"] - resources: ["leases"] - verbs: ["list", "get", "watch", "update", "create"] ---- -kind: ClusterRoleBinding -apiVersion: rbac.authorization.k8s.io/v1 -metadata: - name: system:kube-vip-binding -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: ClusterRole - name: system:kube-vip-role -subjects: -- kind: ServiceAccount - name: kube-vip - namespace: kube-system diff --git a/ansible/roles/nvidia/files/blacklist-nouveau.conf b/ansible/roles/nvidia/files/blacklist-nouveau.conf deleted file mode 100644 index c9b9bfcf..00000000 --- a/ansible/roles/nvidia/files/blacklist-nouveau.conf +++ /dev/null @@ -1,2 +0,0 @@ -blacklist nouveau -options nouveau modeset=0 diff --git a/ansible/roles/nvidia/files/config.toml.tmpl b/ansible/roles/nvidia/files/config.toml.tmpl deleted file mode 100644 index c4778987..00000000 --- a/ansible/roles/nvidia/files/config.toml.tmpl +++ /dev/null @@ -1,53 +0,0 @@ -[plugins.opt] - path = "{{ .NodeConfig.Containerd.Opt }}" -[plugins.cri] - stream_server_address = "127.0.0.1" - stream_server_port = "10010" - enable_selinux = {{ .NodeConfig.SELinux }} -{{- if .IsRunningInUserNS }} - disable_cgroup = true - disable_apparmor = true - restrict_oom_score_adj = true -{{end}} -{{- if .NodeConfig.AgentConfig.PauseImage }} - sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}" -{{end}} -{{- if .NodeConfig.AgentConfig.Snapshotter }} -[plugins.cri.containerd] - disable_snapshot_annotations = true - snapshotter = "{{ .NodeConfig.AgentConfig.Snapshotter }}" -{{end}} -{{- if not .NodeConfig.NoFlannel }} -[plugins.cri.cni] - bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}" - conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}" -{{end}} -[plugins.cri.containerd.runtimes.runc] - runtime_type = "io.containerd.runtime.v1.linux" - -[plugins.linux] - runtime = "nvidia-container-runtime" -{{ if .PrivateRegistryConfig }} -{{ if .PrivateRegistryConfig.Mirrors }} -[plugins.cri.registry.mirrors]{{end}} -{{range $k, $v := .PrivateRegistryConfig.Mirrors }} -[plugins.cri.registry.mirrors."{{$k}}"] - endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}] -{{end}} -{{range $k, $v := .PrivateRegistryConfig.Configs }} -{{ if $v.Auth }} -[plugins.cri.registry.configs."{{$k}}".auth] - {{ if $v.Auth.Username }}username = {{ printf "%q" $v.Auth.Username }}{{end}} - {{ if $v.Auth.Password }}password = {{ printf "%q" $v.Auth.Password }}{{end}} - {{ if $v.Auth.Auth }}auth = {{ printf "%q" $v.Auth.Auth }}{{end}} - {{ if $v.Auth.IdentityToken }}identitytoken = {{ printf "%q" $v.Auth.IdentityToken }}{{end}} -{{end}} -{{ if $v.TLS }} -[plugins.cri.registry.configs."{{$k}}".tls] - {{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}} - {{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}} - {{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}} - {{ if $v.TLS.InsecureSkipVerify }}insecure_skip_verify = true{{end}} -{{end}} -{{end}} -{{end}} diff --git a/ansible/roles/nvidia/tasks/container-runtime.yml b/ansible/roles/nvidia/tasks/container-runtime.yml deleted file mode 100644 index fe08c8b3..00000000 --- a/ansible/roles/nvidia/tasks/container-runtime.yml +++ /dev/null @@ -1,21 +0,0 @@ ---- -- name: container-runtime | add apt key - ansible.builtin.apt_key: - url: https://nvidia.github.io/nvidia-container-runtime/gpgkey - state: present - -- name: container-runtime | add apt repos - ansible.builtin.apt_repository: - repo: "{{ item }}" - state: present - mode: 0644 - update_cache: true - filename: nvidia-container-runtime - with_items: - - "deb https://nvidia.github.io/libnvidia-container/stable/{{ ansible_distribution | lower }}{{ ansible_distribution_version }}/$(ARCH) /" - - "deb https://nvidia.github.io/nvidia-container-runtime/stable/{{ ansible_distribution | lower }}{{ ansible_distribution_version }}/$(ARCH) /" - -- name: container-runtime | install nvidia-container-runtime - ansible.builtin.apt: - name: "nvidia-container-runtime" - state: present diff --git a/ansible/roles/nvidia/tasks/driver.yml b/ansible/roles/nvidia/tasks/driver.yml deleted file mode 100644 index 4cd27b8c..00000000 --- a/ansible/roles/nvidia/tasks/driver.yml +++ /dev/null @@ -1,39 +0,0 @@ ---- -- name: driver | blacklist nouveau driver - ansible.builtin.copy: - src: files/blacklist-nouveau.conf - dest: /etc/modprobe.d/blacklist-nouveau.conf - register: blacklist - -- name: driver | update initramfs - ansible.builtin.command: "update-initramfs -u" - when: blacklist.changed - -- name: driver | reboot to unload nouveau - ansible.builtin.reboot: - when: blacklist.changed - -- name: driver | install dkms build tools - ansible.builtin.apt: - name: "{{ item }}" - state: present - with_items: - - "dkms" - - "build-essential" - -- name: driver | download nvidia driver - ansible.builtin.get_url: - url: https://international.download.nvidia.com/XFree86/Linux-x86_64/{{ nvidia_driver.version }}/NVIDIA-Linux-x86_64-{{ nvidia_driver.version }}.run - dest: /tmp/NVIDIA-Linux-x86_64-{{ nvidia_driver.version }}.run - checksum: "{{ nvidia_driver.checksum }}" - mode: "0755" - -- name: driver | install nvidia driver - ansible.builtin.command: - cmd: "/tmp/NVIDIA-Linux-x86_64-{{ nvidia_driver.version }}.run -s --no-opengl-files --dkms" - creates: "/proc/driver/nvidia/version" - -- name: driver | load nvidia driver - modprobe: - name: nvidia - state: present diff --git a/ansible/roles/nvidia/tasks/k3s-agent.yml b/ansible/roles/nvidia/tasks/k3s-agent.yml deleted file mode 100644 index 5625a7d6..00000000 --- a/ansible/roles/nvidia/tasks/k3s-agent.yml +++ /dev/null @@ -1,13 +0,0 @@ ---- - -- name: k3s-agent | enable nvidia-container-runtime - ansible.builtin.copy: - src: files/config.toml.tmpl - dest: /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl - register: containerd_config - -- name: k3s-agent | restart agent - service: - name: k3s - state: restarted - when: containerd_config.changed diff --git a/ansible/roles/nvidia/tasks/main.yml b/ansible/roles/nvidia/tasks/main.yml deleted file mode 100644 index 15b318fc..00000000 --- a/ansible/roles/nvidia/tasks/main.yml +++ /dev/null @@ -1,17 +0,0 @@ ---- - -- include: driver.yml - tags: - - driver - -- include: patch.yml - tags: - - patch - -- include: container-runtime.yml - tags: - - container-runtime - -- include: k3s-agent.yml - tags: - - k3s-agent diff --git a/ansible/roles/nvidia/tasks/patch.yml b/ansible/roles/nvidia/tasks/patch.yml deleted file mode 100644 index 80ff2b72..00000000 --- a/ansible/roles/nvidia/tasks/patch.yml +++ /dev/null @@ -1,18 +0,0 @@ ---- - -- name: patch | create patch directory - ansible.builtin.file: - path: /opt/nvidia-patch - state: directory - mode: '0755' - -- name: patch | download nvidia-patch - ansible.builtin.get_url: - url: https://raw.githubusercontent.com/keylase/nvidia-patch/{{ nvidia_patch.version }}/patch.sh - dest: /opt/nvidia-patch/patch.sh - checksum: "{{ nvidia_patch.checksum }}" - mode: '0755' - -- name: patch | patch current nvidia driver - ansible.builtin.command: - cmd: /opt/nvidia-patch/patch.sh diff --git a/ansible/roles/ubuntu/defaults/main.yml b/ansible/roles/ubuntu/defaults/main.yml deleted file mode 100644 index e1fb1945..00000000 --- a/ansible/roles/ubuntu/defaults/main.yml +++ /dev/null @@ -1,46 +0,0 @@ ---- -packages: - apt_install: - - apt-transport-https - - arptables - - ca-certificates - - curl - - ebtables - - gdisk - - hdparm - - htop - - iputils-ping - - ipvsadm - - net-tools - - nfs-common - - nano - - ntpdate - - open-iscsi - - psmisc - - socat - - software-properties-common - - unattended-upgrades - - unzip - apt_remove: - - apport - - bcache-tools - - btrfs-progs - - byobu - - cloud-init - - cloud-guest-utils - - cloud-initramfs-copymods - - cloud-initramfs-dyn-netconf - - friendly-recovery - - fwupd - - landscape-common - - lxd-agent-loader - - ntfs-3g - - open-vm-tools - - plymouth - - plymouth-theme-ubuntu-text - - popularity-contest - - snapd - - sosreport - - tmux - - ubuntu-advantage-tools - - ufw diff --git a/ansible/roles/ubuntu/handlers/main.yml b/ansible/roles/ubuntu/handlers/main.yml deleted file mode 100644 index 217990fd..00000000 --- a/ansible/roles/ubuntu/handlers/main.yml +++ /dev/null @@ -1,4 +0,0 @@ ---- - -- name: reboot - ansible.builtin.reboot: diff --git a/ansible/roles/ubuntu/tasks/boot.yml b/ansible/roles/ubuntu/tasks/boot.yml deleted file mode 100644 index 760f8c67..00000000 --- a/ansible/roles/ubuntu/tasks/boot.yml +++ /dev/null @@ -1,46 +0,0 @@ ---- -- name: boot | grub | check for existence of grub - ansible.builtin.stat: - path: /etc/default/grub - register: grub_result - -- name: boot | grub | set apparmor=0 - ansible.builtin.replace: - path: /etc/default/grub - regexp: '^(GRUB_CMDLINE_LINUX=(?:(?![" ]{{ option | regex_escape }}=).)*)(?:[" ]{{ option | regex_escape }}=\S+)?(.*")$' - replace: '\1 {{ option }}={{ value }}\2' - vars: - option: apparmor - value: 0 - when: - - grub_result.stat.exists - notify: reboot - -- name: boot | grub | set mitigations=off - ansible.builtin.replace: - path: /etc/default/grub - regexp: '^(GRUB_CMDLINE_LINUX=(?:(?![" ]{{ option | regex_escape }}=).)*)(?:[" ]{{ option | regex_escape }}=\S+)?(.*")$' - replace: '\1 {{ option }}={{ value }}\2' - vars: - option: mitigations - value: "off" - when: - - grub_result.stat.exists - notify: reboot - -- name: boot | grub | set pti=off - ansible.builtin.replace: - path: /etc/default/grub - regexp: '^(GRUB_CMDLINE_LINUX=(?:(?![" ]{{ option | regex_escape }}=).)*)(?:[" ]{{ option | regex_escape }}=\S+)?(.*")$' - replace: '\1 {{ option }}={{ value }}\2' - vars: - option: pti - value: "off" - when: - - grub_result.stat.exists - notify: reboot - -- name: boot | grub | run grub-mkconfig - ansible.builtin.command: grub-mkconfig -o /boot/grub/grub.cfg - when: - - grub_result.stat.exists diff --git a/ansible/roles/ubuntu/tasks/filesystem.yml b/ansible/roles/ubuntu/tasks/filesystem.yml deleted file mode 100644 index f13c0fa0..00000000 --- a/ansible/roles/ubuntu/tasks/filesystem.yml +++ /dev/null @@ -1,21 +0,0 @@ ---- - -- name: filesystem | sysctl | update max_user_watches - ansible.posix.sysctl: - name: fs.inotify.max_user_watches - value: "65536" - state: present - sysctl_file: /etc/sysctl.d/98-kubernetes-fs.conf - -- name: filesystem | swap | disable at runtime - ansible.builtin.command: swapoff -a - when: ansible_swaptotal_mb > 0 - -- name: filesystem | swap| disable on boot - ansible.posix.mount: - name: "{{ item }}" - fstype: swap - state: absent - loop: - - swap - - none diff --git a/ansible/roles/ubuntu/tasks/host.yml b/ansible/roles/ubuntu/tasks/host.yml deleted file mode 100644 index 3582ba57..00000000 --- a/ansible/roles/ubuntu/tasks/host.yml +++ /dev/null @@ -1,6 +0,0 @@ ---- -- name: host | hostname | update inventory hostname - ansible.builtin.hostname: - name: "{{ inventory_hostname }}" - when: - - ansible_hostname != inventory_hostname diff --git a/ansible/roles/ubuntu/tasks/kernel.yml b/ansible/roles/ubuntu/tasks/kernel.yml deleted file mode 100644 index a26592b5..00000000 --- a/ansible/roles/ubuntu/tasks/kernel.yml +++ /dev/null @@ -1,19 +0,0 @@ ---- -- name: kernel | modules | enable at runtime - community.general.modprobe: - name: "{{ item }}" - state: present - loop: - - br_netfilter - - overlay - - rbd - -- name: kernel | modules | enable on boot - ansible.builtin.copy: - mode: 0644 - content: "{{ item }}" - dest: "/etc/modules-load.d/{{ item }}.conf" - loop: - - br_netfilter - - overlay - - rbd diff --git a/ansible/roles/ubuntu/tasks/locale.yml b/ansible/roles/ubuntu/tasks/locale.yml deleted file mode 100644 index ab3e19a4..00000000 --- a/ansible/roles/ubuntu/tasks/locale.yml +++ /dev/null @@ -1,44 +0,0 @@ ---- -- name: locale | set timezone - community.general.timezone: - name: "{{ timezone | default('America/Chicago') }}" - -- name: locale | copy timesyncd config - ansible.builtin.copy: - mode: 0644 - content: | - [Time] - NTP={{ ntp_servers.primary | default("") | join(" ") }} - FallbackNTP={{ ntp_servers.fallback | join(" ") }} - dest: /etc/systemd/timesyncd.conf - when: - - ntp_servers.primary is defined - - ntp_servers.primary is iterable - - ntp_servers.primary | length > 0 - - ntp_servers.fallback is defined - - ntp_servers.fallback is iterable - - ntp_servers.fallback | length > 0 - -- name: locale | start systemd service - ansible.builtin.systemd: - name: systemd-timesyncd - enabled: true - state: started - -- name: locale | restart systemd service - ansible.builtin.systemd: - name: systemd-timesyncd - daemon_reload: true - enabled: true - state: restarted - -- name: locale | run timedatectl status - ansible.builtin.command: /usr/bin/timedatectl show - changed_when: false - check_mode: false - register: timedatectl_result - -- name: locale | enable ntp - ansible.builtin.command: /usr/bin/timedatectl set-ntp true - when: - - "'NTP=no' in timedatectl_result.stdout" diff --git a/ansible/roles/ubuntu/tasks/main.yml b/ansible/roles/ubuntu/tasks/main.yml deleted file mode 100644 index 592a8adc..00000000 --- a/ansible/roles/ubuntu/tasks/main.yml +++ /dev/null @@ -1,48 +0,0 @@ ---- - -- include: host.yml - tags: - - host - -- include: locale.yml - tags: - - locale - -- include: packages.yml - tags: - - packages - -- include: power-button.yml - tags: - - power-button - -- include: kernel.yml - tags: - - kernel - -- include: boot.yml - tags: - - boot - -- include: network.yml - tags: - - network - -- include: filesystem.yml - tags: - - filesystem - -- include: unattended-upgrades.yml - tags: - - unattended-upgrades - -- include: user.yml - tags: - - user - -- include: rsyslog.yml - when: - - rsyslog.enabled is defined - - rsyslog.enabled - tags: - - rsyslog diff --git a/ansible/roles/ubuntu/tasks/network.yml b/ansible/roles/ubuntu/tasks/network.yml deleted file mode 100644 index 9224bd3d..00000000 --- a/ansible/roles/ubuntu/tasks/network.yml +++ /dev/null @@ -1,41 +0,0 @@ ---- -- name: network | check for bridge-nf-call-iptables - ansible.builtin.stat: - path: /proc/sys/net/bridge/bridge-nf-call-iptables - register: bridge_nf_call_iptables_result - -- name: network | sysctl | set config - ansible.builtin.blockinfile: - path: /etc/sysctl.d/99-kubernetes-cri.conf - mode: 0644 - create: true - block: | - net.ipv4.ip_forward = 1 - net.bridge.bridge-nf-call-iptables = 1 - when: - - bridge_nf_call_iptables_result.stat.exists - register: sysctl_network - -- name: network | sysctl | reload - ansible.builtin.shell: sysctl -p /etc/sysctl.d/99-kubernetes-cri.conf - when: - - sysctl_network.changed - - bridge_nf_call_iptables_result.stat.exists - -- name: network | check for vm cloud-init config - ansible.builtin.stat: - path: /etc/netplan/50-cloud-init.yaml - register: cloud_init_result - -- name: network | set ceph interface mtu - ansible.builtin.lineinfile: - path: /etc/netplan/50-cloud-init.yaml - regexp: '^\s*mtu' - insertafter: '^\s*set-name: eth1' - line: " mtu: 9000" - register: netplan_apply - when: cloud_init_result.stat.exists - -- name: network | apply netplan - ansible.builtin.shell: netplan apply - when: netplan_apply.changed diff --git a/ansible/roles/ubuntu/tasks/packages.yml b/ansible/roles/ubuntu/tasks/packages.yml deleted file mode 100644 index af98dded..00000000 --- a/ansible/roles/ubuntu/tasks/packages.yml +++ /dev/null @@ -1,94 +0,0 @@ ---- -- name: packages | disable recommends - ansible.builtin.blockinfile: - path: /etc/apt/apt.conf.d/02norecommends - mode: 0644 - create: true - block: | - APT::Install-Recommends "false"; - APT::Install-Suggests "false"; - APT::Get::Install-Recommends "false"; - APT::Get::Install-Suggests "false"; - -- name: packages | upgrade all packages - ansible.builtin.apt: - upgrade: full - update_cache: true - cache_valid_time: 3600 - autoclean: true - autoremove: true - register: apt_upgrade - retries: 5 - until: apt_upgrade is success - when: - - (skip_upgrade_packages is not defined or (skip_upgrade_packages is defined and not skip_upgrade_packages)) - -- name: packages | install common - ansible.builtin.apt: - name: "{{ packages.apt_install }}" - install_recommends: false - update_cache: true - cache_valid_time: 3600 - autoclean: true - autoremove: true - register: apt_install_common - retries: 5 - until: apt_install_common is success - when: - - packages.apt_install is defined - - packages.apt_install is iterable - - packages.apt_install | length > 0 - -- name: packages | remove crufty packages - block: - - name: packages | remove crufty packages | gather install packages - ansible.builtin.package_facts: - manager: auto - when: - - "'snapd' in packages.apt_remove" - - name: packages | remove crufty packages | check if snap is installed - ansible.builtin.debug: - msg: "snapd is installed" - register: snapd_check - when: - - "'snapd' in packages.apt_remove" - - "'snapd' in ansible_facts.packages" - - name: packages | remove crufty packages | remove snap packages - - ansible.builtin.command: snap remove {{ item }} - loop: - - lxd - - core18 - - snapd - when: - - "'snapd' in packages.apt_remove" - - "'snapd' in ansible_facts.packages" - - snapd_check.failed is defined - - name: packages | remove crufty packages | remove packages - - ansible.builtin.apt: - name: "{{ packages.apt_remove }}" - state: absent - autoremove: true - - name: packages | remove crufty packages | remove crufty files - - ansible.builtin.file: - state: absent - path: "{{ item }}" - loop: - - "/home/{{ ansible_user }}/.snap" - - "/snap" - - "/var/snap" - - "/var/lib/snapd" - - "/var/cache/snapd" - - "/usr/lib/snapd" - - "/etc/cloud" - - "/var/lib/cloud" - when: - - "'snapd' in packages.apt_remove" - - "'cloud-init' in packages.apt_remove" - when: - - packages.apt_remove is defined - - packages.apt_remove is iterable - - packages.apt_remove | length > 0 - - (skip_remove_packages is not defined or (skip_remove_packages is defined and not skip_remove_packages)) diff --git a/ansible/roles/ubuntu/tasks/power-button.yml b/ansible/roles/ubuntu/tasks/power-button.yml deleted file mode 100644 index 18d93524..00000000 --- a/ansible/roles/ubuntu/tasks/power-button.yml +++ /dev/null @@ -1,15 +0,0 @@ ---- -- name: power-button | disable single power button press shutdown - ansible.builtin.lineinfile: - path: /etc/systemd/logind.conf - regexp: "{{ item.setting }}" - line: "{{ item.setting }}={{ item.value }}" - loop: - - { setting: HandlePowerKey, value: ignore } - -- name: power-button | restart logind systemd service - ansible.builtin.systemd: - name: systemd-logind.service - daemon_reload: true - enabled: true - state: restarted diff --git a/ansible/roles/ubuntu/tasks/rsyslog.yml b/ansible/roles/ubuntu/tasks/rsyslog.yml deleted file mode 100644 index 54e69b51..00000000 --- a/ansible/roles/ubuntu/tasks/rsyslog.yml +++ /dev/null @@ -1,20 +0,0 @@ ---- - -- name: rsyslog - block: - - name: rsyslog | copy promtail configuration - ansible.builtin.template: - src: "rsyslog-50-promtail.conf.j2" - dest: "/etc/rsyslog.d/50-promtail.conf" - mode: 0644 - - name: rsyslog | start systemd service - ansible.builtin.systemd: - name: rsyslog - enabled: true - state: started - - name: rsyslog | restart systemd service - ansible.builtin.systemd: - name: rsyslog.service - daemon_reload: true - enabled: true - state: restarted diff --git a/ansible/roles/ubuntu/tasks/unattended-upgrades.yml b/ansible/roles/ubuntu/tasks/unattended-upgrades.yml deleted file mode 100644 index 890cadf4..00000000 --- a/ansible/roles/ubuntu/tasks/unattended-upgrades.yml +++ /dev/null @@ -1,38 +0,0 @@ ---- - -- name: unattended-upgrades | copy 20auto-upgrades config - ansible.builtin.blockinfile: - path: /etc/apt/apt.conf.d/20auto-upgrades - mode: 0644 - create: true - block: | - APT::Periodic::Update-Package-Lists "14"; - APT::Periodic::Download-Upgradeable-Packages "14"; - APT::Periodic::AutocleanInterval "7"; - APT::Periodic::Unattended-Upgrade "1"; - -- name: unattended-upgrades | copy 50unattended-upgrades config - ansible.builtin.blockinfile: - path: /etc/apt/apt.conf.d/50unattended-upgrades - mode: 0644 - create: true - block: | - Unattended-Upgrade::Automatic-Reboot "false"; - Unattended-Upgrade::Remove-Unused-Dependencies "true"; - Unattended-Upgrade::Allowed-Origins { - "${distro_id}:${distro_codename}"; - "${distro_id} ${distro_codename}-security"; - }; - -- name: unattended-upgrades | start systemd service - ansible.builtin.systemd: - name: unattended-upgrades - enabled: true - state: started - -- name: unattended-upgrades | restart systemd service - ansible.builtin.service: - name: unattended-upgrades.service - daemon_reload: true - enabled: true - state: restarted diff --git a/ansible/roles/ubuntu/tasks/user.yml b/ansible/roles/ubuntu/tasks/user.yml deleted file mode 100644 index b930082b..00000000 --- a/ansible/roles/ubuntu/tasks/user.yml +++ /dev/null @@ -1,28 +0,0 @@ ---- -- name: user | add to sudoers - ansible.builtin.copy: - content: "{{ ansible_user }} ALL=(ALL:ALL) NOPASSWD:ALL" - dest: "/etc/sudoers.d/{{ ansible_user }}_nopasswd" - mode: "0440" - -- name: user | add additional SSH public keys - ansible.posix.authorized_key: - user: "{{ ansible_user }}" - key: "{{ item }}" - loop: "{{ ssh_authorized_keys }}" - when: - - ssh_authorized_keys is defined - - ssh_authorized_keys is iterable - - ssh_authorized_keys | length > 0 - -- name: user | check if hushlogin exists - ansible.builtin.stat: - path: "/home/{{ ansible_user }}/.hushlogin" - register: hushlogin_result - -- name: user | silence the login prompt - ansible.builtin.file: - dest: "/home/{{ ansible_user }}/.hushlogin" - state: touch - owner: "{{ ansible_user }}" - mode: "0775" diff --git a/ansible/roles/ubuntu/templates/rsyslog-50-promtail.conf.j2 b/ansible/roles/ubuntu/templates/rsyslog-50-promtail.conf.j2 deleted file mode 100644 index fa61c4e1..00000000 --- a/ansible/roles/ubuntu/templates/rsyslog-50-promtail.conf.j2 +++ /dev/null @@ -1,4 +0,0 @@ -module(load="omprog") -module(load="mmutf8fix") -action(type="mmutf8fix" replacementChar="?") -action(type="omfwd" protocol="tcp" target="{{ rsyslog.ip }}" port="{{ rsyslog.port }}" Template="RSYSLOG_SyslogProtocol23Format" TCP_Framing="octet-counted" KeepAlive="on") diff --git a/scripts/lib/Talos.class.mjs b/scripts/lib/Talos.class.mjs index b2e32206..fc58b94c 100644 --- a/scripts/lib/Talos.class.mjs +++ b/scripts/lib/Talos.class.mjs @@ -73,7 +73,7 @@ class Talos { console.log(`Waiting for Talos apid to be available`) await sleep(30000) - let healthCheck = await retry(30, expBackoff(), () => $`curl -k https://${nodeConfig.ipAddress}:50000`) + let healthCheck = await retry(30, expBackoff(), () => $`nc -z ${nodeConfig.ipAddress} 50000`) if (await healthCheck.exitCode === 0) { console.log(`${chalk.green.bold('Success:')} You can now push a machine config to ${this.nodes}`) } @@ -86,7 +86,7 @@ class Talos { // Set TESMART switch channel async setChannel(headers, channel) { - const response = await fetch(`${this.proto}://${this.kvm}/api/gpio/pulse?channel=server${channel}_switch`, { method: 'POST', headers }) + const response = await fetch(`${this.proto}://${this.kvm}/api/gpio/pulse?channel=server${--channel}_switch`, { method: 'POST', headers }) if (!response.ok) { const json = await response.json() throw new Error(`${json.result.error} - ${json.result.error_msg}`) @@ -140,19 +140,19 @@ class Talos { // Send CTRL-ALT-DEL to piKVM async sendReboot(headers) { - await Promise.all([ - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=ControlLeft&state=true`, { method: 'POST', headers }), - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=AltLeft&state=true`, { method: 'POST', headers }), - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=Delete&state=true`, { method: 'POST', headers }), - ]) + // await Promise.all([ + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=ControlLeft&state=true`, { method: 'POST', headers }) + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=AltLeft&state=true`, { method: 'POST', headers }) + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=Delete&state=true`, { method: 'POST', headers }) + // ]) - await sleep(500) + await sleep(2000) - await Promise.all([ - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=ControlLeft&state=false`, { method: 'POST', headers }), - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=AltLeft&state=false`, { method: 'POST', headers }), - fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=Delete&state=false`, { method: 'POST', headers }), - ]) + // await Promise.all([ + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=ControlLeft&state=false`, { method: 'POST', headers }) + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=AltLeft&state=false`, { method: 'POST', headers }) + await fetch(`${this.proto}://${this.kvm}/api/hid/events/send_key?key=Delete&state=false`, { method: 'POST', headers }) + // ]) } } diff --git a/talos/cni/values.yaml b/talos/cni/values.yaml index fa6775bd..be81400c 100644 --- a/talos/cni/values.yaml +++ b/talos/cni/values.yaml @@ -18,6 +18,3 @@ bgp: hubble: enabled: false - -ipv6: - enabled: false diff --git a/terraform/.secrets.sops.yaml b/terraform/.secrets.sops.yaml deleted file mode 100644 index 3c340f8a..00000000 --- a/terraform/.secrets.sops.yaml +++ /dev/null @@ -1,34 +0,0 @@ -k8s: - user_password: ENC[AES256_GCM,data:4EVJWH/7,iv:35y9BQXaKcPQMfr5vtcHiWQqV1MZIKchzHExTg0d3H0=,tag:pKBxiQkeHaLvLS0BdI786w==,type:str] - ssh_key: ENC[AES256_GCM,data:q+atXCWn5fSACn9SHOIHxHqHzM7RnySUcJBJzwF18RbEwzSOHthdbLwhYbgy0jiyWollUx6q0OZYKrkNovR+GecqRQZkjPpqBZly/ktMLOc=,iv:bYJcLXeZ5GtgjvOKAeS/JXi+Elx6DIfA4cy9lcAOIJ0=,tag:oZpTLWaUlImDxSxf734oUA==,type:str] -sops: - kms: [] - gcp_kms: [] - azure_kv: [] - hc_vault: [] - age: [] - lastmodified: "2021-05-11T19:00:29Z" - mac: ENC[AES256_GCM,data:EVEYk/NvrM0zZcpuOCl5Ums1vYzgRQtPzPwAz1WTi3d5CEZFQJxps9N1FsFeVqE65o7myqzOZ2EI/ImVAJsq+8RbpXCbkIBffYcMPqL8KrJsy0nRAKNZtmeYdB1jRthGeW20vzbIAbU//LxNFB3Pm4wmRg3/VFF9UcgT/Ka6MKU=,iv:neN8mfBCW9LYWR1kHdXWu/pW6896Z7+jyCNxPJJGBWc=,tag:3n5Oo9/ZHXziYf3x1P+uzA==,type:str] - pgp: - - created_at: "2021-05-11T18:53:16Z" - enc: | - -----BEGIN PGP MESSAGE----- - - hQIMAySEZvKqXwiCAQ//TuGQ7qa8uA3hhn4DpzBPwiR6js536rNwNESIPFlRJSYc - NC7sxhEHdQNeQycbf39uVhQKqF3m9HcT5KePbWkMW+WwjrHOO2rXwDO8AUY8zzdM - COU7s/UeZbTr7kWNGEun6mfPKbQ0nEdCnbXfktbBZJp2vkjFKpnm4Ibnam2D7c8K - j50T+0WJXROpNeF/eVqXCGKHZu7BUqGTuyldHRTMFlxufJzOw1ZeIoN2hXEGR4kT - 4OBYCzkAmf0/EMKSQnOwJfSFKfTMaNsYP+fB3kh1fwQlobsdtVDinNLnnFFx1RIS - DSgnsqtxqslNleTNmngqCPL+AISlfuA5oUa6W31IZDK/1amd7h2A/XuLl885NrSp - /CEJosxmmU7lMSwUXLZVi3RyLxJvQWRwvKmzlCwnC3L10WfWuTjUV7Wdb1+SoqBB - gFp/hRYl8I7MIzRJREsjCJmdKYbF2KfJYOtPo7Aht9mmiczgZCJBzwXhE/ZLIHCj - /lyIzogT8h/R9bgRlfZS6qd+okSXKL28K7gBcFZ/7rK0gf+mhtgbxS3emGwoXOa1 - JWe+B7m4BK39lZ3vWs0bty3u0tjbb6f40pk2/0EB6tSq3pw0Zqqv4SG5vxaJaOGh - 7ICp7m9IMMmhfJ79nvOSwpGORIz/MS3jvxEcWxtnrB7VHMsku0HgdGpQamjZ95jS - XgFX/HLN49ASpyiaF47IPjqtLGX2tJ2lOJg9tYGDayOJGRbL8xJCCDwPjKn5KwOe - gT75OaBWsWvAKCuixi0AF6jS/kC/BXD6k8yDm5uee47kA5bnUdY5Tz9yK77jELw= - =BzQi - -----END PGP MESSAGE----- - fp: 0E883B2F1196288130061C6BA8B44BCF50372B6B - unencrypted_suffix: _unencrypted - version: 3.7.1 diff --git a/terraform/README.md b/terraform/README.md deleted file mode 100644 index c0a3c7a9..00000000 --- a/terraform/README.md +++ /dev/null @@ -1,40 +0,0 @@ -# Peparing Ubuntu cloudinit image for Proxmox - -### Download Ubuntu 20.04 cloudimg -`wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img` - -### Install libguestfs-tools on Proxmox server. -`apt-get install libguestfs-tools` - -### Install qemu-guest-agent on Ubuntu image. -`virt-customize -a focal-server-cloudimg-amd64.img --install qemu-guest-agent` - -### Enable password authentication in the template. Obviously, not recommended for except for testing. -`virt-customize -a focal-server-cloudimg-amd64.img --run-command "sed -i 's/.*PasswordAuthentication.*/PasswordAuthentication yes/g' /etc/ssh/sshd_config"` - -### Set environment variables. Change these as necessary. -```sh -export STORAGE_POOL="local-lvm" -export VM_ID="10000" -export VM_NAME="ubuntu-20.04-cloudimg" -``` - -### Create Proxmox VM image from Ubuntu Cloud Image. -```sh -qm create $VM_ID --memory 2048 --net0 virtio,bridge=vmbr0 -qm importdisk $VM_ID focal-server-cloudimg-amd64.img $STORAGE_POOL -qm set $VM_ID --scsihw virtio-scsi-pci --scsi0 $STORAGE_POOL:vm-$VM_ID-disk-0 -qm set $VM_ID --agent enabled=1,fstrim_cloned_disks=1 -qm set $VM_ID --name $VM_NAME -``` - -### Create Cloud-Init Disk and configure boot. -```sh -qm set $VM_ID --ide2 $STORAGE_POOL:cloudinit -qm set $VM_ID --boot c --bootdisk scsi0 -qm set $VM_ID --serial0 socket --vga serial0 - -qm template $VM_ID - -rm focal-server-cloudimg-amd64.img -``` diff --git a/terraform/main.tf b/terraform/main.tf deleted file mode 100644 index d22a0ccc..00000000 --- a/terraform/main.tf +++ /dev/null @@ -1,24 +0,0 @@ -terraform { - required_version = ">= 0.13.0" - - required_providers { - proxmox = { - source = "Telmate/proxmox" - version = "2.9.0" - } - - sops = { - source = "carlpett/sops" - version = "0.6.3" - } - } -} - -provider "proxmox" { - pm_tls_insecure = true - pm_api_url = "https://10.75.30.20:8006/api2/json" - pm_user = "root@pam" - pm_parallel = 4 -} - -provider "sops" {} diff --git a/terraform/masters.tf b/terraform/masters.tf deleted file mode 100644 index 74dda3b6..00000000 --- a/terraform/masters.tf +++ /dev/null @@ -1,47 +0,0 @@ -resource "proxmox_vm_qemu" "kube-master" { - for_each = var.masters - - name = each.key - target_node = each.value.target_node - agent = 1 - clone = var.common.clone - vmid = each.value.id - memory = each.value.memory - cores = each.value.cores - vga { - type = "qxl" - } - network { - model = "virtio" - macaddr = each.value.macaddr - bridge = "vmbr0" - tag = 40 - firewall = true - } - network { - model = "virtio" - bridge = "vmbr1" - } - disk { - type = "scsi" - storage = "fast-pool" - size = each.value.disk - format = "raw" - ssd = 1 - discard = "on" - } - serial { - id = 0 - type = "socket" - } - bootdisk = "scsi0" - scsihw = "virtio-scsi-pci" - os_type = "cloud-init" - ipconfig0 = "ip=${each.value.cidr},gw=${each.value.gw}" - ipconfig1 = "ip=${each.value.ceph_cidr}" - ciuser = "ubuntu" - cipassword = data.sops_file.secrets.data["k8s.user_password"] - searchdomain = var.common.search_domain - nameserver = var.common.nameserver - sshkeys = data.sops_file.secrets.data["k8s.ssh_key"] -} diff --git a/terraform/outputs.tf b/terraform/outputs.tf deleted file mode 100644 index e69de29b..00000000 diff --git a/terraform/secrets.tf b/terraform/secrets.tf deleted file mode 100644 index 0121e765..00000000 --- a/terraform/secrets.tf +++ /dev/null @@ -1,3 +0,0 @@ -data "sops_file" "secrets" { - source_file = ".secrets.yaml" -} diff --git a/terraform/variables.tf b/terraform/variables.tf deleted file mode 100644 index 70f56c47..00000000 --- a/terraform/variables.tf +++ /dev/null @@ -1,99 +0,0 @@ - -variable "common" { - type = map(string) - default = { - os_type = "ubuntu" - clone = "ubuntu-20.04-cloudimg" - search_domain = "dfw.56k.sh 56k.sh" - nameserver = "10.75.0.1" - } -} - -variable "masters" { - type = map(map(string)) - default = { - k8s-master01 = { - id = 4010 - cidr = "10.75.40.10/24" - ceph_cidr = "10.75.33.40/24" - cores = 8 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:01" - memory = 8192 - disk = "40G" - target_node = "pve01" - }, - k8s-master02 = { - id = 4011 - cidr = "10.75.40.11/24" - ceph_cidr = "10.75.33.41/24" - cores = 8 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:02" - memory = 8192 - disk = "40G" - target_node = "pve02" - }, - k8s-master03 = { - id = 4012 - cidr = "10.75.40.12/24" - ceph_cidr = "10.75.33.42/24" - cores = 8 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:03" - memory = 8192 - disk = "40G" - target_node = "pve03" - } - } -} - -variable "workers" { - type = map(map(string)) - default = { - k8s-worker01 = { - id = 4020 - cidr = "10.75.40.20/24" - ceph_cidr = "10.75.33.50/24" - cores = 16 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:0A" - memory = 16384 - disk = "40G" - target_node = "pve01" - }, - k8s-worker02 = { - id = 4021 - cidr = "10.75.40.21/24" - ceph_cidr = "10.75.33.51/24" - cores = 16 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:0B" - memory = 16384 - disk = "40G" - target_node = "pve02" - }, - k8s-worker03 = { - id = 4022 - cidr = "10.75.40.22/24" - ceph_cidr = "10.75.33.52/24" - cores = 16 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:0C" - memory = 16384 - disk = "40G" - target_node = "pve03" - }, - k8s-worker04 = { - id = 4023 - cidr = "10.75.40.23/24" - ceph_cidr = "10.75.33.53/24" - cores = 16 - gw = "10.75.40.1" - macaddr = "02:DE:4D:48:28:0D" - memory = 16384 - disk = "40G" - target_node = "pve04" - }, - } -} diff --git a/terraform/workers.tf b/terraform/workers.tf deleted file mode 100644 index a9c29a50..00000000 --- a/terraform/workers.tf +++ /dev/null @@ -1,47 +0,0 @@ -resource "proxmox_vm_qemu" "kube-worker" { - for_each = var.workers - - name = each.key - target_node = each.value.target_node - agent = 1 - clone = var.common.clone - vmid = each.value.id - memory = each.value.memory - cores = each.value.cores - vga { - type = "qxl" - } - network { - model = "virtio" - macaddr = each.value.macaddr - bridge = "vmbr0" - tag = 40 - firewall = true - } - network { - model = "virtio" - bridge = "vmbr1" - } - disk { - type = "scsi" - storage = "rust-pool" - size = each.value.disk - format = "raw" - ssd = 1 - discard = "on" - } - serial { - id = 0 - type = "socket" - } - bootdisk = "scsi0" - scsihw = "virtio-scsi-pci" - os_type = "cloud-init" - ipconfig0 = "ip=${each.value.cidr},gw=${each.value.gw}" - ipconfig1 = "ip=${each.value.ceph_cidr}" - ciuser = "ubuntu" - cipassword = data.sops_file.secrets.data["k8s.user_password"] - searchdomain = var.common.search_domain - nameserver = var.common.nameserver - sshkeys = data.sops_file.secrets.data["k8s.ssh_key"] -}