diff --git a/README.md b/README.md index bdd62ba..4347883 100644 --- a/README.md +++ b/README.md @@ -13,11 +13,13 @@ Kubernetes The Hard Way is optimized for learning, which means taking the long r This tutorial is a modified version of the original developed by [Kelsey Hightower](https://github.com/kelseyhightower/kubernetes-the-hard-way). While the original one uses GCP as the platform to deploy kubernetes, we use VirtualBox and Vagrant to deploy a cluster on a local machine. If you prefer the cloud version, refer to the original one [here](https://github.com/kelseyhightower/kubernetes-the-hard-way) -> The results of this tutorial should not be viewed as production ready, and may receive limited support from the community, but don't let that stop you from learning! +> The results of this tutorial should *not* be viewed as production ready, and may receive limited support from the community, but don't let that stop you from learning!
Note that we are only building 2 masters here instead of the recommended 3 that `etcd` requires to maintain quorum. This is to save on resources, and simply to show how to load balance across more than one master. Please note that with this particular challenge, it is all about the minute detail. If you miss one tiny step anywhere along the way, it's going to break! -Always run the `cert_verify` script at the places it suggests, and always ensure you are on the correct node when you do stuff. If `cert_verify` shows anything in red, then you have made an error in a previous step. For the master node checks, run the check on `master-1` and on `master-2` +Note also that in developing this lab, it has been tested *many many* times! Once you have the VMs up and you start to build the cluster, if at any point something isn't working it is 99.9999% likely to be because you missed something, not a bug in the lab! + +Always run the `cert_verify.sh` script at the places it suggests, and always ensure you are on the correct node when you do stuff. If `cert_verify.sh` shows anything in red, then you have made an error in a previous step. For the master node checks, run the check on `master-1` and on `master-2` ## Target Audience @@ -27,11 +29,10 @@ The target audience for this tutorial is someone planning to support a productio Kubernetes The Hard Way guides you through bootstrapping a highly available Kubernetes cluster with end-to-end encryption between components and RBAC authentication. -* [Kubernetes](https://github.com/kubernetes/kubernetes) 1.24.3 -* [Container Runtime](https://github.com/containerd/containerd) 1.5.9 -* [CNI Container Networking](https://github.com/containernetworking/cni) 0.8.6 +* [Kubernetes](https://github.com/kubernetes/kubernetes) Latest version +* [Container Runtime](https://github.com/containerd/containerd) Latest version * [Weave Networking](https://www.weave.works/docs/net/latest/kubernetes/kube-addon/) -* [etcd](https://github.com/coreos/etcd) v3.5.3 +* [etcd](https://github.com/coreos/etcd) v3.5.9 * [CoreDNS](https://github.com/coredns/coredns) v1.9.4 ### Node configuration @@ -40,7 +41,7 @@ We will be building the following: * Two control plane nodes (`master-1` and `master-2`) running the control plane components as operating system services. * Two worker nodes (`worker-1` and `worker-2`) -* One loadbalancer VM running HAProxy to balance requests between the two API servers. +* One loadbalancer VM running [HAProxy](https://www.haproxy.org/) to balance requests between the two API servers. ## Labs diff --git a/docs/01-prerequisites.md b/docs/01-prerequisites.md index 7ca75e9..00116da 100644 --- a/docs/01-prerequisites.md +++ b/docs/01-prerequisites.md @@ -10,7 +10,7 @@ Download and Install [VirtualBox](https://www.virtualbox.org/wiki/Downloads) on any one of the supported platforms: - Windows hosts - - OS X hosts (x86 only, not M1) + - OS X hosts (x86 only, not Apple Silicon M-series) - Linux distributions - Solaris hosts diff --git a/docs/02-compute-resources.md b/docs/02-compute-resources.md index 407e468..0a008be 100644 --- a/docs/02-compute-resources.md +++ b/docs/02-compute-resources.md @@ -11,9 +11,13 @@ git clone https://github.com/mmumshad/kubernetes-the-hard-way.git CD into vagrant directory ```bash -cd kubernetes-the-hard-way\vagrant +cd kubernetes-the-hard-way/vagrant ``` +The `Vagrantfile` is configured to assume you have at least an 8 core CPU which most modern core i5, i7 and i9 do, and at least 16GB RAM. You can tune these values expecially if you have *less* than this by editing the `Vagrantfile` before the next step below and adjusting the values for `RAM_SIZE` and `CPU_CORES` accordingly. + +This will not work if you have less than 8GB of RAM. + Run Vagrant up ```bash diff --git a/docs/03-client-tools.md b/docs/03-client-tools.md index 2b68809..c92ff60 100644 --- a/docs/03-client-tools.md +++ b/docs/03-client-tools.md @@ -10,33 +10,39 @@ Here we create an SSH key pair for the `vagrant` user who we are logged in as. W Generate Key Pair on `master-1` node +[//]: # (host:master-1) + ```bash ssh-keygen ``` -Leave all settings to default. +Leave all settings to default by pressing `ENTER` at any prompt. -View the generated public key ID at: - -```bash -cat ~/.ssh/id_rsa.pub -``` - -Add this key to the local authorized_keys (`master-1`) as in some commands we scp to ourself +Add this key to the local authorized_keys (`master-1`) as in some commands we scp to ourself. ```bash cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ``` -Copy the output into a notepad and form it into the following command +Copy the key to the other hosts. For this step please enter `vagrant` where a password is requested. + +The option `-o StrictHostKeyChecking=no` tells it not to ask if you want to connect to a previously unknown host. Not best practice in the real world, but speeds things up here. ```bash -cat >> ~/.ssh/authorized_keys < output +> output will be similar to this, although versions may be newer ``` kubectl version -o yaml clientVersion: - buildDate: "2022-07-13T14:30:46Z" + buildDate: "2023-11-15T16:58:22Z" compiler: gc - gitCommit: aef86a93758dc3cb2c658dd9657ab4ad4afc21cb + gitCommit: bae2c62678db2b5053817bc97181fcc2e8388103 gitTreeState: clean - gitVersion: v1.24.3 - goVersion: go1.18.3 + gitVersion: v1.28.4 + goVersion: go1.20.11 major: "1" - minor: "24" + minor: "28" platform: linux/amd64 -kustomizeVersion: v4.5.4 +kustomizeVersion: v5.0.4-0.20230601165947-6ce0bf390ce3 The connection to the server localhost:8080 was refused - did you specify the right host or port? ``` diff --git a/docs/04-certificate-authority.md b/docs/04-certificate-authority.md index 7e0608e..aa38611 100644 --- a/docs/04-certificate-authority.md +++ b/docs/04-certificate-authority.md @@ -59,9 +59,6 @@ Create a CA certificate, then generate a Certificate Signing Request and use it # Create private key for CA openssl genrsa -out ca.key 2048 - # Comment line starting with RANDFILE in /etc/ssl/openssl.cnf definition to avoid permission issues - sudo sed -i '0,/RANDFILE/{s/RANDFILE/\#&/}' /etc/ssl/openssl.cnf - # Create CSR using the private key openssl req -new -key ca.key -subj "/CN=KUBERNETES-CA/O=Kubernetes" -out ca.csr @@ -355,7 +352,9 @@ service-account.crt Run the following, and select option 1 to check all required certificates were generated. -```bash +[//]: # (command:./cert_verify.sh 1) + +``` ./cert_verify.sh ``` @@ -374,7 +373,7 @@ Copy the appropriate certificates and private keys to each instance: ```bash { for instance in master-1 master-2; do - scp ca.crt ca.key kube-apiserver.key kube-apiserver.crt \ + scp -o StrictHostKeyChecking=no ca.crt ca.key kube-apiserver.key kube-apiserver.crt \ apiserver-kubelet-client.crt apiserver-kubelet-client.key \ service-account.key service-account.crt \ etcd-server.key etcd-server.crt \ @@ -389,11 +388,13 @@ done } ``` -## Optional - Check Certificates +## Optional - Check Certificates on master-2 -At `master-1` and `master-2` nodes, run the following, selecting option 1 +At `master-2` node run the following, selecting option 1 -```bash +[//]: # (commandssh master-2 './cert_verify.sh 1') + +``` ./cert_verify.sh ``` diff --git a/docs/05-kubernetes-configuration-files.md b/docs/05-kubernetes-configuration-files.md index 02d92a4..3963fd0 100644 --- a/docs/05-kubernetes-configuration-files.md +++ b/docs/05-kubernetes-configuration-files.md @@ -178,7 +178,10 @@ done At `master-1` and `master-2` nodes, run the following, selecting option 2 -```bash +[//]: # (command./cert_verify.sh 2) +[//]: # (command:ssh master-2 './cert_verify.sh 2') + +``` ./cert_verify.sh ``` diff --git a/docs/07-bootstrapping-etcd.md b/docs/07-bootstrapping-etcd.md index b3d813b..2698077 100644 --- a/docs/07-bootstrapping-etcd.md +++ b/docs/07-bootstrapping-etcd.md @@ -20,16 +20,17 @@ Download the official etcd release binaries from the [etcd](https://github.com/e ```bash +ETCD_VERSION="v3.5.9" wget -q --show-progress --https-only --timestamping \ - "https://github.com/coreos/etcd/releases/download/v3.5.3/etcd-v3.5.3-linux-amd64.tar.gz" + "https://github.com/coreos/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz" ``` Extract and install the `etcd` server and the `etcdctl` command line utility: ```bash { - tar -xvf etcd-v3.5.3-linux-amd64.tar.gz - sudo mv etcd-v3.5.3-linux-amd64/etcd* /usr/local/bin/ + tar -xvf etcd-${ETCD_VERSION}-linux-amd64.tar.gz + sudo mv etcd-${ETCD_VERSION}-linux-amd64/etcd* /usr/local/bin/ } ``` diff --git a/docs/08-bootstrapping-kubernetes-controllers.md b/docs/08-bootstrapping-kubernetes-controllers.md index c98af75..f9fe117 100644 --- a/docs/08-bootstrapping-kubernetes-controllers.md +++ b/docs/08-bootstrapping-kubernetes-controllers.md @@ -16,14 +16,16 @@ You can perform this step with [tmux](01-prerequisites.md#running-commands-in-pa ### Download and Install the Kubernetes Controller Binaries -Download the official Kubernetes release binaries: +Download the latest official Kubernetes release binaries: ```bash +KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt) + wget -q --show-progress --https-only --timestamping \ - "https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-apiserver" \ - "https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-controller-manager" \ - "https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-scheduler" \ - "https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl" + "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-apiserver" \ + "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-controller-manager" \ + "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-scheduler" \ + "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubectl" ``` Reference: https://kubernetes.io/releases/download/#binaries @@ -210,7 +212,9 @@ sudo chmod 600 /var/lib/kubernetes/*.kubeconfig At `master-1` and `master-2` nodes, run the following, selecting option 3 -```bash +[//]: # (command:./cert_verify.sh 3) + +``` ./cert_verify.sh ``` @@ -316,7 +320,7 @@ curl https://${LOADBALANCER}:6443/version -k { "major": "1", "minor": "24", - "gitVersion": "v1.24.3", + "gitVersion": "${KUBE_VERSION}", "gitCommit": "aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", "gitTreeState": "clean", "buildDate": "2022-07-13T14:23:26Z", diff --git a/docs/09-install-cri-workers.md b/docs/09-install-cri-workers.md index d50220f..d55e9f2 100644 --- a/docs/09-install-cri-workers.md +++ b/docs/09-install-cri-workers.md @@ -1,4 +1,4 @@ -# Installing CRI on the Kubernetes Worker Nodes +# Installing Container Runtime on the Kubernetes Worker Nodes In this lab you will install the Container Runtime Interface (CRI) on both worker nodes. CRI is a standard interface for the management of containers. Since v1.24 the use of dockershim has been fully deprecated and removed from the code base. [containerd replaces docker](https://kodekloud.com/blog/kubernetes-removed-docker-what-happens-now/) as the container runtime for Kubernetes, and it requires support from [CNI Plugins](https://github.com/containernetworking/plugins) to configure container networks, and [runc](https://github.com/opencontainers/runc) to actually do the job of running containers. @@ -8,74 +8,49 @@ Reference: https://github.com/containerd/containerd/blob/main/docs/getting-start The commands in this lab must be run on each worker instance: `worker-1`, and `worker-2`. Login to each controller instance using SSH Terminal. +Here we will install the container runtime `containerd` from the Ubuntu distribution, and kubectl plus the CNI tools from the Kubernetes distribution. Kubectl is required on worker-2 to initialize kubeconfig files for the worker-node auto registration. + [//]: # (host:worker-1-worker-2) You can perform this step with [tmux](01-prerequisites.md#running-commands-in-parallel-with-tmux) -The versions chosen here align with those that are installed by the current `kubernetes-cni` package for a v1.24 cluster. +Set up the Kubernetes `apt` repository ```bash { - CONTAINERD_VERSION=1.5.9 - CNI_VERSION=0.8.6 - RUNC_VERSION=1.1.1 + KUBE_LATEST=$(curl -L -s https://dl.k8s.io/release/stable.txt | awk 'BEGIN { FS="." } { printf "%s.%s", $1, $2 }') - wget -q --show-progress --https-only --timestamping \ - https://github.com/containerd/containerd/releases/download/v${CONTAINERD_VERSION}/containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz \ - https://github.com/containernetworking/plugins/releases/download/v${CNI_VERSION}/cni-plugins-linux-amd64-v${CNI_VERSION}.tgz \ - https://github.com/opencontainers/runc/releases/download/v${RUNC_VERSION}/runc.amd64 + sudo mkdir -p /etc/apt/keyrings + curl -fsSL https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg - sudo mkdir -p /opt/cni/bin - - sudo chmod +x runc.amd64 - sudo mv runc.amd64 /usr/local/bin/runc - - sudo tar -xzvf containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz -C /usr/local - sudo tar -xzvf cni-plugins-linux-amd64-v${CNI_VERSION}.tgz -C /opt/cni/bin + echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list } ``` -Next create the `containerd` service unit. - -```bash -cat < Next: [Bootstrapping the Kubernetes Worker Nodes](10-bootstrapping-kubernetes-workers.md) diff --git a/docs/10-bootstrapping-kubernetes-workers.md b/docs/10-bootstrapping-kubernetes-workers.md index 6d07408..4f6558b 100644 --- a/docs/10-bootstrapping-kubernetes-workers.md +++ b/docs/10-bootstrapping-kubernetes-workers.md @@ -108,10 +108,11 @@ All the following commands from here until the [verification](#verification) ste ```bash +KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt) + wget -q --show-progress --https-only --timestamping \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-proxy \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubelet + https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-proxy \ + https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubelet ``` Reference: https://kubernetes.io/releases/download/#binaries @@ -130,8 +131,8 @@ Install the worker binaries: ```bash { - chmod +x kubectl kube-proxy kubelet - sudo mv kubectl kube-proxy kubelet /usr/local/bin/ + chmod +x kube-proxy kubelet + sudo mv kube-proxy kubelet /usr/local/bin/ } ``` @@ -168,6 +169,8 @@ CLUSTER_DNS=$(echo $SERVICE_CIDR | awk 'BEGIN {FS="."} ; { printf("%s.%s.%s.10", Create the `kubelet-config.yaml` configuration file: +Reference: https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ + ```bash cat < 93s v1.24.3 +worker-1 NotReady 93s v1.28.4 ``` The node is not ready as we have not yet installed pod networking. This comes later. diff --git a/docs/11-tls-bootstrapping-kubernetes-workers.md b/docs/11-tls-bootstrapping-kubernetes-workers.md index 9682e8c..2e8cda4 100644 --- a/docs/11-tls-bootstrapping-kubernetes-workers.md +++ b/docs/11-tls-bootstrapping-kubernetes-workers.md @@ -212,11 +212,14 @@ Going forward all activities are to be done on the `worker-2` node until [step 1 ### Download and Install Worker Binaries +Note that kubectl is required here to assist with creating the boostrap kubeconfigs for kubelet and kube-proxy + ```bash +KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt) + wget -q --show-progress --https-only --timestamping \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-proxy \ - https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubelet + https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-proxy \ + https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubelet ``` Reference: https://kubernetes.io/releases/download/#binaries @@ -235,8 +238,8 @@ Install the worker binaries: ```bash { - chmod +x kubectl kube-proxy kubelet - sudo mv kubectl kube-proxy kubelet /usr/local/bin/ + chmod +x kube-proxy kubelet + sudo mv kube-proxy kubelet /usr/local/bin/ } ``` Move the certificates and secure them. @@ -316,6 +319,8 @@ Reference: https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-b Create the `kubelet-config.yaml` configuration file: +Reference: https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ + ```bash cat < 93s v1.24.3 -worker-2 NotReady 93s v1.24.3 +worker-1 NotReady 93s v1.28.4 +worker-2 NotReady 93s v1.28.4 ``` Prev: [Bootstrapping the Kubernetes Worker Nodes](10-bootstrapping-kubernetes-workers.md)
diff --git a/docs/12-configuring-kubectl.md b/docs/12-configuring-kubectl.md index 1852e4e..2543ac8 100644 --- a/docs/12-configuring-kubectl.md +++ b/docs/12-configuring-kubectl.md @@ -71,8 +71,8 @@ kubectl get nodes ``` NAME STATUS ROLES AGE VERSION -worker-1 NotReady 118s v1.24.3 -worker-2 NotReady 118s v1.24.3 +worker-1 NotReady 118s v1.28.4 +worker-2 NotReady 118s v1.28.4 ``` Prev: [TLS Bootstrapping Kubernetes Workers](11-tls-bootstrapping-kubernetes-workers.md)
diff --git a/docs/13-configure-pod-networking.md b/docs/13-configure-pod-networking.md index 02562fd..a2f874d 100644 --- a/docs/13-configure-pod-networking.md +++ b/docs/13-configure-pod-networking.md @@ -15,9 +15,10 @@ On `master-1` ```bash kubectl apply -f "https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml" + ``` -Weave uses POD CIDR of `10.32.0.0/12` by default. +Weave uses POD CIDR of `10.244.0.0/16` by default. ## Verification @@ -47,8 +48,8 @@ kubectl get nodes ``` NAME STATUS ROLES AGE VERSION -worker-1 Ready 4m11s v1.24.3 -worker-2 Ready 2m49s v1.24.3 +worker-1 Ready 4m11s v1.28.4 +worker-2 Ready 2m49s v1.28.4 ``` Reference: https://kubernetes.io/docs/tasks/administer-cluster/network-policy-provider/weave-network-policy/#install-the-weave-net-addon diff --git a/docs/15-dns-addon.md b/docs/15-dns-addon.md index 5ca224c..0fcbb8a 100644 --- a/docs/15-dns-addon.md +++ b/docs/15-dns-addon.md @@ -48,7 +48,7 @@ Reference: https://kubernetes.io/docs/tasks/administer-cluster/coredns/#installi Create a `busybox` pod: ```bash -kubectl run busybox --image=busybox:1.28 --command -- sleep 3600 +kubectl run busybox -n default --image=busybox:1.28 --restart Never --command -- sleep 15 ``` [//]: # (command:kubectl wait pods -n default -l run=busybox --for condition=Ready --timeout=90s) @@ -57,7 +57,7 @@ kubectl run busybox --image=busybox:1.28 --command -- sleep 3600 List the pod created by the `busybox` pod: ```bash -kubectl get pods -l run=busybox +kubectl get pods -n default -l run=busybox ``` > output @@ -70,7 +70,7 @@ busybox-bd8fb7cbd-vflm9 1/1 Running 0 10s Execute a DNS lookup for the `kubernetes` service inside the `busybox` pod: ```bash -kubectl exec -ti busybox -- nslookup kubernetes +kubectl exec -ti -n default busybox -- nslookup kubernetes ``` > output diff --git a/docs/16-smoke-test.md b/docs/16-smoke-test.md index 6e4af80..f4a336a 100644 --- a/docs/16-smoke-test.md +++ b/docs/16-smoke-test.md @@ -151,5 +151,14 @@ kubectl exec -ti $POD_NAME -- nginx -v nginx version: nginx/1.23.1 ``` +Clean up test resources + + +```bash +kubectl delete pod -n default busybox +kubectl delete service -n default nginx +kubectl delete deployment -n default nginx +``` + Prev: [DNS Addon](15-dns-addon.md)
Next: [End to End Tests](17-e2e-tests.md) diff --git a/docs/17-e2e-tests.md b/docs/17-e2e-tests.md index 9e021f0..b7880c0 100644 --- a/docs/17-e2e-tests.md +++ b/docs/17-e2e-tests.md @@ -1,5 +1,10 @@ # Run End-to-End Tests +Observations by Alistair (KodeKloud): + +Depending on your computer, you may have varying success with these. I have found them to run much more smoothly on a 12 core Intel(R) Core(TM) i7-7800X Desktop Processor (circa 2017), than on a 20 core Intel(R) Core(TM) i7-12700H Laptop processor (circa 2022) - both machines having 32GB RAM and both machines running the same version of VirtualBox. On the latter, it tends to destabilize the cluster resulting in timeouts in the tests. This *may* be a processor issue in that laptop processors are not really designed to take the kind of abuse that'll be thrown by the tests at a kube cluster that really should be run on a Server processor. Laptop processors do odd things for power conservation like constantly varying the clock speed and mixing "performance" and "efficiency" cores, even when the laptop is plugged in, and this could be causing synchronization issues with the goroutines running in the kube components. If anyone has a definitive explanation for this, please do post in the kubernetes-the-hard-way Slack channel. + + ## Install latest Go ```bash @@ -7,35 +12,55 @@ GO_VERSION=$(curl -s 'https://go.dev/VERSION?m=text' | head -1) wget "https://dl.google.com/go/${GO_VERSION}.linux-amd64.tar.gz" sudo tar -C /usr/local -xzf ${GO_VERSION}.linux-amd64.tar.gz + +sudo ln -s /usr/local/go/bin/go /usr/local/bin/go +sudo ln -s /usr/local/go/bin/gofmt /usr/local/bin/gofmt + +source <(go env) +export PATH=$PATH:$GOPATH/bin ``` -## Install kubetest +## Install kubetest2 + +Here we pull the kubetest2 code, and the Google Cloud CLI which kubetest uses to pull the test packages for our version of the cluster. Kubetest will download and then compile, which takes a few minutes. + ```bash -git clone --depth 1 https://github.com/kubernetes/test-infra.git -cd test-infra/kubetest -export GOPATH="$HOME/go" -export PATH=$PATH:/usr/local/go/bin:$GOPATH/bin -go build +go install sigs.k8s.io/kubetest2/...@latest +sudo snap install google-cloud-cli --classic ``` -> Note: it will take a while to build as it has many dependencies. +## Run test +Here we set up a couple of environment variables to supply arguments to the test package - the version of our cluster and the number of CPUs on `master-1` to aid with test parallelization. -## Use the version specific to your cluster +Then we invoke the test package ```bash -sudo apt install jq -y +KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt) +NUM_CPU=$(cat /proc/cpuinfo | grep '^processor' | wc -l) + +cd ~ +kubetest2 noop --kubeconfig ${PWD}/.kube/config --test=ginkgo -- \ + --focus-regex='\[Conformance\]' --test-package-version $KUBE_VERSION --logtostderr --parallel $NUM_CPU ``` -```bash -K8S_VERSION=$(kubectl version -o json | jq -r '.serverVersion.gitVersion') -export KUBERNETES_CONFORMANCE_TEST=y -export KUBECONFIG="$HOME/.kube/config" +While this is running, you can open an additional session on `master-1` from your workstation and watch the activity in the cluster -./kubetest --provider=skeleton --test --test_args=”--ginkgo.focus=\[Conformance\]” --extract ${K8S_VERSION} | tee test.out +``` +vagrant ssh master-1 ``` -This could take *18 hours or more*! There are several thousand tests in the suite. The number of tests run and passed will be displayed at the end. Expect some failures as it tries tests that aren't supported by our cluster, e.g. mounting persistent volumes using NFS. +then + +``` +watch kubectl get all -A +``` + +Observations by Alistair (KodeKloud): + +This should take up to an hour to run. The number of tests run and passed will be displayed at the end. Expect some failures! + +I am not able to say exactly why the failed tests fail. It would take days to go though the truly enormous test code base to determine why the tests that fail do so. Prev: [Smoke Test](16-smoke-test.md) \ No newline at end of file diff --git a/docs/verify-certificates.md b/docs/verify-certificates.md index 123ba0e..73a68dd 100644 --- a/docs/verify-certificates.md +++ b/docs/verify-certificates.md @@ -7,23 +7,8 @@ This script was developed to assist the verification of certificates for each Ku It is important that the script execution needs to be done by following commands after logging into the respective virtual machines [ whether it is master-1 / master-2 / worker-1 ] via SSH. ```bash -cd /home/vagrant +cd ~ bash cert_verify.sh ``` -Following are the successful output of script execution under different nodes, - -1. VM: Master-1 - - ![Master-1-Cert-Verification](./images/master-1-cert.png) - -2. VM: Master-2 - - ![Master-2-Cert-Verification](./images/master-2-cert.png) - -3. VM: Worker-1 - - ![Worker-1-Cert-Verification](./images/worker-1-cert.png) - -Any misconfiguration in certificates will be reported in red. - +All successful validations are in green text, errors in red. \ No newline at end of file diff --git a/tools/lab-script-generator.py b/tools/lab-script-generator.py index a0d5d39..7db9c31 100644 --- a/tools/lab-script-generator.py +++ b/tools/lab-script-generator.py @@ -41,13 +41,14 @@ if not os.path.isdir(qs_path): newline = chr(10) # In case running on Windows (plus writing files as binary to not convert to \r\n) file_number_rx = re.compile(r'^(?P\d+)') -comment_rx = re.compile(r'^\[//\]:\s\#\s\((?P\w+):(?P[^\)]+)\)') +comment_rx = re.compile(r'^\[//\]:\s\#\s\((?P\w+):(?P.*)\)\s*$') choice_rx = re.compile(r'^\s*-+\s+OR\s+-+') script_begin = '```bash' script_end = '```' script_open = ('{' + newline).encode('utf-8') -script_close = '}'.encode('utf-8') +script_close = '\n}'.encode('utf-8') current_host = None +file_nos = [] def write_script(filename: str, script: list): path = os.path.join(qs_path, filename) @@ -57,18 +58,29 @@ def write_script(filename: str, script: list): f.write(script_close) print(f'-> {path}') - +output_file_no = 1 +script = [] +output_file = None for doc in glob.glob(os.path.join(docs_path, '*.md')): print(doc) - script = [] state = State.NONE ignore_next_script = False m = file_number_rx.search(os.path.basename(doc)) if not m: continue file_no = m['number'] + if int(file_no) < 3: + continue + file_nos.append(file_no) section = 0 - output_file = None + script.extend([ + "##################################################", + "#", + f"# {os.path.basename(doc)}", + "#", + "##################################################", + "" + ]) with codecs.open(doc, "r", encoding='utf-8') as f: for line in f.readlines(): line = line.rstrip() @@ -78,11 +90,24 @@ for doc in glob.glob(os.path.join(docs_path, '*.md')): token = m['token'] value = m['value'] if token == 'host': - if script: + if script and current_host and current_host != value: + #fns = file_no if len(file_nos) < 2 else '-'.join(file_nos[:-1]) + script.append('set +e') + output_file = os.path.join(qs_path, f'{output_file_no}-{current_host}.sh') write_script(output_file, script) - script = [] + output_file_no += 1 + script = [ + "##################################################", + "#", + f"# {os.path.basename(doc)}", + "#", + "##################################################", + "" + ] + file_nos = [file_no] output_file = os.path.join(qs_path, f'{file_no}{chr(97 + section)}-{value}.sh') section += 1 + current_host = value elif token == 'sleep': script.extend([ f'echo "Sleeping {value}s"', @@ -112,8 +137,14 @@ for doc in glob.glob(os.path.join(docs_path, '*.md')): state = State.NONE script.append(newline) ignore_next_script = False + # elif line.startswith('source') or line.startswith('export'): + # script.append('}') + # script.append(line) + # script.append('{') elif not (ignore_next_script or line == '{' or line == '}'): script.append(line) - if output_file and script: - write_script(output_file, script) +if script: + # fns = '-'.join(file_nos[1:]) + output_file = os.path.join(qs_path, f'{output_file_no}-{current_host}.sh') + write_script(output_file, script) diff --git a/vagrant/Vagrantfile b/vagrant/Vagrantfile index a8cba18..66b9705 100644 --- a/vagrant/Vagrantfile +++ b/vagrant/Vagrantfile @@ -1,15 +1,42 @@ # -*- mode: ruby -*- # vi:set ft=ruby sw=2 ts=2 sts=2: -# Define the number of master and worker nodes -# If this number is changed, remember to update setup-hosts.sh script with the new hosts IP details in /etc/hosts of each VM. -NUM_MASTER_NODE = 2 -NUM_WORKER_NODE = 2 +# Define how much memory your computer has in GB (e.g. 8, 16) +# Larger nodes will be created if you have more. +RAM_SIZE = 16 +# Define how mnay CPU cores you have. +# More powerful workers will be created if you have more +CPU_CORES = 8 + +# Internal network prefix for the VM network +# See the documentation before changing this IP_NW = "192.168.56." -MASTER_IP_START = 10 -NODE_IP_START = 20 -LB_IP_START = 30 + +# Calculate resource amounts +# based on RAM/CPU +ram_selector = (RAM_SIZE / 4) * 4 +if ram_selector < 8 + raise "Unsufficient memory #{RAM_SIZE}GB. min 8GB" +end +RESOURCES = { + "master" => { + 1 => { + # master-1 bigger since it may run e2e tests. + "ram" => [ram_selector * 128, 2048].max(), + "cpu" => CPU_CORES >= 12 ? 4 : 2, + }, + 2 => { + # All additional masters get this + "ram" => [ram_selector * 128, 2048].min(), + "cpu" => CPU_CORES > 8 ? 2 : 1, + }, + }, + "worker" => { + "ram" => [ram_selector * 128, 4096].min(), + "cpu" => (((CPU_CORES / 4) * 4) - 4) / 4, + }, +} # Sets up hosts file and DNS def setup_dns(node) @@ -25,18 +52,23 @@ end def provision_kubernetes_node(node) # Set up kernel parameters, modules and tunables node.vm.provision "setup-kernel", :type => "shell", :path => "ubuntu/setup-kernel.sh" - # Restart - node.vm.provision :shell do |shell| - shell.privileged = true - shell.inline = "echo Rebooting" - shell.reboot = true - end + # Set up ssh + node.vm.provision "setup-ssh", :type => "shell", :path => "ubuntu/ssh.sh" # Set up DNS setup_dns node # Install cert verification script node.vm.provision "shell", inline: "ln -s /vagrant/ubuntu/cert_verify.sh /home/vagrant/cert_verify.sh" end +# Define the number of master and worker nodes. You should not change this +NUM_MASTER_NODE = 2 +NUM_WORKER_NODE = 2 + +# Host address start points +MASTER_IP_START = 10 +NODE_IP_START = 20 +LB_IP_START = 30 + # All Vagrant configuration is done below. The "2" in Vagrant.configure # configures the configuration version (we support older styles for # backwards compatibility). Please don't change it unless you know what @@ -50,6 +82,7 @@ Vagrant.configure("2") do |config| # boxes at https://vagrantcloud.com/search. # config.vm.box = "base" config.vm.box = "ubuntu/jammy64" + config.vm.boot_timeout = 900 # Disable automatic box update checking. If you disable this, then # boxes will only be checked for updates when the user runs @@ -62,12 +95,8 @@ Vagrant.configure("2") do |config| # Name shown in the GUI node.vm.provider "virtualbox" do |vb| vb.name = "kubernetes-ha-master-#{i}" - if i == 1 - vb.memory = 2048 # More needed to run e2e tests at end - else - vb.memory = 1024 - end - vb.cpus = 2 + vb.memory = RESOURCES["master"][i > 2 ? 2 : i]["ram"] + vb.cpus = RESOURCES["master"][i > 2 ? 2 : i]["cpu"] end node.vm.hostname = "master-#{i}" node.vm.network :private_network, ip: IP_NW + "#{MASTER_IP_START + i}" @@ -91,6 +120,8 @@ Vagrant.configure("2") do |config| node.vm.hostname = "loadbalancer" node.vm.network :private_network, ip: IP_NW + "#{LB_IP_START}" node.vm.network "forwarded_port", guest: 22, host: 2730 + # Set up ssh + node.vm.provision "setup-ssh", :type => "shell", :path => "ubuntu/ssh.sh" setup_dns node end @@ -99,8 +130,8 @@ Vagrant.configure("2") do |config| config.vm.define "worker-#{i}" do |node| node.vm.provider "virtualbox" do |vb| vb.name = "kubernetes-ha-worker-#{i}" - vb.memory = 1024 - vb.cpus = 1 + vb.memory = RESOURCES["worker"]["ram"] + vb.cpus = RESOURCES["worker"]["cpu"] end node.vm.hostname = "worker-#{i}" node.vm.network :private_network, ip: IP_NW + "#{NODE_IP_START + i}" diff --git a/vagrant/ubuntu/cert_verify.sh b/vagrant/ubuntu/cert_verify.sh index 9ed634c..7fa9403 100755 --- a/vagrant/ubuntu/cert_verify.sh +++ b/vagrant/ubuntu/cert_verify.sh @@ -157,8 +157,14 @@ check_cert_only() exit 1 fi else - printf "${FAILED}${cert} missing. More details: https://github.com/mmumshad/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md#certificate-authority\n${NC}" - echo "These should be in ${CERT_LOCATION}${NC}" + if [[ $cert == *kubelet-client-current* ]] + then + printf "${FAILED}${cert} missing. This probably means that kubelet failed to start.${NC}\n" + echo -e "Check logs with\n\n sudo journalctl -u kubelet\n" + else + printf "${FAILED}${cert} missing. More details: https://github.com/mmumshad/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md#certificate-authority\n${NC}" + echo "These should be in ${CERT_LOCATION}" + fi exit 1 fi } @@ -425,17 +431,27 @@ check_systemd_ks() # END OF Function - Master node # +if [ ! -z "$1" ] +then + choice=$1 +else + echo "This script will validate the certificates in master as well as worker-1 nodes. Before proceeding, make sure you ssh into the respective node [ Master or Worker-1 ] for certificate validation" + while true + do + echo + echo " 1. Verify certificates on Master Nodes after step 4" + echo " 2. Verify kubeconfigs on Master Nodes after step 5" + echo " 3. Verify kubeconfigs and PKI on Master Nodes after step 8" + echo " 4. Verify kubeconfigs and PKI on worker-1 Node after step 10" + echo " 5. Verify kubeconfigs and PKI on worker-2 Node after step 11" + echo + echo -n "Please select one of the above options: " + read choice -echo "This script will validate the certificates in master as well as worker-1 nodes. Before proceeding, make sure you ssh into the respective node [ Master or Worker-1 ] for certificate validation" -echo -echo " 1. Verify certificates on Master Nodes after step 4" -echo " 2. Verify kubeconfigs on Master Nodes after step 5" -echo " 3. Verify kubeconfigs and PKI on Master Nodes after step 8" -echo " 4. Verify kubeconfigs and PKI on worker-1 Node after step 10" -echo " 5. Verify kubeconfigs and PKI on worker-2 Node after step 11" -echo -echo -n "Please select one of the above options: " -read value + [ -z "$choice" ] && continue + [ $choice -gt 0 -a $choice -lt 6 ] && break + done +fi HOST=$(hostname -s) @@ -450,7 +466,7 @@ SUBJ_SA="Subject:CN=service-accounts,O=Kubernetes" SUBJ_ETCD="Subject:CN=etcd-server,O=Kubernetes" SUBJ_APIKC="Subject:CN=kube-apiserver-kubelet-client,O=system:masters" -case $value in +case $choice in 1) if ! [ "${HOST}" = "master-1" -o "${HOST}" = "master-2" ] @@ -459,7 +475,7 @@ case $value in exit 1 fi - echo -e "The selected option is $value, proceeding the certificate verification of Master node" + echo -e "The selected option is $choice, proceeding the certificate verification of Master node" CERT_LOCATION=$HOME check_cert_and_key "ca" $SUBJ_CA $CERT_ISSUER diff --git a/vagrant/ubuntu/setup-kernel.sh b/vagrant/ubuntu/setup-kernel.sh index 2a76c3f..b4ef7aa 100644 --- a/vagrant/ubuntu/setup-kernel.sh +++ b/vagrant/ubuntu/setup-kernel.sh @@ -1,19 +1,27 @@ #!/bin/bash # # Sets up the kernel with the requirements for running Kubernetes -# Requires a reboot, which is carried out by the vagrant provisioner. -set -ex - -# Disable cgroups v2 (kernel command line parameter) -sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="/GRUB_CMDLINE_LINUX_DEFAULT="systemd.unified_cgroup_hierarchy=0 ipv6.disable=1 /' /etc/default/grub -update-grub +set -e # Add br_netfilter kernel module -echo "br_netfilter" >> /etc/modules +cat <> /etc/modules +ip_vs +ip_vs_rr +ip_vs_wrr +ip_vs_sh +br_netfilter +nf_conntrack +EOF +systemctl restart systemd-modules-load.service # Set network tunables cat <> /etc/sysctl.d/10-kubernetes.conf +net.ipv6.conf.all.disable_ipv6 = 1 +net.ipv6.conf.default.disable_ipv6 = 1 +net.ipv6.conf.lo.disable_ipv6 = 1 net.bridge.bridge-nf-call-iptables=1 net.ipv4.ip_forward=1 EOF +sysctl --system + diff --git a/vagrant/ubuntu/ssh.sh b/vagrant/ubuntu/ssh.sh new file mode 100644 index 0000000..476f681 --- /dev/null +++ b/vagrant/ubuntu/ssh.sh @@ -0,0 +1,5 @@ +#!/bin/bash + +# Enable password auth in sshd so we can use ssh-copy-id +sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config +systemctl restart sshd