Update to latest version (#325)

* Build now functional

* Use ssh option to reduce questions

* Use IPVS

* Further e2e observations

* Tidy up

* RAM and CPU adjustments
pull/634/head
Alistair Mackay 2023-11-23 19:52:14 +00:00 committed by GitHub
parent 24d0565f89
commit 2dd8f64d31
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
22 changed files with 344 additions and 215 deletions

View File

@ -13,11 +13,13 @@ Kubernetes The Hard Way is optimized for learning, which means taking the long r
This tutorial is a modified version of the original developed by [Kelsey Hightower](https://github.com/kelseyhightower/kubernetes-the-hard-way).
While the original one uses GCP as the platform to deploy kubernetes, we use VirtualBox and Vagrant to deploy a cluster on a local machine. If you prefer the cloud version, refer to the original one [here](https://github.com/kelseyhightower/kubernetes-the-hard-way)
> The results of this tutorial should not be viewed as production ready, and may receive limited support from the community, but don't let that stop you from learning!
> The results of this tutorial should *not* be viewed as production ready, and may receive limited support from the community, but don't let that stop you from learning!<br/>Note that we are only building 2 masters here instead of the recommended 3 that `etcd` requires to maintain quorum. This is to save on resources, and simply to show how to load balance across more than one master.
Please note that with this particular challenge, it is all about the minute detail. If you miss one tiny step anywhere along the way, it's going to break!
Always run the `cert_verify` script at the places it suggests, and always ensure you are on the correct node when you do stuff. If `cert_verify` shows anything in red, then you have made an error in a previous step. For the master node checks, run the check on `master-1` and on `master-2`
Note also that in developing this lab, it has been tested *many many* times! Once you have the VMs up and you start to build the cluster, if at any point something isn't working it is 99.9999% likely to be because you missed something, not a bug in the lab!
Always run the `cert_verify.sh` script at the places it suggests, and always ensure you are on the correct node when you do stuff. If `cert_verify.sh` shows anything in red, then you have made an error in a previous step. For the master node checks, run the check on `master-1` and on `master-2`
## Target Audience
@ -27,11 +29,10 @@ The target audience for this tutorial is someone planning to support a productio
Kubernetes The Hard Way guides you through bootstrapping a highly available Kubernetes cluster with end-to-end encryption between components and RBAC authentication.
* [Kubernetes](https://github.com/kubernetes/kubernetes) 1.24.3
* [Container Runtime](https://github.com/containerd/containerd) 1.5.9
* [CNI Container Networking](https://github.com/containernetworking/cni) 0.8.6
* [Kubernetes](https://github.com/kubernetes/kubernetes) Latest version
* [Container Runtime](https://github.com/containerd/containerd) Latest version
* [Weave Networking](https://www.weave.works/docs/net/latest/kubernetes/kube-addon/)
* [etcd](https://github.com/coreos/etcd) v3.5.3
* [etcd](https://github.com/coreos/etcd) v3.5.9
* [CoreDNS](https://github.com/coredns/coredns) v1.9.4
### Node configuration
@ -40,7 +41,7 @@ We will be building the following:
* Two control plane nodes (`master-1` and `master-2`) running the control plane components as operating system services.
* Two worker nodes (`worker-1` and `worker-2`)
* One loadbalancer VM running HAProxy to balance requests between the two API servers.
* One loadbalancer VM running [HAProxy](https://www.haproxy.org/) to balance requests between the two API servers.
## Labs

View File

@ -10,7 +10,7 @@
Download and Install [VirtualBox](https://www.virtualbox.org/wiki/Downloads) on any one of the supported platforms:
- Windows hosts
- OS X hosts (x86 only, not M1)
- OS X hosts (x86 only, not Apple Silicon M-series)
- Linux distributions
- Solaris hosts

View File

@ -11,9 +11,13 @@ git clone https://github.com/mmumshad/kubernetes-the-hard-way.git
CD into vagrant directory
```bash
cd kubernetes-the-hard-way\vagrant
cd kubernetes-the-hard-way/vagrant
```
The `Vagrantfile` is configured to assume you have at least an 8 core CPU which most modern core i5, i7 and i9 do, and at least 16GB RAM. You can tune these values expecially if you have *less* than this by editing the `Vagrantfile` before the next step below and adjusting the values for `RAM_SIZE` and `CPU_CORES` accordingly.
This will not work if you have less than 8GB of RAM.
Run Vagrant up
```bash

View File

@ -10,33 +10,39 @@ Here we create an SSH key pair for the `vagrant` user who we are logged in as. W
Generate Key Pair on `master-1` node
[//]: # (host:master-1)
```bash
ssh-keygen
```
Leave all settings to default.
Leave all settings to default by pressing `ENTER` at any prompt.
View the generated public key ID at:
```bash
cat ~/.ssh/id_rsa.pub
```
Add this key to the local authorized_keys (`master-1`) as in some commands we scp to ourself
Add this key to the local authorized_keys (`master-1`) as in some commands we scp to ourself.
```bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
```
Copy the output into a notepad and form it into the following command
Copy the key to the other hosts. For this step please enter `vagrant` where a password is requested.
The option `-o StrictHostKeyChecking=no` tells it not to ask if you want to connect to a previously unknown host. Not best practice in the real world, but speeds things up here.
```bash
cat >> ~/.ssh/authorized_keys <<EOF
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD...OUTPUT-FROM-ABOVE-COMMAND...8+08b vagrant@master-1
EOF
ssh-copy-id -o StrictHostKeyChecking=no vagrant@master-2
ssh-copy-id -o StrictHostKeyChecking=no vagrant@loadbalancer
ssh-copy-id -o StrictHostKeyChecking=no vagrant@worker-1
ssh-copy-id -o StrictHostKeyChecking=no vagrant@worker-2
```
Now ssh to each of the other nodes and paste the above from your notepad at each command prompt.
For each host, the output should be similar to this. If it is not, then you may have entered an incorrect password. Retry the step.
```
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'vagrant@master-2'"
and check to make sure that only the key(s) you wanted were added.
```
## Install kubectl
@ -44,37 +50,39 @@ The [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl). command l
Reference: [https://kubernetes.io/docs/tasks/tools/install-kubectl/](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
We will be using kubectl early on to generate kubeconfig files for the controlplane components.
### Linux
```bash
wget https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/
```
### Verification
Verify `kubectl` version 1.24.3 or higher is installed:
Verify `kubectl` is installed:
```
kubectl version -o yaml
```
> output
> output will be similar to this, although versions may be newer
```
kubectl version -o yaml
clientVersion:
buildDate: "2022-07-13T14:30:46Z"
buildDate: "2023-11-15T16:58:22Z"
compiler: gc
gitCommit: aef86a93758dc3cb2c658dd9657ab4ad4afc21cb
gitCommit: bae2c62678db2b5053817bc97181fcc2e8388103
gitTreeState: clean
gitVersion: v1.24.3
goVersion: go1.18.3
gitVersion: v1.28.4
goVersion: go1.20.11
major: "1"
minor: "24"
minor: "28"
platform: linux/amd64
kustomizeVersion: v4.5.4
kustomizeVersion: v5.0.4-0.20230601165947-6ce0bf390ce3
The connection to the server localhost:8080 was refused - did you specify the right host or port?
```

View File

@ -59,9 +59,6 @@ Create a CA certificate, then generate a Certificate Signing Request and use it
# Create private key for CA
openssl genrsa -out ca.key 2048
# Comment line starting with RANDFILE in /etc/ssl/openssl.cnf definition to avoid permission issues
sudo sed -i '0,/RANDFILE/{s/RANDFILE/\#&/}' /etc/ssl/openssl.cnf
# Create CSR using the private key
openssl req -new -key ca.key -subj "/CN=KUBERNETES-CA/O=Kubernetes" -out ca.csr
@ -355,7 +352,9 @@ service-account.crt
Run the following, and select option 1 to check all required certificates were generated.
```bash
[//]: # (command:./cert_verify.sh 1)
```
./cert_verify.sh
```
@ -374,7 +373,7 @@ Copy the appropriate certificates and private keys to each instance:
```bash
{
for instance in master-1 master-2; do
scp ca.crt ca.key kube-apiserver.key kube-apiserver.crt \
scp -o StrictHostKeyChecking=no ca.crt ca.key kube-apiserver.key kube-apiserver.crt \
apiserver-kubelet-client.crt apiserver-kubelet-client.key \
service-account.key service-account.crt \
etcd-server.key etcd-server.crt \
@ -389,11 +388,13 @@ done
}
```
## Optional - Check Certificates
## Optional - Check Certificates on master-2
At `master-1` and `master-2` nodes, run the following, selecting option 1
At `master-2` node run the following, selecting option 1
```bash
[//]: # (commandssh master-2 './cert_verify.sh 1')
```
./cert_verify.sh
```

View File

@ -178,7 +178,10 @@ done
At `master-1` and `master-2` nodes, run the following, selecting option 2
```bash
[//]: # (command./cert_verify.sh 2)
[//]: # (command:ssh master-2 './cert_verify.sh 2')
```
./cert_verify.sh
```

View File

@ -20,16 +20,17 @@ Download the official etcd release binaries from the [etcd](https://github.com/e
```bash
ETCD_VERSION="v3.5.9"
wget -q --show-progress --https-only --timestamping \
"https://github.com/coreos/etcd/releases/download/v3.5.3/etcd-v3.5.3-linux-amd64.tar.gz"
"https://github.com/coreos/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz"
```
Extract and install the `etcd` server and the `etcdctl` command line utility:
```bash
{
tar -xvf etcd-v3.5.3-linux-amd64.tar.gz
sudo mv etcd-v3.5.3-linux-amd64/etcd* /usr/local/bin/
tar -xvf etcd-${ETCD_VERSION}-linux-amd64.tar.gz
sudo mv etcd-${ETCD_VERSION}-linux-amd64/etcd* /usr/local/bin/
}
```

View File

@ -16,14 +16,16 @@ You can perform this step with [tmux](01-prerequisites.md#running-commands-in-pa
### Download and Install the Kubernetes Controller Binaries
Download the official Kubernetes release binaries:
Download the latest official Kubernetes release binaries:
```bash
KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)
wget -q --show-progress --https-only --timestamping \
"https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-apiserver" \
"https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-controller-manager" \
"https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-scheduler" \
"https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl"
"https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-apiserver" \
"https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-controller-manager" \
"https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-scheduler" \
"https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubectl"
```
Reference: https://kubernetes.io/releases/download/#binaries
@ -210,7 +212,9 @@ sudo chmod 600 /var/lib/kubernetes/*.kubeconfig
At `master-1` and `master-2` nodes, run the following, selecting option 3
```bash
[//]: # (command:./cert_verify.sh 3)
```
./cert_verify.sh
```
@ -316,7 +320,7 @@ curl https://${LOADBALANCER}:6443/version -k
{
"major": "1",
"minor": "24",
"gitVersion": "v1.24.3",
"gitVersion": "${KUBE_VERSION}",
"gitCommit": "aef86a93758dc3cb2c658dd9657ab4ad4afc21cb",
"gitTreeState": "clean",
"buildDate": "2022-07-13T14:23:26Z",

View File

@ -1,4 +1,4 @@
# Installing CRI on the Kubernetes Worker Nodes
# Installing Container Runtime on the Kubernetes Worker Nodes
In this lab you will install the Container Runtime Interface (CRI) on both worker nodes. CRI is a standard interface for the management of containers. Since v1.24 the use of dockershim has been fully deprecated and removed from the code base. [containerd replaces docker](https://kodekloud.com/blog/kubernetes-removed-docker-what-happens-now/) as the container runtime for Kubernetes, and it requires support from [CNI Plugins](https://github.com/containernetworking/plugins) to configure container networks, and [runc](https://github.com/opencontainers/runc) to actually do the job of running containers.
@ -8,74 +8,49 @@ Reference: https://github.com/containerd/containerd/blob/main/docs/getting-start
The commands in this lab must be run on each worker instance: `worker-1`, and `worker-2`. Login to each controller instance using SSH Terminal.
Here we will install the container runtime `containerd` from the Ubuntu distribution, and kubectl plus the CNI tools from the Kubernetes distribution. Kubectl is required on worker-2 to initialize kubeconfig files for the worker-node auto registration.
[//]: # (host:worker-1-worker-2)
You can perform this step with [tmux](01-prerequisites.md#running-commands-in-parallel-with-tmux)
The versions chosen here align with those that are installed by the current `kubernetes-cni` package for a v1.24 cluster.
Set up the Kubernetes `apt` repository
```bash
{
CONTAINERD_VERSION=1.5.9
CNI_VERSION=0.8.6
RUNC_VERSION=1.1.1
KUBE_LATEST=$(curl -L -s https://dl.k8s.io/release/stable.txt | awk 'BEGIN { FS="." } { printf "%s.%s", $1, $2 }')
wget -q --show-progress --https-only --timestamping \
https://github.com/containerd/containerd/releases/download/v${CONTAINERD_VERSION}/containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz \
https://github.com/containernetworking/plugins/releases/download/v${CNI_VERSION}/cni-plugins-linux-amd64-v${CNI_VERSION}.tgz \
https://github.com/opencontainers/runc/releases/download/v${RUNC_VERSION}/runc.amd64
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
sudo mkdir -p /opt/cni/bin
sudo chmod +x runc.amd64
sudo mv runc.amd64 /usr/local/bin/runc
sudo tar -xzvf containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz -C /usr/local
sudo tar -xzvf cni-plugins-linux-amd64-v${CNI_VERSION}.tgz -C /opt/cni/bin
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${KUBE_LATEST}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
}
```
Next create the `containerd` service unit.
```bash
cat <<EOF | sudo tee /etc/systemd/system/containerd.service
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
```
Now start it
Install `containerd` and CNI tools, first refreshing `apt` repos to get up to date versions.
```bash
{
sudo systemctl enable containerd
sudo systemctl start containerd
sudo apt update
sudo apt install -y containerd kubernetes-cni kubectl ipvsadm ipset
}
```
Set up `containerd` configuration to enable systemd Cgroups
```bash
{
sudo mkdir -p /etc/containerd
containerd config default | sed 's/SystemdCgroup = false/SystemdCgroup = true/' | sudo tee /etc/containerd/config.toml
}
```
Now restart `containerd` to read the new configuration
```bash
sudo systemctl restart containerd
```
Prev: [Bootstrapping the Kubernetes Control Plane](08-bootstrapping-kubernetes-controllers.md)</br>
Next: [Bootstrapping the Kubernetes Worker Nodes](10-bootstrapping-kubernetes-workers.md)

View File

@ -108,10 +108,11 @@ All the following commands from here until the [verification](#verification) ste
```bash
KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)
wget -q --show-progress --https-only --timestamping \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-proxy \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubelet
https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-proxy \
https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubelet
```
Reference: https://kubernetes.io/releases/download/#binaries
@ -130,8 +131,8 @@ Install the worker binaries:
```bash
{
chmod +x kubectl kube-proxy kubelet
sudo mv kubectl kube-proxy kubelet /usr/local/bin/
chmod +x kube-proxy kubelet
sudo mv kube-proxy kubelet /usr/local/bin/
}
```
@ -168,6 +169,8 @@ CLUSTER_DNS=$(echo $SERVICE_CIDR | awk 'BEGIN {FS="."} ; { printf("%s.%s.%s.10",
Create the `kubelet-config.yaml` configuration file:
Reference: https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
```bash
cat <<EOF | sudo tee /var/lib/kubelet/kubelet-config.yaml
kind: KubeletConfiguration
@ -181,9 +184,11 @@ authentication:
clientCAFile: /var/lib/kubernetes/pki/ca.crt
authorization:
mode: Webhook
containerRuntimeEndpoint: unix:///var/run/containerd/containerd.sock
clusterDomain: cluster.local
clusterDNS:
- ${CLUSTER_DNS}
cgroupDriver: systemd
resolvConf: /run/systemd/resolve/resolv.conf
runtimeRequestTimeout: "15m"
tlsCertFile: /var/lib/kubernetes/pki/${HOSTNAME}.crt
@ -207,7 +212,6 @@ Requires=containerd.service
[Service]
ExecStart=/usr/local/bin/kubelet \\
--config=/var/lib/kubelet/kubelet-config.yaml \\
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \\
--kubeconfig=/var/lib/kubelet/kubelet.kubeconfig \\
--v=2
Restart=on-failure
@ -227,13 +231,15 @@ sudo mv kube-proxy.kubeconfig /var/lib/kube-proxy/
Create the `kube-proxy-config.yaml` configuration file:
Reference: https://kubernetes.io/docs/reference/config-api/kube-proxy-config.v1alpha1/
```bash
cat <<EOF | sudo tee /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: "/var/lib/kube-proxy/kube-proxy.kubeconfig"
mode: "iptables"
kubeconfig: /var/lib/kube-proxy/kube-proxy.kubeconfig
mode: ipvs
clusterCIDR: ${POD_CIDR}
EOF
```
@ -261,7 +267,9 @@ EOF
At `worker-1` node, run the following, selecting option 4
```bash
[//]: # (command:./cert_verify.sh 4)
```
./cert_verify.sh
```
@ -294,7 +302,7 @@ kubectl get nodes --kubeconfig admin.kubeconfig
```
NAME STATUS ROLES AGE VERSION
worker-1 NotReady <none> 93s v1.24.3
worker-1 NotReady <none> 93s v1.28.4
```
The node is not ready as we have not yet installed pod networking. This comes later.

View File

@ -212,11 +212,14 @@ Going forward all activities are to be done on the `worker-2` node until [step 1
### Download and Install Worker Binaries
Note that kubectl is required here to assist with creating the boostrap kubeconfigs for kubelet and kube-proxy
```bash
KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)
wget -q --show-progress --https-only --timestamping \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubectl \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kube-proxy \
https://storage.googleapis.com/kubernetes-release/release/v1.24.3/bin/linux/amd64/kubelet
https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kube-proxy \
https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/amd64/kubelet
```
Reference: https://kubernetes.io/releases/download/#binaries
@ -235,8 +238,8 @@ Install the worker binaries:
```bash
{
chmod +x kubectl kube-proxy kubelet
sudo mv kubectl kube-proxy kubelet /usr/local/bin/
chmod +x kube-proxy kubelet
sudo mv kube-proxy kubelet /usr/local/bin/
}
```
Move the certificates and secure them.
@ -316,6 +319,8 @@ Reference: https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-b
Create the `kubelet-config.yaml` configuration file:
Reference: https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
```bash
cat <<EOF | sudo tee /var/lib/kubelet/kubelet-config.yaml
kind: KubeletConfiguration
@ -329,6 +334,8 @@ authentication:
clientCAFile: /var/lib/kubernetes/pki/ca.crt
authorization:
mode: Webhook
containerRuntimeEndpoint: unix:///var/run/containerd/containerd.sock
cgroupDriver: systemd
clusterDomain: "cluster.local"
clusterDNS:
- ${CLUSTER_DNS}
@ -360,7 +367,6 @@ ExecStart=/usr/local/bin/kubelet \\
--config=/var/lib/kubelet/kubelet-config.yaml \\
--kubeconfig=/var/lib/kubelet/kubeconfig \\
--cert-dir=/var/lib/kubelet/pki/ \\
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \\
--v=2
Restart=on-failure
RestartSec=5
@ -379,6 +385,7 @@ Things to note here:
In one of the previous steps we created the kube-proxy.kubeconfig file. Check [here](https://github.com/mmumshad/kubernetes-the-hard-way/blob/master/docs/05-kubernetes-configuration-files.md) if you missed it.
```bash
{
sudo mv kube-proxy.kubeconfig /var/lib/kube-proxy/
@ -389,13 +396,15 @@ In one of the previous steps we created the kube-proxy.kubeconfig file. Check [h
Create the `kube-proxy-config.yaml` configuration file:
Reference: https://kubernetes.io/docs/reference/config-api/kube-proxy-config.v1alpha1/
```bash
cat <<EOF | sudo tee /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: /var/lib/kube-proxy/kube-proxy.kubeconfig
mode: iptables
mode: ipvs
clusterCIDR: ${POD_CIDR}
EOF
```
@ -437,7 +446,10 @@ On worker-2:
At `worker-2` node, run the following, selecting option 5
```bash
[//]: # (command:sleep 5)
[//]: # (command:./cert_verify.sh 5)
```
./cert_verify.sh
```
@ -447,7 +459,8 @@ At `worker-2` node, run the following, selecting option 5
Now, go back to `master-1` and approve the pending kubelet-serving certificate
[//]: # (host:master-1)
[//]: # (comment:Please now manually approve the certificate before proceeding)
[//]: # (command:sudo apt install -y jq)
[//]: # (command:kubectl certificate approve --kubeconfig admin.kubeconfig $(kubectl get csr --kubeconfig admin.kubeconfig -o json | jq -r '.items | .[] | select(.spec.username == "system:node:worker-2") | .metadata.name'))
```bash
kubectl get csr --kubeconfig admin.kubeconfig
@ -464,7 +477,7 @@ csr-n7z8p 98s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap
Approve the pending certificate. Note that the certificate name `csr-7k8nh` will be different for you, and each time you run through.
```
kubectl certificate approve csr-7k8nh --kubeconfig admin.kubeconfig
kubectl certificate approve --kubeconfig admin.kubeconfig csr-7k8nh
```
@ -484,8 +497,8 @@ kubectl get nodes --kubeconfig admin.kubeconfig
```
NAME STATUS ROLES AGE VERSION
worker-1 NotReady <none> 93s v1.24.3
worker-2 NotReady <none> 93s v1.24.3
worker-1 NotReady <none> 93s v1.28.4
worker-2 NotReady <none> 93s v1.28.4
```
Prev: [Bootstrapping the Kubernetes Worker Nodes](10-bootstrapping-kubernetes-workers.md)</br>

View File

@ -71,8 +71,8 @@ kubectl get nodes
```
NAME STATUS ROLES AGE VERSION
worker-1 NotReady <none> 118s v1.24.3
worker-2 NotReady <none> 118s v1.24.3
worker-1 NotReady <none> 118s v1.28.4
worker-2 NotReady <none> 118s v1.28.4
```
Prev: [TLS Bootstrapping Kubernetes Workers](11-tls-bootstrapping-kubernetes-workers.md)</br>

View File

@ -15,9 +15,10 @@ On `master-1`
```bash
kubectl apply -f "https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml"
```
Weave uses POD CIDR of `10.32.0.0/12` by default.
Weave uses POD CIDR of `10.244.0.0/16` by default.
## Verification
@ -47,8 +48,8 @@ kubectl get nodes
```
NAME STATUS ROLES AGE VERSION
worker-1 Ready <none> 4m11s v1.24.3
worker-2 Ready <none> 2m49s v1.24.3
worker-1 Ready <none> 4m11s v1.28.4
worker-2 Ready <none> 2m49s v1.28.4
```
Reference: https://kubernetes.io/docs/tasks/administer-cluster/network-policy-provider/weave-network-policy/#install-the-weave-net-addon

View File

@ -48,7 +48,7 @@ Reference: https://kubernetes.io/docs/tasks/administer-cluster/coredns/#installi
Create a `busybox` pod:
```bash
kubectl run busybox --image=busybox:1.28 --command -- sleep 3600
kubectl run busybox -n default --image=busybox:1.28 --restart Never --command -- sleep 15
```
[//]: # (command:kubectl wait pods -n default -l run=busybox --for condition=Ready --timeout=90s)
@ -57,7 +57,7 @@ kubectl run busybox --image=busybox:1.28 --command -- sleep 3600
List the pod created by the `busybox` pod:
```bash
kubectl get pods -l run=busybox
kubectl get pods -n default -l run=busybox
```
> output
@ -70,7 +70,7 @@ busybox-bd8fb7cbd-vflm9 1/1 Running 0 10s
Execute a DNS lookup for the `kubernetes` service inside the `busybox` pod:
```bash
kubectl exec -ti busybox -- nslookup kubernetes
kubectl exec -ti -n default busybox -- nslookup kubernetes
```
> output

View File

@ -151,5 +151,14 @@ kubectl exec -ti $POD_NAME -- nginx -v
nginx version: nginx/1.23.1
```
Clean up test resources
```bash
kubectl delete pod -n default busybox
kubectl delete service -n default nginx
kubectl delete deployment -n default nginx
```
Prev: [DNS Addon](15-dns-addon.md)</br>
Next: [End to End Tests](17-e2e-tests.md)

View File

@ -1,5 +1,10 @@
# Run End-to-End Tests
Observations by Alistair (KodeKloud):
Depending on your computer, you may have varying success with these. I have found them to run much more smoothly on a 12 core Intel(R) Core(TM) i7-7800X Desktop Processor (circa 2017), than on a 20 core Intel(R) Core(TM) i7-12700H Laptop processor (circa 2022) - both machines having 32GB RAM and both machines running the same version of VirtualBox. On the latter, it tends to destabilize the cluster resulting in timeouts in the tests. This *may* be a processor issue in that laptop processors are not really designed to take the kind of abuse that'll be thrown by the tests at a kube cluster that really should be run on a Server processor. Laptop processors do odd things for power conservation like constantly varying the clock speed and mixing "performance" and "efficiency" cores, even when the laptop is plugged in, and this could be causing synchronization issues with the goroutines running in the kube components. If anyone has a definitive explanation for this, please do post in the kubernetes-the-hard-way Slack channel.
## Install latest Go
```bash
@ -7,35 +12,55 @@ GO_VERSION=$(curl -s 'https://go.dev/VERSION?m=text' | head -1)
wget "https://dl.google.com/go/${GO_VERSION}.linux-amd64.tar.gz"
sudo tar -C /usr/local -xzf ${GO_VERSION}.linux-amd64.tar.gz
sudo ln -s /usr/local/go/bin/go /usr/local/bin/go
sudo ln -s /usr/local/go/bin/gofmt /usr/local/bin/gofmt
source <(go env)
export PATH=$PATH:$GOPATH/bin
```
## Install kubetest
## Install kubetest2
Here we pull the kubetest2 code, and the Google Cloud CLI which kubetest uses to pull the test packages for our version of the cluster. Kubetest will download and then compile, which takes a few minutes.
```bash
git clone --depth 1 https://github.com/kubernetes/test-infra.git
cd test-infra/kubetest
export GOPATH="$HOME/go"
export PATH=$PATH:/usr/local/go/bin:$GOPATH/bin
go build
go install sigs.k8s.io/kubetest2/...@latest
sudo snap install google-cloud-cli --classic
```
> Note: it will take a while to build as it has many dependencies.
## Run test
Here we set up a couple of environment variables to supply arguments to the test package - the version of our cluster and the number of CPUs on `master-1` to aid with test parallelization.
## Use the version specific to your cluster
Then we invoke the test package
```bash
sudo apt install jq -y
KUBE_VERSION=$(curl -L -s https://dl.k8s.io/release/stable.txt)
NUM_CPU=$(cat /proc/cpuinfo | grep '^processor' | wc -l)
cd ~
kubetest2 noop --kubeconfig ${PWD}/.kube/config --test=ginkgo -- \
--focus-regex='\[Conformance\]' --test-package-version $KUBE_VERSION --logtostderr --parallel $NUM_CPU
```
```bash
K8S_VERSION=$(kubectl version -o json | jq -r '.serverVersion.gitVersion')
export KUBERNETES_CONFORMANCE_TEST=y
export KUBECONFIG="$HOME/.kube/config"
While this is running, you can open an additional session on `master-1` from your workstation and watch the activity in the cluster
./kubetest --provider=skeleton --test --test_args=”--ginkgo.focus=\[Conformance\]” --extract ${K8S_VERSION} | tee test.out
```
vagrant ssh master-1
```
This could take *18 hours or more*! There are several thousand tests in the suite. The number of tests run and passed will be displayed at the end. Expect some failures as it tries tests that aren't supported by our cluster, e.g. mounting persistent volumes using NFS.
then
```
watch kubectl get all -A
```
Observations by Alistair (KodeKloud):
This should take up to an hour to run. The number of tests run and passed will be displayed at the end. Expect some failures!
I am not able to say exactly why the failed tests fail. It would take days to go though the truly enormous test code base to determine why the tests that fail do so.
Prev: [Smoke Test](16-smoke-test.md)

View File

@ -7,23 +7,8 @@ This script was developed to assist the verification of certificates for each Ku
It is important that the script execution needs to be done by following commands after logging into the respective virtual machines [ whether it is master-1 / master-2 / worker-1 ] via SSH.
```bash
cd /home/vagrant
cd ~
bash cert_verify.sh
```
Following are the successful output of script execution under different nodes,
1. VM: Master-1
![Master-1-Cert-Verification](./images/master-1-cert.png)
2. VM: Master-2
![Master-2-Cert-Verification](./images/master-2-cert.png)
3. VM: Worker-1
![Worker-1-Cert-Verification](./images/worker-1-cert.png)
Any misconfiguration in certificates will be reported in red.
All successful validations are in green text, errors in red.

View File

@ -41,13 +41,14 @@ if not os.path.isdir(qs_path):
newline = chr(10) # In case running on Windows (plus writing files as binary to not convert to \r\n)
file_number_rx = re.compile(r'^(?P<number>\d+)')
comment_rx = re.compile(r'^\[//\]:\s\#\s\((?P<token>\w+):(?P<value>[^\)]+)\)')
comment_rx = re.compile(r'^\[//\]:\s\#\s\((?P<token>\w+):(?P<value>.*)\)\s*$')
choice_rx = re.compile(r'^\s*-+\s+OR\s+-+')
script_begin = '```bash'
script_end = '```'
script_open = ('{' + newline).encode('utf-8')
script_close = '}'.encode('utf-8')
script_close = '\n}'.encode('utf-8')
current_host = None
file_nos = []
def write_script(filename: str, script: list):
path = os.path.join(qs_path, filename)
@ -57,18 +58,29 @@ def write_script(filename: str, script: list):
f.write(script_close)
print(f'-> {path}')
output_file_no = 1
script = []
output_file = None
for doc in glob.glob(os.path.join(docs_path, '*.md')):
print(doc)
script = []
state = State.NONE
ignore_next_script = False
m = file_number_rx.search(os.path.basename(doc))
if not m:
continue
file_no = m['number']
if int(file_no) < 3:
continue
file_nos.append(file_no)
section = 0
output_file = None
script.extend([
"##################################################",
"#",
f"# {os.path.basename(doc)}",
"#",
"##################################################",
""
])
with codecs.open(doc, "r", encoding='utf-8') as f:
for line in f.readlines():
line = line.rstrip()
@ -78,11 +90,24 @@ for doc in glob.glob(os.path.join(docs_path, '*.md')):
token = m['token']
value = m['value']
if token == 'host':
if script:
if script and current_host and current_host != value:
#fns = file_no if len(file_nos) < 2 else '-'.join(file_nos[:-1])
script.append('set +e')
output_file = os.path.join(qs_path, f'{output_file_no}-{current_host}.sh')
write_script(output_file, script)
script = []
output_file_no += 1
script = [
"##################################################",
"#",
f"# {os.path.basename(doc)}",
"#",
"##################################################",
""
]
file_nos = [file_no]
output_file = os.path.join(qs_path, f'{file_no}{chr(97 + section)}-{value}.sh')
section += 1
current_host = value
elif token == 'sleep':
script.extend([
f'echo "Sleeping {value}s"',
@ -112,8 +137,14 @@ for doc in glob.glob(os.path.join(docs_path, '*.md')):
state = State.NONE
script.append(newline)
ignore_next_script = False
# elif line.startswith('source') or line.startswith('export'):
# script.append('}')
# script.append(line)
# script.append('{')
elif not (ignore_next_script or line == '{' or line == '}'):
script.append(line)
if output_file and script:
if script:
# fns = '-'.join(file_nos[1:])
output_file = os.path.join(qs_path, f'{output_file_no}-{current_host}.sh')
write_script(output_file, script)

73
vagrant/Vagrantfile vendored
View File

@ -1,15 +1,42 @@
# -*- mode: ruby -*-
# vi:set ft=ruby sw=2 ts=2 sts=2:
# Define the number of master and worker nodes
# If this number is changed, remember to update setup-hosts.sh script with the new hosts IP details in /etc/hosts of each VM.
NUM_MASTER_NODE = 2
NUM_WORKER_NODE = 2
# Define how much memory your computer has in GB (e.g. 8, 16)
# Larger nodes will be created if you have more.
RAM_SIZE = 16
# Define how mnay CPU cores you have.
# More powerful workers will be created if you have more
CPU_CORES = 8
# Internal network prefix for the VM network
# See the documentation before changing this
IP_NW = "192.168.56."
MASTER_IP_START = 10
NODE_IP_START = 20
LB_IP_START = 30
# Calculate resource amounts
# based on RAM/CPU
ram_selector = (RAM_SIZE / 4) * 4
if ram_selector < 8
raise "Unsufficient memory #{RAM_SIZE}GB. min 8GB"
end
RESOURCES = {
"master" => {
1 => {
# master-1 bigger since it may run e2e tests.
"ram" => [ram_selector * 128, 2048].max(),
"cpu" => CPU_CORES >= 12 ? 4 : 2,
},
2 => {
# All additional masters get this
"ram" => [ram_selector * 128, 2048].min(),
"cpu" => CPU_CORES > 8 ? 2 : 1,
},
},
"worker" => {
"ram" => [ram_selector * 128, 4096].min(),
"cpu" => (((CPU_CORES / 4) * 4) - 4) / 4,
},
}
# Sets up hosts file and DNS
def setup_dns(node)
@ -25,18 +52,23 @@ end
def provision_kubernetes_node(node)
# Set up kernel parameters, modules and tunables
node.vm.provision "setup-kernel", :type => "shell", :path => "ubuntu/setup-kernel.sh"
# Restart
node.vm.provision :shell do |shell|
shell.privileged = true
shell.inline = "echo Rebooting"
shell.reboot = true
end
# Set up ssh
node.vm.provision "setup-ssh", :type => "shell", :path => "ubuntu/ssh.sh"
# Set up DNS
setup_dns node
# Install cert verification script
node.vm.provision "shell", inline: "ln -s /vagrant/ubuntu/cert_verify.sh /home/vagrant/cert_verify.sh"
end
# Define the number of master and worker nodes. You should not change this
NUM_MASTER_NODE = 2
NUM_WORKER_NODE = 2
# Host address start points
MASTER_IP_START = 10
NODE_IP_START = 20
LB_IP_START = 30
# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
@ -50,6 +82,7 @@ Vagrant.configure("2") do |config|
# boxes at https://vagrantcloud.com/search.
# config.vm.box = "base"
config.vm.box = "ubuntu/jammy64"
config.vm.boot_timeout = 900
# Disable automatic box update checking. If you disable this, then
# boxes will only be checked for updates when the user runs
@ -62,12 +95,8 @@ Vagrant.configure("2") do |config|
# Name shown in the GUI
node.vm.provider "virtualbox" do |vb|
vb.name = "kubernetes-ha-master-#{i}"
if i == 1
vb.memory = 2048 # More needed to run e2e tests at end
else
vb.memory = 1024
end
vb.cpus = 2
vb.memory = RESOURCES["master"][i > 2 ? 2 : i]["ram"]
vb.cpus = RESOURCES["master"][i > 2 ? 2 : i]["cpu"]
end
node.vm.hostname = "master-#{i}"
node.vm.network :private_network, ip: IP_NW + "#{MASTER_IP_START + i}"
@ -91,6 +120,8 @@ Vagrant.configure("2") do |config|
node.vm.hostname = "loadbalancer"
node.vm.network :private_network, ip: IP_NW + "#{LB_IP_START}"
node.vm.network "forwarded_port", guest: 22, host: 2730
# Set up ssh
node.vm.provision "setup-ssh", :type => "shell", :path => "ubuntu/ssh.sh"
setup_dns node
end
@ -99,8 +130,8 @@ Vagrant.configure("2") do |config|
config.vm.define "worker-#{i}" do |node|
node.vm.provider "virtualbox" do |vb|
vb.name = "kubernetes-ha-worker-#{i}"
vb.memory = 1024
vb.cpus = 1
vb.memory = RESOURCES["worker"]["ram"]
vb.cpus = RESOURCES["worker"]["cpu"]
end
node.vm.hostname = "worker-#{i}"
node.vm.network :private_network, ip: IP_NW + "#{NODE_IP_START + i}"

View File

@ -156,9 +156,15 @@ check_cert_only()
printf "${FAILED}Exiting...Found mismtach in the ${name} certificate, More details: https://github.com/mmumshad/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md#certificate-authority\n${NC}"
exit 1
fi
else
if [[ $cert == *kubelet-client-current* ]]
then
printf "${FAILED}${cert} missing. This probably means that kubelet failed to start.${NC}\n"
echo -e "Check logs with\n\n sudo journalctl -u kubelet\n"
else
printf "${FAILED}${cert} missing. More details: https://github.com/mmumshad/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md#certificate-authority\n${NC}"
echo "These should be in ${CERT_LOCATION}${NC}"
echo "These should be in ${CERT_LOCATION}"
fi
exit 1
fi
}
@ -425,17 +431,27 @@ check_systemd_ks()
# END OF Function - Master node #
if [ ! -z "$1" ]
then
choice=$1
else
echo "This script will validate the certificates in master as well as worker-1 nodes. Before proceeding, make sure you ssh into the respective node [ Master or Worker-1 ] for certificate validation"
while true
do
echo
echo " 1. Verify certificates on Master Nodes after step 4"
echo " 2. Verify kubeconfigs on Master Nodes after step 5"
echo " 3. Verify kubeconfigs and PKI on Master Nodes after step 8"
echo " 4. Verify kubeconfigs and PKI on worker-1 Node after step 10"
echo " 5. Verify kubeconfigs and PKI on worker-2 Node after step 11"
echo
echo -n "Please select one of the above options: "
read choice
echo "This script will validate the certificates in master as well as worker-1 nodes. Before proceeding, make sure you ssh into the respective node [ Master or Worker-1 ] for certificate validation"
echo
echo " 1. Verify certificates on Master Nodes after step 4"
echo " 2. Verify kubeconfigs on Master Nodes after step 5"
echo " 3. Verify kubeconfigs and PKI on Master Nodes after step 8"
echo " 4. Verify kubeconfigs and PKI on worker-1 Node after step 10"
echo " 5. Verify kubeconfigs and PKI on worker-2 Node after step 11"
echo
echo -n "Please select one of the above options: "
read value
[ -z "$choice" ] && continue
[ $choice -gt 0 -a $choice -lt 6 ] && break
done
fi
HOST=$(hostname -s)
@ -450,7 +466,7 @@ SUBJ_SA="Subject:CN=service-accounts,O=Kubernetes"
SUBJ_ETCD="Subject:CN=etcd-server,O=Kubernetes"
SUBJ_APIKC="Subject:CN=kube-apiserver-kubelet-client,O=system:masters"
case $value in
case $choice in
1)
if ! [ "${HOST}" = "master-1" -o "${HOST}" = "master-2" ]
@ -459,7 +475,7 @@ case $value in
exit 1
fi
echo -e "The selected option is $value, proceeding the certificate verification of Master node"
echo -e "The selected option is $choice, proceeding the certificate verification of Master node"
CERT_LOCATION=$HOME
check_cert_and_key "ca" $SUBJ_CA $CERT_ISSUER

View File

@ -1,19 +1,27 @@
#!/bin/bash
#
# Sets up the kernel with the requirements for running Kubernetes
# Requires a reboot, which is carried out by the vagrant provisioner.
set -ex
# Disable cgroups v2 (kernel command line parameter)
sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="/GRUB_CMDLINE_LINUX_DEFAULT="systemd.unified_cgroup_hierarchy=0 ipv6.disable=1 /' /etc/default/grub
update-grub
set -e
# Add br_netfilter kernel module
echo "br_netfilter" >> /etc/modules
cat <<EOF >> /etc/modules
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
br_netfilter
nf_conntrack
EOF
systemctl restart systemd-modules-load.service
# Set network tunables
cat <<EOF >> /etc/sysctl.d/10-kubernetes.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
EOF
sysctl --system

5
vagrant/ubuntu/ssh.sh Normal file
View File

@ -0,0 +1,5 @@
#!/bin/bash
# Enable password auth in sshd so we can use ssh-copy-id
sed -i 's/PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
systemctl restart sshd