kubernetes-the-hard-way/docs/04-etcd.md

# Bootstrapping a H/A etcd cluster

In this lab you will bootstrap a 3 node etcd cluster. The following virtual machines will be used:

* controller0
* controller1
* controller2

## Why

All Kubernetes components are stateless which greatly simplifies managing a Kubernetes cluster. All state is stored
in etcd, which is a database and must be treated specially. To limit the number of compute resource to complete this lab etcd is being installed on the Kubernetes controller nodes, although some people will prefer to run etcd on a dedicated set of machines for the following reasons:

* The etcd lifecycle is not tied to Kubernetes. We should be able to upgrade etcd independently of Kubernetes.
* Scaling out etcd is different than scaling out the Kubernetes Control Plane.
* Prevent other applications from taking up resources (CPU, Memory, I/O) required by etcd.

However, all the e2e tested configurations currently run etcd on the master nodes.

## Provision the etcd Cluster

Run the following commands on `controller0`, `controller1`, `controller2`:

### TLS Certificates

The TLS certificates created in the [Setting up a CA and TLS Cert Generation](02-certificate-authority.md) lab will be used to secure communication between the Kubernetes API server and the etcd cluster. The TLS certificates will also be used to limit access to the etcd cluster using TLS client authentication. Only clients with a TLS certificate signed by a trusted CA will be able to access the etcd cluster.

Copy the TLS certificates to the etcd configuration directory:

```
sudo mkdir -p /etc/etcd/
```

```
sudo cp ca.pem kubernetes-key.pem kubernetes.pem /etc/etcd/
```

### Download and Install the etcd binaries

Download the official etcd release binaries from `coreos/etcd` GitHub project:

```
wget https://github.com/coreos/etcd/releases/download/v3.1.4/etcd-v3.1.4-linux-amd64.tar.gz
```

Extract and install the `etcd` server binary and the `etcdctl` command line client: 

```
tar -xvf etcd-v3.1.4-linux-amd64.tar.gz
```

```
sudo mv etcd-v3.1.4-linux-amd64/etcd* /usr/bin/
```

All etcd data is stored under the etcd data directory. In a production cluster the data directory should be backed by a persistent disk. Create the etcd data directory:

```
sudo mkdir -p /var/lib/etcd
```

### Set The Internal IP Address

The internal IP address will be used by etcd to serve client requests and communicate with other etcd peers.

```
INTERNAL_IP=$(curl -s -H "Metadata-Flavor: Google" \
  http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip)
```

Each etcd member must have a unique name within an etcd cluster. Set the etcd name:

```
ETCD_NAME=controller$(echo $INTERNAL_IP | cut -c 11)
```

The etcd server will be started and managed by systemd. Create the etcd systemd unit file:

```
cat > etcd.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos

[Service]
ExecStart=/usr/bin/etcd \\
  --name ${ETCD_NAME} \\
  --cert-file=/etc/etcd/kubernetes.pem \\
  --key-file=/etc/etcd/kubernetes-key.pem \\
  --peer-cert-file=/etc/etcd/kubernetes.pem \\
  --peer-key-file=/etc/etcd/kubernetes-key.pem \\
  --trusted-ca-file=/etc/etcd/ca.pem \\
  --peer-trusted-ca-file=/etc/etcd/ca.pem \\
  --initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\
  --listen-peer-urls https://${INTERNAL_IP}:2380 \\
  --listen-client-urls https://${INTERNAL_IP}:2379,http://127.0.0.1:2379 \\
  --advertise-client-urls https://${INTERNAL_IP}:2379 \\
  --initial-cluster-token etcd-cluster-0 \\
  --initial-cluster controller0=https://10.240.0.10:2380,controller1=https://10.240.0.11:2380,controller2=https://10.240.0.12:2380 \\
  --initial-cluster-state new \\
  --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF
```

Once the etcd systemd unit file is ready, move it to the systemd system directory:

```
sudo mv etcd.service /etc/systemd/system/
```

Start the etcd server:

```
sudo systemctl daemon-reload
```

```
sudo systemctl enable etcd
```

```
sudo systemctl start etcd
```

```
sudo systemctl status etcd --no-pager
```

> Remember to run these steps on `controller0`, `controller1`, and `controller2`

## Verification

Once all 3 etcd nodes have been bootstrapped verify the etcd cluster is healthy:

* On one of the controller nodes run the following command:

```
sudo etcdctl \
  --ca-file=/etc/etcd/ca.pem \
  --cert-file=/etc/etcd/kubernetes.pem \
  --key-file=/etc/etcd/kubernetes-key.pem \
  cluster-health
```

```
member 3a57933972cb5131 is healthy: got healthy result from https://10.240.0.12:2379
member f98dc20bce6225a0 is healthy: got healthy result from https://10.240.0.10:2379
member ffed16798470cab5 is healthy: got healthy result from https://10.240.0.11:2379
cluster is healthy
```
update docs 2016-07-07 18:06:53 +03:00			`# Bootstrapping a H/A etcd cluster`
let the pain begin 2016-07-07 17:15:59 +03:00
update docs 2016-07-07 18:06:53 +03:00			`In this lab you will bootstrap a 3 node etcd cluster. The following virtual machines will be used:`
let the pain begin 2016-07-07 17:15:59 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`* controller0`
			`* controller1`
			`* controller2`
update docs 2016-07-07 18:06:53 +03:00
update docs 2016-07-07 18:25:27 +03:00			`## Why`

			`All Kubernetes components are stateless which greatly simplifies managing a Kubernetes cluster. All state is stored`
More nuance in etcd isolation recommendation Clarify that while running etcd on a separate set of machine is a good idea, it isn't tested. 2017-01-21 21:28:14 +03:00			`in etcd, which is a database and must be treated specially. To limit the number of compute resource to complete this lab etcd is being installed on the Kubernetes controller nodes, although some people will prefer to run etcd on a dedicated set of machines for the following reasons:`
update docs 2016-07-07 18:25:27 +03:00
			`* The etcd lifecycle is not tied to Kubernetes. We should be able to upgrade etcd independently of Kubernetes.`
			`* Scaling out etcd is different than scaling out the Kubernetes Control Plane.`
			`* Prevent other applications from taking up resources (CPU, Memory, I/O) required by etcd.`

More nuance in etcd isolation recommendation Clarify that while running etcd on a separate set of machine is a good idea, it isn't tested. 2017-01-21 21:28:14 +03:00			`However, all the e2e tested configurations currently run etcd on the master nodes.`

update docs 2016-07-07 18:06:53 +03:00			`## Provision the etcd Cluster`
let the pain begin 2016-07-07 17:15:59 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			Run the following commands on `controller0`, `controller1`, `controller2`:
let the pain begin 2016-07-07 17:15:59 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`### TLS Certificates`

			`The TLS certificates created in the [Setting up a CA and TLS Cert Generation](02-certificate-authority.md) lab will be used to secure communication between the Kubernetes API server and the etcd cluster. The TLS certificates will also be used to limit access to the etcd cluster using TLS client authentication. Only clients with a TLS certificate signed by a trusted CA will be able to access the etcd cluster.`

			`Copy the TLS certificates to the etcd configuration directory:`
let the pain begin 2016-07-07 17:15:59 +03:00
			```
			`sudo mkdir -p /etc/etcd/`
			```

			```
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`sudo cp ca.pem kubernetes-key.pem kubernetes.pem /etc/etcd/`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`### Download and Install the etcd binaries`

			Download the official etcd release binaries from `coreos/etcd` GitHub project:
update docs 2016-07-07 18:06:53 +03:00
let the pain begin 2016-07-07 17:15:59 +03:00			```
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`wget https://github.com/coreos/etcd/releases/download/v3.1.4/etcd-v3.1.4-linux-amd64.tar.gz`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			Extract and install the `etcd` server binary and the `etcdctl` command line client:

let the pain begin 2016-07-07 17:15:59 +03:00			```
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`tar -xvf etcd-v3.1.4-linux-amd64.tar.gz`
let the pain begin 2016-07-07 17:15:59 +03:00			```

			```
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`sudo mv etcd-v3.1.4-linux-amd64/etcd* /usr/bin/`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`All etcd data is stored under the etcd data directory. In a production cluster the data directory should be backed by a persistent disk. Create the etcd data directory:`

let the pain begin 2016-07-07 17:15:59 +03:00			```
			`sudo mkdir -p /var/lib/etcd`
			```

add support for aws 2016-09-11 06:00:31 +03:00			`### Set The Internal IP Address`

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`The internal IP address will be used by etcd to serve client requests and communicate with other etcd peers.`

let the pain begin 2016-07-07 17:15:59 +03:00			```
add support for aws 2016-09-11 13:07:28 +03:00			`INTERNAL_IP=$(curl -s -H "Metadata-Flavor: Google" \`
dry up the docs 2016-07-08 20:26:32 +03:00			`http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip)`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`Each etcd member must have a unique name within an etcd cluster. Set the etcd name:`
add support for aws 2016-09-11 06:00:31 +03:00
let the pain begin 2016-07-07 17:15:59 +03:00			```
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`ETCD_NAME=controller$(echo $INTERNAL_IP \| cut -c 11)`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`The etcd server will be started and managed by systemd. Create the etcd systemd unit file:`
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00
let the pain begin 2016-07-07 17:15:59 +03:00			```
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`cat > etcd.service <<EOF`
			`[Unit]`
			`Description=etcd`
			`Documentation=https://github.com/coreos`
let the pain begin 2016-07-07 17:15:59 +03:00
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`[Service]`
			`ExecStart=/usr/bin/etcd \\`
			`--name ${ETCD_NAME} \\`
			`--cert-file=/etc/etcd/kubernetes.pem \\`
			`--key-file=/etc/etcd/kubernetes-key.pem \\`
			`--peer-cert-file=/etc/etcd/kubernetes.pem \\`
			`--peer-key-file=/etc/etcd/kubernetes-key.pem \\`
			`--trusted-ca-file=/etc/etcd/ca.pem \\`
			`--peer-trusted-ca-file=/etc/etcd/ca.pem \\`
			`--initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\`
			`--listen-peer-urls https://${INTERNAL_IP}:2380 \\`
			`--listen-client-urls https://${INTERNAL_IP}:2379,http://127.0.0.1:2379 \\`
			`--advertise-client-urls https://${INTERNAL_IP}:2379 \\`
			`--initial-cluster-token etcd-cluster-0 \\`
			`--initial-cluster controller0=https://10.240.0.10:2380,controller1=https://10.240.0.11:2380,controller2=https://10.240.0.12:2380 \\`
			`--initial-cluster-state new \\`
			`--data-dir=/var/lib/etcd`
			`Restart=on-failure`
			`RestartSec=5`

			`[Install]`
			`WantedBy=multi-user.target`
			`EOF`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`Once the etcd systemd unit file is ready, move it to the systemd system directory:`

let the pain begin 2016-07-07 17:15:59 +03:00			```
dry up the docs 2016-07-08 20:26:32 +03:00			`sudo mv etcd.service /etc/systemd/system/`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`Start the etcd server:`
update docs 2016-07-07 18:06:53 +03:00
let the pain begin 2016-07-07 17:15:59 +03:00			```
			`sudo systemctl daemon-reload`
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			```
update to Kubernetes 1.6 2017-03-25 20:07:48 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			```
let the pain begin 2016-07-07 17:15:59 +03:00			`sudo systemctl enable etcd`
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			```
update to Kubernetes 1.6 2017-03-25 20:07:48 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			```
let the pain begin 2016-07-07 17:15:59 +03:00			`sudo systemctl start etcd`
			```

			```
dry up the docs 2016-07-08 20:26:32 +03:00			`sudo systemctl status etcd --no-pager`
let the pain begin 2016-07-07 17:15:59 +03:00			```

update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			> Remember to run these steps on `controller0`, `controller1`, and `controller2`
clean up docs 2016-07-09 03:14:31 +03:00
dry up the docs 2016-07-08 20:26:32 +03:00			`## Verification`
update docs 2016-07-07 18:06:53 +03:00
dry up the docs 2016-07-08 20:26:32 +03:00			`Once all 3 etcd nodes have been bootstrapped verify the etcd cluster is healthy:`
let the pain begin 2016-07-07 17:15:59 +03:00
update to Kubernetes 1.4 2016-09-27 15:23:35 +03:00			`* On one of the controller nodes run the following command:`
let the pain begin 2016-07-07 17:15:59 +03:00
			```
add authentication lab 2017-03-24 09:08:54 +03:00			`sudo etcdctl \`
update to Kubernetes 1.6 2017-03-24 05:48:14 +03:00			`--ca-file=/etc/etcd/ca.pem \`
			`--cert-file=/etc/etcd/kubernetes.pem \`
			`--key-file=/etc/etcd/kubernetes-key.pem \`
			`cluster-health`
let the pain begin 2016-07-07 17:15:59 +03:00			```

			```
			`member 3a57933972cb5131 is healthy: got healthy result from https://10.240.0.12:2379`
			`member f98dc20bce6225a0 is healthy: got healthy result from https://10.240.0.10:2379`
			`member ffed16798470cab5 is healthy: got healthy result from https://10.240.0.11:2379`
			`cluster is healthy`
add support for aws 2016-09-11 06:00:31 +03:00			```