Installing Containerd

Make sure that SWAP is disabled and not running. If swap has not been removed from the disk proposal on the install, do the following:

$ systemctl disable swap.target
 
$ swapoff -a

# Configure persistent loading of modules
sudo tee /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
 
# Load at runtime
sudo modprobe overlay
sudo modprobe br_netfilter
 
# Ensure sysctl params are set
sudo tee /etc/sysctl.d/kubernetes.conf<<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
 
# Reload configs
sudo sysctl --system

# Install required packages

sudo apt install -y curl gnupg2 software-properties-common apt-transport-https ca-certificates

2 Méthodes pour installer containerD, soit en utilisant les repository docker, soit via les repo de la distribution

# Add Docker repo
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
 
# Install containerd
sudo apt update

Depuis les repos docker

sudo apt install -y containerd.io

Depuis les repos distribution

sudo apt install containerd

Create a containerd configuration file

sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml

Set the cgroup driver for runc to systemd

Set the cgroup driver for runc to systemd which is required for the kubelet.

For more information on this config file see the containerd configuration docs here and also here.

At the end of this section in /etc/containerd/config.toml

      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
      ...

Around line 112, change the value for SystemCgroup from false to true.

            SystemdCgroup = true

If you like, you can use sed to swap it out in the file with out having to manually edit the file.

sudo sed -i 's/            SystemdCgroup = false/            SystemdCgroup = true/' /etc/containerd/config.toml

Restart containerd with the new configuration

# restart containerd
sudo systemctl restart containerd
sudo systemctl enable containerd
systemctl status  containerd

And that’s it, from here you can install and configure Kubernetes on top of this container runtime. In an upcoming post, I will bootstrap a cluster using containerd as the container runtime.

Si les noeuds sont pas labelisés en tant que worker

kubectl label node worker001 node-role.kubernetes.io/worker=worker
kubectl label node worker002 node-role.kubernetes.io/worker=worker

Pour supprimer le label worker sur le noeuds 2

kubectl label node worker002 node-role.kubernetes.io/worker-

This section describes how to install a high availability (HA) RKE2 cluster. An HA RKE2 cluster consists of:

A fixed registration address that is placed in front of server nodes to allow other nodes to register with the cluster
An odd number (three recommended) of server nodes that will run etcd, the Kubernetes API, and other control plane services
Zero or more agent nodes that are designated to run your apps and services

Agents register through the fixed registration address. However, when RKE2 launches the kubelet and it must connect to the Kubernetes api-server, it does so through the rke2 agent process, which acts as a client-side load balancer.

Setting up an HA cluster requires the following steps:

Configure a fixed registration address
Launch the first server node
Join additional server nodes
Join agent nodes

Server nodes beyond the first one and all agent nodes need a URL to register against. This can be the IP or hostname of any of the server nodes, but in many cases those may change over time as nodes are created and destroyed. Therefore, you should have a stable endpoint in front of the server nodes.

This endpoint can be set up using any number approaches, such as:

A layer 4 (TCP) load balancer
Round-robin DNS
Virtual or elastic IP addresses

This endpoint can also be used for accessing the Kubernetes API. So you can, for example, modify your kubeconfig file to point to it instead of a specific node.

Note that the rke2 server process listens on port 9345 for new nodes to register. The Kubernetes API is served on port 6443, as normal. Configure your load balancer accordingly.

root@master001:~# curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="server" sh -
[INFO]  finding release for channel stable
[INFO]  using v1.22.7+rke2r1 as release
[INFO]  downloading checksums at https://github.com/rancher/rke2/releases/download/v1.22.7+rke2r1/sha256sum-amd64.txt
[INFO]  downloading tarball at https://github.com/rancher/rke2/releases/download/v1.22.7+rke2r1/rke2.linux-amd64.tar.gz
[INFO]  verifying tarball
[INFO]  unpacking tarball file to /usr/local

The first server node establishes the secret token that other server or agent nodes will register with when connecting to the cluster.

To specify your own pre-shared secret as the token, set the token argument on startup.

If you do not specify a pre-shared secret, RKE2 will generate one and place it at /var/lib/rancher/rke2/server/node-token.

To avoid certificate errors with the fixed registration address, you should launch the server with the tls-san parameter set. This option adds an additional hostname or IP as a Subject Alternative Name in the server's TLS cert, and it can be specified as a list if you would like to access via both the IP and the hostname.

Here is an example of what the RKE2 config file (at /etc/rancher/rke2/config.yaml) would look like if you are following this guide.

# mkdir -p /etc/rancher/rke2

Note The RKE2 config file needs to be created manually. You can do that by running touch /etc/rancher/rke2/config.yaml as a privileged user.

token: my-shared-secret
tls-san:
  - my-kubernetes-domain.com
  - another-kubernetes-domain.com

mkdir -p /etc/rancher/rke2
cat << EOF >  /etc/rancher/rke2/config.yaml
write-kubeconfig-mode: "0644"
tls-san:
  - "oowy.lan"
# (db) Set the base name of etcd snapshots. Default: etcd-snapshot-<unix-timestamp> (default: "etcd-snapshot")
etcd-snapshot-name: "etcd-snapshot"
# (db) Snapshot interval time in cron spec. eg. every 6 hours '* */6 * * *' (default: "0 */12 * * *")
etcd-snapshot-schedule-cron: "* */6 * * *"
# (db) Number of snapshots to retain Default: 5 (default: 5)
etcd-snapshot-retention: "5"
# (db) Directory to save db snapshots. (Default location: ${data-dir}/db/snapshots)
etcd-snapshot-dir: "${data-dir}/db/snapshots"
# (networking) IPv4/IPv6 network CIDRs to use for pod IPs (default: 10.42.0.0/16)
cluster-cidr: "10.42.0.0/16"
# (networking) IPv4/IPv6 network CIDRs to use for service IPs (default: 10.43.0.0/16)
service-cidr: "10.43.0.0/16"
# (networking) Port range to reserve for services with NodePort visibility (default: "30000-32767")
service-node-port-range: "30000-32767"
# (networking) IPv4 Cluster IP for coredns service. Should be in your service-cidr range (default: 10.43.0.10)
cluster-dns: "10.43.0.10"
# (networking) Cluster Domain (default: "cluster.local")
cluster-domain: "cluster.local"
cni:
  - calico
disable:
  - rke2-canal
  - rke2-kube-proxy
EOF

On lance RHE2 server

root@master001:~# systemctl enable rke2-server.service
Created symlink /etc/systemd/system/multi-user.target.wants/rke2-server.service → /usr/local/lib/systemd/system/rke2-server.service.

root@master001:~# systemctl start rke2-server.service

root@master001:~# journalctl -u rke2-server -f
-- Logs begin at Thu 2022-02-24 10:18:19 UTC. --
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="Event(v1.ObjectReference{Kind:\"Addon\", Namespace:\"kube-system\", Name:\"rke2-multus\", UID:\"\", APIVersion:\"k3s.cattle.io/v1\", ResourceVersion:\"\", FieldPath:\"\"}): type: 'Normal' reason: 'DeletingManifest' Deleting manifest at \"/var/lib/rancher/rke2/server/manifests/rke2-multus.yaml\""
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="Stopped tunnel to 127.0.0.1:9345"
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="Connecting to proxy" url="wss://10.75.168.101:9345/v1-rke2/connect"
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="Proxy done" err="context canceled" url="wss://127.0.0.1:9345/v1-rke2/connect"
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
Feb 25 13:53:24 master001 rke2[2878]: time="2022-02-25T13:53:24Z" level=info msg="Updating TLS secret for rke2-serving (count: 10): map[listener.cattle.io/cn-10.43.0.1:10.43.0.1 listener.cattle.io/cn-10.75.168.101:10.75.168.101 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/cn-master001:master001 listener.cattle.io/cn-oowy.lan:oowy.lan listener.cattle.io/fingerprint:SHA1=70BC8C7B219CFEB57118179109D5CB7EAA0F0460]"
Feb 25 13:53:25 master001 rke2[2878]: time="2022-02-25T13:53:25Z" level=info msg="Active TLS secret rke2-serving (ver=371) (count 10): map[listener.cattle.io/cn-10.43.0.1:10.43.0.1 listener.cattle.io/cn-10.75.168.101:10.75.168.101 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/cn-master001:master001 listener.cattle.io/cn-oowy.lan:oowy.lan listener.cattle.io/fingerprint:SHA1=70BC8C7B219CFEB57118179109D5CB7EAA0F0460]"
Feb 25 13:53:25 master001 rke2[2878]: time="2022-02-25T13:53:25Z" level=info msg="Handling backend connection request [master001]"
Feb 25 13:53:25 master001 rke2[2878]: time="2022-02-25T13:53:25Z" level=info msg="Running kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0 --conntrack-tcp-timeout-close-wait=0s --conntrack-tcp-timeout-established=0s --healthz-bind-address=127.0.0.1 --hostname-override=master001 --kubeconfig=/var/lib/rancher/rke2/agent/kubeproxy.kubeconfig --proxy-mode=iptables"
Feb 25 13:53:26 master001 rke2[2878]: time="2022-02-25T13:53:26Z" level=info msg="Labels and annotations have been set successfully on node: master001"

On vérifie que le cluster réponds

/var/lib/rancher/rke2/bin/kubectl \
        --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

output

root@master001:~# /var/lib/rancher/rke2/bin/kubectl \
>         --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes
NAME        STATUS   ROLES                       AGE     VERSION
master001   Ready    control-plane,etcd,master   9m35s   v1.22.7+rke2r1

Pour les autres nœuds du cluster on va ajouter en plus les infos du master dans le fichier de config

server: https://my-kubernetes-domain.com:9345
token: my-shared-secret
***

Le token se trouve dans le master001 : /var/lib/rancher/rke2/server/token Exemple

server: https://master001.oowy.lan:9345
token: K10b93dee6011a468fe9ea43d59ae8064c12ccefbe34a6b446bfd07ff7902d6d88e::server:eb92b677c234026da5db07826b243ca4
write-kubeconfig-mode: "0644"
tls-san:
  - "oowy.lan"
# (db) Set the base name of etcd snapshots. Default: etcd-snapshot-<unix-timestamp> (default: "etcd-snapshot")
etcd-snapshot-name: "etcd-snapshot"
# (db) Snapshot interval time in cron spec. eg. every 6 hours '* */6 * * *' (default: "0 */12 * * *")
etcd-snapshot-schedule-cron: "* */6 * * *"
# (db) Number of snapshots to retain Default: 5 (default: 5)
etcd-snapshot-retention: "5"
# (db) Directory to save db snapshots. (Default location: ${data-dir}/db/snapshots)
etcd-snapshot-dir: "${data-dir}/db/snapshots"
# (networking) IPv4/IPv6 network CIDRs to use for pod IPs (default: 10.42.0.0/16)
cluster-cidr: "10.42.0.0/16"
# (networking) IPv4/IPv6 network CIDRs to use for service IPs (default: 10.43.0.0/16)
service-cidr: "10.43.0.0/16"
# (networking) Port range to reserve for services with NodePort visibility (default: "30000-32767")
service-node-port-range: "30000-32767"
# (networking) IPv4 Cluster IP for coredns service. Should be in your service-cidr range (default: 10.43.0.10)
cluster-dns: "10.43.0.10"
# (networking) Cluster Domain (default: "cluster.local")
cluster-domain: "cluster.local"
cni:
  - calico
disable:
  - rke2-canal
  - rke2-kube-proxy

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -

root@worker001:~# curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -
[INFO]  finding release for channel stable
[INFO]  using v1.22.7+rke2r1 as release
[INFO]  downloading checksums at https://github.com/rancher/rke2/releases/download/v1.22.7+rke2r1/sha256sum-amd64.txt
[INFO]  downloading tarball at https://github.com/rancher/rke2/releases/download/v1.22.7+rke2r1/rke2.linux-amd64.tar.gz
[INFO]  verifying tarball
[INFO]  unpacking tarball file to /usr/local

On créer un fichier de config

mkdir -p /etc/rancher/rke2/
nano /etc/rancher/rke2/config.yaml

On dépose uste le server et token

server: https://master001.oowy.lan:9345
token: K10b93dee6011a468fe9ea43d59ae8064c12ccefbe34a6b446bfd07ff7902d6d88e::server:eb92b677c234026da5db07826b243ca4
node-label:
  - ["node-role.kubernetes.io/worker=true"]

On lance

systemctl enable rke2-agent.service
systemctl start rke2-agent.service

On vérifie

root@worker001:~# journalctl -u rke2-agent -f
-- Logs begin at Thu 2022-02-24 10:18:25 UTC. --
Feb 25 14:26:07 worker001 rke2[4626]: Flag --tls-cert-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Feb 25 14:26:07 worker001 rke2[4626]: Flag --tls-private-key-file has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: nodes \"worker001\" is forbidden: node \"worker001\" is not allowed to modify taints"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: nodes \"worker001\" is forbidden: node \"worker001\" is not allowed to modify taints"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: nodes \"worker001\" is forbidden: node \"worker001\" is not allowed to modify taints"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: nodes \"worker001\" is forbidden: node \"worker001\" is not allowed to modify taints"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: nodes \"worker001\" is forbidden: node \"worker001\" is not allowed to modify taints"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="Failed to update node worker001: Operation cannot be fulfilled on nodes \"worker001\": the object has been modified; please apply your changes to the latest version and try again"
Feb 25 14:26:14 worker001 rke2[4599]: time="2022-02-25T14:26:14Z" level=info msg="labels have been set successfully on node: worker001"
Feb 25 14:26:14 worker001 systemd[1]: Started Rancher Kubernetes Engine v2 (agent).

On vérifie que le node apparait bien

root@master001:~# /var/lib/rancher/rke2/bin/kubectl         --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes
NAME        STATUS   ROLES                       AGE   VERSION
master001   Ready    control-plane,etcd,master   34m   v1.22.7+rke2r1
master002   Ready    control-plane,etcd,master   21m   v1.22.7+rke2r1
master003   Ready    control-plane,etcd,master   18m   v1.22.7+rke2r1
worker001   Ready    <none>                      70s   v1.22.7+rke2r1

/usr/local/bin/rke2-uninstall.sh

kubectl delete nodes worker002

kubectl get secret -n kube-system

Si le secret n'est pas supprimé

kubectl delete secret <node> -n kube-system

https://infohub.delltechnologies.com/l/suse-rancher-and-rke2-kubernetes-cluster-in-apex-private-cloud-services/steps-to-install-rke2-cluster-three-nodes-manually

https://www.acagroup.be/en/blog/how-to-install-rancher-rke2-on-centos-stream-8/

Installing Containerd

RKE2 - HA Clusters

1. Configure the Fixed Registration Address

2. Launch the first server node

3. config

Ajout des nodes

Supprimer un noeuds

Annexe