Install Cluster Kubernates using Kubeadm Ubuntu 22.04, Kubeadm Cluster 1.31, Calico 3.29
Requirement :
- 3 machines running Ubuntu 22.04 – Jammy
- master-1
- worker-1
- worker-2
- 4 GiB or more of RAM per machine–any less leaves little room for your apps.
- At least 2 CPUs on the machine that you use as a control-plane node.
- Full network connectivity among all machines in the cluster. You can use either a public or a private network.
Initialize node
reference : https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd
Make sure date time are correct and synced
We use timesyncd, check the status
systemctl status systemd-timesyncd
Edit your NTP Server
root@node1:/home/chairul# cat /etc/systemd/timesyncd.conf
[Time]
NTP=[NTP Server]
Restart the service
systemctl restart systemd-timesyncd
Check your time and timezone, set your timezone if neccessary
$timedatectl
$timedatectl set-timezone Asia/Jakarta
Remove Swap
$strace -eopenat kubectl version
#swapoff -a
#sudo sed -i '/\tswap\t/d' /etc/fstab
Sometime etcd having problem running in Ubuntu 22.04, edit the grub config,, edit a line
#cat /etc/default/grub
…
GRUB_CMDLINE_LINUX_DEFAULT=”systemd.unified_cgroup_hierarchy=0″
…
Disable AppArmor
sudo systemctl stop apparmor && sudo systemctl disable apparmor
iptables -F
For all Node, run as root
#apt update
Initialize the node parameter
#cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
$sudo modprobe overlay
$sudo modprobe br_netfilter
### sysctl params required by setup, params persist across reboots
#cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.netfilter.nf_conntrack_max = 1000000
EOF
### Apply sysctl params without reboot
$sudo sysctl --system
### Verify
#lsmod | grep br_netfilter
#lsmod | grep overlay
#sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
#Sample Output
root@master-1:/home/ubuntu# lsmod | grep br_netfilter
br_netfilter 32768 0
bridge 311296 1 br_netfilter
root@master-1:/home/ubuntu# lsmod | grep overlay
overlay 151552 25
root@master-1:/home/ubuntu# sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
root@master-1:/home/ubuntu#
Install Containerd
Reference: https://docs.docker.com/engine/install/ubuntu/
For all nodes :
Update package
$sudo apt update
Setup the Repository
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install containerd.io
Use correct cgroup
#Verify cgroup drivers
$sudo ps -p 1
root@master-1:/home/ubuntu# sudo ps -p 1
PID TTY TIME CMD
1 ? 00:00:01 systemd
If it systemd, config the cgroup driver:
Edit /etc/containerd/config.toml
, and remove all the content. Replace it with this following
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml
sudo nano /etc/containerd/config.toml
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
...
Setup kernel parameter for containerd
#mkdir -p /etc/systemd/system/containerd.service.d
#cat <<EOF | tee /etc/systemd/system/containerd.service.d/override.conf
[Service]
LimitMEMLOCK=4194304
LimitNOFILE=1048576
EOF
#cat > /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
#systemctl daemon-reload
#systemctl restart containerd
Installing kubeadm, kubelet and kubect
Reference : https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
Do this on all nodes
Update the apt
package index and install packages needed to use the Kubernetes apt
repository:
$sudo apt-get update
### apt-transport-https may be a dummy package; if so, you can skip that package
$sudo apt-get install -y apt-transport-https ca-certificates curl
Download the public signing key for the Kubernetes package repositories. The same signing key is used for all repositories so you can disregard the version in the URL:
# If the directory `/etc/apt/keyrings` does not exist, it should be created before the curl command, read the note below.
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
Add the appropriate Kubernetes apt
repository:
# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
Update the apt
package index, install kubelet, kubeadm and kubectl, and pin their version:
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
Deploy the cluster
Do this on the Master node
Get the IP address of the Master main
interface/adapter of the controlplane
ip a
root@master-1:/home/ubuntu# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:b7:9b:bd brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 10.5.24.32/24 metric 100 brd 10.5.24.255 scope global dynamic ens3
valid_lft 41773sec preferred_lft 41773sec
inet6 fe80::f816:3eff:feb7:9bbd/64 scope link
valid_lft forever preferred_lft forever
Set the master hostname to be resolve locally to the listener IP
root@master-1:/home/ubuntu# cat /etc/hosts
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
10.5.24.32 master-1
10.5.24.165 worker-1
10.5.24.181 worker-2
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Initialize the Kubernates Master with the IP
$sudo kubeadm init \
--apiserver-cert-extra-sans=master-1\
--control-plane-endpoint 10.5.24.32\
--pod-network-cidr=192.168.0.0/16
### If encountering [preflight] Some fatal errors occurred:
failed to create new CRI runtime service: validate service connection: validate CRI v1 runtime API for endpoint "unix:///var/run/containerd/containerd.sock", Error
#### Refer to this https://forum.linuxfoundation.org/discussion/862825/kubeadm-init-error-cri-v1-runtime-api-is-not-implemented
###at the final output you should get this
==========================================
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 10.5.24.32:6443 --token 4639jr.a7smqg4qwzp3tv2u \
--discovery-token-ca-cert-hash sha256:70b896b391bc1956dfbf6b658fa75d6d1da5c07eeca75645a13664dc7a341b22 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.5.24.32:6443 --token 4639jr.a7smqg4qwzp3tv2u \
--discovery-token-ca-cert-hash sha256:70b896b391bc1956dfbf6b658fa75d6d1da5c07eeca75645a13664dc7a341b22
==========================================
Follow the instruction accordingly
If you missed copying the join command, execute the following command in the master node to recreate the token with the join command.
kubeadm token create --print-join-command
Then Follow the info from the output :
$mkdir -p $HOME/.kube
$sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$sudo chown $(id -u):$(id -g) $HOME/.kube/config
add add worker to cluster (do this in worker-1 worker-2):
#kubeadm join 10.5.24.32:6443 --token qqo5iu.0rnk4f2rwoetyyor \
--discovery-token-ca-cert-hash sha256:b32b88879a91026e5ec224ab1630adb8cdbf42ba60de6e1ec7a8ee24493193e2
Verify your cluster running :
root@master-1:/home/ubuntu# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-1 NotReady control-plane 2m45s v1.31.2
root@master-1:/home/ubuntu# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7c65d6cfc9-jptjm 0/1 Pending 0 30s
kube-system coredns-7c65d6cfc9-z74gm 0/1 Pending 0 32s
kube-system etcd-master-1 1/1 Running 356 2m45s
kube-system kube-apiserver-master-1 1/1 Running 348 2m45s
kube-system kube-controller-manager-master-1 1/1 Running 70 (113s ago) 2m45s
kube-system kube-proxy-mvsbd 1/1 Running 0 34s
kube-system kube-scheduler-master-1 1/1 Running 368 2m42s
The nodes will still be in “Not Ready” state until network addon ready.
Setup the pod network and metric addon
Click the add-on link from the init Master output
https://kubernetes.io/docs/concepts/cluster-administration/addons/
In this part we will use Calico
reference : https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises
in the Master node, install the operator
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/tigera-operator.yaml
###Also download custom resource to use later
curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/custom-resources.yaml -O
### Customized Resource
nano custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io>
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
ipPools:
- name: default-ipv4-ippool
blockSize: 26
cidr: 192.168.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
### If nodes are inside Openstack, use ipip encapsulation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
ipPools:
- name: default-ipv4-ippool
blockSize: 26
cidr: 192.168.0.0/16
ipipMode: Always
natOutgoing: Enabled
nodeSelector: all()
vxlanMode: Never
# This section configures the Calico API server.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io>
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
### Create Resource
kubectl create -f custom-resources.yaml
### Wait until all pods running, it will takes some minutes
### Setup metric addon
kubectl apply -f https://raw.githubusercontent.com/techiescamp/kubeadm-scripts/main/manifests/metrics-server.yaml
### Check node metric
kubectl top nodes
Verify and troubleshoot calico
### Install calicoctl
## Refer :https://docs.tigera.io/calico/latest/operations/calicoctl/install
## This is for operationg calico installation
curl -L https://github.com/projectcalico/calico/releases/download/v3.29.1/calicoctl-linux-amd64 -o calicoctl
chmod +x ./calicoctl
## Depends on your 'echo $PATH', copy calicoctl
mv calicoctl /usr/local/bin/
### Verifying Calico config
calicoctl get ipPool --export -o yaml > pool.yaml
cat pool.yaml
### Change the encapsulation to IPIP incase you missed the custom config
## Also allow IP Protocols 4 and 47 in your security group
nano set-ipinip.yaml
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: default-ipv4-ippool
spec:
cidr: 192.168.0.0/16
ipipMode: Always
natOutgoing: true
---
save exit
kubectl replace -f set-ipinip.yaml
### Verify node status and Networking
calicoctl node status
### BGP (tcp 179) port must reachable between nodes
Sample of used security group on OpenStack
This is the #kubectl get pods -A at this stage. Make sure all are running
root@master-1:/home/ubuntu# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-654d7c769b-mggb9 1/1 Running 0 5m11s
calico-apiserver calico-apiserver-654d7c769b-z7vvj 1/1 Running 2 (2m51s ago) 5m12s
calico-system calico-kube-controllers-5bc67c77b-xxwtq 1/1 Running 0 4m51s
calico-system calico-node-44nb9 1/1 Running 0 5m1s
calico-system calico-typha-645c4bcfb6-z5jkh 1/1 Running 0 5m8s
calico-system csi-node-driver-kx4gq 2/2 Running 0 4m58s
kube-system coredns-7c65d6cfc9-jptjm 1/1 Running 0 8m23s
kube-system coredns-7c65d6cfc9-z74gm 1/1 Running 0 8m25s
kube-system etcd-master-1 1/1 Running 356 10m
kube-system kube-apiserver-master-1 1/1 Running 348 10m
kube-system kube-controller-manager-master-1 1/1 Running 70 (9m46s ago) 10m
kube-system kube-proxy-mvsbd 1/1 Running 0 8m27s
kube-system kube-scheduler-master-1 1/1 Running 368 10m
tigera-operator tigera-operator-f8bc97d4c-lks6t 1/1 Running 0 6m8s
Add worker node to cluster
By the initialize Master output, you can paste the join command to the Worker node
kubeadm join 10.5.24.32:6443 --token 2ucm4t.wdqf2wcqpx0cdhia \
--discovery-token-ca-cert-hash sha256:618005b50326f476a0b766d4987ab95b54e3fc1a2dc295d77c5a1a0315c88357
Verify the cluster
###Check Component
#kubectl get componentstatuses
###Check Node
#kubectl get node
root@node1:/home/chairul# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready control-plane 43m v1.28.2
node2 Ready <none> 116s v1.28.2
node3 Ready <none> 98s v1.28.2
###Check Pods
#kubectl get pods -A
root@node1:/home/chairul# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5dd5756b68-52vpg 1/1 Running 0 42m
kube-system coredns-5dd5756b68-p6cgk 1/1 Running 0 42m
kube-system etcd-node1 1/1 Running 9 (18m ago) 42m
kube-system kube-apiserver-node1 1/1 Running 7 (16m ago) 43m
kube-system kube-controller-manager-node1 1/1 Running 14 (19m ago) 42m
kube-system kube-proxy-9l2gv 1/1 Running 11 (16m ago) 42m
kube-system kube-proxy-ksvbt 1/1 Running 2 (18s ago) 113s
kube-system kube-proxy-sxkmr 1/1 Running 1 (74s ago) 95s
kube-system kube-scheduler-node1 1/1 Running 12 (19m ago) 42m
kube-system weave-net-dcfsf 2/2 Running 2 (36s ago) 95s
kube-system weave-net-vfzkf 2/2 Running 2 (55s ago) 113s
kube-system weave-net-w7k6x 2/2 Running 1 (11m ago) 11m
Troubleshooting and Verification Command
###Check Component Status
#kubectl get componentstatuses
###System Journal, if kubectl failing
#sudo journalctl --since "10min ago" > all_logs.txt
###Use containerd management to inspect container of kube-system, example to check kubectl-api container
# crictl pods | grep kubectl-api
# crictl ps --pod a40aa4b396b9b
# crictl logs 0072c84f747ce |& tail -2
### Refer to this link to debug DNS Issue
https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/
Double check "Initialize the node parameter" part
### Use this image for troubleshoot from pod perspective
kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
kubectl exec -i -t dnsutils -- nslookup kubernetes.default
kubectl exec -i -t dnsutils -- cat /etc/resolv.conf
kubectl exec -i -t dnsutils -- ping 1.1.1.1
kubectl exec -i -t dnsutils -- nc -v 10.96.0.10 53 -u
kubectl exec -i -t dnsutils -- sh ### For interactive shell access
### Use nc -v to test connectivity
nc -v 10.96.0.10 53 -u
Connection to 10.96.0.10 53 port [tcp/domain] succeeded!