Kubernetes: Difference between revisions

Revision as of 03:18, 27 April 2022

Kubernetes, also known as K8s, is a container orchestration service by Google.
It supposedly has a harder learning curve than docker-swarm but is heavily inspired by Google's internal borg system.
This document contains notes on both administrating a self-hosted Kubernetes cluster and deploying applications to one.

Getting Started

Background

Kubernetes runs applications across nodes which are physical or virtual machines.
Each node contains a kubelet process, a container runtime (typically containerd), and any running pods.
Pods contain resources needed to host your application including volumes and containers.
Typically you will want one container per pod since deployments scale by creating multiple pods.

Installation

For local development, you can install minikube.
Otherwise, install kubeadm.

kubeadm

Deploy a Kubernetes cluster using kubeadm

Install Commands

Install Kubeadm

KUBE_VERSION=1.23.1-00
# Setup docker repos and install containerd.io
sudo apt update
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg \
    lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update && sudo apt install containerd.io

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

sudo apt-get install -y apt-transport-https ca-certificates curl
sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet=$KUBE_VERSION kubeadm=$KUBE_VERSION kubectl=$KUBE_VERSION
sudo apt-mark hold kubelet kubeadm kubectl

Install Containerd

sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install containerd.io

Setup containerd
Container runtimes

# Configure containerd
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd

# See https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd
sudo sed -i '/\[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options\]/a \ \ \ \ \ \ \ \ \ \ \ \ SystemdCgroup = true' /etc/containerd/config.toml
sudo systemctl restart containerd

Control Plane Init

Setup Networking With Calico

Local Balancer (MetalLB)

Ingress Controller (ingress-nginx)

Add worker nodes

Pods per node

How to increase pods per node
By default, Kubernetes allows 110 pods per node.
You may increase this up to a limit of 255 with the default networking subnet.
For reference, GCP GKE uses 110 pods per node and AWS EKS uses 250 pods per node.

kubectl

In general you will want to create a .yaml manifest and use apply, create, or delete to manage your resources.

nodes

kubectl get nodes

# Drain evicts all pods from a node.
kubectl drain $NODE_NAME
# Uncordon to reenable scheduling
kubectl uncordon $NODE_NAME

pods

# List all pods
kubectl get pods
kubectl describe pods

# List pods and node name
kubectl get pods -o=custom-columns='NAME:metadata.name,Node:spec.nodeName'

# Access a port on a pod
kubectl port-forward <pod> <localport:podport>

deployment

kubectl get deployments
kubectl logs $POD_NAME
kubectl exec -it $POD_NAME -- bash

# For one-off deployments of an image.
kubectl create deployment <name> --image=<image> [--replicas=1]

proxy

kubectl proxy

service

Services handle routing to your pods.

kubectl get services

kubectl expose deployment/<name> --type=<type> --port <port>
kubectl describe services/<name>

run

https://gc-taylor.com/blog/2016/10/31/fire-up-an-interactive-bash-pod-within-a-kubernetes-cluster

# Throw up a ubuntu container
kubectl run my-shell --rm -i --tty --image ubuntu -- bash
kubectl run busybox-shell --rm -i --tty --image odise/busybox-curl -- sh

Services

Documentation

Services handle networking.
For self-hosted/bare metal deployments, there are two types of services.

ClusterIP - This creates an IP address on the internal cluster which nodes and pods on the cluster can access. (Default)
NodePort - This exposes the port on every node. It implicitly creates a ClusterIP and every node will route to that. This allows access from outside the cluster.
ExternalName - uses a CNAME record. Primarily for accessing other services from within the cluster.
LoadBalancer - Creates a clusterip+nodeport and tells the loadbalancer to create an IP and route it to the nodeport.
- On bare-metal deployments you will need to install a loadbalancer such as metallb.

By default, ClusterIP is provided by kube-proxy and performs round-robin load-balancing to pods.
For exposing non-http(s) production services, you typically will use a LoadBalancer service.
For https services, you will typically use an ingress.

Example ClusterIP Service

Ingress

Ingress | Kubernetes
Ingress is equivalent to having a load-balancer / reverse-proxy pod with a NodePort service.

Installing an Ingress Controller

See ingress-nginx to deploy an ingress controller.

Personally, I have:

values.yaml

upgrade.sh

To set settings per-ingress, add the annotation to your ingress definition:

example ingress

If your backend uses HTTPS, you will need to add the annotation: nginx.ingress.kubernetes.io/backend-protocol: HTTPS

Autoscaling

Horizontal Autoscale Walkthrough
Horizontal Pod Autoscaler

You will need to install metrics-server.
For testing, you may need to allow insecure tls.

Accessing External Services

access mysql on localhost
To access services running outside of your kubernetes cluster, including services running directly on a node, you need to add an endpoint and a service.

Example

NetworkPolicy

Network policies are used to limit ingress or egress to pods.

Example network policy

Devices

Generic devices

See https://gitlab.com/arm-research/smarter/smarter-device-manager
and https://github.com/kubernetes/kubernetes/issues/7890#issuecomment-766088805

Intel GPU

See https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/cmd/gpu_plugin

After adding the gpu plugin, add the following to your deployment.

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
          resources:
            limits:
              gpu.intel.com/i915: 1

Restarting your cluster

Scale to 0

reference
If you wish to restart your cluster, you can scale your deployments and stateful sets down to 0 and then scale them back up after.

# Annotate existing deployments and statefulsets with replica count.
kubectl get deploy -o jsonpath='{range .items[*]}{"kubectl annotate --overwrite deploy "}{@.metadata.name}{" previous-size="}{@.spec.replicas}{" \n"}{end}' | sh
kubectl get sts -o jsonpath='{range .items[*]}{"kubectl annotate --overwrite deploy "}{@.metadata.name}{" previous-size="}{@.spec.replicas}{" \n"}{end}' | sh

# Scale to 0.
# shellcheck disable=SC2046
kubectl scale --replicas=0 $(kubectl get deploy -o name) 
# shellcheck disable=SC2046
kubectl scale --replicas=0 $(kubectl get sts -o name)

# Scale back up.
kubectl get deploy -o jsonpath='{range .items[*]}{"kubectl scale deploy "}{@.metadata.name}{" --replicas="}{.metadata.annotations.previous-size}{"\n"}{end}' | sh
kubectl get sts -o jsonpath='{range .items[*]}{"kubectl scale deploy "}{@.metadata.name}{" --replicas="}{.metadata.annotations.previous-size}{"\n"}{end}' | sh

Helm

Helm is a method for deploying application using premade kubernetes manifest templates known as helm charts.
Rather than writing your own manifest or copying a manifest from elsewhere, you can use helm charts which create and install kubernetes manifests.
Manifests can also be composed into other manifests for applications which require multiple microservices.

Usage

To install an application, generally you do the following:

Create a yaml file, e.g. values.yaml with the options you want.
If necessary, create any PVs, PVCs, and Ingress which might be required.

Install the application using helm.

helm upgrade --install $NAME $CHARTNAME -f values.yaml [--version $VERSION]

Variants

minikube

minikube is a tool to quickly set up a local Kubernetes cluster on your PC.

kind

k3s

k3s is a lighter-weight Kubernetes by Rancher Labs.

KubeVirt

KubeVirt allows you to run virtual machines with vGPU support on your Kubernetes cluster.