Types of Health Checks in Kubernetes

Posted: | Last updated: | 5 minute read

Kubernetes supports 2 types of health checks. First Liveness probe and second one is Readiness Probe. In this article you will learn what are the importance of these 2 health checks.

Package

Why we need Health check system?

Distributed systems are hard to manager, because there are many moving parts that all need to work in an order for the system to function. If a small part breaks, then the system has to detect it and fix it, It has to be done automatically. That is why we need to check the health of a small part system continuously.

In this article you will learn, how you can setup Readiness and Liveness probs in your Kubernetes cluster.

What is Health checks?

It’s one of the best practices in Kubernetes.

Health checks a simple way to let the system know, whether an instance of an app is working or not?

If an instance of the app is not working then other services should not access it or not send a request to it, instead the request must go to another instance of the app that is ready or re-try at another time. The system should also bring back your instance to a healthy state.

By default, Kubernetes will start to send traffic to a Pod, when all the Containers inside a Pod starts, and Restart the container when they are crashed. All these default behaviors are good enough, but you can make your deployment more robust when you implement custom health checks.

Fortunately, k8s makes it relatively straight forward by customs health checks.

What is Liveness checks?

Liveness probs tell, whether your Pod is alive or dead. If your app is alive then Kubernetes leaves it alone, and if your app is dead then the Kubernetes will remove the Pod and start a new Pod to replace it.

Let’s take a scenario, where an app takes a minute to warm-up and start. Your k8s service would not work until its up and running, even though the processes are already started. You will also have issues when you will scaleup the services by spinning up multiple copies (multiple replica counts). New copies of Pods should not receive traffic until they are fully ready, but default k8s start sending traffic as soon as the process starts inside a container. With liveness probs, Kubernetes will detect that the app is no longer service the request and restart the Pod by-default.

What is Readiness checks?

Readiness probes let k8s know when your App/Pod is ready to serve the traffic/request. Kubernetes will ensure that Readiness probes pass before allowing a service to send traffic to the Pod. If Readiness probes fail, then Kubernetes stops sending traffic/requests to the Pod until it passes again. With the readiness probs, Kubernetes does not allow service to send traffics to new copies of Pod until the app is fully started.

Types of Probes

Kubernetes provides 3 types of probs! You can use anyone of them into Liveness and Readiness checks.

HTTP Probe

HTTP is the most common custom type of probs. Even your app is not an HTTP server. Usually, you create a lightweight HTTP server inside your app to respond to the livenessProb. Kubernetes will ping a path, and if it gets an HTTP response in the 200 or 300 range, then it considers Pod as healthy, otherwise it will be marked as unhealthy.

Syntax of HTTP probe

spec:
    containers:
    - name: liveness
    livenessProb:
      httpGet:
        path: /healthz
        port: 8080

Example of HTTP probe.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

Command probe

Kubernetes will run a command inside a container. If the command returns the exit code zero (0), then the container will be marked as healthy, otherwise, it will be marked unhealthy. This kind of prob is useful when you can’t or don’t want to run an HTTP server but you want to run a command and check whether your app is healthy or not.

Syntax of command probe.

spec:
    containers:
    - name: liveness
    livenessProb:
      exec:
        command:
        - mycommand

Example of Command probe.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5

TCP probe

In TCP probs, k8s will try to establish a TCP connection on the specified port. If it can establish a connection, then the Pod is considered healthy, otherwise unhealthy. This is useful when HTTP or Command probs don’t work. TCP Probs is used in gRPC and FTP services.

Syntax of TCP probe

spec:
    containers:
    - name: liveness
    livenessProb:
      tcpSocket:
        port: 8080

Example of TCP probe

apiVersion: v1
kind: Pod
metadata:
  name: goproxy
  labels:
    app: goproxy
spec:
  containers:
  - name: goproxy
    image: k8s.gcr.io/goproxy:0.1
    ports:
    - containerPort: 8080
    readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20

Configuring Probs

Probs can be configured in many ways as given below!

initalDelaySeconds: This is a very important setting that you need to set when you are using livenessProbe. This is the initial delay in seconds before checking the Liveness or Readiness probes. As we know now, livenessProbe failure will cause the Pod restart. You need to ensure that the Prob does not restart until the Pod is ready. Otherwise, the Pod will continuously restart in a loop and never be ready.

periodSeconds: It is how often to perform the probe. Default to 10 seconds. The minimum value is 1. You can modify it based on the time taken by the app to start the service.

timeoutSeconds: This is the number of seconds after which the probe times out. Defaults to 1 second. The minimum value is 1. successThreshold: This is the minimum consecutive success for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. The minimum value is 1.

failureThreshold: This is a minimum of consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. The minimum value is 1.

Conclusion: A health check is required for any distributed system. Kubernetes is no exception. Using health checks gives Kubernetes Services a solid foundation, better Reliability, and Higher uptime. Kubernetes makes it easy to do.