Intro

At Kalc we have a lot of experience, gained either from customers or from our time at Fortune 500 Companies, and we are concerned about all the mistakes you might make with Kubernetes. In some ways, this is motivated by the fact that we have all these new schedulers, this immutably styled infrastructure that we are all striving towards and we all want to be cloud natives but we still find ourselves making Kalcl mistakes.

From security to availability, to assumptions based on the past, we will propose solutions to these issues rooted in lessons learned from small-scale and large-scale deployments. The goal is to share these experiences and help all of us avoid the most pain possible.

The Setup

All Kubernetes does is that it schedules your workloads to the best place. You give it a container and a bunch of computers and it will decide the best place to run it. If it doesn’t work, it will reschedule it to another computer. Kubernetes is not just about running containers though, you can also schedule VMs with it.

The Challenge

So, if we have this shiny, intent-based platform, why isn't it working in production? What are the top six gotchas that application development and DevOps teams are making as they move into production? It becomes important to consider these questions in the design process, so you can schedule anything with Kubernetes. But how do you get that HA that you truly desire?

The Event

  1. A Reliable Cluster with a SPOF

Basically, this is all about putting all your resources in one basket or in some ways creating a Single Point of Failure. Etcd is a good example of that in terms of how many times etcd is being replicated at the same time.

  1. What is your Backup Strategy?

The reason you need a backup strategy is because at any point you need a way to recover both data and configurations. This is something you have to do in the beginning.

  1. Allowing * in your Ingress

If you put a * in your ingress, your Kubernetes cluster is going to forward all the traffic from all the cluster to your container in such a way that one container receives all the traffic from the cluster.


apiVersion: v1beta1
kind: Ingress
metadata:
  name: test-ingress
  annotations:
    rewrite-target: /
spec:
  Rules:
  - host: *
    http:
      paths:
      - path: /testpath
        backend:
          serviceName: test
          servicePort: 80

  1. Large Image

If you deploy a hundred images and each one of them is 5GB and then you do a  kubectl get pods , your response is going to take roughly 5 seconds but if the image is 50MB, the response is going to be in milliseconds. Now if you have thousands of developers, each one of them deploying a 5GB image, it is going to slow the cluster down.

  1. Externally Hosted Images

Deploying to production from images hosted by Docker Hub is a terrible idea because Docker Hub is a free service unless you pay them. Imagine you are deploying to prod on the free tier and then suddenly:

  • Docker Hub rate limits you
  • Image gets deleted
  • Image gets compromised
  1. Privileged Containers

Privileged containers will not take your cluster down but they will compromise your security. When you run a container in privileged mode, the bad thing that happens is that the container can read other containers’ namespaces. So one container can go and read other containers’ processes and data. Not only is this a security problem, it can also by mistake trigger a process that can take down a cluster.

The Fix

  1. Single Point of Failure - Use a multi-master cluster

It’s incredibly important that you deploy your resources in such a way that if the availability zone drops, you will actually be able to run your app in another availability zone. HA is all about setting up Kubernetes and its supporting components in such a way that there is no single point of failure. It's very easy for a single master cluster to fail. On the other hand a multi-master cluster uses multiple master nodes, and each master has access to all the worker nodes.

For a single master cluster, the mission critical core components like the Kubernetes API server and the controller manager lie only on the single master node and in the event that it fails you cannot create more services, pods etc. However, when you use a Kubernetes HA environment, these key components are replicated on multiple masters and should the masters fail, the other masters keep the cluster up and running.

  1. Backup Strategy

When you are using Kubernetes, it is good practice to develop your backup strategy. If your cluster crashes, you will need a backup to go back to the previous stable state of the Kubernetes cluster.

A backup will help you to:

1.1. Recover from disasters: like someone accidentally deleting the namespace where your deployments reside.

1.2. Replicate the environment: for example, replicating your production environment to staging environment before a major update.

1.3. Migrate your Kubernetes cluster from one environment to another.

  1. Avoid using * in your Ingress controller

Ingress exposes HTTP and HTTPS routes from external services to services found within the cluster. An Ingress resource using * will route all the traffic to a single container and this can easily take down a cluster. You should avoid using * in your ingress controller.

  1. Large Images - Convert large images to smaller sizes
For example, converting a 5GB docker image to 46MB image. Large Image Sizes = “Large Attack Surface”. Large Image Size = “Slower Deployment”.
  1. Externally Hosted Images

Use a Docker container registry that will make it easier for developers to store, manage, and deploy images. This will eliminate the need to operate your own container repositories or worry about scaling the underlying infrastructure, allowing you to reliably deploy containers for your applications. A popular alternative would be to use Amazon Elastic Container Registry (ECR), which is fully integrated with Amazon Elastic Container Service (ECS).

  1. Privileged Containers

Running in privileged mode gives the container all the capabilities of the host machine and it also lifts all the limitations enforced by the device cgroup controller. An attacker can use this as a starting point to exploit your whole system. To have a more secure container:

1.1. Run as a non-root user, using the Dockerfile’s USER command.

1.2. Drop as many Linux capabilities as you can (ideally all of them) when you run your container.

Conclusion

The greatest challenge we are facing right now is that there is a mismatch in expectations and reality when running Kubernetes in production. Kalc is addressing this problem with a revolutionary new product called kubectl-val that scans your cluster configuration and evaluates that against a large database of known vulnerabilities.