Kubernetes

Where Did All My Pods Go?

Toni Kurya
November 15, 2019

Enterprises using Kubernetes often need to autoscale their resources based on more than just CPU usage—for example concurrent persistent connections or queue length.This post walks you through an incident where one of our customers enabled autoscaling for their application and one day all their Pods disappeared.

#cpu issuesRead More
Kubernetes

CoreDNS Autopath Failure For External Name Services

Toni Kurya
November 4, 2019

Datadog is a monitoring service for cloud-based workflows offering Kubernetes insights through metrics, traces, logs, dashboards, etc. They are positioned as a Cloud Native service provider. Through their SaaS-based analytics platform,

#Kubernetes clusterRead More
Kubernetes

Zalando's Total DNS outage in Kubernetes cluster

Nisar Ahmad
October 30, 2019

Zalando is an e-commerce store that provides lifestyle and fashion products to customers in seventeen European markets. Zalando is considered the starting point for fashion in Europe, and it currently offers more than 300,000 products, with 2,000 different brands in fashion and lifestyle.

#Kubernetes clusterRead More
Kubernetes

How To Break a Cassandra Cluster

Toni Kurya
October 28, 2019

Apache Cassandra is a highly scalable free and open source NoSQL database, achieving great performance on multi-node setups with no single point of failure. Cassandra supports replication across multiple data centers and offers lower latency for users and the ability to survive regional outages.

#Kubernetes clusterRead More
Kubernetes

Kubernetes Jobs and the Sidecar Problem

Toni Kurya
October 25, 2019

Imagine that, you have a large computation to perform, and once the computation is done, you want Kubernetes to stop Pods automatically. Simply put, we are talking about running Pods temporarily until a Job is completed

#sidecarcontainerRead More
Kubernetes

Job being constantly recreated despite RestartPolicy: Never

Toni Kurya
October 22, 2019

Universe.com, a division within Ticketmaster, is shaping the future of the event industry using Kubernetes. They provide meaningful, real-life experiences to people around the globe through a world-class event ticketing platform.

#KubernetesJobRead More
Kubernetes

The Case of the Infected Cluster

Toni Kurya
October 19, 2019

Today's distributed systems need to be resilient. Resilient, in short, is a way that ideally a user does not notice at all if a random failure takes place or that the user at least can continue to use the degraded application. On Monday 9 July 2018,

#Kubernetes clusterRead More
Kubernetes

Debugging DNS Failure On Pods Looking Up External Resources

Toni Kurya
October 14, 2019

Docker makes building containers remarkably easy. The downside of this simplicity is that it's easy to build huge containers full of things you don't need - including security holes. By using a smaller, specialized base image such as Alpine, you can significantly minimize the attack surface.

#Kubernetes podsRead More
Kubernetes

Challenges With Running PostgreSQL On Kubernetes

Toni Kurya
October 10, 2019

Containers have become the next big thing in infrastructure software. However, for you to take full advantage of containers you need to be conversant on how to turn them into production services. This is where Kubernetes shines — as an orchestrator of your containerized applications.

#run PostgreSQLRead More
Kubernetes

The Developer Guide to Taking a Kubernetes Cluster Down

Toni Kurya
October 8, 2019

At Kalc we have a lot of experience, gained either from customers or from our time at Fortune 500 Companies, and we are concerned about all the mistakes you might make with Kubernetes. In some ways, this is motivated by the fact that we have all these new schedulers, this immutably styled infrastructure that we are all striving towards

Read More
Kubernetes

Managing Kubernetes Clusters on AWS Using Kops

Toni Kurya
September 29, 2019

Containers are a well-established way of packaging an application. Kubernetes has also gotten out of the early-adopters phase. Today it is a widely held view that Kubernetes is a cost-effective, ready-made solution that enterprise customers can trust.

#Kubernetes clusterRead More
Kubernetes

How JetStack simple admission webhook lead to a Kubernetes cluster outage?

Nisar Ahmad
September 19, 2019

Jetstack is a fast growing Kubernetes professional services company that helps startups, SMBs, and enterprises to modernize their cloud-native Kubernetes infrastructure. They have been building, operating, and contributing to the Kubernetes ecosystem since 2015.

#cpu issuesRead More
Kubernetes

How to solve the strange case of kube-api pods constantly restarting

Nisar Ahmad
September 18, 2019

NRE Labs is a site for teaching network automation in the browser using real, interactive, compelling virtual environments. Its main aim is to democratize interactive, dependency-free learning. The Labs are powered by the Antidote project, which provides a platform for representing curriculum-as-code.

Read More
Kubernetes

Setting Up Your EKS Cluster for Scale

Toni Kurya
August 30, 2019

Many organizations are modernizing their existing applications to become more agile and innovate faster. Architectural patterns like microservices enable teams to independently test services and continuously push applications to delivery environments.

#resources utilizationRead More
Kubernetes

How Pivotal Caused an Application Outage on Kubernetes?

Nisar Ahmad
August 29, 2019

Pivotal offers business transformation, a cloud-native platform, microservices, containers, developer tools, and consulting services to help enterprise-level businesses to build and run their applications. VMware recently showed intentions to acquire Pivotal for $2.7 bn.

#Pod Disruption Budget Read More
Kubernetes

How was Grafana’s Production Outage Caused Using Kubernetes Pod Priorities?

Nisar Ahmad
August 3, 2019

Grafana is the leading open source metric suite for analytics and visualization that is commonly used for analyzing time series data. It can also be used in many other domains, such as home automation, industrial sensors, process control, weather, etc.

#resources issuesRead More
Kubernetes

How Moonlight Fixed Application Outage Issues on Kubernetes?

Nisar Ahmad
August 1, 2019

Moonlight is a professional community of software developers and designers where you can find and work with quality candidates based on their experience and location.

#cpu issuesRead More
Kubernetes

How Blue Matador Recovered Kubernetes Node OOM?

Nisar Ahmad
August 16, 2018

Blue Matador is a platform that monitors your AWS infrastructure and compute resources, understands the baselines, manages thresholds, and sends actionable alerts. It is considered as a check engine light for your public cloud infrastructure and keeps a pulse effortlessly on everything in the cloud environment.

#cpu issuesRead More