In this blog, we’re going to explain what kubernetes is. We’re going to start off with the definition to see what the official definition is and what it does. Basically, why did kubernetes even come around and what problems does it solve? We’re gonna look at the basic architecture of kubernetes. What are master, nodes and what are the kubernetes processes that actually make up the platform mechanism.
Then, we going to see some basic concepts and the components of kubernetes which are pods and containers and services and what is the role of each one of those components.
And, finally we going to look at a simple configuration that you as a kubernetes cluster user would use to create those components and configure the cluster to your needs.
What is kubernetes?
Kubernetes is an open source container orchestration framework which was originally developed by google.
So, on the foundation it manages containers be docker containers or from some other technology. Which basically means that kubernetes helps you to manage applications that are made up of hundreds or maybe thousands of containers and it helps you to manage them in different environments like physical machines, virtual machines or cloud environments or even hybrid deployment environments.
So, what problems does kubernetes solve? and what are the tasks of a container orchestration tool?
The rise of microservices causes increased usage of container technologies because the containers actually offer the perfect host for small independent applications like microservices.
The rise of container and microservice technology actually resulted in applications that now comprised hundreds or sometime maybe even thousands of containers.
Now, managing those loads of containers across multiple environments using scripts and self-made tools can be really complex and sometimes even impossible. So, that specific scenario actually caused the need for having container orchestration technologies.
What orchestration tools like kubernetes do is actually guarantee following features?
In simple words, high availability means that the application has no downtime. So it’s always accessible by the users.
Scalability means that the application has a high performance load fast and users have a very high response rate from the application.
Disaster recovery, which basically means that if an infrastructure has some problems like data is lost or the server’s explode or something bad happens with the server center. The infrastructure has to have some kind of mechanism to pick up the data and to restore it to the latest state. So, that application doesn’t actually lose any data and the containerized application can run from the latest stay after the recovery. All of these are functionalities that container orchestration technologies like kubernetes offer.
How does the kubernetes basic architecture actually look like?
The kubernetes cluster is made up with at least one master node and then connected to it you have a couple of worker nodes where each node has a kubelet process running on it.
Kubelet is a kubernetes process that makes it possible for the cluster to talk to each other to communicate to each other and actually execute some tasks on those nodes like running application processes.
Each worker node has docker containers of different applications deployed on it. So, depending on how the workload is distributed you would have a different number of docker containers running on worker nodes and worker nodes are where the actual work is happening. So, here is where your applications are running.
So, the question is what is running on the master node?
Master node actually runs several kubernetes processes that are absolutely necessary to run and manage the cluster properly.
API server which is also a container. The API server is actually the entry point to the kubernetes cluster. So, this is the process which the different kubernetes clients will talk to like the UI if you’re using kubernetes dashboard. API if you’re using some scripts and automating technologies and a command-line tool. So, all of these will talk to the API server.
Another process that is running on master node is a controller manager. Which basically keeps an overview of what’s happening in the cluster whether something needs to be repaired or maybe if a container died and it needs to be restarted etc.
Scheduler which is basically responsible for scheduling containers on different nodes based on the workload and the available server resources on each node. So, it’s an intelligent process that decides on which worker node is the next container should be scheduled on based on the available resources on those worker node and the load that container meets.
Another very important component of the whole cluster is actually an etcd key value storage. Which basically holds at any time the current state of the kubernetes cluster so it has all the configuration data inside and all the status data of each node and each container inside of that node.
The backup and restore is made from these etcd snapshots because you can recover the whole cluster state using that etcd snapshot.
And last but not least also a very important component of kubernetes which enables those worker nodes and master nodes talk to each other is the virtual network that spends all the nodes that are part of the cluster. In simple words, a virtual network actually turns all the nodes inside of the cluster into one powerful machine that has the sum of all the resources of individual nodes.
One thing to be noted here is that worker nodes, they actually have the most load because they are running the applications on inside of it, usually are much bigger and have more resources because they will be running hundreds of containers inside of them.
Where the master node will be running just a handful of master processes like we see in this diagram. So, it doesn’t need that many resources.
However, as you can imagine, a master node is much more important than the individual worker node because for example if you lose a master node excess. You will not be able to access the cluster anymore and that means that you absolutely have to have a backup of your master at any time. So, in production environments usually you would have at least two masters inside of your kubernetes cluster.
But in most cases you’re going to have multiple masters where if one master node is down the cluster continues to function smoothly because you have other masters available.
Kubernetes Basic Concepts
So, now look at some basic concepts of kubernetes like pods and containers.
In kubernetes, pod is the smallest unit that you as a kubernetes user will configure and interact. Pod is basically a wrapper of a container and on each worker node you’re gonna have multiple pods and inside of a pod you can actually have multiple containers. Usually, per application you would have one pod so the only time you would need more than one container inside of a pod is when you have a main application that needs some helper containers. So, usually you would have one pod per application. For example, would be one pod a message broker will be another pod a server will be again another pod a java application.
And, as we mentioned previously as well there is a virtual network dispense the kubernetes cluster. So, what virtual network does is?
Virtual network assigns each pod its own IP address. So, each pod is its own self containing server with its own IP address and the way that they can communicate with each other is they using that internal IP addresses and to note here we don’t actually configure or create containers inside of kubernetes cluster but we only work with the pods which is an abstraction layer over containers.
Pod is a component of kubernetes that manages the containers running inside itself without our intervention. So for example, if a container stops or dies inside of a pod it will be automatically restarted inside of the pod.
However, pods are ephemeral components which means that pods can also die very frequently and when a pod dies a new one gets created.
So, what happens is that whenever a pod gets restarted or weak a new pod is created and it gets a new IP address. For example, if you have your application talking to a database pod using the IP address that pod have and when the pod restarts it gets a new IP address. So, because of this problem kubernetes have another component called service. Which basically is an alternative or a substitute to those IP addresses.
So, instead of having this dynamic IP addresses their services sitting in front of each pod that talk to each other. So, now if a pod behind the service dies and gets recreated the service stays in place because their life cycles are not tied to each other and the service has two main functionalities one is an IP address so it’s a permanent IP address which you can use to communicate with between the pods and at the same time it is a load balancer.
Now, that we have seen the basic concepts of kubernetes how do we actually create those components like pods and services to configure the kubernetes cluster. All the configuration in kubernetes cluster actually goes through a master node with the process called API server which we mentioned briefly earlier.
So, kubernetes clients which could be a UI a kubernetes dashboard for example or an API which could be a script or curl command or a command line tool like Kubectl they all talk to the API server and they send their configuration requests to the API server which is the main entry point or the only entry point into the cluster in this requests have to be either in YAML format or JSON format.
This is how a example configuration in the YAML format actually looks like,
So, with this YAML file we are sending a request to kubernetes to configure a component called deployment which is basically a template or a blueprint for creating pods.
Also Read: HTTP Protocol (Hypertext Transfer Protocol)