Manjusaka

Manjusaka

Some Thoughts on Kubernetes and Containerization

Recently, there have been many discussions in various groups about Kubernetes and containerization. Let's summarize some random thoughts as a general overview. This article represents my personal standpoint and does not reflect any commercial opinions.

Containerization#

Currently, a popular viewpoint is to use containers as much as possible. To review this idea, we need to understand the changes that containers have brought us.

Containers undoubtedly bring many benefits:

  1. It is very convenient to keep the development and production environments consistent. In other words, when developers say "this service works fine on my local machine," it becomes a useful statement.
  2. It makes deploying services more convenient, whether it is distribution or deployment.
  3. It allows for a certain level of resource isolation and allocation.

So, can we blindly use containers? No, we can't. We need to review the drawbacks of containerization:

  1. Container security is a significant concern. The most popular container implementation (specifically Docker) is fundamentally based on CGroups + NS for resource and process isolation. Therefore, security becomes a crucial consideration. After all, Docker vulnerabilities and escapes are discovered every year. This means that we need a systematic mechanism to regulate the use of containers, ensuring that any potential security risks are within a controllable range. Another aspect is image security. As programmers who rely on platforms like Baidu, CSDN, Google, and Stack Overflow, there is a high chance of encountering a problem, searching for a solution, and directly copying a Dockerfile. This poses a significant risk because we don't know what ingredients are added to the base image.
  2. Networking is another issue with containers. When we start multiple containers, how do they communicate with each other? In a production environment with more than one machine, how do we ensure stable communication between containers across different hosts?
  3. Container scheduling and operations are also challenges. When a machine is under high load, how do we schedule some containers to other machines? How do we determine if a container is alive? If a container crashes, how do we restart it?
  4. There are specific details about containers that need to be addressed, such as how to build and package images, how to upload them, and how to troubleshoot corner cases.

When making a business decision, we should not choose a technology just because it is advanced or comfortable. We need to measure the ROI of the decision and make a trade-off between its advantages and disadvantages. Regarding containerization, let's consider some common misconceptions:

  1. We want to use containers for resource isolation. In that case, what is the difference between using systemd + cgroup, which is a simpler method? Is containerization more cost-effective?
  2. We want to practice DevOps, so we want to adopt containerization. In reality, DevOps and containerization are not closely related. DevOps is more of a methodology, a set of practices for internal collaboration within a team. Inaccurately speaking, it involves simplifying the distribution and operations of a set of services through automation, process improvement, and the introduction of SOPs. In other words, when practicing DevOps, it is not just a technical problem but also an institutional problem (here's a joke: DevOps developers don't need to write scripts). Traditional tools like Ansible for operations and various automation testing methods and frameworks can all be part of DevOps. So, why do we need containers? Is it because traditional tools are more expensive to implement DevOps with compared to containerization?

From these examples, we can see that when we embark on containerization, we must consider whether it truly solves our pain points or if we are simply adopting it because it seems advanced and impressive.

Kubernetes#

The aforementioned issues with containerization have led to the emergence of container orchestration systems, with Kubernetes being the representative one. Now, let's discuss Kubernetes.

Firstly, I will ignore the scenario of building a self-managed Kubernetes cluster because it is not something that ordinary people can handle. Instead, let's focus on the usage of public cloud platforms. Taking Alibaba Cloud as an example, when we visit their website, we see the following images:

images

images

Now, let's ask some questions:

  1. What is VPC?
  2. What is the difference between Kubernetes 1.16.9 and 1.14.8?
  3. What are Docker 19.03.5 and Alibaba Cloud Security Sandbox 1.1.0? What is the difference between them?
  4. What is a dedicated network?
  5. What is a virtual switch?
  6. What are network plugins? What are Flannel and Terway? How do they differ? When you look through the documentation and find out that Terway is an Alibaba Cloud-customized CNI plugin based on Calico, you might wonder, what is a CNI plugin? What is Calico?
  7. What is Pod CIDR and how do you set it?
  8. What is Service CIDR and how do you set it?
  9. What is SNAT and how do you configure it?
  10. How do you configure security groups?
  11. What is Kube-Proxy? What is the difference between iptables and IPVS? How do you choose?

Doesn't it differ greatly from what you imagined? You might say that small companies don't need to worry about these things and can just use the default settings... Well, then why bother with Kubernetes? Okay, let's assume you've deployed it. Now, let's continue calculating.

  1. You need a container registry, right? It's not expensive, the basic version in the China region costs 780 RMB per month.
  2. Do you need to expose services within your cluster? Okay, buy the lowest-spec SLB (Server Load Balancer), the simple version, for 200 RMB per month.
  3. Alright, you need to pay for logs every month, right? Let's say you have 20GB of logs per month, not much, right? Okay, that will be 39.1 RMB.
  4. Do you want cluster monitoring? Great, buy it. Let's say you have 500,000 log entries reported per day. It's not expensive, only 975 RMB per month.

Let's calculate the cost for one cluster: (780+200+39.1+975)*12 = 23292.2 RMB. This doesn't include the costs of basic resources like ENIs and ECS instances. Quite a hefty sum, isn't it?

Moreover, many other issues will arise. For more details, you can check out the Kubernetes issue tracker.

Conclusion#

I wrote this article not to complain or criticize anyone but to express a viewpoint. I want to borrow a sentence from an article I really like, "中台,我信了你的邪 | 深氪" (translated as "I believed in the 'Middle Platform' concept, but now I don't").

At the end of last year, Alibaba's Chairman and CEO, Zhang Yong, also mentioned during a lecture at Hupan University: "If a company focuses solely on building a middle platform, it will die."

I'm not sure if it was said by Xie Yazi, but I agree with it. At the same time, I believe that if a company focuses solely on pursuing technological advancement, it will die. After all, technology should serve the business, and the progress of technology largely depends on the accumulation of business needs and requirements.

Well, this is probably the most casual article I have ever written. That's all for now. Back to work!

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.