What are we talking about when we discuss CI/CD?

This article is actually the content shared for the second time in the group. This time, I want to talk about some issues and the evolution process we need to encounter regarding CI/CD. Of course, this article is somewhat aimed at beginners and includes a bit of explosive discussion, so feel free to just browse through it.

To Begin with, Let's Define#

Before we discuss something, we need to provide a definition for it. So let's take a look at the definitions of CI and CD that we are going to talk about today.

First, CI stands for Continuous Integration, which is expressed in Chinese as 持续集成. CD, in common contexts, may refer to two meanings: Continuous Delivery or Continuous Deployment, corresponding to 持续交付 / 持续部署. Here, I borrow the definition given by Brent Laster in What is CI/CD?¹

Continuous integration (CI) is the process of automatically detecting, pulling, building, and (in most cases) doing unit testing as source code is changed for a product. CI is the activity that starts the pipeline (although certain pre-validations—often called "pre-flight checks"—are sometimes incorporated ahead of CI).
The goal of CI is to quickly make sure a new change from a developer is "good" and suitable for further use in the code base.
Continuous deployment (CD) refers to the idea of being able to automatically take a release of code that has come out of the CD pipeline and make it available for end users. Depending on the way the code is "installed" by users, that may mean automatically deploying something in a cloud, making an update available (such as for an app on a phone), updating a website, or simply updating the list of available releases.

Just looking at the definitions might leave everyone a bit confused, so let's go through some practical examples to clarify CI/CD.

Re: Building the Process from Scratch#

This title seems a bit casual, but never mind. First, let's assume the simplest requirement:

We have built a personal blog system based on Hexo. It includes the articles we need to publish and the theme we configured. We need to publish it to a specific Repo.

Alright, based on this requirement, let's play through the process from 0 to 0 (laughs).

Building from the Ground Up#

Many people might ask why we chose Hexo as our starting point. The reason is simple! Because it's simple enough!

Back to the point, Hexo has two commands hexo g && hexo d, which generate static web pages based on the Markdown files in the current directory and then push the generated products to the corresponding repo based on the configuration.

OK, so at the most basic stage, a build process looks like this:

Use an editor to happily write articles.
Then execute hexo g && hexo d in the local terminal.

Now the problem arises: what if sometimes we forget to execute the generation command after submitting the blog? Or it's cumbersome to type the same command repeatedly? Let's automate the whole process. Let's rock!

Further Building#

OK, let's assume that if we complete the automation, what should our workflow for publishing a blog look like?

We write a Markdown file and push it to the Master branch of the GitHub repository.
Our automated task starts building our blog, generating a series of static files and styles.
We push our static files and styles to our site Repo/CDN or other target locations.

Now, there are two core issues here:

Automatically start building when we push code.
Push the products after the build is complete.

Now, let's configure a set of our automated build tasks based on GitHub Action.

name: Build And Publish Blog
on:
  push:
    branches: [ master ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
        with:
          node-version: '12.x'
      - name: Install Package
        run: npm install -g hexo-cli && npm install
      - name: Generate Html File
        run: hexo g
      - name: Deploy 🚀
        uses: JamesIves/[email protected]
        with:
          GITHUB_TOKEN: ${{ secrets.PUBLSH_TOKEN }}
          BRANCH: gh-pages # The branch the action should deploy to.
          FOLDER: public # The folder the action should deploy.

We can see that this configuration actually accomplishes the following tasks:

Triggers the build when we submit code to the master branch.
Pulls the code.
Installs the dependencies required for the build.
Builds and generates static files.
Pushes the static files.

If any of the above steps fail, the subsequent steps will be canceled. In fact, such a simple task already contains the basic elements of CI and CD (here I have not strictly distinguished between Continuous Delivery and Continuous Deployment).

Continuous building and integration with existing code.
Distinguishing multiple phases in integration. Each phase will depend on the results of the previous phase.
Delivering/deploying the build products. The success of delivery/deployment relies on the success of integration.

Now, let's switch the blog system example to one from our engineering context. Replace Hexo with our Python service. Replace adding a blog post with adding our new code. Replace the build command with tools like mypy/pylint. You see, CI/CD is actually quite different from the complex systems you might imagine, right?

Many people might ask, if we trigger these commands not online but implement them locally using Git Hooks, does that count as a form of CI and CD? I believe it undoubtedly does. From my perspective, the core element of CI/CD is to expose defects as early as possible through repeatable, automated tasks, reducing unnecessary incidents caused by human factors.

This Developer Is Excessively Foolish Yet Not Cautious#

First, let me throw out a basic explosive statement, and then we can continue discussing:

Everyone has foolish moments, and those moments can be quite frequent.

In light of this explosive statement, let's review the example of building a personal blog system based on Hexo. If we do not choose to solve our build and publishing requirements through a convergent, automated system, what risks might arise in our process?

At the most basic level, finishing the blog and forgetting to build or publish.
For example, if we upgrade the Hexo version or theme version in our dependencies without testing, it could lead to broken styles.
Issues in our Markdown could cause the build to fail.
In cases where multiple people maintain a blog, each of us needs to save the credentials for the target repository/CDN, leading to information leakage, etc.

Switching the example of building a personal blog system based on Hexo to our daily development scenarios, we might encounter even more problems. Here are a few examples:

Inability to quickly roll back.
Inability to trace specific build/deployment records.
Lack of automated tasks, leading developers to neglect running tests or linting, resulting in code degradation.
Accidents caused by peak-time deployments.

Hmm, are these problems familiar to everyone? Basically, I built it, and it broke, what’s there to say? 23333

At this point, have you noticed a problem? In this article, I haven't distinguished between CI and CD? From my perspective, CI/CD is essentially the same practice. That is, the convergence of the development process and the delivery process.

From my perspective, the core goal of building a CI/CD system is:

To minimize the system's instability caused by human factors through convergent entry points and automated task triggers.
To expose problems in the system as early as possible through fast, multiple, repeatable, and seamless tasks.

Under these two major goals, we will adopt different means and forms to enrich the content of our CI/CD based on different business scenarios, including but not limited to:

Automated unit tests, E2E tests, etc., in the CI phase.
Periodic Nightly Builds in the CI phase.
Release management in the CD phase.

However, regardless of how we build a CI/CD system or what granularity we choose for CI/CD, I believe a qualified CI/CD system and mechanism must adhere to the following principles (personal summary):

Convergence of entry points and establishment of SOPs. If this consensus is not reached, and developers can bypass the CI/CD system through technical means, then we return to the title of this chapter (this developer is excessively foolish yet not cautious).
Non-intrusiveness to business code.
Integration tasks/deployment tasks must be automated and repeatable.
Traceable historical records and results.
Traceable build integration products.
Top-down support.

Following these summarized principles, let's iterate on our previous blog publishing process.

name: Build And Publish Blog
on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - uses: actions/setup-node@v1
        with:
          node-version: '12.x'

      - name: Install Package
        run: npm install -g hexo-cli && npm install

      - name: Generate Html File
        run: hexo g

      - name: Deploy To Repo🚀
        if: ${{ github.ref == 'refs/heads/master'}}
        uses: JamesIves/[email protected]
        with:
          GITHUB_TOKEN: ${{ secrets.PUBLSH_TOKEN }}
          BRANCH: gh-pages # The branch the action should deploy to.
          FOLDER: public # The folder the action should deploy.
      - name: Upload to Collect Repo
        uses: JamesIves/[email protected]
        with:
          GITHUB_TOKEN: ${{ secrets.PUBLSH_TOKEN }}
          BRANCH: build-${{ github.run_id }} # The branch the action should deploy to.
          FOLDER: public # The folder the action should deploy.

In this revised build process, I chose to trigger the CI process at the PR level and store historical products, while adding a new deployment process after merging into the main branch. This way, when I build and publish the blog, I can verify the correctness of framework upgrades and new blog posts through the traceable historical products. Additionally, relying on GitHub Action, I can effectively complete the historical build traceability.

This way, I can avoid various side effects caused by my foolish actions as much as possible (escape).

The Final Chapter of Aggressive Building#

Alright, nothing left, stunned, right?
.
.
.
.

Just kidding. In fact, this article can be concluded here. Through this article, you can see that building a CI/CD system may not involve many complex technical issues (with very few exceptions). Whether it's traditional Jenkins or new GitHub Action, GitLab-CI, or services provided by cloud vendors, they can all help us build a CI/CD system that fits our business well. However, I previously stated on Twitter that "the establishment of CI/CD is often not a technical issue, but an institutional issue, or more accurately, a conceptual issue."

Therefore, I hope each of us can recognize the fact that we all make mistakes and then try to converge and automate the development and delivery processes of the systems we are responsible for as much as possible. Let CI/CD truly become a part of our daily work.

That's about it, I'm off, I'm off.