Manjusaka

Manjusaka

A Brief Discussion on Signal Handling in Processes V2

Last time I wrote a water article A Simple Discussion on Signal Handling in Processes, my master scolded me angrily, saying that the examples in the previous article were too old style, too simple, too naive. If there are deviations in the future, I will also be responsible. I was so scared that I didn't even write an anniversary article with my girlfriend, so I hurried to write another article to discuss better and more convenient ways of signal handling.

Background#

First, let's take a look at the example from the previous article.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>

void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    deletejob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

int main(int argc, char **argv) {
  int pid;
  sigset_t mask_all, prev_all;
  sigfillset(&mask_all);
  signal(SIGCHLD, handler);
  while (1) {
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    addjob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
}

Now let's review a few key syscall.

  1. signal1: The signal handling function, which allows the user to specify a specific signal handler for the current process. When the signal is triggered, the system will call the specific handler for corresponding logic processing.
  2. sigfillset2: One of the functions used to manipulate signal sets. Here it means adding all supported signals into a signal set.
  3. fork3: A well-known API that creates a new process and returns the pid. If in the parent process, the returned pid is the corresponding child process's pid. If in the child process, pid is 0.
  4. execve4: Executes a specific executable file.
  5. sigprocmask5: Sets the process's signal mask. When the first parameter is SIG_BLOCK, the function saves the current process's signal mask in the signal set variable passed as the third parameter and sets the current process's signal mask to the signal mask passed as the second parameter. When the first parameter is SIG_SETMASK, the function sets the current process's signal mask to the value set by the second parameter.
  6. wait_pid6: To make an imprecise summary, it recycles and releases the resources of terminated child processes.

Now that we've reviewed the key points, let's move on to the main part of this article.

More Elegant Signal Handling#

More Elegant Handler#

First, let's take another look at the signal handling part of the code above.

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    deletejob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

Here, in order to ensure that the handler is not interrupted by other signals, we use sigprocmask + SIG_BLOCK for signal masking during processing. Logically, this seems fine, but there is a problem. When we have many different handlers, we will inevitably generate a lot of redundant code. So is there a more elegant way to ensure the safety of our handler?

Yes (very loudly (good, very energetic! (run away))). Let me introduce a new syscall -> sigaction7.

Without further ado, here’s the code.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>

void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

void handler(int sig) {
  int olderrno = errno;
  sigset_t mask_all, prev_all;
  pid_t pid;
  sigfillset(&mask_all);
  while ((pid = waitpid(-1, NULL, 0)) > 0) {
    deletejob(pid);
  }
  if (errno != ECHILD) {
    printf("waitpid error");
  }
  errno = olderrno;
}

int main(int argc, char **argv) {
  int pid;
  sigset_t mask_all, prev_all;
  sigfillset(&mask_all);
  struct sigaction new_action;
  new_action.sa_handler=handler;
  new_action.sa_mask=mask_all;
  signal(SIGCHLD, handler);
  while (1) {
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
    addjob(pid);
    sigprocmask(SIG_SETMASK, &prev_all, NULL);
  }
}

Great! Very energetic! You may have noticed that this code has added settings related to sigaction compared to the previous code. Why?

Yep, in sigaction, we can set sa_mask to specify which signals will be blocked during the execution of the signal handling function.

See, our code is indeed more elegant compared to before. Of course, sigaction has many other useful settings, which you can explore.

Faster Signal Handling#

In our previous example, we have solved the problem of elegantly setting the signal handling function, but now we face a brand new problem.

As mentioned earlier, when our signal handling function is executing, we choose to block other signals. Here lies a problem: if the logic in our signal handling function takes a long time and does not require atomicity (i.e., needs to be synchronized with the signal handling function), and the frequency of signals occurring in the system is high, then our approach will cause the process's signal queue to continuously increase, leading to unpredictable consequences.

So, is there a better way to handle this?

Suppose we open a file, and in the signal handling function, we only complete one task: writing a specific value to this file. Then we poll this file, and if it changes, we read the value from the file, determine the specific signal, and perform the corresponding signal handling. This way, we ensure both the delivery of signals and minimize the cost of blocking signals in our signal handling logic.

Of course, the community knows that everyone finds writing code difficult, so they have specially provided a brand new syscall -> signalfd8.

As usual, let’s take a look at the example.

#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/signalfd.h>
#include <sys/wait.h>

#define MAXEVENTS 64
void deletejob(pid_t pid) { printf("delete task %d\n", pid); }

void addjob(pid_t pid) { printf("add task %d\n", pid); }

int main(int argc, char **argv) {
  int pid;
  struct epoll_event event;
  struct epoll_event *events;
  sigset_t mask;
  sigemptyset(&mask);
  sigaddset(&mask, SIGCHLD);
  if (sigprocmask(SIG_SETMASK, &mask, NULL) < 0) {
    perror("sigprocmask");
    return 1;
  }
  int sfd = signalfd(-1, &mask, 0);
  int epoll_fd = epoll_create(MAXEVENTS);
  event.events = EPOLLIN | EPOLLEXCLUSIVE | EPOLLET;
  event.data.fd = sfd;
  int s = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, sfd, &event);
  if (s == -1) {
    abort();
  }
  events = calloc(MAXEVENTS, sizeof(event));
  while (1) {
    int n = epoll_wait(epoll_fd, events, MAXEVENTS, 1);
    if (n == -1) {
      if (errno == EINTR) {
        fprintf(stderr, "epoll EINTR error\n");
      } else if (errno == EINVAL) {
        fprintf(stderr, "epoll EINVAL error\n");
      } else if (errno == EFAULT) {
        fprintf(stderr, "epoll EFAULT error\n");
        exit(-1);
      } else if (errno == EBADF) {
        fprintf(stderr, "epoll EBADF error\n");
        exit(-1);
      }
    }
    printf("%d\n", n);
    for (int i = 0; i < n; i++) {
      if ((events[i].events & EPOLLERR) || (events[i].events & EPOLLHUP) ||
          (!(events[i].events & EPOLLIN))) {
        printf("%d\n", i);
        fprintf(stderr, "epoll err\n");
        close(events[i].data.fd);
        continue;
      } else if (sfd == events[i].data.fd) {
        struct signalfd_siginfo si;
        ssize_t res = read(sfd, &si, sizeof(si));
        if (res < 0) {
          fprintf(stderr, "read error\n");
          continue;
        }
        if (res != sizeof(si)) {
          fprintf(stderr, "Something wrong\n");
          continue;
        }
        if (si.ssi_signo == SIGCHLD) {
          printf("Got SIGCHLD\n");
          int child_pid = waitpid(-1, NULL, 0);
          deletejob(child_pid);
        }
      }
    }
    if ((pid = fork()) == 0) {
      execve("/bin/date", argv, NULL);
    }
    addjob(pid);
  }
}

Now, let’s introduce some key points from this code.

  1. signalfd is a special file descriptor that is readable and can be selected. When the specified signal occurs, we can read the specific signal value from the returned fd.
  2. signalfd has a lower priority than the signal handling function. In other words, if we register a signal handling function for the signal SIGCHLD and also register signalfd, when the signal occurs, the signal handling function will be called first. Therefore, when using signalfd, we need to use sigprocmask to set the process's signal mask.
  3. As mentioned earlier, this file descriptor can be selected, meaning we can use select9, poll10, epoll1112, etc., to monitor the fd. In the above code, we use epoll to monitor signalfd.

Of course, one additional point to note is that many languages may not provide an official signalfd API (like Python), but they may offer equivalent alternatives, a typical example being Python's signal.set_wakeup_fd13.

Here’s a thought question for you: besides using signalfd, what other methods can achieve efficient and safe signal handling?

Conclusion#

I believe that signal handling is a fundamental skill for developers, and we need to safely and reliably handle various signals encountered in the program environment. The system also provides many well-designed APIs to alleviate the burden on developers. However, we must understand that signals are essentially a means of communication, and their inherent drawback is that they carry limited information. Often, when we have a lot of high-frequency information to transmit, using signals may not be a good choice. Of course, this is not a definitive conclusion; it can only be a trade-off done on a case-by-case basis.

That's about it for this week's second water article (run away).

Reference#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.