3.5 Signals and Interrupts


3.5.1 Problems associated to signals

POSIX contemplates the notion of "signals", which are events that cause a process or a thread to be interrupted. Windows uses the term "exception", which includes also a more general kind of errors.

In both cases the consequence is that a thread or process may be interrupted at any time, either by causes which are intrinsic to them (synchronous signals), such as floating point exceptions, or extrinsic (asynchronous signals), such as the process being aborted by the user.

Of course, those interruptions are not always welcome. When the interrupt is delivered and a handler is invoked, the thread or even the whole program may be in an inconsistent state. For instance the thread may have acquired a lock, or it may be in the process of filling the fields of a structure. Furthermore, sometimes the signal that a process receives may not even be related to it, as in the case when a user presses Cltr-C and a SIGINT signal is delivered to an arbitrary thread, or when the process receives the Windows exception CTRL_CLOSE_EVENT denoting that the terminal window is being closed.

Understanding this, POSIX restricts severely what functions can be called from a signal handler, thereby limiting its usefulness. However, Common Lisp users expect to be able to handle floating point exceptions and to gracefully manage user interrupts, program exits, etc. In an attempt to solve this seemingly impossible problem, ECL has taken a pragmatic approach that works, it is rather safe, but involves some work on the ECL maintainers and also on users that want to embed ECL as a library.


3.5.2 Kinds of signals


3.5.2.1 Synchronous signals

The name derives from POSIX and it denotes interrupts that occur due to the code that a particular thread executes. They are largely equivalent to C++ and Java exceptions, and in Windows they are called "unchecked exceptions."

Common Lisp programs may generate mostly three kinds of synchronous signals:

  • Floating point exceptions, that result from overflows in computations, division by zero, and so on.
  • Access violations, such as dereferencing NULL pointers, writing into regions of memory that are protected, etc.
  • Process interrupts.

The first family of signals are generated by the floating point processing hardware in the computer, and they typically happen when code is compiled with low security settings, performing mathematical operations without checks.

The second family of signals may seem rare, but unfortunately they still happen quite often. One scenario is wrong code that handles memory directly via FFI. Another one is undetected stack overflows, which typically result in access to protected memory regions. Finally, a very common cause of these kind of exceptions is invoking a function that has been compiled with very low security settings with arguments that are not of the expected type – for instance, passing a float when a structure is expected.

The third family is related to the multiprocessing capabilities in Common Lisp systems and more precisely to the mp:interrupt-process function which is used to kill, interrupt and inspect arbitrary threads. In POSIX systems ECL informs a given thread about the need to interrupt its execution by sending a particular signal from the set which is available to the user.

Note that in neither of these cases we should let the signal pass unnoticed. Access violations and floating point exceptions may propagate through the program causing more harm than expected, and without process interrupts we will not be able to stop and cancel different threads. The only question that remains, though, is whether such signals can be handled by the thread in which they were generated and how.


3.5.2.2 Asynchronous signals

In addition to the set of synchronous signals or "exceptions", we have a set of signals that denote "events", things that happen while the program is being executed, and "requests". Some typical examples are:

  • Request for program termination (SIGKILL, SIGTERM).
  • Indication that a child process has finished.
  • Request for program interruption (SIGINT), typically as a consequence of pressing a key combination, e.g. Ctrl-C.

The important difference with synchronous signals is that we have no thread that causes the interrupt and thus there is no preferred way of handling them. Moreover, the operating system will typically dispatch these signals to an arbitrary thread, unless we set up mechanisms to prevent it. This can have nasty consequences if the incoming signal interrupt a system call, or leaves the interrupted thread in an inconsistent state.


3.5.3 Signals and interrupts in ECL

The signal handling facilities in ECL are constrained by two needs. First of all, we can not ignore the synchronous signals mentioned in Synchronous signals. Second, all other signals should cause the least harm to the running threads. Third, when a signal is handled synchronously using a signal handler, the handler should do almost nothing unless we are completely sure that we are in an interruptible region, that is outside system calls, in code that ECL knows and controls.

The way in which this is solved is based on the existence of both synchronous and asynchronous signal handling code, as explained in the following two sections.


3.5.3.1 Handling of asynchronous signals

In systems in which this is possible, ECL creates a signal handling thread to detect and process asynchronous signals (See Asynchronous signals). This thread is a trivial one and does not process the signals itself: it communicates with, or launches new signal handling threads to act accordingly to the denoted events.

The use of a separate thread has some nice consequences. The first one is that those signals will not interrupt any sensitive code. The second one is that the signal handling thread will be able to execute arbitrary lisp or C code, since it is not being executed in a sensitive context. Most important, this style of signal handling is the recommended one by the POSIX standards, and it is the one that Windows uses.

The installation of the signal handling thread is dictated by a boot time option, ECL_OPT_SIGNAL_HANDLING_THREAD (see Table 1.1 for a summary of boot options), and it will only be possible in systems that support either POSIX or Windows threads.

Systems which embed ECL as an extension language may wish to deactivate the signal handling thread using the previously mentioned option. If this is the case, then they should take appropriate measures to avoid interrupting the code in ECL when such signals are delivered.

Systems which embed ECL and do not mind having a separate signal handling thread can control the set of asynchronous signals which is handled by this thread. This is done again using the appropriate boot options such as ECL_OPT_TRAP_SIGINT, ECL_OPT_TRAP_SIGTERM, etc. Note that in order to detect and handle those signals, ECL must block them from delivery to any other thread. This means changing the sigprocmask() in POSIX systems or setting up a custom SetConsoleCtrlHandler() in Windows.


3.5.3.2 Handling of synchronous signals

We have already mentioned that certain synchronous signals and exceptions can not be ignored and yet the corresponding signal handlers are not able to execute arbitrary code. To solve this seemingly impossible contradiction, ECL uses a simple solution, which is to mark the sections of code which are interruptible, and in which it is safe for the handler to run arbitrary code. All other regions would be considered "unsafe" and would be protected from signals and exceptions.

In principle this "marking" of safe areas can be done using POSIX functions such as pthread_sigmask() or sigprocmask(). However in practice this is slow, as it involves at least a function call, resolving thread-local variables, etc, etc, and it will not work in Windows.

Furthermore, sometimes we want signals to be detected but not to be immediately processed. For instance, when reading from the terminal we want to be able to interrupt the process, but we can not execute the code from the handler, since the C function which is used to read from the terminal, read(), may have left the input stream in an inconsistent, or even locked state.

The approach in ECL is more lightweight: we install our own signal handler and use a thread-local variable as a flag that determines whether the thread is executing interrupt safe code or not. More precisely, if the variable ecl_process_env()->disable_interrupts is set, signals and exceptions will be postponed and then the information about the signal is queued. Otherwise the appropriate code is executed: for instance invoking the debugger, jumping to a condition handler, quitting, etc.

Systems that embed ECL may wish to deactivate completely these signal handlers. This is done using the boot options, ECL_OPT_TRAP_SIGFPE, ECL_OPT_TRAP_SIGSEGV, ECL_OPT_TRAP_SIGBUS, ECL_OPT_TRAP_INTERRUPT_SIGNAL.

Systems that embed ECL and want to allow handling of synchronous signals should take care to also trap the associated lisp conditions that may arise. This is automatically taken care of by functions such as si_safe_eval, and in all other cases it can be solved by enclosing the unsafe code in a ECL_CATCH_ALL frame.


3.5.4 Considerations when embedding ECL

There are several approaches when handling signals and interrupts in a program that uses ECL. One is to install your own signal handlers. This is perfectly fine, but you should respect the same restrictions as ECL. Namely, you may not execute arbitrary code from those signal handlers, and in particular it will not always be safe to execute Common Lisp code from there.

If you want to use your own signal handlers then you should set the appropriate options before invoking cl_boot, as explained in ecl_set_option. Note that in this case ECL will not always be able to detect floating point exceptions.

The other option is to let ECL handle signals itself. This would be safer when the dominant part of the code is Common Lisp, but you may need to protect the code that embeds ECL from being interrupted using either the macros ecl_disable_interrupts and ecl_enable_interrupts or the POSIX functions pthread_sigmaks and sigprocmask.


3.5.5 Signals Reference

Macro: mp:with-interrupts &body body
Macro: mp:without-interrupts &body body

Execute code with interrupts optionally enabled/disabled, See Processes dictionary.

Condition: ext:unix-signal-received

Unix signal condition

Class Precedence List

condition, t

Methods

Function: ext:unix-signal-received-code condition

Returns the signal code of condition

Function: ext:get-signal-handler code

Queries the currently active signal handler for code.

Function: ext:set-signal-handler code handler

Arranges for the signal code to be caught in all threads and sets the signal handler for it. The value of handler modifies the signal handling behaviour as follows:

handler is a function designator

The function designated by handler will be invoked with no arguments

handler is a symbol denoting a condition type

A continuable error of the given type will be signaled

handler is equal to code

A condition of type ext:unix-signal-received with the corresponding signal code will be signaled

handler is nil

The signal will be caught but no handler will be called

Function: ext:catch-signal code flag &key process

Changes the action taken on receiving the signal code. flag can be one of the following:

nil or :ignore

Ignore the signal

:default

Use the default signal handling strategy of the operating system

t or :catch

Catch the signal and invoke the signal handler as given by ext:get-signal-handler

:mask, :unmask

Change the signal mask of either a) the not yet enabled process or b) the current process, if process is not supplied

Returns t on success and nil on failure.

Example:

CL-USER> (ext:catch-signal ext:+SIGPIPE+ :catch)
T
CL-USER> (ext:get-signal-handler ext:+SIGPIPE+)
NIL
CL-USER> (ext:set-signal-handler ext:+SIGPIPE+
                                 #'(lambda ()
                                     (format t "SIGPIPE detected in process: ~a~%" mp:*current-process*)))
#<bytecompiled-function 0x25ffca8>

Passing the SIGPIPE signal to the ECL program with killall -s SIGPIPE ecl results in the output:

SIGPIPE detected in process: #<process TOP-LEVEL 0x1ecdfc0>