Logging messages onto a strictly bounded queue is perilous

Imagine we have a complicated program with a logging framework in it which puts messages on a queue. One thread is responsible for taking log messages from the queue and writing them somewhere, such as a logstash instance or local database.

Ideally we don't want to ever drop messages on the floor, even if the threads that add log messages go faster than the logging thread can save the messages. We can't buffer messages forever or we'll eventually just run out of memory and crash, so the only way out that never loses messages is to force the threads generating the messages to slow down if they outrace the logging thread too much.

The obvious way to do this is with a blocking queue. Unfortunately, this is perilous in any system in which finalisers or signals can cause messages to be logged from any thread.

The setup:

Say we have the following program, which simulates the system described above:

try: import queue except ImportError: import Queue as queue from threading import Thread, current_thread import time import os import sys def usage(): print("Usage: %s <MAX_SIZE>" % sys.argv[0]) print(" where MAX_SIZE is an integer >= 0.") print("") print("If MAX_SIZE is 0, model an unbounded log message queue,") print("otherwise model a queue holding up to MAX_SIZE messages.") sys.exit(1) if not len(sys.argv) == 2: usage() MAX_SIZE = int(sys.argv[1], base=10) if MAX_SIZE < 0: usage() message_queue = queue.Queue(maxsize=MAX_SIZE) class HLD(object): def __del__(self): message = u"%r destroyed %r\n" % (current_thread().name, self) os.write(1, message.encode("utf-8")) message_queue.put(message) def __repr__(self): return 'HLD(%r)' % (id(self),) def in_read_thread(): while True: item = message_queue.get() message = u"reader dequeued (%r)\n" % (item,) os.write(1, message.encode("utf-8")) # This block allocates cyclic garbage in order to # make the gc sometimes activate on the read thread. # For a faster repro, replace this block with: # import gc; gc.collect() for _ in range(2): trash = [object() for _ in range(100)] trash.append(trash) trash = None def in_write_thread(): while True: # This block allocates cyclic garbage which has references to HLD # objects. The HLD objects themselves aren't part of the cycles, # so this cyclic garbage can be collected at some point. trash = [HLD() for _ in range(10)] trash.append(trash) trash = None time.sleep(0.1) message_queue.put(u"Hello from write thread!") read_thread = Thread(target=in_read_thread) read_thread.name = "read_thread" read_thread.daemon = True read_thread.start() write_thread = Thread(target=in_write_thread) write_thread.name = "write_thread" write_thread.daemon = True write_thread.start() while True: # Pause the main thread indefinitely (until the program is interrupted # by pressing control-C at the terminal). Use time.sleep() for this # rather than a lock or Thread.join(), because Python 2 has a bug # wherein signals such as KeyboardInterrupt can't be delivered # when the main thread is block on any kind of mutex. # This bug appears to have been fixed in Python 3. time.sleep(100)

If you run this program with the argument 0, it will simulate a system with an unbounded queue for log messages. If you run it with any positive argument, it will simulate a system with a strictly bounded queue for log messages.

Unfortunately, when the queue is bounded, it manifests a deadlock bug.

The HLD class models a type of object which sometimes logs a message from its finaliser (the __del__ method). This can happen in practical frameworks where, for example, exceptions get raised from buggy code in finalisers. We need to log these so we can find out about and fix the bugs.

The read_thread executes the function in_read_thread(). It accepts log messages from the log queue, saves them somewhere (in this case, stdout) and in the process allocates some garbage. The deliberate garbage allocations in this code are a simulation of doing something slightly complicated with the messages, like formatting them to put into a database through an ORM.

The write_thread executes the function in_write_thread(). It writes log messages to the log queue and allocates garbage, simulating the program doing real work and responding to user requests. Some of the garbage it allocates has finalisers that will emit log messages.

The punchline:

When the garbage collector kicks in, some HLD objects will be destroyed, have their finalisers run, and emit log messages.

If the garbage collector kicks in on the write_thread then nothing goes wrong; if there are too many of them all at once for the read_thread to deal with, the write_thread will simply slow down and wait for the read_thread.

In the output from our simulation, this will tend to look like:

'write_thread' destroyed HLD(140364268959120)
'write_thread' destroyed HLD(140364268959056)
reader dequeued (u"'write_thread' destroyed HLD(140364268958032)\n")
'write_thread' destroyed HLD(140364268958992)
reader dequeued (u"'write_thread' destroyed HLD(140364268957968)\n")
'write_thread' destroyed HLD(140364268958928)
reader dequeued (u"'write_thread' destroyed HLD(140364268957904)\n")
'write_thread' destroyed HLD(140364268958864)
reader dequeued (u"'write_thread' destroyed HLD(140364268959120)\n")
'write_thread' destroyed HLD(140364268958800)
reader dequeued (u"'write_thread' destroyed HLD(140364268959056)\n")

If the garbage collector kicks in on the read_thread instead, then the system can deadlock when the queue is bounded. When the finaliser for a HLD object is run on the read_thread, it will try to log a message to the queue. The queue might, at this point, be full. If it is, then the reader thread will block in the HLD object's finaliser. It can't ever be unblocked, because it can't return from the HLD object's finaliser in order to drain the logging queue to make room for the message it is currently waiting for a change to add!

In the output from our simulation, (with MAX_SIZE==4) this looks like:

reader dequeued (u'Hello from write thread!')
reader dequeued (u'Hello from write thread!')
reader dequeued (u'Hello from write thread!')
'read_thread' destroyed HLD(140364269227472)
'read_thread' destroyed HLD(140364269227408)
'read_thread' destroyed HLD(140364269227344)
'read_thread' destroyed HLD(140364269227280)
'read_thread' destroyed HLD(140364269227216)

And there it freezes, deadlocked.


The respite:

One way out of this conundrum is to move the message queue out of the process, into a separate process which does nothing but drain the queue and does not ever log messages itself.

Another possibility is to change the method for adding messages to the queue to refrain from blocking if called from the read_thread. However, we're now back to worrying about how we can't buffer messages forever. If we go down this route, we'll either have to drop messages on the floor, or hope really hard that any messages that happen to get added by the logging thread itself don't happen to be large enough to exhaust available memory. This may be entirely practical; many programs don't ever allocate very many objects with finalisers.

A less practical idea is to make sure that signals and finalisers can't cause messages to be logged from the read_thread. In Python, it's easy enough to arrange for signals never to arrive on the thread which is draining the queue - simply make sure that the log message queue is drained from some thread that is not the 'main' thread, because Python only delivers signals to the 'main' thread. However, avoiding logging messages from finalisers is not so practical. If we have finalisers at all then eventually we'll probably find bugs in them, and if our logging system doesn't help us get the exception messages and tracebacks so we can try to fix them, then the logging system isn't actually doing us any good.

One idea which will not work is simply adding more than one logging thread, in the hope that the second thread will always unblock the first. The big problem is that two (or more!) readers can both block if they each log a message at the wrong time. A smaller issue is that, if having more than one thread doing the actual writing-down of messages isn't a problem for, why use a queue at all? We could just change the logging mechanism to write down the messages directly without the queue.