Monday, March 7, 2011

Cross Process Locking and Synchronization in Python

Python includes all the bells and whistles to allow for multi-threading and multi-process locking and synchronization. However, the Python libraries lack many of the key features needed for true multi-process synchronization. We are going to take a trip through the Python source code(2.6.5 in our examples) to point out its shortcomings, and find a workaround to solve these problems.

First, a little primer. POSIX compliant systems have the necessary APIs to let the kernel do the work for you. The standard tool for cross-process synchronization is the “named semaphore”. Using them means more then one process can share the same semaphore (hence the naming, you got to find it somehow) with all the blocking (or non-blocking) fun handled by the API. These same APIs are used by Python to handle locking and synchronization in both the threading and multi-process libraries. However, Python only wraps some of their functionality, and leaves out the stuff you may need:

First, the threading module(/Python-2.6.5/Python/thread_pthread.h):

POSIX semaphores for legacy systems:
296         status = sem_init(lock,0,1);

POSIX thread mutexes for newer systems:
398         status = pthread_mutex_init(&lock->mut,
399                         pthread_mutexattr_default);

Depending on your system, either of these locking mechanisms are utilized by the Lock, RLock, Event, Condition, and Semaphore classes. Essentially, all of these classes are identical underneath. They all use an un-named binary POSIX semaphore for synchronization. This disqualifies them for use in any cross-process synchronization, since they are not named we cannot open them across processes. It is also useful to note they handle their own counting internally in pure Python (/Python-2.6.5/Lib/threading.py). That means all lock acquisition counting for re-entrant locks and non-binary semaphores in the threading library is handled in user space by Python, not kernel space.

Now the mulitprocess module (/Python-2.6.5/Modules/_multiprocessing/semaphore.c):

Straight up POSIX semaphores only here:
195 #define SEM_CREATE(name, val, max) sem_open(name, O_CREAT | O_EXCL, 0600, val)
.
.
.
439     handle = SEM_CREATE(buffer, value, maxvalue);

This looks more promising. It appears we can set the values and it takes a name. As well, just like the threading module the Lock, RLock, Event, Condition, and Semaphore classes in the multiprocessing module depend on this one central primitive. However, upon further review of the play, the naming of semaphores is not exposed by the Python multiprocessing interface, and is expressly forbid by the flags to the sem_open() library call in the Python source.

From the man pages:
If O_EXCL and O_CREAT are set, sem_open() fails if the semaphore name exists.

From the Python source:
195 #define SEM_CREATE(name, val, max) sem_open(name, O_CREAT | O_EXCL, 0600, val)

This is fine if we stay within the bounds of the process library; using the Process class to fork() new processes. Since they are all related, they will have access to the same Lock and their is no need to know the internal name the Python multiprocessing module named your semaphore. This great if you plan on spawning new processes off of existing ones.

However, if you have two or more independent processes (Python, or a mix of languages), how do you synchronize resources between the two? Python doesn’t expose named semaphores, and rolling your own takes a lot of work. Maybe another kernel controlled synchronization primitive will do. This leaves us with file locks. We can use them for our purposes, but without ever doing disk reads or writes.

Python exposes them through the fcntl module:

import os
import fcntl

class Lock:
    
    def __init__(self, filename):
        self.filename = filename
        # This will create it if it does not exist already
        self.handle = open(filename, 'w')
    
    # Bitwise OR fcntl.LOCK_NB if you need a non-blocking lock 
    def acquire(self):
        fcntl.flock(self.handle, fcntl.LOCK_EX)
        
    def release(self):
        fcntl.flock(self.handle, fcntl.LOCK_UN)
        
    def __del__(self):
        self.handle.close()

# Usage
try:
    lock = Lock("/tmp/lock_name.tmp")
    lock.acquire()
    # Do important stuff that needs to be synchronized
    .
    .
    .
finally: 
    lock.release()

As long as all your processes use that class in your Python code for locking, you will be fine. I find it works great for sharing resources between DJango, API’s, and regular Python processes. If you need to mix languages, just use flock() and be sure to use the same file name and to make sure you set it to blocking or non-blocking accordingly. It should also be very efficient, the file is created only when the lock is first created, and the locking is controlled from the kernel, so no more disk read/writes after instantiation.

Just remember that file locks are advisory. All programs must use flock() for it to work. As well, they are local, so if the file you are using as your lock file is on an NFS mount then it will only work for processes that are all located on the same system.

If you are interested in IPC and Python, I would suggest checking out POSIX IPC for Python. It exposes all kinds of goodness you may find useful, like shared memory and semaphores. However, at the time of the writing of this article it is currently on version 0.9.0 and little beta. I preferred to create a solution that was simple, and used the well tested and proven Python libraries.

If you need to scale your locking past one system I would suggest checking out Elock. It’s a distributed lock system created in Erlang, so you know it can scale. The adminstrative guide can be found on the main page and you can find a client example in Ruby here and in Python using Twisted here.

I would also suggest checking out this example that uses memcache. Great for distributed systems.

1 comment:

  1. multi process works in windows...does flock'ing work in windows?

    ReplyDelete