Sunday, March 13, 2005

Python decorators

I'm getting back into Python; in particular, I took the plunge and upgraded to 2.4.

One of the new features in 2.4 is decorators: a decorator is a function which wraps a function, adding extra behaviour. The name comes from a well-established design pattern, although it's slightly misapplied in Python; strictly speaking, a classic GoF Decorator adds responsibility without changing the interface, while a Python decorator may change both behaviour and interface.

Really though, decorators were always possible in Python:

def func(...):
    # Function body.
    pass

func = decorator(func)


In earlier versions of Python, the classmethod and staticmethod builtins were used just like this.

What 2.4 adds is syntax. The decorator syntax was thrashed out in one of the longest and bloodiest debates on python-dev — second only to the great Ternary Operator War of 2004 — which eventually led to Guido (Python's BDFL) pronouncing that the syntax should be:

@decorator
def func(...):
    # Function body.
    pass


And look: in 2.4, the classmethod and staticmethod builtins are decorators and can be applied with the @-syntax. What the @-syntax brings to the table is brevity (no repetition of the function name) and better scoping (the decorator is applied at, not after, the function definition).

The best current documentation for decorators is the relevant section of the Python 2.4 What's New; there's also a library of decorator examples forming in the Python wiki, and a number of Python Cookbook recipes for decorators.

I thought I'd have a play with decoration to attack a problem: expressing prerequisites. Say I have a pile of data — track metadata extracted from iTunes — and a pile of functions which validate various aspects of that data and which need to be applied in a certain order. For example, in my example, it may not be meaningful to validate disc numbers until the disc counts have been validated.

The blunt way to do this is to hard-code a list of functions that obey the constraints of the prerequisites. But it'd be nice to be able to plug in new functions without having to manually update that list; Python's introspection capabilities make plugin architectures easy.

So, step 1: add a prereqs attribute to each function. (In Python, functions are objects so can — and do — carry attributes.) The What's New gives an example for this:

>>> def deco(func):
... func.attr = 'decorated'
... return func
...
>>> @deco
... def f(): pass
...
>>> f
<function f at 0x402ef0d4>
>>> f.attr
'decorated'


The decorator takes a function as argument and returns a function as result. But I want to be able to provide a list of prerequisites as arguments to my decorator. This is funkier: decorators which take arguments must themselves return a decorator. As the What's New says: "getting this right can be slightly brain-bending". Here's my decorator:

def prereqs(*args):
    def deco(func):
        func.prereqs = args
        return func
    return deco


Two notable Pythonisms at work here: firstly, the *args syntax means "list of all positional arguments". And secondly, the nested def defines a new function on each invocation of the containing function. This dynamic definition of functions is a common Python idiom.

Here's some functions with applied prerequisites:

def f1(): pass

@prereqs(f1)
def f2(): pass

@prereqs(f1)
def f3(): pass

@prereqs(f2, f3):
def f4(): pass


Step 2: sort these functions into a list ordered by prerequisites. This is in fact a topological sort of a directed acyclic graph, and the method for sorting a DAG is well-known: accumulate final vertices by locating them with successive depth-first searches.

def sort_prereqs(funcs):

    # Depth-first search, accumulating in result.
    def depth_first(func, result):
        if not func.visited:
            func.visited = True
            try:
                for prereq in func.prereqs:
                    depth_first(prereq, result)
                except AttributeError:
                    pass
            result.append(func)

    # Mark all vertices as unvisited.
    for func in funcs:
        func.visited = False

    # Perform depth-first search from all vertices.
    result = []
    for func in funcs:
        depth_first(func, result)

    return result


The try... except block handles the case where functions have no prerequisites listed, as is the case for f1 above. Here's the results of an application of this sort on an misordered list of the functions defined as examples above:

>>> sort_prereqs([f2, f4, f1, f3])
[<function f1 at 0x0115B9B0>, <function f2 at 0x0115B9F0>, <function f3 at 0x0115BA30>, <function f4 at 0x0115BA70>]


Mildly cool. I was lukewarm about decorators during the arguments: the @-syntax struck me as just so much syntactic sugar. But it really does make decorators easier to apply; and if they're easier to use, they're more likely to be added to a developer's personal toolbox.

Finally, a note: this has been a noddy example. But dealing with prerequisites is a problem I've faced in real production code. How to determine the initialisation order of modules, some of which are dependent on others? In the past I've used both hard-coded orders and dynamic ordering with explicitly-stated dependencies, as shown here. While the dynamic solution cost more initially, it caused a lot fewer maintenance problems in the long run.

Comments: