Friday, December 6, 2013

Properties on Python modules

Python properties let programmers abstract away get, set, and delete methods using simple attribute access without exposing getValue(), setValue(), and delValue() methods to the user. Normally, properties can only be added to classes as either an instance-level property, or a class-level property (few people use class-level properties; I've only used them once, and I had to build a ClassProperty object to have it). But in this post, you will find out how to create module-level properties, and where to get a package that offers transparent module-level property creation and use from Python 2.4 all the way up to early versions of 3.4.

Why do I want module-level properties in Python?


For many people, the desire to have module-level properties boils down to flexibility. In this case, I came up with this solution while talking with Giampaolo Rodola (author of PyFTPDLibpsutil, current maintainer of asyncore and related Python libraries, ...) and hearing about psutil's use of module-level constants. Now, module-level constants aren't usually a big deal, but in this case, some of psutil's constants are actually relatively expensive to compute - expensive enough that Giampaolo was planning on deprecating the constants, deferring computation until the library user explicitly called the relevant compute_value() functions.

But with module properties, Giampaolo can defer calculation of those values until a program accesses the module's attributes, at which point the values can be computed (and cached as necessary). But even more useful is the fact that if you don't use those attributes, you don't need to compute them, so most people will get an convenient but unexpected performance improvement any time they need to import psutil.

What doesn't work and why


The first time someone wants to use a module property, they will try to decorate a function in their module with the property() decorator like they are used to using on methods of classes. Unfortunately, when trying to access that "property", they discover that the property didn't get any of the descriptor magic applied to it, and they have a property object that doesn't do anything particularly useful.

The reason it doesn't do anything useful is because properties are attached to classes, not instances. And during import/execution of a module, your definitions are being executed in the context of an instance dictionary of the module with no substantive post-processing. On the other hand, typical Python class definition results in the body of the class being executed, then the results passed to type() (via the type(name, bases, dict) form) for class creation.

Making it work


Someone who knows a bit more about how Python's internals are put together knows that you can muck with the contents of sys.modules, and that doing so during module import will let you replace the module object itself. So that's what we are going to do. Along the way, we're going to be doing a bit of deep magic, so don't be scared if you see something that you don't quite understand.

There are 5 major steps to make module properties work:
  1. Define your property
  2. Create a new type to offer unique properties for the module
  3. Ensure that the replacement module has access to the module namespace
  4. Fix up the module namespace and handle property definitions
  5. Replace the module in sys.modules

Our first two steps are easy. We can simply use the standard @property decorator (that we'll make work later) to create a property, and  we define an empty class definition that subclasses from object.

@property
def module_property(module):
    return "I work!", module

class Module(object):
    pass

Our third step is also easy, we just need to instantiate our replacement module and replace its __dict__ with the globals from the module we are replacing.

module = Module()
module.__dict__ = globals()

Our fourth step also isn't all that difficult, we just need to go through the module's globals and extract any properties that are defined. Generally speaking, we really want to pull out *any* descriptors, not just properties. But for this version, we'll extract out only property instances.

for k, v in list(module.__dict__.items()):
    if isinstance(v, property):
        setattr(Module, k, v)
        del module.__dict__[k]

Note that when we move the properties from the module globals to the replacement module, we have to assign to the replacement module class, not the instance of the replacement module. Generally speaking, this kind of class-level function/descriptor assignment is frowned upon, but in some cases (like this), it is necessary in order to get the functionality that we want.

And our final step is actually pretty easy, but we have to remember to keep a reference to the original module, as standard module destruction includes the resetting of all values in the module to be equal to None.

module._module = sys.modules[module.__name__]
module._pmodule = module
sys.modules[module.__name__] = module

And that is it. If you copy and paste all of the above code into a module with all of your module properties defined before our fourth step executes, then after the module is imported you will be able to reference any of your defined properties as attributes of the module. Note that if you want to access the properties from within the module, you need to reference them from the _pmodule global we injected.

Where can I get a pre-packaged copy of this magic?


To save you (and me) from needing to copy/paste the above into every module we want module properties, I've gone ahead and built a Python package for module properties. You can find it on Github, or you can find it on the Python package index. How do you use it? Very similar to what I defined above:

@property
def module_property(module):
    return "I work!", module

# after all properties are defined (put this at the end of the file)
import mprop; mprop.init()

Alternatively, if you don't want to remember to throw an mprop.init() call at the end, I've got a property work-alike that handles all of the magic:

from mprop import mproperty

@mproperty
def module_property(module):
    return "I also work!", module

And that's it. Module properties in Python. Enjoy :)

Hacker news thread here. Reddit thread here.