Patterns in Python

来源:互联网 发布:linux系统安全设置 编辑:程序博客网 时间:2024/06/13 01:22

Patterns in Python

Author: Duncan Booth Contact: duncan@rcp.co.uk

Abstract

What design patterns are applicable to Python? Some patterns are an intrinsic part of Python, other patterns require some careful coding to get the best from them. What new patterns appear in Python?

Contents

  • 1   What is a pattern?
    • 1.1   A word of warning
  • 2   Creational Patterns
    • 2.1   Factory
    • 2.2   Singleton (and the Borg)
      • Intent
  • 3   Structural Patterns
    • 3.1   Flyweight
      • Intent
  • 4   Behavioural Patterns
    • 4.1   Observer
      • Intent
    • 4.2   Iterators and Generators
      • Intent
      • Generators
      • itertools
    • 4.3   Command Dispatch Pattern
  • 5   Little patterns in Python
  • 6   Conclusion
  • 7   References

1   What is a pattern?

The definitive reference book Design Patterns [GoF] describes a set of patterns for object-oriented software design. This book is often referred to as the 'Gang of Four' book (or even GoF) after the four authors (Gamma, Helm, Johnson and Vlissides).

A design pattern describes a problem that occurs over and again and the core of a solution to that problem in such a way that it may be used in many different ways. That last point is important, when we talk about design patterns in software, we aren't talking about things that can be neatly tied up into a class or library implementation and just used, we are talking about techniques that are applied in different ways. Recognising when the same technique is being used in a different context allows us to apply our experiences across a much wider domain.

The GoF introduced a pattern vocabulary to the software community. Each pattern as they describe it has:

  1. A pattern name which is a handle used to describe a design problem. Using a name lets us have a common vocabulary with other software developers.
  2. The problem describes when to apply a pattern.
  3. The solution describes the elements that make up the design, their relationships, responsibilities and collaborations.
  4. The consequences are the results and trade-offs of applying the pattern. Recognising the consequences of applying a pattern in one situation lets us better evaluate how appropriate the pattern may be in another.

Design Patterns gives outline implementions of patterns in C++ (and a supplementary book translated many of them into Smalltalk). However, it is apparent that while some patterns are largely independant of the language in which they are implemented, others either become inappropriate in another language, or virtually disappear.

This paper looks at a few of the common patterns, identifies what if anything is their equivalent in Python, and also considers whether Python has its own patterns different than the C++ patterns.

Others have looked at how Design Patterns relate to Python, most notably Vespe Savikko [VS], and Alex Martelli [AM], but as Python evolves, the ways you can implement these patterns are changing.

1.1   A word of warning

I was reading an article by Ron Jeffries [RJ] recently where he wrote:

Small Boy with a Patterns Book

After spending a bunch of time thinking about these ideas, over a few days now, I finally recognized in myself what I call "Small Boy with a Patterns Book". You can always tell when someone on your team is reading the Gang of Four book (Gamma, et al., Design Patterns). Every day or so, this person comes in with a great idea for a place in the system that is just crying out for the use of Composite, or whatever chapter he read last night.

There's an old saying: To a small boy with a hammer, everything looks like a nail. As programmers, we call into the same trap all too often. We learn about some new technology or solution, and we immediately begin seeing places to apply it.

Patterns are useful, they can also be addictive. Try not to overuse them.

2   Creational Patterns

The GoF identified several creational patterns. These patterns abstract the process of instantiating objects.

2.1   Factory

The most fundamental of patterns identified by the GoF are probably the Factory and Abstract Factory. The Factory pattern in a language such as C++ wraps the usual object creation syntax new someclass() in a function or method which can control the creation. The advantage of this is that the code using the class no longer needs to know all of the details of creation. It may not even know the exact type of object it has created. In other words it reduces the dependencies between modules.

A more advanced form of factory (Abstract Factory) provides the extra indirection to let the type of object created vary.

The factory pattern is fundamental in Python: where other languages use special syntax to indicate creation of an object, Python uses function call syntax as the (almost) only way to create any object: some of the builtin types such as int, str, list, and dict, have their own special syntax, but they all support factory construction as well.

Moreover, Python uses abstract factories for everything. The dynamic nature of the system means that any factory may be overridden.

For example, the following code:

import random

def listOfRandom(n):
return [random.random() for i in range(n)]

At first sight it looks as though this function will return a list of 10 pseudo-random numbers. However you can reassign random at the module level, and make it return anything you wish. Although at first this may sound like a crazy thing to do, in fact it is one of the reasons why Python is such a great language for writing unit tests. It is hard to write an automated test for a function with a pseudo-random result, but if you can temporarily replace the random number generator with a known, repeatable, sequence, you can have repeatable tests. Python makes this easy.

It is hard to say whether this really counts as a pattern in Python at all. At one level it is basic to the language, and does not involve actual code. On the other hand, the pattern is so well known that it is important to acknowledge that it corresponds to the Factory pattern.

Python 2.2 introduced a new way to control object creation. New-style objects (where object is the base class) allow a method __new__ to control the actual creation of the object. This may be seen as another form of the factory pattern, and one where there is actual Python code to implement it. We see more of this in the next section.

2.2   Singleton (and the Borg)

Intent

Ensure a class has only one instance and provide a global point of access to it.

One of the first patterns many programmers learn to identify as a pattern is the 'Singleton'. This is a pity, as in many ways it is rather more of an anti-pattern.

A singleton is an object which can only be instantiated once in a process. Not of course an object which you only happen to instantiate once, but rather an object which will resist all attempts to create multiple instances.

The singleton pattern is often used for a class controlling an application's access to a database, a network link to a server, a conection to the computer's registry and so on. This is a poor use of singleton. The application may only require a single database, but it isn't a requirement that there can only be one database connection. Maybe one day it will evolve into an application with two databases, so why write code to prevent that?

A more significant drawback of singleton is that it breaks testing. Unit tests often work by creating mock objects that look similar to real objects but have dummy implementations. If your code has built brick walls protecting that 'database' instance, then it becomes harder, or even impossible to temporarily stub it out (although very little is completely impossible in Python). Test driven development very quickly leads you to abandon large singletons.

Nevertheless, should you require it, it is easy to implement the singleton pattern in Python:

>>> class Singleton(object):
_instance = None
def __new__(cls, *args, **kwargs):
if not cls._instance:
cls._instance = super(Singleton, cls).__new__(
cls, *args, **kwargs)
return cls._instance


>>> class C(Singleton):
pass


>>> class D(Singleton):
pass


>>> c = C()
>>> d = C()
>>> id(c), id(d)
(10049912, 10049912)
>>> e = D()
>>> f = D()
>>> id(e)
10113672
>>> id(f)
10113672
>>> g = C()
>>> id(g)
10049912
>>>

This example creates a new mixin class Singleton. Each new subclass creates one instance and thereafter returns that instance. Further subclassing of C or D could be confusing, but by checking the type returned we can avoid the obvious errors.

It has been noted elsewhere [AM] that the requirement driving people to use the Singleton is not a requirement for a single instance at all. Rather it is a need for a shared state. This led Python Programmers to invent one of the few genuine Python Patterns with a name: The Borg.

The Borg pattern allows multiple class instances, but shares state between instances so the end user cannot tell them apart. Here is the Borg example from the Python Cookbook

class Borg:
__shared_state = {}
def __init__(self):
self.__dict__ = self.__shared_state
# and whatever else you want in your class -- that's all!

Problems with the Borg as a pattern start when you begin to write 'new style' classes. The __dict__ attribute is not always assignable, but worse any attributes defined within __slots__ will simply not be shared. The Borg is cool, but it isn't your friend.

There is another implementation of Singleton which is even simpler than the one given above. In fact, I'm sure every Python programmer has used this method, although many of them may have failed to recognise the Singleton pattern within it.

Consider this file (singleton.py):

"""This module implements the singleton pattern"""

Simple, isn't it. You can access the singleton object using the import statement. You can set and access attributes on the object. Obviously in real life you might want a few methods, or some initial values, so just put them in the module.

Python modules are Singleton instances: another case of Python taking a design pattern and making it a fundamental part of the language.

3   Structural Patterns

3.1   Flyweight

Intent

Use sharing to support large numbers of fine-grained objects effeciently.

This is related to the singleton pattern. Whereas with singleton we wanted exactly one instance of an object, in some cases we need very many instances but not all of the objects need to be distinct.

For example, consider an application that handles stock market prices. Perhaps we have several portfolios, each of which contains a large number of underlying stock instruments. Each instrument holds some data (current and recent prices, daily high and low, etc.), but this data is common to the instrument wherever it is used. Each portfolio might record the amount of each instrument held, the date purchased, and the price at which it was purchased.

We have a choice here. We could store the portfolio specific data inside each instrument, but then instrument instances cannot be shared between portfolios. If we store them as part of the portfolio then we can have shared instrument classes:

# Model a financial instrument
# The instrument class represents a financial instrument,
# with updates arriving from some network source.
#
# N.B. As an example, this code is not threadsafe.
# Real code might have to handle asynchronous updates to data.
#
import weakref

class Instrument(object):
_InstrumentPool = weakref.WeakValueDictionary()

def __new__(cls, name):
'''Instrument(name)
Create a new instrument object, or return an existing one'''
obj = Instrument._InstrumentPool.get(name, None)

if not obj:
print "new",name
obj = object.__new__(cls)
Instrument._InstrumentPool[name] = obj

return obj

def __init__(self, name):
'''Complete object construction'''
self.name = name
print "New instrument @%04x, %s" % (id(self), name)

# ... connect instrument to datasource ...

import unittest

class InstrumentTests(unittest.TestCase):
def testInstrument(self):
ibm1 = Instrument("IBM")
ms = Instrument("MS")
ibm2 = Instrument("IBM")
self.assertEquals(id(ibm1), id(ibm2))
self.assertNotEquals(id(ibm1), id(ms))

self.assertEquals(2, len(Instrument._InstrumentPool),
"Total instruments allocated")

# This bit assumes CPython memory allocation:
del(ibm1)
del(ibm2)
self.assertEquals(1, len(Instrument._InstrumentPool),
"Total instruments allocated")

if __name__=='__main__':
unittest.main()

This code shows a simple way to create objects which share their state if they are created with compatible parameters. The two IBM objects are in fact only one object, but the MS object is separate.

If we run this code with the 'print' statements then we can see that although we only create two objects, the __init__ constructor is called all three times. This could be useful, for example, if we want the Instrument class to generate events to some Portfolio class further up the line.

D:/accu>instrument.py
new IBM
New instrument @7bd240, IBM
new MS
New instrument @7a80b8, MS
New instrument @7bd240, IBM
.
----------------------------------------------------------------------
Ran 1 tests in 0.020s

OK

The weakref dictionary ensures that when nothing is actively using a particular instrument the storage for it may be automatically released. The actual behaviour of this may vary somewhat, for example the Java implementation of Python wouldn't actually release unused instruments until a garbage collection cycle.

4   Behavioural Patterns

4.1   Observer

Intent

Defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.

Observer is one of the patterns I find myself using over and again, but until recently I never felt completely happy with my Python implementations of it.

You have two classes, the subject and an observer which registers itself with the subject and receives notification callbacks when data changes in the subject. I find the GoF form of this pattern somewhat limiting, because they describe a system where a subject class has a general 'notify' method used for everything (although they do suggest a mechanism for generating more selective events using aspects).

The implementation described here uses a more general, and I believe cleaner form of event generation which is based (loosely) on the event structure from Microsoft's .Net framework. It is best described starting with the intended use.

The actual example is taken from the GoF book, although of course the implementation is not. ClockTimer is a subject for storing and maintaining the time of day. It notifies its Observers every second. ClockTimer provides the interface for retrieving individual time units such as the hour, minute and second:

class ClockTimer:
def GetHour(self):
return self._hour
def GetMinute(self):
return self._minute
def GetSecond(self):
return self._second

TickEvent = Event()
def OnTick(self):
ClockTimer.TickEvent.call(self, self.GetHour(),
self.GetMinute(), self.GetSecond())

def Tick(self):
# update internal time-keeping state
# ...
self.OnTick()

The Tick method gets called by an internal timer at regular intervals. It updates the internal state and calls the OnTick method to notify observers of the change.

The OnTick method fires the event with whatever parameters seem appropriate. Firing the event indirectly in this way allows subclasses to override the event handling.

Although we have a single Event in this class, the implementation allows for any number of different events to be defined.

Now, we can define a class DigitalClock that displays the time:

class DigitalClock(Widget):
def __init__(self, clockTimer):
self.__subject = clockTimer
clockTimer.TickEvent += self.Update

def close(self):
self.__subject.TickEvent -= self.Update

def Update(self, subject, hour, min, sec):
self.displayedTime = (hour, min, sec)
self.Draw()

def Draw(self):
# draw the digital clock

N.B. We need an explicit close method to be called on this object because there is a circular dependency (ClockTimer contains a reference to the UpdateMethod of the DigitalClock instance, and the DigitalClock instance stores a reference to the clockTimer). This means that a __del__ method would never be called. In cases where this could be a problem, one solution would be to define a WeakMethod class that simulates a bound method but only holds a weak reference to the instance.

The plumbing that allows this to work is as follows:

class Delegate:
'''Handles a list of methods and functions
Usage:
d = Delegate()
d += function # Add function to end of delegate list
d(*args, **kw) # Call all functions, returns a list of results
d -= function # Removes last matching function from list
d -= object # Removes all methods of object from list
'''
def __init__(self):
self.__delegates = []

def __iadd__(self, callback):
self.__delegates.append(callback)
return self

def __isub__(self, callback):
# If callback is a class instance,
# remove all callbacks for that instance
self.__delegates = [ cb
for cb in self.__delegates
if getattr(cb, 'im_self', None) != callback]

# If callback is callable, remove the last
# matching callback
if callable(callback):
for i in range(len(self.__delegates)-1, -1, -1):
if self.__delegates[i] == callback:
del self.__delegates[i]
return self
return self

def __call__(self, *args, **kw):
return [ callback(*args, **kw)
for callback in self.__delegates]

The delegate class maintains a list of callbacks (so we can have several observers for a single subject). The only operations supported on a delegate are to add a function, remove a function or call all of the functions in the delegate. The callback functions are stored in order (so first added is also first called), and removed in last in/first out order.

We could create the Delegate instances in __init__, but there is a potential drawback to this. If we created a class that could fire many events, but events mostly went unused, we should have a lot of delegates created for no reason. The Event class below creates delegates only when they are needed, and the indirect call used in the subject class supports this:

class Event(property):
'''Class event notifier
Usage:
class C:
TheEvent = Event()
def OnTheEvent(self):
self.TheEvent(self, context)

instance = C()
instance.TheEvent += callback
instance.OnTheEvent()
instance.TheEvent -= callback
'''
def __init__(self):
self.attrName = attrName = "__Event_" + str(id(self))
def getEvent(subject):
if not hasattr(subject, attrName):
setattr(subject, attrName, Delegate())
return getattr(subject, attrName)
super(Event, self).__init__(getEvent)

def call(self, subject, *args, **kw):
if hasattr(subject, self.attrName):
getattr(subject, self.attrName)(subject, *args, **kw)

Within the ClockTimer class a reference to instance.TickEvent will create the Delegate. The Delegate could be called using self.TickEvent(args), but this would always create it. By calling it instead using ClockTimer.TickEvent.call(args) we avoid doing this unneccessarily.

4.2   Iterators and Generators

Intent

Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.

The iterator pattern is one which Python has embraced fully, albeit in a slightly simpler form than the one proposed by the GoF. GoF iterators have methods:

First()
Next()
IsDone()
CurrentItem()

Python's iterator interface requires the following methods to be defined:

__iter__()      Returns self
next() Returns the next value or throws StopIteration

In addition, any object which supports iteration, but is not itself an iterator supports the iterable interface, i.e. it has a method __iter__() which creates a new iterator object of the appropriate type. It may also have additional methods for creating different types of iterator, but this is not required by the language.

The main difference in Python is that Python's iterators cannot be reset. If you want to iterate over a sequence more than once in Python, then you simply have to create multiple iterators.

Here is a simple example using Python's iterators to iterate over a binary tree structure. This involves walking the tree, and because each value is returned in order we must remember which nodes have been processed, and which we have yet to see. We could write the code recursively if we didn't need to keep suspending the iterator to return a result, but handling a stack manually in Python is actually pretty straightforward:

class Node(object):
class NodeIterator:
def __init__(self, node):
self.stack = [node]

def __iter__(self):
return self

def next(self):
if not self.stack:
raise StopIteration

node = self.stack.pop(-1)
while isinstance(node, Node):
self.stack.append(node.right)
node = node.left
return node

def __init__(self, left, right):
self.left = left
self.right = right

def __iter__(self):
return Node.NodeIterator(self)

import unittest
class NodeTests(unittest.TestCase):
def testNode(self):
tree = Node(
Node('a', 'b'),
Node(
Node('c', 'd'),
'e'))

self.assertEquals(['a', 'b', 'c', 'd', 'e'], list(iter(tree)))

if __name__=='__main__':
unittest.main()

The main problem with this code is that it isn't immediately clear why it works. Why, if I want to return the left side of each branch before the right do I have to deal with the right node first? Why do I never (apparently) return the right hand node of anything?

Iterators often involve thinking backwards in this way. First we maintain our state so we can resume the iterator later on, then we wory about what to return on this iteration.

Also, of course, the unit test obscures the normal use of this code. Where I simply convert the iteration into a flat list to check that all the leaves came back in the correct order, normally we would have some code more like:

for leaf in tree:
... do something with the leaf ...

Generators

Since Python 2.2 there has been special syntax in the language to make it easier to write iterators. Generators turn the process of iterating over an object on its head. The iterator object exists solely to maintain the state of the iteration, this is usually some sort of loop index, but in some cases the data structures can be much more complex. For example the same binary tree structure using a generator becomes:

from __future__ import generators

class Node(object):
def __init__(self, left, right):
self.left = left
self.right = right

def __iter__(self):
if (isinstance(self.left, Node)):
for n in self.left:
yield n
else:
yield self.left

if (isinstance(self.right, Node)):
for n in self.right:
yield n
else:
yield self.right

import unittest
class NodeTests(unittest.TestCase):
def testNode(self):
tree = Node(
Node('a', 'b'),
Node(
Node('c', 'd'),
'e'))

self.assertEquals(['a', 'b', 'c', 'd', 'e'], list(iter(tree)))

if __name__=='__main__':
unittest.main()

The code isn't actually any shorter, but walking the tree has now become rather more obvious. If the left side is another node then we yield each leaf in turn, otherwise we just yield the leaf. Then we repeat on the right side either yielding each leaf in turn or yielding the leaf if that is all we have. Without the generator we were forced to abandon the 'obvious' recursive implementation, but the generator lets us suspend execution as each result is generated.

Other benefits of the generator are that it hides the need for another object type, and (rather suprisingly) it turns out that using a generator is actually much faster than simply calling a Python function repeatedly.

itertools

Iterators and generators may be combined to form pipelines, and the upcoming Python 2.3 includes a new builtin module with a variety of useful iterators. Because these iterators generate each result only as it is returned they provide a way to work with potentially infinite lists:

count([n])
Return consecutive integers starting with n
ifilter(predicate, iterable)
Return all elements x of iterable for which predicate(x) is true.
imap(function, *iterables)
Like map(), but returns an iterator rather than a list.
izip(*iterables)
Like zip(), except it returns an iterator.
repeat(obj)
Returns an iterator that yields obj an unlimited number of times.
times(n, [object])
Returns object a total of n times.

and many more.

It is too early to see how the Python community takes to this new support for iterators. So far there seem to be those (like myself) who see almost every problem as an opportunity for generators, and those who are steering well clear.

4.3   Command Dispatch Pattern

The GoF describe a Command pattern where a request is encapsulated as an object. Back in 1997, Guido van Rossum [GvR] identified a pattern that performs a similar function, but which is unique to dynamic languages such as Python, Perl &c. He gave it the name Command Dispatch.

Sadly, although use of this pattern is common through many Python programs, the name, and perhaps also the identification of this as a pattern, have largely been forgotten.

Suppose you have a class that needs to execute a number of different commands sent from some outside source. e.g. 'get()' and 'put()'. There are various ways to handle this such as:

if command == 'get':
get()
elif command == 'put':
put()
else:
error()

or:

dispatch_table = {
'get': get,
'put': put,
}

# Command dispatch:
if dispatch_table.has_key(command):
func = dispatch_table[command]
func()
else:
error()

but the one used by Python programmers everywhere is:

class Dispatcher:

def do_get(self): ...

def do_put(self): ...

def error(self): ...

def dispatch(self, command):
mname = 'do_' + command
if hasattr(self, mname):
method = getattr(self, mname)
method()
else:
self.error()

As Guido put it: I find this approach super elegant and have used it many times.

You can find this pattern used throughout Python's libraries, including BaseHTTPServer, cmd, pydoc, repr, sgmllib, SimpleXMLRPCServer, urllib, distutils and so on.

5   Little patterns in Python

There are many other common idioms in Python which (depending on your viewpoint) are certainly patterns, although they are perhaps too small to count as Design Patterns. However, even if they don't count as fully fledged Design Patterns, the small patterns listed here are representative of how writing software in Python influences the way you think.

DSU

Decorate, Sort, Undecorate. The way in Python to do any but the simplest sorting. Instead of providing a comparison function for your objects, simply replace a list of objects with a list that sorts into the desired order using the builtin functions.

For example, to produce a list of files in the current directory sorted by their 'last modified' time:

>>> files = glob.glob('*')
>>> decorated = [ (os.stat(file).st_mtime, file) for file in files ]
>>> decorated.sort()
>>> files = [ file for (time, file) in decorated ]
>>> print files
['py.ico', 'pyc.ico', 'pycon.ico', 'default.tag',
... (long list of files here) ...
'Doc', 'Tools', 'INSTALL.LOG', 'win32', 'win32com', 'Lib']
>>>

List comprehensions

Introduced in Python 2.0, list comprehensions let you build new lists by writing a description instead of a series of commands. This lets you think about what you are writing in a different way, which is one of the important features of design patterns, although the list comprehension on its own is too small to be a major pattern.

Bound methods

Coming to Python from other languages where functions and methods are not first class methods requires a definite mental gearshift. A common Python technique is to pass around bound methods or use them for micro-optimisations. e.g.

result = []
save = result.append
while somecondition:
save(calculateValue())

Lists, tuples, and dictionaries

Everything in Python is an object, but not everything has to be a user-defined object. Thinking in terms of the builtin types is important to Python programmers. Partly this is because the builtin types can run faster or be less memory hungry, but there is another bonus in that code is often easier to understand when it uses more primitive but familiar types rather than customised classes everywhere.

Module as a script

Write every module as a script in Python (and the inverse, write every script as a module). By encapsulating the main code of a script in a block headed by if __name__=='__main__', any classes or functions in the script can be reused in other programs. Likewise adding the same block to a module allows tests use of the module outside the context of the program.

6   Conclusion

Design Patterns are very useful tools. They give you a language for thinking about the design and allow you to recognise familiar problems in new contexts. Recognising a pattern lets you immediately relate that pattern back to previous experiences both good and bad.

Some patterns are almost universal across programming languages but other patterns are specific to the features or even the syntax of a particular language. To become fluent in a programming language means not just understanding the syntax but also adopting a common pattern of thought with the other developers.

Even within a language possible implementations of patterns change as the language evolves. The examples given in this paper use new style classes, weak references, and properties. All features added to Python comparatively recently.

7   References

[GoF] Design Patterns, Elements of Reusable Object-Oriented Software; Erich Gamma, Richard Helm, Ralph Johnson, Jon Vlissides; 1995. [AM] (1, 2) Five Easy Pieces: Simple Python Non-Patterns; Alex Martelli, AB Strakt. http://www.aleax.it/Python/5ep.html [VS] Design Patterns in Python; Vespe Savikko, Tampere University of Technology. http://www.python.org/workshops/1997-10/proceedings/savikko.html [GvR] Command Dispatch Pattern; Guido van Rossum; Python Pattern-SIG mailing list, May 1997 [RJ] Adventures in C#: Some Things We Ought to Do; Ron Jeffries; Jan 2003 http://www.xprogramming.com/xpmag/acsMusings.htm