Python Attributes and Methods

来源:互联网 发布:淘宝客导购名填手机号 编辑:程序博客网 时间:2024/05/16 12:07

http://www.cafepy.com/article/python_attributes_and_methods/python_attributes_and_methods.html

Before You Begin

Some points you should note:

  • This book covers the new-style objects (introduced a long time ago inPython 2.2). Examples are valid for Python 2.5 and all the way to Python3.x.

  • This book is not for absolute beginners. It is for people who alreadyknow Python (some Python at least), and want to know more.

  • You should be familiar with the different kinds of objects inPython and not be confused when you come across the termtype where you expectedclass. You can read the first part of this seriesfor background information -Python Types and Objects.

Happy pythoneering!

Chapter 1. New Attribute Access

The Dynamic __dict__

What is an attribute? Quite simply, an attribute is a way to get fromone object to another. Apply the power of the almighty dot -objectname.attributename - and voila! you now havethe handle to another object. You also have the power to createattributes, by assignment: objectname.attributename =notherobject.

Which object does an attribute access return, though? And where doesthe object set as an attribute end up? These questions are answered inthis chapter.

Example 1.1. Simple attribute access

>>> class C(object):...     classattr = "attr on class"  1...>>> cobj = C()>>> cobj.instattr = "attr on instance"  2>>>>>> cobj.instattr  3'attr on instance'>>> cobj.classattr 4'attr on class'>>> C.__dict__['classattr'] 5'attr on class'>>> cobj.__dict__['instattr'] 6'attr on instance'>>>>>> cobj.__dict__ 7{'instattr': 'attr on instance'}>>> C.__dict__ 8{'classattr': 'attr on class', '__module__': '__main__', '__doc__': None}

1

Attributes can be set on a class.

2

Or even on an instance of a class.

34

Both, class and instance attributes are accessible from aninstance.

56

Attributes really sit inside a dictionary-like__dict__ in the object.

78

__dict__ contains only theuser-provided attributes.

Ok, I admit 'user-provided attribute' is a term I made up, but I thinkit is useful to understand what is going on. Notethat__dict__ is itself an attribute. We didn't setthis attribute ourselves, but Python provides it. Our old friends__class__ and__bases__ (nonewhich appear to be in __dict__ either) also seem tobe similar. Let's call them Python-providedattributes. Whether an attribute is Python-provided or not depends onthe object in question (__bases__, for example, isPython-provided only for classes).

We, however, are more interested in user-definedattributes. These are attributes provided by the user, and theyusually (but not always) end up in the__dict__ ofthe object on which they're set.

When accessed (for e.g. printobjectname.attributename), the following objects aresearched in sequence for the attribute:

  1. The object itself(objectname.__dict__ or anyPython-provided attribute ofobjectname).

  2. The object's type(objectname.__class__.__dict__). Observe that only__dict__ is searched, which means onlyuser-provided attributes of the class. In otherwordsobjectname.__bases__ may not return anythingeven thoughobjectname.__class__.__bases__ doesexist.

  3. The bases of the object's class, their bases, and soon. (__dict__ of each ofobjectname.__class__.__bases__). More than one basedoes not confuse Python, and should not concern us at the moment. Thepoint to note is that all bases are searched until an attribute isfound.

If all this hunting around fails to find a suitably named attribute,Python raises anAttributeError. The type of thetype (objectname.__class__.__class__) is neversearched for attribute access on an object(objectname in the example).

The built-in dir() function returns a list ofall attributes of an object. Also look attheinspectmodule in the standard library for more functions to inspectobjects.

The above section explains the general mechanism forall objects. Even for classes (for exampleaccessingclassname.attrname), with a slightmodification: the bases of the class are searched before theclass of the class (which isclassname.__class__ and for most types, by theway, is<type 'type'>).

Some objects, such as built-in types and their instances (lists,tuples, etc.) do not have a__dict__. Consequentlyuser-defined attributes cannot be set on them.

We're not done yet! This was the short version of thestory. There is more to what can happen when setting and gettingattributes. This is explored in the following sections.

From Function to Method

Continuing our Python experiments:

Example 1.2. A function is more

>>> class C(object):...     classattr = "attr on class"...     def f(self):...             return "function f"...>>> C.__dict__ 1{'classattr': 'attr on class', '__module__': '__main__', '__doc__': None, 'f': <function f at 0x008F6B70>}>>> cobj = C()>>> cobj.classattr is C.__dict__['classattr'] 2True>>> cobj.f is C.__dict__['f'] 3False>>> cobj.f 4<bound method C.f of <__main__.C instance at 0x008F9850>>>>> C.__dict__['f'].__get__(cobj, C) 5<bound method C.f of <__main__.C instance at 0x008F9850>>

1

Two innocent looking class attributes, a string 'classattr' and a function 'f'.

2

Accessing the string really gets it from the class's __dict__, as expected.

3

Not so for the function! Why?

4

Hmm, it does look like a different object. (A bound method is acallable object that calls a function (C.f in theexample) passing an instance (cobj in the example)as the first argument in addition to passing through all arguments itwas called with. This is what makes method calls on instancework.)

5

Here's the spoiler - this is what Python did to create the boundmethod. While looking for an attribute for an instance, if Pythonfinds an object with a__get__() method inside theclass's __dict__, instead of returning the object,it calls the __get__() method and returns theresult. Note that the __get__() method is called withthe instance and the class as the first and second argumentsrespectively.

It is only the presence of the __get__() methodthat transforms an ordinary function into aboundmethod. There is nothing really special about a functionobject. Anyone can put objects with a__get__()method inside the class __dict__ and get away withit. Such objects are called descriptors and havemany uses.

Creating Descriptors

Any object with a __get__() method, and optionally__set__() and__delete__()methods, accepting specific parameters is said to follow thedescriptor protocol. Such an object qualifies asa descriptor and can be placed inside a class's__dict__ to do something special when an attributeis retrieved, set or deleted. An empty descriptor is shown below.

Example 1.3. A simple descriptor

class Desc(object):    "A descriptor example that just demonstrates the protocol"        def __get__(self, obj, cls=None): 1        pass    def __set__(self, obj, val): 2        pass    def __delete__(self, obj): 3        pass

1

Called when attribute is read (eg. printobjectname.attrname). Hereobj is theobject on which the attribute is accessed (may beNone if the attribute is accessed directly on theclass, eg.print classname.attrname). Alsocls is the class ofobj (or theclass, if the access was on the class itself. In thiscase,obj is None).

2

Called when attribute is set on an instance(eg. objectname.attrname = 12). Hereobj is the object on which the attribute is beingset andval is the object provided as thevalue.

3

Called when attribute is deleted from an instance(eg. del objectname.attrname). Hereobj is the object on which the attribute is beingdeleted.

What we defined above is a class that can be instantiated to create adescriptor. Let's see how we can create a descriptor, attach it to aclass and put it to work.

Example 1.4. Using a descriptor

class C(object):    "A class with a single descriptor"    d = Desc() 1    cobj = C()x = cobj.d 2cobj.d = "setting a value"  3cobj.__dict__['d'] = "try to force a value" 4x = cobj.d 5del cobj.d 6x = C.d 7C.d = "setting a value on class" 8

1

Now the attribute called d is a descriptor. (ThisusesDesc from previous example.)

2

Callsd.__get__(cobj, C). The value returned is bound tox. Hered means the instance ofDesc defined in1. It can be found inC.__dict__['d'].

3

Calls d.__set__(cobj, "setting a value").

4

Sticking a valuedirectly in the instance's __dict__ works,but...

5

is futile. This still calls d.__get__(cobj, C).

6

Calls d.__delete__(cobj).

7

Calls d.__get__(None, C).

8

Doesn't callanything. This replaces the descriptor with a new string object. Afterthis, accessingcobj.d or C.dwill just return the string"setting a value onclass". The descriptor has been kicked out ofC's__dict__.

Note that when accessed from the class itself, only the__get__() method comes in the picture, setting ordeleting the attribute will actually replace or remove thedescriptor.

Descriptors work only when attached to classes. Sticking adescriptor in an object that is not a class gives us nothing.

Two Kinds of Descriptors

In the previous section we used a descriptor with both__get__() and__set__()methods. Such descriptors, by the way, are calleddatadescriptors. Descriptors with only the__get__() method are somewhat weaker than theircousins, and callednon-data descriptors.

Repeating our experiment, but this time with non-data descriptors, we get:

Example 1.5. Non-data descriptors

class GetonlyDesc(object):    "Another useless descriptor"        def __get__(self, obj, typ=None):        passclass C(object):    "A class with a single descriptor"    d = GetonlyDesc()    cobj = C()x = cobj.d 1cobj.d = "setting a value" 2x = cobj.d 3del cobj.d 4x = C.d 5C.d = "setting a value on class" 6

1

Calls d.__get__(cobj, C) (just like before).

2

Puts"setting a value" in the instance itself (incobj.__dict__ to be precise).

3

Surprise!This now returns "setting a value", that is pickedup fromcobj.__dict__. Recall that for a datadescriptor, the instance's__dict__ is bypassed.

4

Deletes theattribute d from the instance (fromcobj.__dict__ to be precise).

56

These function identical to a data descriptor.

Interestingly, not having a __set__() affects notjust attribute setting, but also retrieval. What is Python thinking?If on setting, the descriptor gets fired and puts the data somewhere,then it follows that the descriptor only knows how to get it back. Whyeven bother with the instance's __dict__?

Data descriptors are useful for providing fullcontrol over an attribute. This is what one usually wantsfor attributes used to store some piece of data. For example anattribute that getstransformed and savedsomewhere on setting, would usually bereverse-transformed and returned when read. Whenyou have a data descriptor, it controls all access (both read andwrite) to the attribute on an instance. Of course, you could stilldirectly go to the class and replace thedescriptor, but you can't do that from an instance of the class.

Non-data descriptors, in contrast, only provide a value when aninstance itself does not have a value. So setting the attribute on aninstancehides the descriptor. This isparticularly useful in the case of functions (which are non-datadescriptors) as it allows one to hide a function defined in the classby attaching one to an instance.

Example 1.6. Hiding a method

class C(object):    def f(self):        return "f defined in class"cobj = C()cobj.f() 1def another_f():    return "another f"cobj.f = another_fcobj.f() 2

1

Calls the boundmethod returned by f.__get__(cobj,C). Essentially ends up callingC.__dict__['f'](cobj).

2

Callsanother_f(). The function f()defined in C has been hidden.

Attribute Search Summary

This is the long version of the attribute access story, includedjust for the sake of completeness.

When retrieving an attribute froman object (print objectname.attrname) Python followsthese steps:

  1. If attrname is a special(i.e. Python-provided) attribute forobjectname,return it.

  2. Checkobjectname.__class__.__dict__ for attrname. If it exists and isa data-descriptor, return the descriptor result. Search all bases ofobjectname.__class__ for the same case.

  3. Check objectname.__dict__ forattrname, and return if found. Ifobjectname is a class, search its bases too. If itis a class and a descriptor exists in it or its bases, return thedescriptor result.

  4. Check objectname.__class__.__dict__for attrname. If it exists and is anon-data descriptor, return the descriptor result. If itexists, and is not a descriptor, just return it. If it exists and is adata descriptor, we shouldn't be here because we would have returnedat point 2. Search all bases ofobjectname.__class__ for samecase.

  5. Raise AttributeError

Note that Python first checks for a datadescriptor in the class (and its bases), then for the attribute in theobject__dict__, and then for anon-data descriptor in the class (and itsbases). These are points 2, 3 and 4 above.

The descriptor result above impliesthe result of calling the__get__() method of thedescriptor with appropriate arguments. Also, checking a__dict__ forattrname meanschecking if __dict__["attrname"] exists.

Now, the steps Python follows when settinga user-defined attribute (objectname.attrname =something):

  1. Checkobjectname.__class__.__dict__ forattrname. If it existsand isa data-descriptor, use the descriptor to set the value. Search all bases ofobjectname.__class__ for the same case.

  2. Insert something intoobjectname.__dict__ for key"attrname".

  3. Think "Wow, this was much simpler!"

What happens when setting a Python-provided attribute depends onthe attribute. Python may not even allow some attributes to beset. Deletion of attributes is very similar to setting asabove.

Descriptors In The Box

Before you rush to the mall and get yourself some expensivedescriptors, note that Python ships with some very useful ones thatcan be found by simply looking in the box.

Example 1.7. Built-in descriptors

class HidesA(object):    def get_a(self):        return self.b - 1    def set_a(self, val):        self.b = val + 1    def del_a(self):        del self.b    a = property(get_a, set_a, del_a, "docstring") 1    def cls_method(cls):        return "You called class %s" % cls    clsMethod = classmethod(cls_method) 2    def stc_method():        return "Unbindable!"    stcMethod = staticmethod(stc_method) 3

1

Aproperty provides an easy way to call functionswhenever an attribute is retrieved, set or deleted on theinstance. When the attribute is retrieved from the class, the gettermethod is not called but the property object itself is returned. Adocstring can also be provided which is accessible asHidesA.a.__doc__.

2

Aclassmethod is similar to a regular method,except that is passes the class (and not the instance) as the firstargument to the function. The remaining arguments are passed throughas usual. It can also be called directly on the class and it behavesthe same way. The first argument is named clsinstead of the traditional self to avoid confusionregarding what it refers to.

3

Astaticmethod is just like a function outside theclass. It is neverbound, which means no matterhow you access it (on the class or on an instance), it gets called withexactly the same arguments you pass. No object is inserted as thefirst argument.

As we saw earlier, Python functions are descriptors too. They weren'tdescriptors in earlier versions of Python (as there were nodescriptors at all), but now they fit nicely into a more genericmechanism.

A property is always a data-descriptor, but not all arguments arerequired when defining it.

Example 1.8. More on properties

class VariousProperties(object):    def get_p(self):        pass    def set_p(self, val):        pass    def del_p(self):        pass    allOk = property(get_p, set_p, del_p)  1    unDeletable = property(get_p, set_p) 2    readOnly = property(get_p) 3

1

Can be set, retrieved, or deleted.

2

Attempting todelete this attribute from an instance will raiseAttributeError.

3

Attempting toset or delete this attribute from an instance will raiseAttributeError.

The getter and setter functions need not be defined in the classitself, any function can be used. In any case, the functions will becalled with the instance as the first argument. Note thatwhere the functions are passed to the propertyconstructor above, they are not bound functions anyway.

Another useful observation would be to note that subclassing theclass and redefining the getter (or setter) functions is not going tochange the property. The property object isholdingon to the actual functions provided. When kicked, it isgoing to say "Hey, I'm holding this function I was given, I'll justcall this and return the result.", and not "Hmm, let me look up thecurrent class for a method called 'get_a' andthen use that". If that is what one wants, then defining a newdescriptor would be useful. How would it work? Let's say it isinitialized with a string (i.e. the name of the method to call). Onactivation, it does agetattr() for the method nameon the class, and use the method found. Simple!

Classmethods and staticmethods are non-data descriptors, and so can behidden if an attribute with the same name is setdirectly on the instance. If you are rolling your own descriptor (andnot using properties), it can be made read-only by giving it a__set__() method but raisingAttributeError in the method. This is how aproperty behaves when it does not have a setter function.

Chapter 2. Method Resolution Order

The Problem (Method Resolution Disorder)

Why do we need Method Resolution Order? Let's say:

  1. We're happily doing OO programming and building aclass hierarchy.

  2. Our usual technique to implement thedo_your_stuff() method is to first calldo_your_stuff() on the base class, and then doour stuff.

    Example 2.1. Usual base call technique

    class A(object):    def do_your_stuff(self):        # do stuff with self for A        return    class B(A):    def do_your_stuff(self):        A.do_your_stuff(self)        # do stuff with self for B        return    class C(A):    def do_your_stuff(self):        A.do_your_stuff(self)        # do stuff with self for C        return    

  3. We subclass a new class from two classes and end uphaving the same superclass being accessible through two paths.

    Example 2.2. Base call technique fails

    class D(B,C):    def do_your_stuff(self):        B.do_your_stuff(self)        C.do_your_stuff(self)        # do stuff with self for D        return

    Figure 2.1. The Diamond Diagram

    The Diamond Diagram

  4. Now we're stuck if we want to implementdo_your_stuff(). Using our usual technique, if wewant to call bothB and C, weend up callingA.do_your_stuff() twice. And we allknow it might be dangerous to haveA do its stufftwice, when it is only supposed to be done once. The other optionwould leave eitherB's stuff orC's stuff not done, which is not what we wanteither.

There are messy solutions to this problem, and clean ones. Python,obviously, implements a clean one which is explained in the next section.

The "Who's Next" List

Let's say:

  1. For each class, we arrange allsuperclasses into an ordered list without repetitions, and insert theclass itself at the start of the list. We put this list in an classattribute callednext_class_list for our uselater.

    Example 2.3. Making a "Who's Next" list

    B.next_class_list = [B,A]C.next_class_list = [C,A]D.next_class_list = [D,B,C,A]

  2. We use a different technique to implementdo_your_stuff() for our classes.

    Example 2.4. Call next method technique

    class B(A):    def do_your_stuff(self):        next_class = self.find_out_whos_next()        next_class.do_your_stuff(self)        # do stuff with self for B    def find_out_whos_next(self):        l = self.next_class_list           # l depends on the actual instance        mypos = l.index(B)  1            # Find this class in the list        return l[mypos+1]                  # Return the next one

    The interesting part is how wefind_out_whos_next(), which depends on whichinstance we are working with. Note that:

    • Depending on whether we passed an instance ofD or ofB, next_class above will resolve to eitherC orA.

    • We have to implementfind_out_whos_next() for each class, since it hasto have the class name hardcoded in it (see1 above). We cannot useself.__class__ here. If we have calleddo_your_stuff() on an instance ofD, and the call is traversing up the hierarchy,thenself.__class__ will be Dhere.

Using this technique, each method is called only once. Itappears clean, but seems to require too much work. Fortunately for us,we neither have to implementfind_out_whos_next()for each class, nor set the next_class_list, asPython does both of these things.

A Super Solution

Python provides a class attribute __mro__ foreach class, and a type calledsuper. The__mro__ attribute is a tuple containing the classitself and all of its superclasses without duplicates in a predictableorder. Asuper object is used in place of thefind_out_whos_next() method.

Example 2.5. One super technique

class B(A): 1    def do_your_stuff(self):        super(B, self).do_your_stuff() 2        # do stuff with self for B

2

The super() call creates asuper object. It finds the next class afterB inself.__class__.__mro__. Attributes accessed on thesuper object are searched on the next class andreturned. Descriptors are resolved. What this means is accessing amethod (as above) returns abound method (notethe do_your_stuff() call does not pass self). Whenusing super() the first parameter should always bethe same as the class in which it is being used (1).

If we're using a class method, we don't have aninstance self to pass into thesuper call. Fortunately for us,super works even with a class as the secondargument. Observe that above,super usesself only to get atself.__class__.__mro__. The class can be passeddirectly tosuper as shown below.

Example 2.6. Using super with a class method

class A(object):    @classmethod 1    def say_hello(cls):        print 'A says hello'class B(A):    @classmethod    def say_hello(cls):        super(B, cls).say_hello() 2        print 'B says hello'class C(A):    @classmethod    def say_hello(cls):        super(C, cls).say_hello()        print 'C says hello'class D(B, C):    @classmethod    def say_hello(cls):        super(D, cls).say_hello()        print 'D says hello'B.say_hello() 3D.say_hello() 4

1

This example is for classmethods (not instancemethods).

2

Note we pass cls (the class and not theinstance) to super().

3

This prints out:

A says hello
B says hello

4

This prints out (observe each method is called only once):

A says hello
C says hello
B says hello
D says hello

There is yet another way to use super:

Example 2.7. Another super technique

class B(A):    def do_your_stuff(self):        self.__super.do_your_stuff()        # do stuff with self for BB._B__super = super(B) 1

When created with only a type, the super instancebehaves like a descriptor. This means (ifd is aninstance of D) thatsuper(B).__get__(d) returns the same thing assuper(B,d). In1 above, we munge anattribute name, similar to what Python does for names starting withdouble underscoreinside the class. So this isaccessible as self.__super within the body of theclass. If we didn't use a class specific attribute name, accessing theattribute through the instanceself might return anobject defined in a subclass.

While using super we typically use only onesuper call in one method even if the class hasmultiple bases. Also, it is a good programming practice to usesuper instead of calling methods directly on a baseclass.

A possible pitfall appears if do_your_stuff()accepts different arguments forC andA. This is because, if we usesuper inB to calldo_your_stuff() on thenextclass, we don't know if it is going to be called onA orC. If this scenario isunavoidable, a case specific solution might be required.

Computing the MRO

One question as of yet unanswered is how does Python determinethe __mro__ for a type? A basic idea behind thealgorithm is provided in this section. This is not essential for justusingsuper, or reading following sections, so youcan jump to the next section if you want.

Python determines the precedence of types(or the order in which they should be placed in any__mro__) from two kinds of constraints specified bythe user:

  1. If A is a superclass ofB, thenB has precedence overA. Or,B should always appearbeforeA in all__mro__s (that contain both). In short let's denotethis asB > A.

  2. If C appears beforeD in the list of bases in a class statement(eg.class Z(C,D):), then C > D.

In addition, to avoid being ambiguous, Python adheres to thefollowing principle:

  1. If E > F in one scenario (or one__mro__), then it should be thatE >F in all scenarios (or all__mro__s).

We can satisfy the constraints if we build the__mro__ for each new classC weintroduce, such that:

  1. All superclasses ofC appear in theC.__mro__ (plusC itself, at the start), and

  2. The precedence of types inC.__mro__ does not conflict with the precedence oftypes inB.__mro__ for each B inC.__bases__.

Here the same problem is translated into a game. Consider a classhierarchy as follows:

Figure 2.2. A Simple Hierarchy

A Simple Hierarchy

Since only single inheritance is in play, it is easy to find the__mro__ of these classes. Let's say we define a newclass asclass N(A,B,C). To compute the__mro__, consider a game using abacus style beadsover a series of strings.

Figure 2.3. Beads on Strings - Unsolvable

Beads on Strings - Unsolvable

Beads can move freely over the strings, but the strings cannot be cutor twisted. The strings from left to right contain beads in the orderof__mro__ of each of the bases. The rightmoststring contains one bead for each base, in the order the bases arespecified in the class statement.

The objective is to line up beads in rows, so that each row containsbeads with only one label (as done with theO beadin the diagram). Each string represents an ordering constraint, and ifwe can reach the goal, we would have an order that satisfies allconstraints. We could then just read the labels off rows from thebottom up to get the__mro__ forN.

Unfortunately, we cannot solve this problem. The last two strings haveC andB in differentorders. However, if we change our class definition toclassN(A,C,B), then we have some hope.

Figure 2.4. Beads on Strings - Solved

Beads on Strings - Solved

We just found out that N.__mro__ is(N,A,C,B,object) (note we insertedN at the head). The reader can try out thisexperiment in real Python (for the unsolvable case above, Pythonraises an exception). Observe that we even swapped the position of twostrings, keeping the strings in the same order as the bases arespecified in the class statement. The usefulness of this is seenlater.

Sometimes, there might be more than one solution, as shown inthe figure below. Consider four classesclassA(object), class B(A),classC(object) and class D(C). If a new classis defined asclass E(B, D), there are multiplepossible solutions that satisfy all constraints.

Figure 2.5. Multiple Solutions

Multiple Solutions

Possible positions for A are shown as thelittle beads. The order can be kept unambiguous (morecorrectly,monotonic) if the following policiesare followed:

  1. Arrange strings from left to right in order ofappearance of bases in the class statement.

  2. Attempt to arrange beads in rows moving from bottomup, and left to right. What this means is that the MROofclass E(B, D) will be setto: (E,B,A,D,C,object). This isbecauseA, being left of C, willbe selected first as a candidate for the second row frombottom.

This, essentially, is the idea behind the algorithm used byPython to generate the__mro__ for any newtype. The formal algorithm is formally explained elsewhere[mro-algorithm].

Chapter 3. Usage Notes

This chapter includes usage notes that do not fit in otherchapters.

Special Methods

In Python, we can use methods with special name like__len__(),__str__() and__add__() to make objects convenient to use (forexample, with the built-in functionslen(),str() or with the '+' operator,etc.)

Example 3.1. Special methods work on type only

class C(object):    def __len__(self):  1        return 0cobj = C()def mylen():    return 1cobj.__len__ = mylen 2print len(cobj) 3

1

Usually we putthe special methods in a class.

2

We can try to putthem in the instance itself, but it doesn't work.

3

This goesstraight to the class (calls C.__len__()), not tothe instance.

The same is true for all such methods, putting them on the instance wewant to use them with does not work. If it did go to the instance theneven something likestr(C) (strof the classC) would go toC.__str__(), which is a method defined for aninstance ofC, and notC itself.

A simple technique to allow defining such methods for each instanceseparately is shown below.

Example 3.2. Forwarding special method to instance

class C(object):    def __len__(self):        return self._mylen() 1    def _mylen(self): 2        return 0cobj = C()def mylen():    return 1cobj._mylen = mylen 3print len(cobj) 4

1

We call another method on the instance,

2

forwhich we provide a default implementation in the class.

3

Butit can be overwritten (rather hidden) by settingon the instance.

4

Thisnow calls mylen().

Subclassing Built-in Types

Subclassing built-in types is straightforward. Actually we have beendoing it all along (whenever we subclass<type 'object'>). Some built-intypes (types.FunctionType, for example) are notsubclassable (not yet, at least). However, here we talk aboutsubclassing<type 'list'>, <type'tuple'> and other basic data types.

Example 3.3. Subclassing <type 'list'>

>>> class MyList(list): 1...     "A list that converts appended items to ints"...     def append(self, item): 2...         list.append(self, int(item)) 3...>>>>>> l = MyList() >>> l.append(1.3) 4>>> l.append(444) >>> l[1, 444] 5>>> len(l) 62>>> l[1] = 1.2 7>>> l[1, 1.2]>>> l.color = 'red' 8

1

A regular class statement.

2

Define the method tobe overridden. In this case we will convert all items passed throughappend() to integers.

3

Upcall to the base if required. list.append() works like an unbound method, and is passed the instance as the first argument.

4

Append a float and...

5

watch it automatically become an integer.

6

Otherwise, it behaves like any other list.

7

This doesn't gothrough append. We would have to define__setitem__() in our class to massage such data. Theupcall would be tolist.__setitem__(self,item). Note that thespecial methods such as__setitem__ exist on built-in types.

7

We can set attributeson our instance. This is because it has a __dict__.

Basic lists do not have __dict__ (and so nouser-defined attributes), but ours does. This is usually not a problemand may even be what we want. If we use averylarge number of MyLists, however, we could optimizeour program by telling Python not to create the__dict__ for instances ofMyList.

Example 3.4. Using __slots__ for optimization

class MyList(list):    "A list subclass disallowing any user-defined attributes"        __slots__ = [] 1ml = MyList()ml.color = 'red' # raises exception! 2class MyListWithFewAttrs(list):    "A list subclass allowing specific user-defined attributes"    __slots__ = ['color'] 3mla = MyListWithFewAttrs()mla.color = 'red' 4mla.weight = 50 # raises exception! 5

1

The__slots__ class attribute tells Python to notcreate a__dict__ for instances of this type.

2

Setting any attribute on this raises an exception.

3

__slots__ can contain a list of strings. Theinstances still don't get a real dictionary for__dict__, but they get aproxy. Python reserves space in the instance forthe specified attributes.

4

Now, if an attribute has space reserved, it can be used.

5

Otherwise, it cannot. This will raise an exception.

The purpose and recommended use of __slots__is for optimization. After a type is defined, its slots cannot bechanged. Also, every subclass must define__slots__,otherwise its instances will end up having__dict__.

We can create a list even by instantiating it like any other type:list([1,2,3]). This meanslist.__init__() accepts the same argument (i.e. anyiterable) and initializes a list. We can customize initialization in asubclass by redefining __init__() andupcalling__init__() on thebase.

Tuples are immutable and different from lists. Once an instanceis created, it cannot be changed. Note that the instance of a typealready exists when__init__() is called (in factthe instance is passed as the first argument). The__new__() static method of a type is called tocreate an instance of the type. It is passed thetype itself as the first argument, and passed through other initialarguments (similar to__init__()). We use this tocustomize immutable types like a tuple.

Example 3.5. Customizing creation of subclasses

class MyList(list):    def __init__(self, itr): 1        list.__init__(self, [int(x) for x in itr])class MyTuple(tuple):        def __new__(typ, itr): 2        seq = [int(x) for x in itr]        return tuple.__new__(typ, seq) 3

1

For a list, we massage the arguments and hand them over tolist.__init__().

2

For a tuple, we have to override __new__().

3

A __new__() should always return. It is supposed to return an instance of the type.

The __new__() method is not special to immutabletypes, it is used for all types. It is also converted to a staticmethod automatically by Python (by virtue of its name).

Related Documentation

[descrintro] Unifying types and classes in Python 2.2.Guido van Rossum.

[pep-252] Making Types Look More Like Classes.Guido van Rossum.

[pep-253] Subclassing Built-in Types.Guido van Rossum.

[descriptors-howto] How-To Guide for Descriptors.Raymond Hettinger.

[mro-algorithm] The Python 2.3 Method Resolution Order.Michele Simionato.

Colophon

This book was written in DocBook XML. TheHTML version was produced using DocBook XSL stylesheets andxsltproc. The PDF version was produced usinghtmldoc. The diagrams were drawn using OmniGraffe[1]. Theprocess was automated usingPaver [2].



[1] http://www.omnigroup.com/
[2] http://www.blueskyonmars.com/projects/paver/