A collection of not-so-obvious Python stuff you should know

来源：互联网发布：建网站做淘宝客编辑：程序博客网时间：2024/06/14 22:06

Sebastian Raschka
last updated: 04/25/2014
Link to this IPython Notebook on GitHub
All code was executed in Python 3.4

A collection of not-so-obvious Python stuff you should know!

I am really looking forward to your comments and suggestions to improve and
extend this little collection! Just send me a quick note
via Twitter: @rasbt
or Email: bluewoodtree@gmail.com

Sections

The C3 class resolution algorithm for multiple class inheritance
Using += on lists creates new objects
True and False in the datetime module
Python reuses objects for small integers - always use "==" for equality, "is" for identity
Shallow vs. deep copies if list contains other structures and objects
Picking True values from logical ands and ors
Don't use mutable objects as default arguments for functions!
Be aware of the consuming generator
bool is a subclass of int
About lambda-in-closures and-a-loop pitfall
Python's LEGB scope resolution and the keywords global and nonlocal
When mutable contents of immutable tuples aren't so mutable
List comprehensions are fast, but generators are faster!?
Public vs. private class methods and name mangling
The consequences of modifying a list when looping through it
Dynamic binding and typos in variable names
List slicing using indexes that are "out of range
Reusing global variable names and UnboundLocalErrors
Creating copies of mutable objects
Key differences between Python 2 and 3
Function annotations - What are those ->'s in my Python code?
Abortive statements in finally blocks

The C3 class resolution algorithm for multiple class inheritance

[back to top]

If we are dealing with multiple inheritance, according to the newer C3 class resolution algorithm, the following applies:
Assuming that child class C inherits from two parent classes A and B, "class A should be checked before class B".

If you want to learn more, please read the original blog post by Guido van Rossum.

(Original source: http://gistroll.com/rolls/21/horizontal_assessments/new)

In [2]:

class A(object):    def foo(self):        print("class A")class B(object):    def foo(self):        print("class B")class C(A, B):    passC().foo()

class A

So what actually happened above was that class C looked in the scope of the parent class A for the method .foo() first (and found it)!

I received an email containing a suggestion which uses a more nested example to illustrate Guido van Rossum's point a little bit better:

In [3]:

class A(object):   def foo(self):      print("class A")class B(A):   passclass C(A):   def foo(self):      print("class C")class D(B,C):   passD().foo()

class C

Here, class D searches in B first, which in turn inherits from A (note that class C also inherits from A, but has its own .foo() method) so that we come up with the search order: D, B, C, A.

Using `+=` on lists creates new objects

[back to top]

Python lists are mutable objects as we all know. So, if we are using the += operator on lists, we extend the list by directly modifying the object directly.

However, if we use the assigment via my_list = my_list + ..., we create a new list object, which can be demonstrated by the following code:

In [6]:

a_list = []print('ID:', id(a_list))a_list += [1]print('ID (+=):', id(a_list))a_list = a_list + [2]print('ID (list = list + ...):', id(a_list))

ID: 4366496544ID (+=): 4366496544ID (list = list + ...): 4366495472

Just for reference, the .append() and .extends() methods are modifying the list object in place, just as expected.

In [7]:

a_list = []print('ID:',id(a_list))a_list.append(1)print('ID (append):',id(a_list))a_list.append(2)print('ID (extend):',id(a_list))

ID: 4366495544ID (append): 4366495544ID (extend): 4366495544

In []:

`True` and `False` in the datetime module

"It often comes as a big surprise for programmers to find (sometimes by way of a hard-to-reproduce bug) that, unlike any other time value, midnight (i.e.datetime.time(0,0,0)) is False. A long discussion on the python-ideas mailing list shows that, while surprising, that behavior is desirable—at least in some quarters."

(Original source: http://lwn.net/SubscriberLink/590299/bf73fe823974acea/)

[back to top]

In [8]:

import datetimeprint('"datetime.time(0,0,0)" (Midnight) ->', bool(datetime.time(0,0,0)))print('"datetime.time(1,0,0)" (1 am) ->', bool(datetime.time(1,0,0)))

"datetime.time(0,0,0)" (Midnight) -> False"datetime.time(1,0,0)" (1 am) -> True

Python reuses objects for small integers - use "==" for equality, "is" for identity

[back to top]

This oddity occurs, because Python keeps an array of small integer objects (i.e., integers between -5 and 256, see the doc).

In [9]:

a = 1b = 1print('a is b', bool(a is b))Truec = 999d = 999print('c is d', bool(c is d))

a is b Truec is d False

(I received a comment that this is in fact a CPython artefact and must not necessarily be true in all implementations of Python!)

So the take home message is: always use "==" for equality, "is" for identity!

Here is a nice article explaining it by comparing "boxes" (C language) with "name tags" (Python).

This example demonstrates that this applies indeed for integers in the range in -5 to 256:

In [11]:

print('256 is 257-1', 256 is 257-1)print('257 is 258-1', 257 is 258 - 1)print('-5 is -6+1', -5 is -6+1)print('-7 is -6-1', -7 is -6-1)

256 is 257-1 True257 is 258-1 False-5 is -6+1 True-7 is -6-1 False

And to illustrate the test for equality (`==`) vs. identity (`is`):

In [6]:

a = 'hello world!'b = 'hello world!'print('a is b,', a is b)print('a == b,', a == b)

a is b, Falsea == b, True

We would think that identity would always imply equality, but this is not always true, as we can see in the next example:

In [12]:

a = float('nan')print('a is a,', a is a)print('a == a,', a == a)

a is a, Truea == a, False

Shallow vs. deep copies if list contains other structures and objects

[back to top]

Shallow copy:
If we use the assignment operator to assign one list to another list, we just create a new name reference to the original list. If we want to create a new list object, we have to make a copy of the original list. This can be done via a_list[:] of a_list.copy().

In [23]:

list1 = [1,2]list2 = list1        # referencelist3 = list1[:]     # shallow copylist4 = list1.copy() # shallow copyprint('IDs:\nlist1: {}\nlist2: {}\nlist3: {}\nlist4: {}\n'      .format(id(list1), id(list2), id(list3), id(list4)))list2[0] = 3print('list1:', list1)list3[0] = 4list4[1] = 4print('list1:', list1)

IDs:list1: 4377955288list2: 4377955288list3: 4377955432list4: 4377954784list1: [3, 2]list1: [3, 2]

Deep copy
As we have seen above, a shallow copy works fine if we want to create a new list with contents of the original list which we want to modify independently.

However, if we are dealing with compound objects (e.g., lists that contain other lists, read here for more information) it becomes a little trickier.

In the case of compound objects, a shallow copy would create a new compound object, but it would just insert the references to the contained objects into the new compound object. In contrast, a deep copy would go "deeper" and create also new objects
for the objects found in the original compound object. If you follow the code, the concept should become more clear:

In [25]:

from copy import deepcopylist1 = [[1],[2]]list2 = list1.copy()    # shallow copylist3 = deepcopy(list1) # deep copyprint('IDs:\nlist1: {}\nlist2: {}\nlist3: {}\n'      .format(id(list1), id(list2), id(list3)))list2[0][0] = 3print('list1:', list1)list3[0][0] = 5print('list1:', list1)

IDs:list1: 4377956296list2: 4377961752list3: 4377954928list1: [[3], [2]]list1: [[3], [2]]

Picking `True` values from logical `and`s and `or`s

[back to top]

Logical or:

a or b == a if a else b

If both values in or expressions are True, Python will select the first value (e.g., select "a" in "a" or "b"), and the second one in and expressions.
This is also called short-circuiting - we already know that the logical or must be True if the first value is True and therefore can omit the evaluation of the second value.

Logical and:

a and b == b if a else a

If both values in and expressions are True, Python will select the second value, since for a logical and, both values must be true.

In [9]:

result = (2 or 3) * (5 and 7)print('2 * 7 =', result)

2 * 7 = 14

Don't use mutable objects as default arguments for functions!

[back to top]

Don't use mutable objects (e.g., dictionaries, lists, sets, etc.) as default arguments for functions! You might expect that a new list is created every time when we call the function without providing an argument for the default parameter, but this is not the case: Python will create the mutable object (default parameter) the first time the function is defined - not when it is called, see the following code:

(Original source: http://docs.python-guide.org/en/latest/writing/gotchas/

In [1]:

def append_to_list(value, def_list=[]):    def_list.append(value)    return def_listmy_list = append_to_list(1)print(my_list)my_other_list = append_to_list(2)print(my_other_list)

[1][1, 2]

Another good example showing that demonstrates that default arguments are created when the function is created (and not when it is called!):

In [10]:

import timedef report_arg(my_default=time.time()):    print(my_default)report_arg()time.sleep(5)report_arg()

1397764090.4566881397764090.456688

Be aware of the consuming generator

[back to top]

Be aware of what is happening when combining "in" checks with generators, since they won't evaluate from the beginning once a position is "consumed".

In [9]:

gen = (i for i in range(5))print('2 in gen,', 2 in gen)print('3 in gen,', 3 in gen)print('1 in gen,', 1 in gen)

2 in gen, True3 in gen, True1 in gen, False

Although this defeats the purpose of an generator (in most cases), we can convert a generator into a list to circumvent the problem.

In [27]:

gen = (i for i in range(5))a_list = list(gen)print('2 in l,', 2 in a_list)print('3 in l,', 3 in a_list)print('1 in l,', 1 in a_list)

2 in l, True3 in l, True1 in l, True

`bool` is a subclass of `int`

[back to top]

Chicken or egg? In the history of Python (Python 2.2 to be specific) truth values were implemented via 1 and 0 (similar to the old C), to avoid syntax error in old (but perfectly working) code, bool was added as a subclass of int in Python 2.3.

Original source: http://www.peterbe.com/plog/bool-is-int

In [16]:

print('isinstance(True, int):', isinstance(True, int))print('True + True:', True + True)print('3*True:', 3*True)print('3*True - False:', 3*True - False)

isinstance(True, int): TrueTrue + True: 23*True: 33*True - False: 3

About lambda-in-closures-and-a-loop pitfall

[back to top]

Remember the "consuming generators"? This example is somewhat related, but the result might still come unexpected.

(Original source: http://openhome.cc/eGossip/Blog/UnderstandingLambdaClosure3.html)

In the first example below, where we call a lambda function in a list comprehension, the value i is dereferenced every time we call lambda within the scope of the list comprehension. Since the list is already constructed when we for-loop through the list, it is set to the last value 4.

In [11]:

my_list = [lambda: i for i in range(5)]for l in my_list:    print(l())

This, however, does not apply to generators:

In [9]:

my_gen = (lambda: n for n in range(5))for l in my_gen:    print(l())

Python's LEGB scope resolution and the keywords `global` and `nonlocal`

[back to top]

There is nothing particularly surprising about Python's LEGB scope resolution (Local -> Enclosed -> Global -> Built-in), but it is still useful to take a look at some examples!

`global` vs. `local`

According to the LEGB rule, Python will first look for a variable in the local scope. So if we set the variable x = 1 locally in the function's scope, it won't have an effect on the global x.

In [33]:

x = 0def in_func():    x = 1    print('in_func:', x)    in_func()print('global:', x)

in_func: 1global: 0

If we want to modify the global x via a function, we can simply use the global keyword to import the variable into the function's scope:

In [34]:

x = 0def in_func():    global x    x = 1    print('in_func:', x)    in_func()print('global:', x)

in_func: 1global: 1

`local` vs. `enclosed`

Now, let us take a look at local vs. enclosed. Here, we set the variable x = 1 in the outer function and set x = 1 in the enclosed function inner. Since inner looks in the local scope first, it won't modify outer's x.

In [36]:

def outer():       x = 1       print('outer before:', x)       def inner():           x = 2           print("inner:", x)       inner()       print("outer after:", x)outer()

outer before: 1inner: 2outer after: 1

Here is where the nonlocal keyword comes in handy - it allows us to modify the x variable in the enclosed scope:

In [35]:

def outer():       x = 1       print('outer before:', x)       def inner():           nonlocal x           x = 2           print("inner:", x)       inner()       print("outer after:", x)outer()

outer before: 1inner: 2outer after: 2

When mutable contents of immutable tuples aren't so mutable

[back to top]

As we all know, tuples are immutable objects in Python, right!? But what happens if they contain mutable objects?

First, let us have a look at the expected behavior: a TypeError is raised if we try to modify immutable types in a tuple:

In [41]:

tup = (1,)tup[0] += 1

---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<ipython-input-41-c3bec6c3fe6f> in <module>()      1 tup = (1,)----> 2 tup[0] += 1TypeError: 'tuple' object does not support item assignment

But what if we put a mutable object into the immutable tuple? Well, modification works, but we also get a `TypeError` at the same time.

In [42]:

tup = ([],)print('tup before: ', tup)tup[0] += [1]

tup before:  ([],)

---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<ipython-input-42-aebe9a31dbeb> in <module>()      1 tup = ([],)      2 print('tup before: ', tup)----> 3 tup[0] += [1]TypeError: 'tuple' object does not support item assignment

In [43]:

print('tup after: ', tup)

tup after:  ([1],)

However, there are ways to modify the mutable contents of the tuple without raising the TypeError, the solution is the .extend() method, or alternatively .append()(for lists):

In [44]:

tup = ([],)print('tup before: ', tup)tup[0].extend([1])print('tup after: ', tup)

tup before:  ([],)tup after:  ([1],)

In [5]:

tup = ([],)print('tup before: ', tup)tup[0].append(1)print('tup after: ', tup)

tup before:  ([],)tup after:  ([1],)

Explanation

A. Jesse Jiryu Davis has a nice explanation for this phenomenon (Original source: http://emptysqua.re/blog/python-increment-is-weird-part-ii/)

If we try to extend the list via += "then the statement executes STORE_SUBSCR, which calls the C function PyObject_SetItem, which checks if the object supports item assignment. In our case the object is a tuple, so PyObject_SetItem throws the TypeError. Mystery solved."

One more note about the `immutable` status of tuples. Tuples are famous for being immutable. However, how comes that this code works?

In [6]:

my_tup = (1,)my_tup += (4,)my_tup = my_tup + (5,)print(my_tup)

(1, 4, 5)

What happens "behind" the curtains is that the tuple is not modified, but every time a new object is generated, which will inherit the old "name tag":

In [8]:

my_tup = (1,)print(id(my_tup))my_tup += (4,)print(id(my_tup))my_tup = my_tup + (5,)print(id(my_tup))

433738184043574154964357289952

List comprehensions are fast, but generators are faster!?

[back to top]

Not, really (or significantly, see the benchmarks below). So what's the reason to prefer one over the other?

use lists if you want to use list methods
use generators when you are dealing with huge collections to avoid memory issues

In [11]:

import timeitdef plainlist(n=100000):    my_list = []    for i in range(n):        if i % 5 == 0:            my_list.append(i)    return my_listdef listcompr(n=100000):    my_list = [i for i in range(n) if i % 5 == 0]    return my_listdef generator(n=100000):    my_gen = (i for i in range(n) if i % 5 == 0)    return my_gendef generator_yield(n=100000):    for i in range(n):        if i % 5 == 0:            yield i

To be fair to the list, let us exhaust the generators:

In [13]:

def test_plainlist(plain_list):    for i in plain_list():        passdef test_listcompr(listcompr):    for i in listcompr():        passdef test_generator(generator):    for i in generator():        passdef test_generator_yield(generator_yield):    for i in generator_yield():        passprint('plain_list:     ', end = '')%timeit test_plainlist(plainlist)print('\nlistcompr:     ', end = '')%timeit test_listcompr(listcompr)print('\ngenerator:     ', end = '')%timeit test_generator(generator)print('\ngenerator_yield:     ', end = '')%timeit test_generator_yield(generator_yield)

plain_list:     10 loops, best of 3: 22.4 ms per looplistcompr:     10 loops, best of 3: 20.8 ms per loopgenerator:     10 loops, best of 3: 22 ms per loopgenerator_yield:     10 loops, best of 3: 21.9 ms per loop

Public vs. private class methods and name mangling

[back to top]

Who has not stumbled across this quote "we are all consenting adults here" in the Python community, yet? Unlike in other languages like C++ (sorry, there are many more, but that's one I am most familiar with), we can't really protect class methods from being used outside the class.
All we can do is to indicate methods as private to make clear that they are better not used outside the class, but it is really up to the class user, since "we are all consenting adults here"!
So, when we want to "make" class methods private, we just put a double-underscore in front of it (same with other class members), which invokes some name mangling if we want to acess the private class member outside the class!
This doesn't prevent the class user to access this class member though, but he has to know the trick and also knows that it his own risk...

Let the following example illustrate what I mean:

In [28]:

class my_class():    def public_method(self):        print('Hello public world!')    def __private_method(self):        print('Hello private world!')    def call_private_method_in_class(self):        self.__private_method()        my_instance = my_class()my_instance.public_method()my_instance._my_class__private_method()my_instance.call_private_method_in_class()

Hello public world!Hello private world!Hello private world!

The consequences of modifying a list when looping through it

[back to top]

It can be really dangerous to modify a list when iterating through - it is a very common pitfall that can cause unintended behavior!
Look at the following examples, and for a fun exercise: try to figure out what is going on before you skip to the solution!

In [3]:

a = [1, 2, 3, 4, 5]for i in a:    if not i % 2:        a.remove(i)print(a)

[1, 3, 5]

In [4]:

b = [2, 4, 5, 6]for i in b:     if not i % 2:         b.remove(i)print(b)

[4, 5]

The solution is that we are iterating through the list index by index, and if we remove one of the items in-between, we inevitably mess around with the indexing, look at the following example, and it will become clear:

In [7]:

b = [2, 4, 5, 6]for index, item in enumerate(b):    print(index, item)    if not item % 2:        b.remove(item)print(b)

0 21 52 6[4, 5]

Dynamic binding and typos in variable names

[back to top]

Be careful, dynamic binding is convenient, but can also quickly become dangerous!

In [14]:

print('first list:')for i in range(3):    print(i)    print('\nsecond list:')for j in range(3):    print(i) # I (intentionally) made typo here!

first list:012second list:222

List slicing using indexes that are "out of range"

[back to top]

As we have all encountered it 1 (x10000) time(s) in our live, the infamous IndexError:

In [15]:

my_list = [1, 2, 3, 4, 5]print(my_list[5])

---------------------------------------------------------------------------IndexError                                Traceback (most recent call last)<ipython-input-15-eb273dc36fdc> in <module>()      1 my_list = [1, 2, 3, 4, 5]----> 2 print(my_list[5])IndexError: list index out of range

But suprisingly, it is not raised when we are doing list slicing, which can be a really pain for debugging:

In [16]:

my_list = [1, 2, 3, 4, 5]print(my_list[5:])

[]

Reusing global variable names and `UnboundLocalErrors`

[back to top]

Usually, it is no problem to access global variables in the local scope of a function:

In [37]:

def my_func():    print(var)var = 'global'my_func()

global

And is also no problem to use the same variable name in the local scope without affecting the local counterpart:

In [38]:

def my_func():    var = 'locally changed'var = 'global'my_func()print(var)

global

But we have to be careful if we use a variable name that occurs in the global scope, and we want to access it in the local function scope if we want to reuse this name:

In [40]:

def my_func():    print(var) # want to access global variable    var = 'locally changed' # but Python thinks we forgot to define the local variable!    var = 'global'my_func()

---------------------------------------------------------------------------UnboundLocalError                         Traceback (most recent call last)<ipython-input-40-3afd870b7c35> in <module>()      4       5 var = 'global'----> 6 my_func()<ipython-input-40-3afd870b7c35> in my_func()      1 def my_func():----> 2     print(var) # want to access global variable      3     var = 'locally changed'      4       5 var = 'global'UnboundLocalError: local variable 'var' referenced before assignment

In this case, we have to use the global keyword!

In [43]:

def my_func():    global var    print(var) # want to access global variable    var = 'locally changed' # changes the gobal variablevar = 'global'my_func()print(var)

globallocally changed

Creating copies of mutable objects

[back to top]

Let's assume a scenario where we want to duplicate sublists of values stored in another list. If we want to create independent sublist object, using the arithmetic multiplication operator could lead to rather unexpected (or undesired) results:

In [24]:

my_list1 = [[1, 2, 3]] * 2print('initially ---> ', my_list1)# modify the 1st element of the 2nd sublistmy_list1[1][0] = 'a'print("after my_list1[1][0] = 'a' ---> ", my_list1)

initially --->  [[1, 2, 3], [1, 2, 3]]after my_list1[1][0] = 'a' --->  [['a', 2, 3], ['a', 2, 3]]

In this case, we should better create "new" objects:

In [25]:

my_list2 = [[1, 2, 3] for i in range(2)]print('initially:  ---> ', my_list2)# modify the 1st element of the 2nd sublistmy_list2[1][0] = 'a'print("after my_list2[1][0] = 'a':  ---> ", my_list2)

initially:  --->  [[1, 2, 3], [1, 2, 3]]after my_list2[1][0] = 'a':  --->  [[1, 2, 3], ['a', 2, 3]]

And here is the proof:

In [26]:

for a,b in zip(my_list1, my_list2):    print('id my_list1: {}, id my_list2: {}'.format(id(a), id(b)))

id my_list1: 4350764680, id my_list2: 4350766472id my_list1: 4350764680, id my_list2: 4350766664

Key differences between Python 2 and 3

[back to top]

There are some good articles already that are summarizing the differences between Python 2 and 3, e.g.,

https://wiki.python.org/moin/Python2orPython3
https://docs.python.org/3.0/whatsnew/3.0.html
http://python3porting.com/differences.html
https://docs.python.org/3/howto/pyporting.html
etc.

But it might be still worthwhile, especially for Python newcomers, to take a look at some of those! (Note: the the code was executed in Python 3.4.0 and Python 2.7.5 and copied from interactive shell sessions.)

Unicode...

- Python 2:

We have ASCII str() types, separate unicode(), but no byte type

- Python 3:

Now, we finally have Unicode (utf-8) strings, and 2 byte classes: byte and bytearrays

In []:

############## Python 2#############>>> type(unicode('is like a python3 str()'))<type 'unicode'>>>> type(b'byte type does not exist')<type 'str'>>>> 'they are really' + b' the same''they are really the same'>>> type(bytearray(b'bytearray oddly does exist though'))<type 'bytearray'>############## Python 3#############>>> print('strings are now utf-8 \u03BCnico\u0394é!')strings are now utf-8 μnicoΔé!>>> type(b' and we have byte types for storing data')<class 'bytes'>>>> type(bytearray(b'but also bytearrays for those who prefer them over strings'))<class 'bytearray'>>>> 'string' + b'bytes for data'Traceback (most recent call last):s  File "<stdin>", line 1, in <module>TypeError: Can't convert 'bytes' object to str implicitly

The print statement

Very trivial, but this change makes sense, Python 3 now only accepts prints with proper parentheses - just like the other function calls ...

In []:

# Python 2>>> print 'Hello, World!'Hello, World!>>> print('Hello, World!')Hello, World!# Python 3>>> print('Hello, World!')Hello, World!>>> print 'Hello, World!'  File "<stdin>", line 1    print 'Hello, World!'                        ^SyntaxError: invalid syntax

And if we want to print the output of 2 consecutive print functions on the same line, you would use a comma in Python 2, and a end="" in Python 3:

In []:

# Python 2>>> print "line 1", ; print 'same line'line 1 same line# Python 3>>> print("line 1", end="") ; print (" same line")line 1 same line

Integer division

This is a pretty dangerous thing if you are porting code, or executing Python 3 code in Python 2 since the change in integer-division behavior can often go unnoticed.
So, I still tend to use a float(3/2) or 3/2.0 instead of a 3/2 in my Python 3 scripts to save the Python 2 guys some trouble ... (PS: and vice versa, you canfrom __future__ import division in your Python 2 scripts).

In []:

# Python 2>>> 3 / 21>>> 3 // 21>>> 3 / 2.01.5>>> 3 // 2.01.0# Python 3>>> 3 / 21.5>>> 3 // 21>>> 3 / 2.01.5>>> 3 // 2.01.0

`xrange()`

xrange() was pretty popular in Python 2.x if you wanted to create an iterable object. The behavior was quite similar to a generator ('lazy evaluation'), but you could iterate over it infinitely. The advantage was that it was generally faster than range() (e.g., in a for-loop) - not if you had to iterate over the list multiple times, since the generation happens every time from scratch!
In Python 3, the range() was implemented like the xrange() function so that a dedicated xrange() function does not exist anymore.

In []:

# Python 2> python -m timeit 'for i in range(1000000):' ' pass'10 loops, best of 3: 66 msec per loop    > python -m timeit 'for i in xrange(1000000):' ' pass'10 loops, best of 3: 27.8 msec per loop# Python 3> python3 -m timeit 'for i in range(1000000):' ' pass'10 loops, best of 3: 51.1 msec per loop> python3 -m timeit 'for i in xrange(1000000):' ' pass'Traceback (most recent call last):  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/timeit.py", line 292, in main    x = t.timeit(number)  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/timeit.py", line 178, in timeit    timing = self.inner(it, self.timer)  File "<timeit-src>", line 6, in inner    for i in xrange(1000000):NameError: name 'xrange' is not defined

Raising exceptions

Where Python 2 accepts both notations, the 'old' and the 'new' way, Python 3 chokes (and raises a SyntaxError in turn) if we don't enclose the exception argument in parentheses:

In []:

# Python 2>>> raise IOError, "file error"Traceback (most recent call last):  File "<stdin>", line 1, in <module>IOError: file error>>> raise IOError("file error")Traceback (most recent call last):  File "<stdin>", line 1, in <module>IOError: file error    # Python 3    >>> raise IOError, "file error"  File "<stdin>", line 1    raise IOError, "file error"                 ^SyntaxError: invalid syntax>>> raise IOError("file error")Traceback (most recent call last):  File "<stdin>", line 1, in <module>OSError: file error

Handling exceptions

Also the handling of excecptions has slightly changed in Python 3. Now, we have to use the as keyword!

In []:

# Python 2>>> try:...     blabla... except NameError, err:...     print err, '--> our error msg'... name 'blabla' is not defined --> our error msg# Python 3>>> try:...     blabla... except NameError as err:...     print(err, '--> our error msg')... name 'blabla' is not defined --> our error msg

The `next()` function and `.next()` method

Where you can use both function and method in Python 2.7.5, the next() function is all that remain in Python 3!

In []:

# Python 2>>> my_generator = (letter for letter in 'abcdefg')>>> my_generator.next()'a'>>> next(my_generator)'b'# Python 3>>> my_generator = (letter for letter in 'abcdefg')>>> next(my_generator)'a'>>> my_generator.next()Traceback (most recent call last):  File "<stdin>", line 1, in <module>AttributeError: 'generator' object has no attribute 'next'

Function annotations - What are those `->`'s in my Python code?

[back to top]

Have you ever seen any Python code that used colons inside the parantheses of a function definition?

In [8]:

def foo1(x: 'insert x here', y: 'insert x^2 here'):    print('Hello, World')    return

And what about the fancy arrow here?

In [10]:

def foo2(x, y) -> 'Hi!':    print('Hello, World')    return

Q: Is this valid Python syntax?
A: Yes!

Q: So, what happens if I just call the function?
A: Nothing!

Here is the proof!

In [9]:

foo1(1,2)

Hello, World

In [11]:

foo2(1,2)

Hello, World

So, those are function annotations ...

the colon for the function parameters
the arrow for the return value

You probably will never make use of them (or at least very rarely). Usually, we write good function documentations below the function as a docstring - or at least this is how I would do it (okay this case is a little bit extreme, I have to admit):

In []:

def is_palindrome(a):    """    Case-and punctuation insensitive check if a string is a palindrom.        Keyword arguments:        a (str): The string to be checked if it is a palindrome.            Returns `True` if input string is a palindrome, else False.        """    stripped_str = [l for l in my_str.lower() if l.isalpha()]    return stripped_str == stripped_str[::-1]

However, function annotations can be useful to indicate that work is still in progress in some cases. But they are optional and I see them very very rarely.

As it is stated in PEP3107:

Function annotations, both for parameters and return values, are completely optional.
Function annotations are nothing more than a way of associating arbitrary Python expressions with various parts of a function at compile-time.

The nice thing about function annotations is their __annotations__ attribute, which is dictionary of all the parameters and/or the return value you annotated.

In [17]:

foo1.__annotations__

Out[17]:

{'y': 'insert x^2 here', 'x': 'insert x here'}

In [18]:

foo2.__annotations__

Out[18]:

{'return': 'Hi!'}

When are they useful?

Function annotations can be useful for a couple of things

Documentation in general
pre-condition testing
type checking

...

Abortive statements in `finally` blocks

Python's try-except-finally blocks are very handy for catching and handling errors. The finally block is always executed whether an exception has been raised or not as illustrated in the following example.

In [24]:

def try_finally1():    try:        print('in try:')        print('do some stuff')        float('abc')    except ValueError:        print('an error occurred')    else:        print('no error occurred')    finally:        print('always execute finally')        try_finally1()

in try:do some stuffan error occurredalways execute finally http://

But can you also guess what will be printed in the next code cell?

In [21]:

def try_finally2():    try:        print("do some stuff in try block")        return "return from try block"    finally:        print("do some stuff in finally block")        return "always execute finally"    print(try_finally2())

do some stuff in try blockdo some stuff in finally blockalways execute finally

Here, the abortive return statement in the finally block simply overrules the return in the try block, since finally is guaranteed to always be executed. So, be careful using abortive statements in finally blocks!

In []:

From

http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/not_so_obvious_python_stuff.ipynb

0 0

A collection of not-so-obvious Python stuff you should know

Sebastian Raschkalast updated: 04/25/2014Link to this IPython Notebook on GitHubAll code was executed in Python 3.4A collection of not-so-obvious Python stuff you should know!

All code was executed in Python 3.4

Sections

The C3 class resolution algorithm for multiple class inheritance

Using += on lists creates new objects

True and False in the datetime module

Python reuses objects for small integers - use "==" for equality, "is" for identity

And to illustrate the test for equality (==) vs. identity (is):

Shallow vs. deep copies if list contains other structures and objects

Picking True values from logical ands and ors

Don't use mutable objects as default arguments for functions!

Be aware of the consuming generator

bool is a subclass of int

About lambda-in-closures-and-a-loop pitfall

Python's LEGB scope resolution and the keywords global and nonlocal

global vs. local

local vs. enclosed

When mutable contents of immutable tuples aren't so mutable

But what if we put a mutable object into the immutable tuple? Well, modification works, but we also get a TypeError at the same time.

Explanation

One more note about the immutable status of tuples. Tuples are famous for being immutable. However, how comes that this code works?

List comprehensions are fast, but generators are faster!?

To be fair to the list, let us exhaust the generators:

Public vs. private class methods and name mangling

The consequences of modifying a list when looping through it

Dynamic binding and typos in variable names

List slicing using indexes that are "out of range"

Reusing global variable names and UnboundLocalErrors

Creating copies of mutable objects

Key differences between Python 2 and 3

Unicode...

- Python 2:

- Python 3:

The print statement

Integer division

xrange()

Raising exceptions

Handling exceptions

The next() function and .next() method

Function annotations - What are those ->'s in my Python code?

Abortive statements in finally blocks

Sebastian Raschka
last updated: 04/25/2014
Link to this IPython Notebook on GitHub
All code was executed in Python 3.4

A collection of not-so-obvious Python stuff you should know!

Using `+=` on lists creates new objects

`True` and `False` in the datetime module

And to illustrate the test for equality (`==`) vs. identity (`is`):

Picking `True` values from logical `and`s and `or`s

`bool` is a subclass of `int`

Python's LEGB scope resolution and the keywords `global` and `nonlocal`

`global` vs. `local`

`local` vs. `enclosed`

But what if we put a mutable object into the immutable tuple? Well, modification works, but we also get a `TypeError` at the same time.

One more note about the `immutable` status of tuples. Tuples are famous for being immutable. However, how comes that this code works?

Reusing global variable names and `UnboundLocalErrors`

`xrange()`

The `next()` function and `.next()` method

Function annotations - What are those `->`'s in my Python code?

Abortive statements in `finally` blocks