Code Like a Pythonista: Idiomatic Python
来源:互联网 发布:mac限量版口红2017 编辑:程序博客网 时间:2024/06/05 20:32
Code Like a Pythonista: Idiomatic Python
In this interactive tutorial, we'll cover many essential Python idiomsand techniques in depth, adding immediately useful tools to your belt.
There are 3 versions of this presentation:
- S5 presentation
- Plain HTML handout
- reStructuredText source
©2006-2008, licensed under a Creative CommonsAttribution/Share-Alike (BY-SA) license.
My credentials: I am
- a resident of Montreal,
- father of two great kids, husband of one special woman,
- a full-time Python programmer,
- author of the Docutils project and reStructuredText,
- an editor of the Python Enhancement Proposals (or PEPs),
- an organizer of PyCon 2007, and chair of PyCon 2008,
- a member of the Python Software Foundation,
- a Director of the Foundation for the past year, and its Secretary.
In the tutorial I presented at PyCon 2006 (called Text & DataProcessing), I was surprised at the reaction to some techniques Iused that I had thought were common knowledge. But many of theattendees were unaware of these tools that experienced Pythonprogrammers use without thinking.
Many of you will have seen some of these techniques and idiomsbefore. Hopefully you'll learn a few techniques that you haven'tseen before and maybe something new about the ones you have alreadyseen.
The Zen of Python (1)
These are the guiding principles of Python, but are open tointerpretation. A sense of humor is required for their properinterpretation.
If you're using a programming language named after a sketch comedytroupe, you had better have a sense of humor.
Beautiful is better than ugly.Explicit is better than implicit.Simple is better than complex.Complex is better than complicated.Flat is better than nested.Sparse is better than dense.Readability counts.Special cases aren't special enough to break the rules.Although practicality beats purity.Errors should never pass silently.Unless explicitly silenced....
The Zen of Python (2)
In the face of ambiguity, refuse the temptation to guess.There should be one—and preferably only one—obvious way to do it.Although that way may not be obvious at first unless you're Dutch.Now is better than never.Although never is often better than right now.If the implementation is hard to explain, it's a bad idea.If the implementation is easy to explain, it may be a good idea.Namespaces are one honking great idea—let's do more of those!—Tim Peters
This particular "poem" began as a kind of a joke, but it reallyembeds a lot of truth about the philosophy behind Python. The Zenof Python has been formalized in PEP 20, where the abstract reads:
Long time Pythoneer Tim Peters succinctly channels the BDFL'sguiding principles for Python's design into 20 aphorisms, only19 of which have been written down.
—http://www.python.org/dev/peps/pep-0020/
You can decide for yourself if you're a "Pythoneer" or a"Pythonista". The terms have somewhat different connotations.
When in doubt:
import this
Try it in a Python interactive interpreter:
>>> import this
Here's another easter egg:
>>> from __future__ import braces File "<stdin>", line 1SyntaxError: not a chance
What a bunch of comedians! :-)
Coding Style: Readability Counts
Programs must be written for people to read, and only incidentallyfor machines to execute.
—Abelson & Sussman, Structure and Interpretation of Computer Programs
PEP 8: Style Guide for Python Code
Worthwhile reading:
http://www.python.org/dev/peps/pep-0008/
PEP = Python Enhancement Proposal
A PEP is a design document providing information to the Pythoncommunity, or describing a new feature for Python or its processesor environment.
The Python community has its own standards for what source codeshould look like, codified in PEP 8. These standards are differentfrom those of other communities, like C, C++, C#, Java,VisualBasic, etc.
Because indentation and whitespace are so important in Python, theStyle Guide for Python Code approaches a standard. It would bewise to adhere to the guide! Most open-source projects and(hopefully) in-house projects follow the style guide quiteclosely.
Whitespace 1
4 spaces per indentation level.
No hard tabs.
Never mix tabs and spaces.
This is exactly what IDLE and the Emacs Python mode support.Other editors may also provide this support.
One blank line between functions.
Two blank lines between classes.
Whitespace 2
- Add a space after "," in dicts, lists, tuples, & argument lists, andafter ":" in dicts, but not before.
- Put spaces around assignments & comparisons (except in argumentlists).
- No spaces just inside parentheses or just before argumentlists.
- No spaces just inside docstrings.
def make_squares(key, value=0): """Return a dictionary and a list...""" d = {key: value} l = [key, value] return d, l
Naming
joined_lower for functions, methods, attributes
joined_lower or ALL_CAPS for constants
StudlyCaps for classes
camelCase only to conform to pre-existing conventions
Attributes: interface, _internal, __private
But try to avoid the __private form. I never use it.Trust me. If you use it, youWILL regret it later.
Explanation:
People coming from a C++/Java background are especially prone tooverusing/misusing this "feature". But__private names don'twork the same way as in Java or C++. They just trigger anamemangling whose purpose is to prevent accidental namespacecollisions in subclasses:MyClass.__private just becomesMyClass._MyClass__private. (Note that even this breaks downfor subclasses with the same name as the superclass,e.g. subclasses in different modules.) Itis possible toaccess __private names from outside their class, justinconvenient and fragile (it adds a dependency on the exact nameof the superclass).
The problem is that the author of a class may legitimately think"this attribute/method name should be private, only accessiblefrom within this class definition" and use the__privateconvention. But later on, a user of that class may make asubclass that legitimately needs access to that name. So eitherthe superclass has to be modified (which may be difficult orimpossible), or the subclass code has to use manually manglednames (which is ugly and fragile at best).
There's a concept in Python: "we're all consenting adults here".If you use the__private form, who are you protecting theattribute from? It's the responsibility of subclasses to useattributes from superclasses properly, and it's theresponsibility of superclasses to document their attributesproperly.
It's better to use the single-leading-underscore convention,_internal. This isn't name mangled at all; it justindicates to others to "be careful with this, it's an internalimplementation detail; don't touch it if you don'tfullyunderstand it". It's only a convention though.
There are some good explanations in the answers here:
- http://stackoverflow.com/questions/70528/why-are-pythons-private-methods-not-actually-private
- http://stackoverflow.com/questions/1641219/does-python-have-private-variables-in-classes
Long Lines & Continuations
Keep lines below 80 characters in length.
Use implied line continuation inside parentheses/brackets/braces:
def __init__(self, first, second, third, fourth, fifth, sixth): output = (first + second + third + fourth + fifth + sixth)
Use backslashes as a last resort:
VeryLong.left_hand_side \ = even_longer.right_hand_side()
Long Strings
>>> print 'o' 'n' "e"one
The spaces between literals are not required, but help withreadability. Any type of quoting can be used:
>>> print 't' r'\/\/' """o"""t\/\/o
The string prefixed with an "r" is a "raw" string. Backslashes arenot evaluated as escapes in raw strings. They're useful forregular expressions and Windows filesystem paths.
Note named string objects are not concatenated:
>>> a = 'three'>>> b = 'four'>>> a b File "<stdin>", line 1 a b ^SyntaxError: invalid syntax
That's because this automatic concatenation is a feature of thePython parser/compiler, not the interpreter. You must use the "+"operator to concatenate strings at run time.
text = ('Long strings can be made up ' 'of several shorter strings.')
The parentheses allow implicit line continuation.
Multiline strings use triple quotes:
"""Tripledoublequotes"""
'''\Triplesinglequotes\'''
Compound Statements
Good:
if foo == 'blah': do_something()do_one()do_two()do_three()
Bad:
if foo == 'blah': do_something()do_one(); do_two(); do_three()
Whitespace & indentations are useful visual indicators of theprogram flow. The indentation of the second "Good" line aboveshows the reader that something's going on, whereas the lack ofindentation in "Bad" hides the "if" statement.
Multiple statements on one line are a cardinal sin. In Python,readability counts.
Docstrings & Comments
Docstrings = How to use code
Comments = Why (rationale) & how code works
Docstrings explain how to use code, and are for the usersof your code. Uses of docstrings:
- Explain the purpose of the function even if it seems obvious toyou, because it might not be obvious to someone else later on.
- Describe the parameters expected, the return values, and anyexceptions raised.
- If the method is tightly coupled with a single caller, make somemention of the caller (though be careful as the caller mightchange later).
Comments explain why, and are for the maintainers of yourcode. Examples include notes to yourself, like:
# !!! BUG: ...# !!! FIX: This is a hack# ??? Why is this here?
Both of these groups include you, so write good docstrings andcomments!
Docstrings are useful in interactive use (help()) and forauto-documentation systems.
False comments & docstrings are worse than none at all. So keepthem up to date! When you make changes, make sure the comments &docstrings are consistent with the code, and don't contradict it.
There's an entire PEP about docstrings, PEP 257, "DocstringConventions":
http://www.python.org/dev/peps/pep-0257/
Practicality Beats Purity
A foolish consistency is the hobgoblin of little minds.
—Ralph Waldo Emerson
(hobgoblin: Something causing superstitious fear; a bogy.)
There are always exceptions. From PEP 8:
But most importantly: know when to be inconsistent -- sometimesthe style guide just doesn't apply. When in doubt, use yourbest judgment. Look at other examples and decide what looksbest. And don't hesitate to ask!
Two good reasons to break a particular rule:
- When applying the rule would make the code less readable,even for someone who is used to reading code that followsthe rules.
- To be consistent with surrounding code that also breaks it(maybe for historic reasons) -- although this is also anopportunity to clean up someone else's mess (in true XPstyle).
... but practicality shouldn't beat purity to a pulp!
Idiom Potpourri
A selection of small, useful idioms.
Now we move on to the meat of the tutorial: lots of idioms.
We'll start with some easy ones and work our way up.
Swap Values
In other languages:
temp = aa = bb = temp
In Python:
b, a = a, b
- The comma is the tuple constructor syntax.
- A tuple is created on the right (tuple packing).
- A tuple is the target on the left (tuple unpacking).
The right-hand side is unpacked into the names in the tuple onthe left-hand side.
Further examples of unpacking:
>>> l =['David', 'Pythonista', '+1-514-555-1234']>>> name, title, phone = l>>> name'David'>>> title'Pythonista'>>> phone'+1-514-555-1234'
Useful in loops over structured data:
l (L) above is the list we just made (David's info). Sopeople is a list containing two items, each a 3-item list.
>>> people = [l, ['Guido', 'BDFL', 'unlisted']]>>> for (name, title, phone) in people:... print name, phone...David +1-514-555-1234Guido unlisted
Each item in people is being unpacked into the(name, title,phone) tuple.
Arbitrarily nestable (just be sure to match the structure on theleft & right!):
>>> david, (gname, gtitle, gphone) = people>>> gname'Guido'>>> gtitle'BDFL'>>> gphone'unlisted'>>> david['David', 'Pythonista', '+1-514-555-1234']
More About Tuples
>>> 1,(1,)
>>> (1,)(1,)
>>> (1)1
>>> ()()
>>> tuple()()
>>> value = 1,>>> value(1,)
Interactive "_"
This is a really useful feature that surprisingly few people know.
In the interactive interpreter, whenever you evaluate an expressionor call a function, the result is bound to a temporary name,_(an underscore):
>>> 1 + 12>>> _2
_ stores the last printed expression.
When a result is None, nothing is printed, so_ doesn'tchange. That's convenient!
This only works in the interactive interpreter, not within amodule.
It is especially useful when you're working out a probleminteractively, and you want to store the result for a later step:
>>> import math>>> math.pi / 31.0471975511965976>>> angle = _>>> math.cos(angle)0.50000000000000011>>> _0.50000000000000011
Building Strings from Substrings
colors = ['red', 'blue', 'green', 'yellow']
Don't do this:
result = ''for s in colors: result += s
This is very inefficient.
It has terrible memory usage and performance patterns. The"summation" will compute, store, and then throw away eachintermediate step.
Instead, do this:
result = ''.join(colors)
The join() string method does all the copying in one pass.
When you're only dealing with a few dozen or hundred strings, itwon't make much difference. But get in the habit of buildingstrings efficiently, because with thousands or with loops, itwill make a difference.
Building Strings, Variations 1
If you want spaces between your substrings:
result = ' '.join(colors)
Or commas and spaces:
result = ', '.join(colors)
Here's a common case:
colors = ['red', 'blue', 'green', 'yellow']print 'Choose', ', '.join(colors[:-1]), \ 'or', colors[-1]
To make a nicely grammatical sentence, we want commas between allbut the last pair of values, where we want the word "or". Theslice syntax does the job. The "slice until -1" ([:-1]) givesall but the last value, which we join with comma-space.
Of course, this code wouldn't work with corner cases, lists oflength 0 or 1.
Choose red, blue, green or yellow
Building Strings, Variations 2
If you need to apply a function to generate the substrings:
result = ''.join(fn(i) for i in items)
If you need to compute the substrings incrementally, accumulatethem in a list first:
items = []...items.append(item) # many times...# items is now completeresult = ''.join(fn(i) for i in items)
Usein where possible (1)
Good:
for key in d: print key
- in is generally faster.
- This pattern also works for items in arbitrary containers (suchas lists, tuples, and sets).
- in is also an operator (as we'll see).
Bad:
for key in d.keys(): print key
Usein where possible (2)
But .keys() is necessary when mutating the dictionary:
for key in d.keys(): d[str(key)] = d[key]
For consistency, use key in dict, notdict.has_key():
# do this:if key in d: ...do something with d[key]# not this:if d.has_key(key): ...do something with d[key]
Dictionaryget Method
We often have to initialize dictionary entries before use:
navs = {}for (portfolio, equity, position) in data: if portfolio not in navs: navs[portfolio] = 0 navs[portfolio] += position * prices[equity]
dict.get(key, default) removes the need for the test:
navs = {}for (portfolio, equity, position) in data: navs[portfolio] = (navs.get(portfolio, 0) + position * prices[equity])
Dictionarysetdefault Method (1)
Initializing mutable dictionary values:
equities = {}for (portfolio, equity) in data: if portfolio in equities: equities[portfolio].append(equity) else: equities[portfolio] = [equity]
dict.setdefault(key, default) does the job much moreefficiently:
equities = {}for (portfolio, equity) in data: equities.setdefault(portfolio, []).append( equity)
dict.setdefault() is equivalent to "get, or set & get". Or"set if necessary, then get". It's especially efficient if yourdictionary key is expensive to compute or long to type.
The only problem with dict.setdefault() is that the defaultvalue is always evaluated, whether needed or not. That onlymatters if the default value is expensive to compute.
If the default value is expensive to compute, you may want touse thedefaultdict class, which we'll cover shortly.
Dictionarysetdefault Method (2)
setdefault can also be used as a stand-alone statement:
navs = {}for (portfolio, equity, position) in data: navs.setdefault(portfolio, 0) navs[portfolio] += position * prices[equity]
defaultdict
New in Python 2.5.
defaultdict is new in Python 2.5, part of thecollectionsmodule. defaultdict is identical to regular dictionaries,except for two things:
- it takes an extra first argument: a default factory function; and
- when a dictionary key is encountered for the first time, thedefault factory function is called and the result used toinitialize the dictionary value.
There are two ways to get defaultdict:
import the collections module and reference it via themodule,
➔
or import the defaultdict name directly:
➔
import collectionsd = collections.defaultdict(...)
from collections import defaultdictd = defaultdict(...)
from collections import defaultdictequities = defaultdict(list)for (portfolio, equity) in data: equities[portfolio].append(equity)
There's no fumbling around at all now. In this case, the defaultfactory function islist, which returns an empty list.
This is how to get a dictionary with default values of 0: useint as a default factory function:
navs = defaultdict(int)for (portfolio, equity, position) in data: navs[portfolio] += position * prices[equity]
Building & Splitting Dictionaries
given = ['John', 'Eric', 'Terry', 'Michael']family = ['Cleese', 'Idle', 'Gilliam', 'Palin']
pythons = dict(zip(given, family))
>>> pprint.pprint(pythons){'John': 'Cleese', 'Michael': 'Palin', 'Eric': 'Idle', 'Terry': 'Gilliam'}
>>> pythons.keys()['John', 'Michael', 'Eric', 'Terry']>>> pythons.values()['Cleese', 'Palin', 'Idle', 'Gilliam']
Testing for Truth Values
# do this: # not this:if x: if x == True: pass pass
Testing a list:
# do this: # not this:if items: if len(items) != 0: pass pass # and definitely not this: if items != []: pass
Truth Values
Example of an object's truth value:
>>> class C:... pass...>>> o = C()>>> bool(o)True>>> bool(C)True
(Examples: execute truth.py.)
To control the truth value of instances of a user-defined class,use the __nonzero__ or __len__ special methods. Use__len__ if your class is a container which has a length:
class MyContainer(object): def __init__(self, data): self.data = data def __len__(self): """Return my length.""" return len(self.data)
If your class is not a container, use __nonzero__:
class MyClass(object): def __init__(self, value): self.value = value def __nonzero__(self): """Return my truth value (True or False).""" # This could be arbitrarily complex: return bool(self.value)
In Python 3.0, __nonzero__ has been renamed to__bool__ forconsistency with the bool built-in type. For compatibility,add this to the class definition:
__bool__ = __nonzero__
Index & Item (1)
>>> items = 'zero one two three'.split()>>> print items['zero', 'one', 'two', 'three']
Say we want to iterate over the items, and we need both the item'sindex and the item itself:
- or -i = 0for item in items: for i in range(len(items)): print i, item print i, items[i] i += 1
Index & Item (2):enumerate
The enumerate function takes a list and returns (index, item)pairs:
>>> print list(enumerate(items))[(0, 'zero'), (1, 'one'), (2, 'two'), (3, 'three')]
Our loop becomes much simpler:
for (index, item) in enumerate(items): print index, item
# compare: # compare:index = 0 for i in range(len(items)):for item in items: print i, items[i] print index, item index += 1
The enumerate version is much shorter and simpler than theversion on the left, and much easier to read and understand thaneither.
An example showing how the enumerate function actually returnsan iterator (a generator is a kind of iterator):
>>> enumerate(items)<enumerate object at 0x011EA1C0>>>> e = enumerate(items)>>> e.next()(0, 'zero')>>> e.next()(1, 'one')>>> e.next()(2, 'two')>>> e.next()(3, 'three')>>> e.next()Traceback (most recent call last): File "<stdin>", line 1, in ?StopIteration
Other languages have "variables"
int a = 1;
Box "a" now contains an integer 1.
Assigning another value to the same variable replaces the contentsof the box:
a = 2;
Now box "a" contains an integer 2.
Assigning one variable to another makes a copy of the value andputs it in the new box:
int b = a;
Python has "names"
a = 1
Here, an integer 1 object has a tag labelled "a".
If we reassign to "a", we just move the tag to another object:
a = 2
Now the name "a" is attached to an integer 2 object.
The original integer 1 object no longer has a tag "a". It may liveon, but we can't get to it through the name "a". (When an objecthas no more references or tags, it is removed from memory.)
If we assign one name to another, we're just attaching anothernametag to an existing object:
b = a
Although we commonly refer to "variables" even in Python (becauseit's common terminology), we really mean "names" or "identifiers".In Python, "variables" are nametags for values, not labelled boxes.
If you get nothing else out of this tutorial, I hope you understandhow Python names work. A good understanding is certain to paydividends, helping you to avoid cases like this:
Default Parameter Values
def bad_append(new_item, a_list=[]): a_list.append(new_item) return a_list
>>> print bad_append('one')['one']
>>> print bad_append('two')['one', 'two']
def good_append(new_item, a_list=None): if a_list is None: a_list = [] a_list.append(new_item) return a_list
% String Formatting
Although if you don't know C, that's not very helpful. Basically,you provide a template or format and interpolation values.
In this example, the template contains two conversionspecifications: "%s" means "insert a string here", and "%i" means"convert an integer to a string and insert here". "%s" isparticularly useful because it uses Python's built-instr()function to to convert any object to a string.
The interpolation values must match the template; we have twovalues here, a tuple.
name = 'David'messages = 3text = ('Hello %s, you have %i messages' % (name, messages))print text
Output:
Hello David, you have 3 messages
Advanced % String Formatting
By name with a dictionary:
values = {'name': name, 'messages': messages}print ('Hello %(name)s, you have %(messages)i ' 'messages' % values)
Here we specify the names of interpolation values, which are lookedup in the supplied dictionary.
Notice any redundancy? The names "name" and "messages" are alreadydefined in the local namespace. We can take advantage of this.
By name using the local namespace:
print ('Hello %(name)s, you have %(messages)i ' 'messages' % locals())
The locals() function returns a dictionary of alllocally-available names.
This is very powerful. With this, you can do all the stringformatting you want without having to worry about matching theinterpolation values to the template.
But power can be dangerous. ("With great power comes greatresponsibility.") If you use thelocals() form with anexternally-supplied template string, you expose your entire localnamespace to the caller. This is just something to keep in mind.
To examine your local namespace:
>>> from pprint import pprint>>> pprint(locals())
Advanced % String Formatting
By name using the instance namespace:
print ("We found %(error_count)d errors" % self.__dict__)
Equivalent to, but more flexible than:
print ("We found %d errors" % self.error_count)
List Comprehensions
The traditional way, with for andif statements:
new_list = []for item in a_list: if condition(item): new_list.append(fn(item))
As a list comprehension:
new_list = [fn(item) for item in a_list if condition(item)]
For example, a list of the squares of 0–9:
>>> [n ** 2 for n in range(10)][0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
A list of the squares of odd 0–9:
>>> [n ** 2 for n in range(10) if n % 2][1, 9, 25, 49, 81]
Generator Expressions (1)
As a loop:
total = 0for num in range(1, 101): total += num * num
As a list comprehension:
total = sum([num * num for num in range(1, 101)])
As a generator expression:
total = sum(num * num for num in xrange(1, 101))
Generator expressions ("genexps") are just like listcomprehensions, except that where listcomps are greedy, generatorexpressions are lazy. Listcomps compute the entire result list allat once, as a list. Generator expressions compute one value at atime, when needed, as individual values. This is especially usefulfor long sequences where the computed list is just an intermediatestep and not the final result.
In this case, we're only interested in the sum; we don't need theintermediate list of squares. We usexrange for the samereason: it lazily produces values, one at a time.
Generator Expressions (2)
total = sum(num * num for num in xrange(1, 1000000000))
Rule of thumb:
- Use a list comprehension when a computed list is the desired endresult.
- Use a generator expression when the computed list is just anintermediate step.
Here's a recent example I saw at work.
We needed a dictionary mapping month numbers (both as string and asintegers) to month codes for futures contracts. It can be done inone logical line of code.
The way this works is as follows:
- The dict() built-in takes a list of key/value pairs(2-tuples).
- We have a list of month codes (each month code is a singleletter, and a string is also just a list of letters). Weenumerate over this list to get both the month code and theindex.
- The month numbers start at 1, but Python starts indexing at 0, sothe month number is one more than the index.
- We want to look up months both as strings and as integers. Wecan use the int() and str() functions to do this for us,and loop over them.
Recent example:
month_codes = dict((fn(i+1), code) for i, code in enumerate('FGHJKMNQUVXZ') for fn in (int, str))
month_codes result:
{ 1: 'F', 2: 'G', 3: 'H', 4: 'J', ... '1': 'F', '2': 'G', '3': 'H', '4': 'J', ...}
Sorting
a_list.sort()
(Note that the list is sorted in-place: the original list issorted, and the sort method does not return the list or acopy.)
But what if you have a list of data that you need to sort, but itdoesn't sort naturally (i.e., sort on the first column, then thesecond column, etc.)? You may need to sort on the second columnfirst, then the fourth column.
We can use list's built-in sort method with a custom function:
def custom_cmp(item1, item2): return cmp((item1[1], item1[3]), (item2[1], item2[3]))a_list.sort(custom_cmp)
Sorting with DSU *
DSU = Decorate-Sort-Undecorate
* Note: DSU is often no longer necessary. See the next section,Sorting With Keys for the new approach.
# Decorate:to_sort = [(item[1], item[3], item) for item in a_list]# Sort:to_sort.sort()# Undecorate:a_list = [item[-1] for item in to_sort]
The first line creates a list containing tuples: copies of the sortterms in priority order, followed by the complete data record.
The second line does a native Python sort, which is very fast andefficient.
The third line retrieves the last value from the sorted list.Remember, this last value is the complete data record. We'rethrowing away the sort terms, which have done their job and are nolonger needed.
Sorting With Keys
def my_key(item): return (item[1], item[3])to_sort.sort(key=my_key)
The function my_key will be called once for each item in theto_sort list.
You can make your own key function, or use any existingone-argument function if applicable:
- str.lower to sort alphabetically regarless of case.
- len to sort on the length of the items (strings or containers).
- int or float to sort numerically, as with numeric stringslike "2", "123", "35".
Generators
def my_range_generator(stop): value = 0 while value < stop: yield value value += 1for i in my_range_generator(10): do_something(i)
The yield keyword turns a function into a generator. When youcall a generator function, instead of running the code immediatelyPython returns a generator object, which is an iterator; it has anext method. for loops just call the next method onthe iterator, until a StopIteration exception is raised. Youcan raiseStopIteration explicitly, or implicitly by fallingoff the end of the generator code as above.
Generators can simplify sequence/iterator handling, because wedon't need to build concrete lists; just compute one value at atime. The generator function maintains state.
This is how a for loop really works. Python looks at thesequence supplied after thein keyword. If it's a simplecontainer (such as a list, tuple, dictionary, set, or user-definedcontainer) Python converts it into an iterator. If it's already aniterator, Python uses it directly.
Then Python repeatedly calls the iterator's next method,assigns the return value to the loop counter (i in this case),and executes the indented code. This is repeated over and over,untilStopIteration is raised, or a break statement isexecuted in the code.
A for loop can have an else clause, whose code is executedafter the iterator runs dry, but not after a breakstatement is executed. This distinction allows for some elegantuses.else clauses are not always or often used on forloops, but they can come in handy. Sometimes an else clauseperfectly expresses the logic you need.
For example, if we need to check that a condition holds on someitem, any item, in a sequence:
for item in sequence: if condition(item): breakelse: raise Exception('Condition not satisfied.')
Example Generator
Filter out blank rows from a CSV reader (or items from a list):
def filter_rows(row_iterator): for row in row_iterator: if row: yield rowdata_file = open(path, 'rb')irows = filter_rows(csv.reader(data_file))
Reading Lines From Text/Data Files
datafile = open('datafile')for line in datafile: do_something(line)
This is possible because files support a next method, as doother iterators: lists, tuples, dictionaries (for their keys),generators.
There is a caveat here: because of the way the buffering is done,you cannot mix.next & .read* methods unless you're usingPython 2.5+.
EAFP vs. LBYL
It's easier to ask forgiveness than permission
Look before you leap
Duck typing
If it walks like a duck, and talks like a duck, and looks like aduck: it's a duck.(Goose? Close enough.)
Exceptions
Use coercion if an object must be a particular type. If xmust be a string for your code to work, why not call
str(x)
instead of trying something like
isinstance(x, str)
EAFPtry/except Example
try: return str(x)except TypeError: ...
Importing
from module import *
You've probably seen this "wild card" form of the import statement.You may even like it.Don't use it.
To adapt a well-known exchange:
(Exterior Dagobah, jungle, swamp, and mist.)
LUKE: Is from module import * better than explicit imports?
YODA: No, not better. Quicker, easier, more seductive.
LUKE: But how will I know why explicit imports are better thanthe wild-card form?
YODA: Know you will when your code you try to read six monthsfrom now.
Wild-card imports are from the dark side of Python.
Never!
The from module import * wild-card style leads to namespacepollution. You'll get things in your local namespace that youdidn't expect to get. You may see imported names obscuringmodule-defined local names. You won't be able to figure out wherecertain names come from. Although a convenient shortcut, thisshould not be in production code.
Moral: don't use wild-card imports!
It's much better to:
reference names through their module(fully qualified identifiers),
➔
import a long module using a shorter name (alias; recommended),
➔
or explicitly import just the names you need.
➔
Instead,
import modulemodule.name
import long_module_name as modmod.name
from module import namename
Modules & Scripts
To make a simultaneously importable module and executable script:
if __name__ == '__main__': # script code here
When imported, a module's __name__ attribute is set to themodule's file name, without ".py". So the code guarded by theif statement above will not run when imported. When executedas a script though, the __name__ attribute is set to"__main__", and the script codewill run.
Except for special cases, you shouldn't put any major executablecode at the top-level. Put code in functions, classes, methods,and guard it withif __name__ == '__main__'.
Module Structure
"""module docstring"""# imports# constants# exception classes# interface functions# classes# internal functions & classesdef main(...): ...if __name__ == '__main__': status = main() sys.exit(status)
Command-Line Processing
Example: cmdline.py:
#!/usr/bin/env python"""Module docstring."""import sysimport optparsedef process_command_line(argv): """ Return a 2-tuple: (settings object, args list). `argv` is a list of arguments, or `None` for ``sys.argv[1:]``. """ if argv is None: argv = sys.argv[1:] # initialize the parser object: parser = optparse.OptionParser( formatter=optparse.TitledHelpFormatter(width=78), add_help_option=None) # define options here: parser.add_option( # customized description; put --help last '-h', '--help', action='help', help='Show this help message and exit.') settings, args = parser.parse_args(argv) # check number of arguments, verify values, etc.: if args: parser.error('program takes no command-line arguments; ' '"%s" ignored.' % (args,)) # further process settings & args if necessary return settings, argsdef main(argv=None): settings, args = process_command_line(argv) # application code here, like: # run(settings, args) return 0 # successif __name__ == '__main__': status = main() sys.exit(status)
Packages
package/ __init__.py module1.py subpackage/ __init__.py module2.py
- Used to organize your project.
- Reduces entries in load-path.
- Reduces import name conflicts.
Example:
import package.module1from package.subpackage import module2from package.subpackage.module2 import name
In Python 2.5 we now have absolute and relative imports via afuture import:
from __future__ import absolute_import
I haven't delved into these myself yet, so we'll conveniently cutthis discussion short.
Simple is Better Than Complex
Debugging is twice as hard as writing the code in the first place.Therefore, if you write the code as cleverly as possible, you are,by definition, not smart enough to debug it.
—Brian W. Kernighan, co-author of The C Programming Languageand the "K" in "AWK"
Don't reinvent the wheel
Check Python's standard library.
Check the Python Package Index (the "Cheese Shop"):
http://cheeseshop.python.org/pypi
Search the web. Google is your friend.
References
- "Python Objects", Fredrik Lundh,http://www.effbot.org/zone/python-objects.htm
- "How to think like a Pythonista", Mark Hammond,http://python.net/crew/mwh/hacks/objectthink.html
- "Python main() functions", Guido van Rossum,http://www.artima.com/weblogs/viewpost.jsp?thread=4829
- "Python Idioms and Efficiency",http://jaynes.colorado.edu/PythonIdioms.html
- "Python track: python idioms",http://www.cs.caltech.edu/courses/cs11/material/python/misc/python_idioms.html
- "Be Pythonic", Shalabh Chaturvedi,http://www.cafepy.com/article/be_pythonic/ (PDF version)
- "Python Is Not Java", Phillip J. Eby,http://dirtsimple.org/2004/12/python-is-not-java.html
- "What is Pythonic?", Martijn Faassen,http://faassen.n--tree.net/blog/view/weblog/2005/08/06/0
- "Sorting Mini-HOWTO", Andrew Dalke,http://wiki.python.org/moin/HowTo/Sorting
- "Python Idioms", http://www.gungfu.de/facts/wiki/Main/PythonIdioms
- "Python FAQs", http://www.python.org/doc/faq/
- Code Like a Pythonista: Idiomatic Python 笔记
- Code Like a Pythonista: Idiomatic Python
- 编写地道的Python代码 Code Like a Pythonista: Idiomatic Python
- Code Like a Pythonista: Idiomatic Python - Python idioms, 3 of n
- Transforming Code into Beautiful,Idiomatic Python(1)
- How to code like a hacker
- 如何像Python高手(Pythonista)一样编程
- 如何像Python高手(Pythonista)一样编程
- writing idiomatic python翻译
- writing idiomatic python 读书笔记(1)
- writing idiomatic python 读书笔记(2)
- writing idiomatic python 读书笔记(3)
- writing idiomatic python 读书笔记(4)
- writing idiomatic python 读书笔记(5)
- writing idiomatic python 读书笔记(6)
- writing idiomatic python 读书笔记(7)
- writing idiomatic python 读书笔记(8)
- Idiomatic Python, well format in python
- 用户空间与内核空间数据交换的方式(5)------内核启动参数
- MongoDB命令帮助系统(很不错的)
- VC笔记(四)
- 自定义View之圆形拖动DEMO
- 如何学好3D游戏引擎编程
- Code Like a Pythonista: Idiomatic Python
- 用户空间与内核空间数据交换的方式(6)------模块参数与sysfs
- MyEclipse中配置struts.xml自动提示
- ASPxGridView:创建自定义过滤项
- 重载、覆盖与隐藏
- weblogic数据库连接池参数配置详解
- mdi 之df和mlr
- 正则表达式
- ORACLE 10.0导入导出