5 Simple Rules For Building Great Python Packages

来源:互联网 发布:天津网站建设优化 编辑:程序博客网 时间:2024/05/22 05:14

A package seems simple enough to build, just a collection of modules in a directory with an __init__.py, right? As straight-forward as it may seem, with more and more modifications to your package over time, a poorly designed package will tend towards circular dependency problems, and may become non-portable and brittle.
Following these 5 simple design patterns will help you avoid these common pitfalls, and write packages that will live long and prosper.
1. __init__.py is Only for Imports
For a simple package, you might be tempted to throw utility methods, factories and exceptions into your __init__.py. Don’t.
A well-formed __init__.py serves one very important purpose: to import from sub-modules. Your __init__.py should look something like this:

# ORDER MATTERS HERE -- SOME MODULES ARE DEPENDANT ON OTHERSfrom exceptions import FSQError, FSQEnvError, FSQEncodeError,\                       FSQTimeFmtError, FSQMalformedEntryError,\                       FSQCoerceError, FSQEnqueueError, FSQConfigError,\                       FSQPathError, FSQInstallError, FSQCannotLockError,\                       FSQWorkItemError, FSQTTLExpiredError,\                       FSQMaxTriesError, FSQScanError, FSQDownError,\                       FSQDoneError, FSQFailError, FSQTriggerPullError,\                       FSQHostsError, FSQReenqueueError, FSQPushError # constants relies on: exceptions, internalimport constants# const relies on: constants, exceptions, internalfrom const import const, set_const # has tests# path relies on: exceptions, constants, internalimport path # has tests# lists relies on: pathfrom lists import hosts, queues#...

2.Use __init__.py to Enforce Import Order
As seen in the above example, __init__.py solves 2 problems:

  1. Expose methods and classes at package scope, so your user doesn’t have to descend into your internal package structure, making your package easy to use.
  2. Become a single place for reconciling the order of imports.

Used well, the __init__.py will afford you the flexibility to re-organize your internal package structure without worrying about side-effects from internal sub-module imports or the order imports within each module. As you are only importing from sub-modules in a specific order, your __init__.py should be simple-to-grok for other programmers, and serve as a manifest of all functionality provided by the package.
A doc string, and assignment to an __all__ attribute at the package level, should be the only non-import code in your __init__.py:

__all__ = [ 'FSQError', 'FSQEnvError', 'FSQEncodeError', 'FSQTimeFmtError',            'FSQMalformedEntryError', 'FSQCoerceError', 'FSQEnqueueError',            'FSQConfigError', 'FSQCannotLock', 'FSQWorkItemError',            'FSQTTLExpiredError', 'FSQMaxTriesError', 'FSQScanError',            'FSQDownError', 'FSQDoneError', 'FSQFailError', 'FSQInstallError',            'FSQTriggerPullError', 'FSQCannotLockError', 'FSQPathError',            'path', 'constants', 'const', 'set_const', 'down', 'up',            # ...          ]

3. Use One Module to Define All Exceptions
You may have noted that the first import in the __init__.py imports all exceptions from a single exceptions.py sub-module. This is a departure from what you’ll see in most packages, where exceptions are defined closely with the code raising them. While this may provide high cohesion within a module, a sufficiently complex package will cause this pattern to break down in 2 ways:

  1. Often times a module/program will need to import from one sub-module to get a function that imports and makes use of the code raising an exception. To trap the exception with granularity, you’ll need to import both the module you need, and the module defining the exception (or worse, chain the import of the exception). This sort of derivative import requirement is the first step towards a convoluted web of imports within your package. The more times you execute this pattern, the more interdependent and error-prone your package becomes.
  2. Over time as the number of exceptions increases, it becomes more and more difficult to find all of the exceptions a package is capable of raising. Defining all exceptions in a single module provides one convenient place where a programmer can inspect to determine the full surface-area of potential error conditions raised by your package.

You should define 1 base Exception for your package:

class APackageException(Exception):    '''root for APackage Exceptions, only used to except any APackage error, never raised'''    pass

And then ensure that your package raises only descendants of this base Exception for all error conditions, so you can suppress all exceptions if you need to:

try:    '''bunch of code from your package'''except APackageException:    '''blanked condition to handle all errors from your package'''

There are a few notable exceptions here for generic error conditions already included in the standard library (e.g. TypeError, ValueError, etc.).
Define exceptions liberally and with plenty of granularity:

# from fsqclass FSQEnvError(FSQError):    '''An error if something cannot be loaded from env, or env has an invalid       value'''    passclass FSQEncodeError(FSQError):    '''An error occured while encoding or decoding an argument'''    pass# ... and 20 or so more

More granularity in your exceptions allows programmers to have larger-and-larger uninterupted blocks of code wrapped by single try/except conditions:

# thistry:   item = fsq.senqueue('queue', 'str', 'arg', 'arg')   scanner = fsq.scan('queue')except FSQScanError:   '''do something'''except FSQEnqueueError:   '''do something else'''# not thistry:    item = fsq.senqueue('queue', 'str', 'arg', 'arg')except FSQEnqueueError:    '''do something else'''try:    scanner = fsq.scan('queue')except FSQScanError:    '''do something'''# and definitely nottry:    item = fsq.senqueue('queue', 'str', 'arg', 'arg')    try:        scanner = fsq.scan('queue')    except FSQScanError:        '''do something'''except FSQEnqueueError:    '''do something else'''

High levels of granularity in Exception definitions leads to less convoluted error handling, and allows you to group the normal execution instructions and the error handling instructions separately, making your code easier to understand and maintain.
4. Only Relative imports within the package
One of the simplest mistakes you’ll see commonly in sub-modules is importing from the package using the package name itself:

# within a sub-modulefrom a_package import APackageError

This decision results in two unsavory outcomes:

  1. The sub-modules will only function properly if the package is installed in PYTHONPATH.
  2. The sub-modules will only function properly if the package is named a_package.

While the former may not seem like a big problem, consider what happens if you have two packages of the same name in two different directories in the PYTHONPATH. Your sub-module may well end up importing from another package, and you’ll have unintentionally inflicted a late night of debugging on one or more unsuspecting programmers (or yourself).Rather than importing from your own package name, always use relative imports within a package:

# within a sub-module from . import FSQEnqueueError, FSQCoerceError, FSQError, FSQReenqueueError,\              constants as _c, path as fsq_path, construct,\              hosts as fsq_hosts, FSQWorkItemfrom .internal import rationalize_file, wrap_io_os_err, fmt_time,\                      coerce_unicode, uid_gid# you can also use ../... etc. in sub-packages.

5. Keep Modules Small
Your modules should be small. Remember that a programmer using your package will be importing from package scope, and you will be using your __init__.py as an organizational tool to help expose a coherent interface.
A good rule of thumb is to only have one class definition per module, along with any helper and factory methods you’ll expose to help construct it:

class APackageClass(object):    '''One class'''def apackage_builder(how_many):    for i in range(how_many):        yield APackageClass()

If your module exposes methods, group interdependent methods into single modules, and move any non-interdependent methods to separate modules:

####### EXPOSED METHODS #######def enqueue(trg_queue, item_f, *args, **kwargs):    '''Enqueue the contents of a file, or file-like object, file-descriptor or       the contents of a file at an address (e.g. '/my/file') queue with       arbitrary arguments, enqueue is to venqueue what printf is to vprintf    '''    return venqueue(trg_queue, item_f, args, **kwargs)def senqueue(trg_queue, item_s, *args, **kwargs):    '''Enqueue a string, or string-like object to queue with arbitrary       arguments, senqueue is to enqueue what sprintf is to printf, senqueue       is to vsenqueue what sprintf is to vsprintf.    '''    return vsenqueue(trg_queue, item_s, args, **kwargs)def venqueue(trg_queue, item_f, args, user=None, group=None, mode=None):    '''Enqueue the contents of a file, or file-like object, file-descriptor or       the contents of a file at an address (e.g. '/my/file') queue with       an argument list, venqueue is to enqueue what vprintf is to printf       if entropy is passed in, failure on duplicates is raised to the caller,       if entropy is not passed in, venqueue will increment entropy until it       can create the queue item.    '''    # setup defaults    trg_fd = name = None    # ...

The above example is fsq/enqueue.py, which exposes a family of functions that provide different interfaces for the same functionality (like load/loads in simplejson). While this example is straight-forward, keeping your sub-modules small requires some amount of judgement, but a good rule of thumb is:
When in doubt, create a new sub-module.

From 
http://axialcorps.com/2013/08/29/5-simple-rules-for-building-great-python-packages/?goback=%2Egde_25827_member_269818493

原创粉丝点击