Monday, August 19, 2013

Python line continuation

PEP 8 is Python's Bible when it comes to style, but it deliberately avoids certain kinds of assertions about how to write code while suggesting some only in examples. One area in which I feel it is too vague is line-continuation. In this article, I'll try to outline the reasons that one wants to continue lines and how one should deal with that. This is partly a matter of preference, but most of the rationales I present will come along with the practical reasons that this choices make sense.

Let's start with the simplest case. You have a line where a simple function call extends past the end of the normal boundary (79 characters per PEP 8, though I find myself trying to stay below that for historical reasons relating to specific editors, formatters and printing systems):

do_the_first_things_first(thefirstthing, thesecondthing, thelastofthethings, reasons=None)

There are many ways that would work. Here's one:

do_the_first_things_first(thefirstthing,
                          thesecondthing,
                          thelastofthethings,
                          reasons=None)


Aesthetically, this is quite pleasing because it has a function name on the left and all of the items being acted on on the right, perfectly lined up. However, this format has some significant drawbacks:
  • Because it takes up a horizontal width equal to the sum of the longest parameter (including any naming) plus the length of the function name (plus any object reference for methods), it discourages the use of parameter names and/or function names beyond a certain, in some cases quite reasonable, length.
  • Any change to the function name requires re-indenting all of the following lines.
  • Subsequent continuations in the same block will be indented differently unless their function name is the same length.
  • The correct handling for nested calls is not at all clear.
To resolve these problems, a more block-oriented approach can be used. This approach has a few simple rules:
  • By default indent all continuations by 4 spaces.
  • Exception: Count any enclosure (parentheses, brackets, braces, etc.) as an additional block level and indent appropriately.
  • Exception: If an indented block immediately follows the current statement, indent one extra 4-space level.
Here is the above example using this approach:
do_the_first_things_first(
    thefirstthing, thesecondthing, thelastofthethings,
    reasons=None)

Notice that the first continuation line has three items. This is a general pattern that I use whenever there are arguments which meet all of these criteria:
  • Not named
  • Required
  • Operate as a group to identify the data that the function is acting on
  • Fit on a single line
In such a case, I provide them as a group on the same line if possible, but if it is not possible, then I break them up as normal. This helps to illustrate their use and purpose.

Here's a more complex example that illustrates the value of this format:
class Foo(object):
    ...
    def blah(self):
        with MyContext(self) as ctx:
            try:
                ...
            except ValueError, e:
                self.logger.warning(
                    self.format_error(
                        "Failed during context", self.state,
                        context=ctx,
                        debug=self.debug))
                return
That's all well and good, but these are trivial examples. What happens when things get harrier? I'll cover some specific cases, below:

Strings

String literals in Python can be continued in the same way as C; that is, you can simply end the string with a close-quote and then immediately open a new string literal with a new quote and continue. The two strings are treated as one. This helps tremendously in line-continuation, and can be used to great effect with the above rules:
                self.logger.warning(
                    self.format_error(
                        "Failed during use of MyContext in "
                        "user-initiated callback function",
                        self.state,
                        context=ctx,
                        debug=self.debug))
If formatting is being used, the string continuation can be enclosed in parentheses to disambiguate:
                self.logger.warning(
                    self.format_error(
                        ("Failed during use of MyContext(%s) in "
                         "user-initiated callback function") %(
                             ctx,),
                        self.state,
                        debug=self.debug))
Data Structures

Notice in the example above that the indent after the parenthesis is hanging, not 4 spaces. Why not 4 spaces? The glib answer is: this isn't Lisp. The more detailed answer is that the parenthetical isn't really introducing a block scope and has no identifier to modify indent level (thus if this next parameter were continued in the same way, it would be indented in the same way). This is a bit of a shortcut, but it doesn't hurt any of the goals which we set out to achieve, so it's painless and pleasing to the eye. If it bothers you, feel free to indent fully to 4 spaces.

Here's a more complex example:
{'alpha': 1,
 'beta': [
     "Apples",
     "Oranges",
     "Pears" ],
 'gamma': [
     [ format_gamma(
           "And here we are, formatting a gamma",
           dogfood=1,
           catfood=2),
       format_gamma(
           "Why not go ahead and format a second?",
           dogfood=8,
           catfood=10) ],
     [...] ] }
Again, where there are no identifiers to the left of the newly introduced level, we can hang the indentation off of that opening bracket or paren without a problem. This is clean, compact and as consistent as it needs to be without being overly slavish to its own style.

Methods

One thing that I find maddeningly hard to read in other people's code is chained method invocation in continued lines. Really, just don't do that, but if you must, please never do this:
                some_object.some_method_name(
                                             blahblah,
                                             bletchbletch)\
                    .boingboing(snobbery)
Yeah, so I just want to kill something. Please don't do that. Here's two suggestions:
                some_object.some_method_name(
                    blahblah,
                    bletchbletch).boingboing(snobbery)

But far more realistically:
                intermediate = some_object.some_method_name(
                    blahblah,
                    bletchbletch)
                intermediate.boingboing(snobbery)

You don't have to do all of the work in one line and if you find that it's getter cumbersome, just don't do that.

Exceptions

There are some excellent examples in PEP 8, and one of them strikes me as ideal for an exception to the above:

        if width == 0 and height == 0 and (color == 'red' or
                                           emphasis is None):
            raise ValueError("I don't think so -- values are %s, %s" %
                             (width, height))

Here the PEP makes an important clarification, visually, as to what part of the clause the or goes with. It can't be stressed enough that if you have something in your code that would be made less clear by whatever style you're using, you should think very carefully about using another style. Yes, consistency is nice, but when you can make your code more readable with inconsistency, you definitely should!

No comments:

Post a Comment