Posts tagged python

Posted 2 years ago

My Welcome Non-Committee

I’ve been home from PyCon US for three weeks, which has given me plenty of time to reflect on the experience. Now that the Easter break has started, I also have the time to put some of those thoughts into words.

PyCon US this year was huge, or so I was told. Many of the 2250+ attendees were, like me, making their first appearance. Maybe consequently, there was a strong theme of ‘welcome’ for the conference. Stormy Peters encouraged returning attendees to meet somebody new during her keynote address. People wore funny hats and called themselves the Welcome Un-Committee. And so on.

As great as these initiatives were, what made a strong difference for me was the group of people who consciously went out of there way to welcome me to PyCon. Sort of my own, personal, Welcome Non-Committee. What I want to do now is publicly acknowledge and thank them for this. It’s quite a long list of people, so after a couple of weeks of soul searching I created a list of five names that have made a lasting impression.

David, without doubt one of the friendliest, engaging persons I have ever met. One day I hope to visit Canada and be able to see one of the modular, futuristic, signage displays you work on. Thank you.

Maciej, made joining PyPy for sprints the welcoming, entertaining and over all fun experience it was. The enthusiasm you imparted on me leaves me feel guilty for not working on PyPy even while I write this. Thank you.

Taavi, even after giving his highly rated presentation to PyCon was happy to have a conversation with me on the topic at dinner. I still smile when recalling the steady stream of often hilarious ‘WTF’ on NumPy edge cases you delivered at the sprints. Thank you.

Jonas, kept me company during the final hours of sprinting and shared some beers supplied by my travel allowance. I enjoyed our gentle jibing over whose fault it is that pymaging runs slower on PyPy.

Andrew, a fellow Australian, government employee, PyCon first time attendee. Being able to share the roller coaster that is PyCon tutorials, conference and sprints while spending a fortnight away from our families made an enormous difference. Thank you.

Of course, there is a very long other names that I could/should have included.

  • Simon, I really enjoyed discussing all things radio with you.
  • Dan, I’m amazed you chose to share a drink with me after the sprints wrapped up.
  • Michael, o.m.g, I met the guy behind mock and unittest!
  • Tavis, I was relieved to find someone who’d attempted a presentation that’s a little unusual, although dictating code is so much cooler …
  • James, o.m.g, I met the Eldarion guy!
  • Yarko, such positive enthusiasm is rare and inspirational. That and packing the swag bag was a blast.

As Steve Holden said, “people come for the programming language and stay for the community”. Each of you, as well as all the persons I was unable to name above, are that community for me. Thank you. I hope we meet again.

Posted 2 years ago

Hidden Features

How do you wind down after a weekend whirl wind interstate visit? By perusing the Hidden features of Python thread on Stack Overflow of course!

Ok, so strictly speaking it wasn’t for winding down, more that the farm has incredibly limited wireless internet. That and I was unable to fit my laptop in amongst the other 47kg of luggage that comprises the family Iles travelling road show.

Seriously though, the thread has some impressive answers. I’m surprised it hasn’t made it onto Jesse Noller's Good to Great Python Reads yet.

Posted 2 years ago

Where does heapq belong now?

Two of my favorite featues in Python’s standard library are namedtuple and heapq. Namedtuple resides in the collections module, while heapq is a module in its own right.

While heapq predates collections. The standard library has never been considered a fossilized structure. Reorganisation has (and will) take place when necessary.

So, is it time to relocate heapq to a logical home in collections?

Posted 2 years ago

Why metaclass?

I was asked an excellent question today:

So those metaclasses you’ve ben writing about, when would you use them?

I thought this question was great because, if you’ve been following my previous posts on using metaclasses, you may have noticed I never stopped to describe some scenarios when metaclasses are useful.

A good place to start is this 2002 quote from Tim Peters, the Tim behind timsort.

[Metaclasses] are deeper magic than 99% of users should ever worry about.  If you wonder whether you need them, you don’t (the people who actually need themknow with certainty that they need them,and don’t need an explanation about why).

While not exactly answering the question at hand, it’s a pertinent reminder that just because we have a hammer it doesn’t mean we need to treat every problem like a nail.

So when might you know you need them? Metaclasses are the class of classes, so they’re capable of  making systematic changes to class definitions or class instances as they are created. When multiple classes need to adhere to common interface, that logic can be abstracted out into a metaclass.

That’s very abstract, so here’s a short list to real world examples:

What all these examples have in common is the application of a common transform to class definition or instance creation. That transform could add new methods or properties, modify existing methods or properties, or communicate with an external framework.

Metaclasses are a specialised tool. While invaluable for framework authors, it’s rare to use them in standard application code. If in doubt, refer to the Zen of Python and ask yourself; ‘Is using a metaclass making my code clearer, more concise and more maintainable?’.

Posted 3 years ago

The when of Python scoping

Here’s a fun bit of Python trivia I learned a yesterday. The decision to determine the scope of a Python variable is made at compile time not run time by the Python interpreter.

Compile time for Python you ask? Why yes. When Python code is first read by the interpreter it is translated into byte code that includes representations for the classes and functions in the code. It’s during this first read that Python determines a variables scope. As an aside, Python includes support for dissembling the bytecode which can be useful to performance analysis and debugging.

Normally this is not something you need to worry about. However, it is possible to introduce interesting bugs as a side effect of when scoping rules are applied. Consider this short piece of code:

from __future__ import print_function

def outer(name):
    def inner():
        name = name.capitalize()
        return 'Hello {0:s}'.format(name)
    return inner()
print(outer('world'))

This code will raise an UnboundLocalError for name. That’s because Python will determine when first reading the code that name is local to the inner function because there is an assignment to name. Even though name is available to to inner from outer’s scope.

Scoping rules are more than just interesting trivia. It goes to the heart of how closures work in Python. But that’s a post for another night.

Posted 3 years ago

Keeping up with Python

It’s hard keeping up to date with the “happenings” in an open source community as large and diverse as Python. I make no claims of knowing where the vanguard of Python news is. Though if it helps you, here’s how I stay up to date with developments in the community.

Most important to me is Twitter. The Python community is diverse, active and vibrant on Twitter. I’ve been acquiring new people to follow for well over a year now, so to help you get started I’ve started curating two Python related Twitter lists for you to follow:

The first list is of general members of the Python community I find interesting. The second list is of people predominately related to the PyPy project.

After Twitter I use Google Reader to follow Python related blogs and news feeds. To date I’ve acquired 47 Python related feeds. Again to make it easy for you I’ve collated these feeds into a single Google Reader bungle.

The bundle page includes an Atom link so you can follow along with your preferred news reader.

That should be enough to get you started following along with the Python community. If there are other interesting people and feeds you follow, please share them in the comments below.

Posted 3 years ago

Replacing exec with metaclass for namedtuple

namedtuple is one of my favourite additions to Python’s standard library. Even though I’ve used it regularly throughout my Python code, I’ve never really wondered how it was implemented. That was until earlier this year when Kristjan Valur described how removing exec from their port of Python to the PS3 broke namedtuple and much of the standard library as a consequence.

The use of exec was raised as issue 3974, generated some discussion and eventually rejected by Raymond Hettinger. However, as an exercise to continue my experiments with Python meta programming, this afternoon I reimplemented namedtuple using a metaclass. Between bouts of baby sitting that is.

The resulting implementation takes a little over 150 lines and borrows heavily from the current implementation:

"collections.namedtuple implementation without using exec."
from collections import OrderedDict
from keyword import iskeyword
from operator import itemgetter
import itertools
import sys

__all__ = ['NamedTuple', 'namedtuple']


class NamedTuple(type):
    """Metaclass for a new subclass of tuple with named fields.

    >>> class Point(metaclass=NamedTuple):
    ...     _fields = ['x', 'y']
    ...
    >>> Point.__doc__                   # docstring for the new class
    'Point(x, y)'
    >>> p = Point(11, y=22)             # instantiate with positional args or keywords
    >>> p[0] + p[1]                     # indexable like a plain tuple
    33
    >>> x, y = p                        # unpack like a regular tuple
    >>> x, y
    (11, 22)
    >>> p.x + p.y                       # fields also accessable by name
    33
    >>> d = p._asdict()                 # convert to a dictionary
    >>> d['x']
    11
    >>> Point(**d)                      # convert from a dictionary
    Point(x=11, y=22)
    >>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
    Point(x=100, y=22)

    """

    def __new__(meta, classname, bases, classdict):
        if '_fields' not in classdict:
            raise ValueError("NamedTuple must have _fields attribute.")
        if tuple not in bases:
            bases = tuple(itertools.chain(itertools.repeat(tuple, 1), bases))
        for pos, name in enumerate(classdict['_fields']):
            classdict[name] = property(itemgetter(pos),
                    doc='Alias for field number {0:d}'.format(pos))
        classdict.update(meta.NAMESPACE)
        classdict['__doc__'] = '{0:s}({1:s})'.format(classname,
                repr(classdict['_fields']).replace("'", "")[1:-1])
        cls = type.__new__(meta, classname, bases, classdict)
        cls._make = classmethod(cls._make)
        return cls

    def _new(cls, *args, **kwargs):
        'Create new instance of {0:s}({1:s})'.format(cls.__name__, cls.__doc__)
        expected = len(cls._fields)
        received = len(args) + len(kwargs)
        if received != expected:
            raise TypeError('__new__() takes exactly {0:d} arguments ({1:d} given)'.format(received, expected))
        values = itertools.chain(args,
                (kwargs[name] for name in cls._fields[len(args):]))
        return tuple.__new__(cls, values)

    def _make(cls, iterable, new=tuple.__new__, len=len):
        'Make a new {0:s} object from a sequence or iterable'.format(cls.__name__)
        result = new(cls, iterable)
        if len(result) != len(cls._fields):
            raise TypeError('Expected {0:d} arguments, got {1:d}'.format(
                len(cls._fields), len(result)))
        return result

    def _repr(self):
        'Return a nicely formatted representation string'
        keywords = ', '.join('{0:s}={1!r:s}'.format(self._fields[i], self[i])
                for i in itertools.islice(itertools.count(), len(self._fields)))
        classname = self.__class__.__name__
        return '{0:s}({1:s})'.format(classname, keywords)

    def _asdict(self):
        'Return a new OrderedDict which maps field names to their values'
        return OrderedDict(zip(self._fields, self))

    def _replace(self, **kwargs):
        'Return a new {0:s} object replacing specified fields with new values'.format(self.__class__.__name__)
        result = self._make(map(kwargs.pop, self._fields, self))
        if kwargs:
            raise ValueError('Got unexpected field names: {0:r}'.format(kwargs.keys()))
        return result

    def _getnewargs(self):
        'Return self as a plain tuple. Used by copy and pickle.'
        return tuple(self)

    NAMESPACE = {
            '__slots__': (),
            '__new__': _new,
            '_make': _make,
            '__repr__': _repr,
            '_asdict': _asdict,
            '_replace': _replace,
            '__getnewargs__': _getnewargs,
    }


def namedtuple(typename, field_names, rename=False):
    """Returns a new subclass of tuple with named fields.

    >>> Point = namedtuple('Point', 'x y')
    >>> Point.__doc__                   # docstring for the new class
    'Point(x, y)'
    >>> p = Point(11, y=22)             # instantiate with positional args or keywords
    >>> p[0] + p[1]                     # indexable like a plain tuple
    33
    >>> x, y = p                        # unpack like a regular tuple
    >>> x, y
    (11, 22)
    >>> p.x + p.y                       # fields also accessable by name
    33
    >>> d = p._asdict()                 # convert to a dictionary
    >>> d['x']
    11
    >>> Point(**d)                      # convert from a dictionary
    Point(x=11, y=22)
    >>> p._replace(x=100)               # _replace() is like str.replace() but targets named fields
    Point(x=100, y=22)

    """
    if hasattr(field_names, 'split'):
        field_names = field_names.replace(',', ' ').split()
    if rename:
        names = list(field_names)
        seen = set()
        for i, name in enumerate(names):
            if (not all(c.isalnum() or c == '_' for c in name) or iskeyword(name)
                or not name or name[0].isdigit() or name.startswith('_')
                or name in seen):
                names[i] = '_%d' % i
            seen.add(name)
        field_names = tuple(names)
    for name in (typename,) + field_names:
        if not all(c.isalnum() or c == '_' for c in name):
            raise ValueError('Type names and field names can only contain alphanumeric characters and underscores: {0!r:s}'.format(name))
        if iskeyword(name):
            raise ValueError('Type names and field names cannot be a keyword: {0!r:s}'.format(name))
        if name[0].isdigit():
            raise ValueError('Type names and field names cannot start with a number: {0!r:s}'.format(name))
    seen_names = set()
    for name in field_names:
        if name.startswith('_') and not rename:
            raise ValueError('Field names cannot start with an underscore: {0!r:s}'.format(name))
        if name in seen_names:
            raise ValueError('Encountered duplicate field name: {0!r:s}'.format(name))
        seen_names.add(name)

    result = NamedTuple.__new__(NamedTuple, typename, (tuple, object), {'_fields': tuple(field_names)})

    # For pickling to work, the __module__ variable needs to be set to the frame
    # where the named tuple is created.  Bypass this step in enviroments where
    # sys._getframe is not defined (Jython for example) or sys._getframe is not
    # defined for arguments greater than 0 (IronPython).
    try:
        result.__module__ = sys._getframe(1).f_globals.get('__name__', '__main__')
    except (AttributeError, ValueError):
        pass

    return result

Ultimately I think Raymond is correct that while the exec version is clear and maintainable, other implementations such as my metaclass version are not. Even so, I learnt a lot in the process of making it, which was the point of the exercise.

Posted 3 years ago

Another Python REPL enhancement

I’ve previously written about enhancing the Python interpreter for REPL. Today I added a new feature to my startup.py script thanks to this tweet from Michael Ford.

I’ve added the following section to the PYTHONSTARTUP script to replace the default pager less the more command.

try:
    import functools
    import pydoc
except ImportError as err:
    pass
else:
    pydoc.pager = functools.partial(pydoc.pipepager, cmd='more')

    del functools, pydoc

The more command is preferable (to me) because it doesn’t clear the screen after the pager terminates. This is useful is conjunction with the help function, when the pager is exits the last help documentation remains visible.

While I could have achieved the same result by setting the PAGER environment variable, this would result in replacing less with more in every instance where PAGER is honoured. Using startup.py allows me to use more just within the Python interpreter. 

Posted 3 years ago

A universal Python interpreter?

It started when @zedshaw lamented that PyPy didn’t support both Python2 and Python3 code a the same time. (PyPy currently only implementes Python 2.7) So an interesting thought exercise; what would the experience be like working with both Python2 and Python3 in the same codebase? How would it work? Given that both my hands were largely spent looking after a small baby today, a thought exercise was just what I needed.

After some mental games, I focused on having the integration of the two major Python versions introduce as little complexity in the language as possible. Complexity is a barrier to those teaching and learning the language. Complexity also has a habit of obscure bugs. As The Zen of Python says, “simple is better than complex”.

With that philosophy in mind, the first question to answer is ‘how would the compiler know which Python version this code was written for?’ A few ideas came to mind:

  • Using the file extensions .py2 and .py3.
  • Automatically detecting the language version.
  • Through a magic comment like source encodings.
  • Via an a special attribute such as ‘__interpreter__’.
  • Assuming a default language version unless otherwise directed.

Having a default version removes ambiguity for the developer. The code will be interpreted as a known Python version unless otherwise directed. This also aids learning Python; teachers and students can focus on a single version until it’s time to introduce the complexity of an alternate, incompatible, language version.

Using an attribute such as ‘__interpreter__’ (which is similar to __name__, __all__ and __metaclass__) to change the Python version used to interpret a block of code has a couple of desirable advantages over the dismissed alternatives.

Firstly, it’s backwards compatible with current versions of Python. All current interpreters will treat the presence of __interpreter__ as a variable assignment. Only the imaginary universal interpreter will observe it’s presence.

Secondly, the attribute could be used to annotate at any level of scope, not just at module scope. Meaning that class and functions could be marked to be interpreted using a different Python language version.

Here’s an overly complicated example of how this might look:

"Simple universal Python module, defaulting to the Python3 language"

__interpreter__ = 3

class Demo:
    "This will be a new style class as it's interpreted as Python3 code."

def hello(name):
    "This is a Python2 function, therefore print is a statement."
    __interpreter__ = 2
    print "hello %s" % name

if __name__ == "__main__":
    hello("world")

Assuming that the default language version is Python3, then the module level __interpreter__ attribute above is redundant.

Now that my thought experiment has defined how to determine which version of Python is being interpreted; next consider the  changes to the standard library. From Python2 to Python3 a significant reorganisation of the standard library took place. What should the universal interpreter do about this?

The superficial answer is ‘nothing’. As the interpreter knows what version of Python is being interpreted it knows which version of the standard library to use. The universal interpreter can maintain multiple versions of the standard library inline with the language versions supported. This is fine until one version of the language needs to call into another.

In these cases some casting will likely need to take place. A simple case of this is the change to integers in Python3 where (simplistically) the int type was replaced by the long type.

The complex case is the change to text in Python3. As the universal interpreter knows when it’s changing from a Python2 to a Python3 code segment it would be possible to automatically encode from Python3 unicode text string to a Python2 byte string. I don’t think this would be a good idea.

One obvious problem is how to pass a unicode string to Python2 from Python3. But the deeper issue I have with automatically encoding and decoding unicode is the lack of transparency to the developer. I’ve grown accustomed to ‘explicit is better than implicit’. Automatically encoding/decoding doesn’t fall into the syntactic sugar category either. It could easily mask a subtle bug or introduce a serious performance issue.

However, if a workable solution to passing text objects between Python2 and Python3 code segments were to be found, a warning for this casting taking place would be valuable to the developer. If a tight loop involved calling between Python2 and Python3 with text objects all the encoding and decoding taking place could constitute a performance issue. There is precedence of warnings that are ignored by default with the DeprecationWarning warning.

That’s about as far as I got with my thought exercise. Other than contemplate that, even though Python3 support in PyPy hasn’t been an issue to date, Python3 is now in year three of a five year plan. I can’t help but think that some time in the next two or three years, Python3 support in PyPy will be an issue.