• 12

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191


File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

name Punditsdkoslkdosdkoskdo

Does enumerate() produce a generator object?

As a complete Python newbie, it certainly looks that way. Running the following...

x = enumerate(['fee', 'fie', 'foe'])
# Out[1]: (0, 'fee')

# Out[2]: [(1, 'fie'), (2, 'foe')]

# Out[3]: []

... I notice that: (a) x does have a next method, as seems to be required for generators, and (b) x can only be iterated over once, a characteristic of generators emphasized in this famous python-tag answer.

On the other hand, the two most highly-upvoted answers to this question about how to determine whether an object is a generator would seem to indicate that enumerate() does not return a generator.

import types
import inspect

x = enumerate(['fee', 'fie', 'foe'])

isinstance(x, types.GeneratorType)
# Out[4]: False

# Out[5]: False

... while a third poorly-upvoted answer to that question would seem to indicate that enumerate() does in fact return a generator:

def isgenerator(iterable):
    return hasattr(iterable,'__iter__') and not hasattr(iterable,'__len__')

# Out[8]: True

So what's going on? Is x a generator or not? Is it in some sense "generator-like", but not an actual generator? Does Python's use of duck-typing mean that the test outlined in the final code block above is actually the best one?

Rather than continue to write down the possibilities running through my head, I'll just throw this out to those of you who will immediately know the answer.

While the Python documentation says that enumerate is functionally equivalent to:

def enumerate(sequence, start=0):
    n = start
    for elem in sequence:
        yield n, elem
        n += 1

The real enumerate function returns an iterator, but not an actual generator. You can see this if you call help(x) after doing creating an enumerate object:

>>> x = enumerate([1,2])
>>> help(x)
class enumerate(object)
 |  enumerate(iterable[, start]) -> iterator for index, value of iterable
 |  Return an enumerate object.  iterable must be another object that supports
 |  iteration.  The enumerate object yields pairs containing a count (from
 |  start, which defaults to zero) and a value yielded by the iterable argument.
 |  enumerate is useful for obtaining an indexed list:
 |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
 |  Methods defined here:
 |  __getattribute__(...)
 |      x.__getattribute__('name') <==> x.name
 |  __iter__(...)
 |      x.__iter__() <==> iter(x)
 |  next(...)
 |      x.next() -> the next value, or raise StopIteration
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  __new__ = <built-in method __new__ of type object>
 |      T.__new__(S, ...) -> a new object with type S, a subtype of T

In Python, generators are basically a specific type of iterator that's implemented by using a yield to return data from a function. However, enumerate is actually implemented in C, not pure Python, so there's no yield involved. You can find the source here: http://hg.python.org/cpython/file/2.7/Objects/enumobject.c

  • 34
Reply Report
      • 2
    • @JoshO'Brien I'm not aware of any way to determine if the data being returned by a given iterator is obtained lazily or if it's loaded completely into memory. An iterator just provides a way to iterate over an object once (and only once) one step at a time, by exposing a next() method. Exactly what goes on inside next() is unknown to the caller.
      • 1
    • @JohnY -- Thanks for leaving the glossary link in place. Both of your comments help a lot. Was just off comparing iter(range(9)), iter(xrange(9)), iter([2,1]), iter((2,1)), and iter(enumerate([2,1])), which was quite enlightening. Still not totally sure why enumerate and generator objects get "used up" by being iterated across whereas the other types don't, but I suppose that's just a matter of their implementation and of decisions on the part of their authors that that behavior was desirable. Edit: OK -- scratch that final bit. I just figured it out. Now it all makes good sense
      • 2
    • Very interesting. Just to be sure I've got this straight, iterators (or iterables) are defined by their behavior, whereas there are actual "generator" and "enumerate" classes that define those objects. "enumerate"-class objects share many behaviors with "generator"-class objects, but so do objects of many other classes. Is there an simple way to find out whether any one of these "generalized generator" classes generates its elements on the fly (like a true generator), rather than storing them in memory?
      • 1
    • @JoshO'Brien A little nitpick: an "iterable" is any object that can be iterated over, e.g. list, dict, str, file. An "iterator" is the object that's actually created to iterate over an iterable. For most Python containers, you get it by calling iter(obj). This happens implicitly when you do for x in obj. Edit: I see John Y. beat me to this point :)

Testing for enumerate types:

I would include this important test in an exploration of the enumerate type and how it fits into the Python language:

>>> import collections
>>> e = enumerate('abc')
>>> isinstance(e, enumerate)
>>> isinstance(e, collections.Iterable)
>>> isinstance(e, collections.Iterator)

But we see that:

>>> import types
>>> isinstance(e, types.GeneratorType)

So we know that enumerate objects are not generators.

The Source:

In the source, we can see that the enumerate object (PyEnum_Type) that iteratively returns the tuple, and in the ABC module we can see that any item with a next and __iter__ method (actually, attribute) is defined to be an iterator. (__next__ in Python 3.)

The Standard Library Test

So the Abstract Base Class library uses the following test:

>>> hasattr(e, 'next') and hasattr(e, '__iter__')

So we know that enumerate types are iterators. But we see that a Generator type is created by a function with yield in the documentation or a generator expression. So generators are iterators, because they have the next and __iter__ methods, but not all iterators are necessarily generators (the interface which requires send, close, and throw), as we've seen with this enumerate object.

So what do we know about enumerate?

From the docs and the source, we know that enumerate returns an enumerate object, and we know by definition that it is an iterator, even if our testing states that it is explicitly not a generator.

We also know from the documentation that generator types simply "provide a convenient way to implement the iterator protocol." Therefore, generators are a subset of iterators. Furthermore, this allows us to derive the following generalization:

All generators are iterators, but not all iterators are generators.

So while we can make our enumerate object into a generator:

>>> g = (i for i in e)
>>> isinstance(g, types.GeneratorType)

We can't expect that it is a generator itself, so this would be the wrong test.

So What to Test?

And what this means is that you should not be testing for a generator, and you should probably use the first of the tests I provided, and not reimplement the Standard Library (which I hope I can be excused from doing today.):

If you require an enumerate type, you'll probably want to allow for iterables or iterators of tuples with integer indexes, and the following will return True:

isinstance(g, collections.Iterable)

If you only want specifically an enumerate type:

isinstance(e, enumerate)

PS In case you're interested, here's the source implementation of generators: https://github.com/python/cpython/blob/master/Objects/genobject.c
And here's the Generator Abstract Base Class (ABC): https://github.com/python/cpython/blob/master/Lib/_collections_abc.py#L309

  • 12
Reply Report

Is it in some sense "generator-like", but not an actual generator?

Yes, it is. You shouldn't really care if it is a duck, but only if it walks, talks, and smells like one. It just as well be a generator, shouldn't make a real difference.

It is typical to have generator-like types instead of actual generators, when you want to extend the functionality. E.g. range is also generator-like, but it also supports things like y in range(x) and len(range(x)) (xrange in python2.x).

  • 4
Reply Report

You can try a few things out to prove to yourself that it's neither a generator nor a subclass of a generator:

>>> x = enumerate(["a","b","c"])
>>> type(x)
<type 'enumerate'>
>>> import types
>>> issubclass(type(x), types.GeneratorType)

As Daniel points out, it is its own type, enumerate. That type happens to be iterable. Generators are also iterable. That second, down-voted answer you reference basically just points that out somewhat indirectly by talking about the __iter__ method.

So they implement some of the same methods by virtue of both being iterable. Just like lists and generators are both iterable, but are not the same thing.

So rather than say that something of type enumerate is "generator-like", it makes more sense to simply say that both the enumerate and GeneratorType classes are iterable (along with lists, etc.). How they iterate over data (and shape of the data they store) might be quite different, but the interface is the same.

Hope that helps!

  • 3
Reply Report

Trending Tags