Lazy iterators in PHP and Python

I recently watched Clean Code Episode 34: Pattern Apocalypse, which is about several design patterns, including the Iterator. Afterwards I took some example code from this episode and tried writing it first in PHP and then in Python. While there is no rocket science involved in any of it, I did hit a few bumps and learned a thing or two. In this blog post I’m following the Share What You Learn pattern from Apprenticeship Patterns by sharing my experiences.

Printing the squares of integers

One way of writing an application that prints the squares  of the first n integers (in Java) is as follows:

In this Clean Code episode, Unclebob explains that he finds this code “quite awefull”, even though it is simple, because the calculation is buried within the loop, driven by it and coupled to it. And while not explicitly stated in the episode, there is also the binding of the algorithm to the output call. This coupling can be avoided using iterators, as follows:

PrintAll takes a collection, and speaks for itself when it comes to its function. “take” returns a collection with the first n elements from the collection it got. SquaresOf returns a collection of integers which are the squares of those in the collection it got. “Integers.all” returns a collection of all positive integers. Of course, this last collection is an Iterator and not some kind of infinite list. SquaresOf is also an Iterator, which lazily applies the square function on an element at the time of request.

One thing that stood out to me was how much scaffolding code Unclebob had written to create the constructs used here. (These are not listed in this blog post, if you’re curious, go watch the episode itself.) As I’m not a Java programmer, it’s unclear to me how much of that was needed, and how much was written for the sake of explaining the relevant Iterator concepts in a manner easily consumable by non-Java programmers. This made me wonder how much of that I’d need in PHP, which then led me to writing the code in PHP.

Lazy Iterators in PHP

Even though PHP is the primary language I use at work, and even though I think the Iterator pattern is quite cool and useful, sadly enough I do not get to use it often. This is due to a combination of PHPs horrid collection support, the limitations of type inference in PHP (or the lack when it comes to iterators) and me being constrained to using an outdated version of the language most of the time. That makes it all the more fun to play with Iterators, as for instance I did with what I wrote about in my “Some fun with iterators” post.

I started with replicating the example code from the outside, so starting with the “print all” part. I wanted to know if there is a sane way of doing that without using a loop.

Unfortunately you cannot do that, so I had to create a new function. My first inclination was to create a named “printLine” function, which could then also be used elsewhere. That’d however force me to refer to this function using a string, which is another one of my PHP pet peeves, since it breaks static code analysis. So I settled for the inline notation, which unfortunately is also more verbose than either the Java or Python one.

As a next step, I went for implementing the “take” function.

Unclebob implemented it using a for loop with a counter. I figured such a loop can be avoided in PHP by using the LimitIterator provided by PHPs SPL.

As soon as I did that, the output of my script disappeared. The cause of this resides in the printAll function, which uses array_walk on an Iterator. I knew I’d need to change that when I wrote it, as PHPs arrays and Iterators don’t work well together, though then it worked without me making the change and I forgot all about it. Turns out that array_walk works fine with an ArrayIterator, though not with an ArrayIterator wrapped in a LimitIterator. I decided to write a collection_walk that would work with both arrays and iterators:

With that out of the way and writing an additional collection_to_iterator helper, the following worked fine for creating the take function:

I did waste a few seconds on incorrectly assuming that LimitIterator would take the limit after the collection, rather than also having an offset in there.

As a next step I decided to go with the “all integers” part, as otherwise I’d have to write the “squares of iterator” without being forced to make it lazy.

As of PHP 5.4, this is really easy to implement using generators:

The final step was of course adding in the squaresOf function:

All this iterator needs to do is perform a lazy map. As far as I can tell, PHP does not provide any lazy map functions, so I created my own. Again, generators make this rather easy:

View the full PHP code on GitHub.

Lazy Iterators in Python

While I know Python basics, its not a language I regularly work in. I also implemented this algorithm in Python to generally freshen up my Python skills and perhaps learn some new language features I was not yet aware of. Furthermore after having to deal with PHPs nice collection handling, I was curious just how much smoother this would go in Python.

I took a similar approach and started with the “print all” function:

Before ending up with that approach, I looked for something equivalent to PHPs array_walk. I tried using the map function by doing map(print, collection) , though that does not work.

To implement the take function I created

Which is rather silly to do in Python, though I decided to stick to making it work with the same call code as in the example, and then to look at how to make that part of the code more idiomatic to Python later on. Also, this makes the take function non-lazy, so would need change later on anyway.

The lazy iterator of all positive integers was as easy to implement as in PHP:

That brought me to

Which went boom with the following error:

TypeError: ‘generator’ object is not subscriptable

I changed the take function as follows:

Which resulted in some kind of progress: another error message:

AttributeError: ‘generator’ object has no attribute ‘takewhile’

That puzzled me, as I figured takewhile is a method iterators have, and generators are iterators. I tried wrapping the thing in the iter function, which had no result on the error message. While searching for a solution, I found out that there is such a thing as iterator expressions, which are very similar to list comprehensions, in this excellent post on iterators and generators. Using one of those I rewrote the take function in a working and lazy fashion:

While writing that code I did wonder if that would actually work as intended, or if it’d continue iterating over the generator forever. As I got the output I expected, I assumed it worked correct, until a minute later or so when I realized my IDE was was still running the script 🙂

After some more looking through the docs I realized that the takewhile function that I had tried to use earlier is not a method on iterators at all, but a function that takes an iterator. The facepalm was real. The reason I figured it was a method is because of things such as join being as method on string. Anyway, finally I got a working version of the take function:

For the “squares of” function I was happy to have found a use case for the newly learned iterator expressions:

Of course at that point I realized that the take function was in fact still wrong, as it compared the squared number the the number count limit rather than an actual number count. So I added a counter to be used in the function passed to takewhile:

I was surprised to find that “i” was undefined in should_take, as I figured it’d just be accessible from the parent function scope a-la-JavaScript. Which made me discover the “nonlocal” keyword.

With that the algorithm works. The one part I’m still unhappy about it the implementation of the take function. Surely this can be done nicer. The inner should_take function has an unused parameter and using takewhile in this manner seems like a hack.

View the full Python code on GitHub.

6 thoughts on “Lazy iterators in PHP and Python”

  1. While reading through the description of the itertools module I saw the islice function that probably does what your “take” function does.

    And your Integers class can probably shortened like this:

    class Integers:
    def all():
    return itertools.count(start=1, step=1)

    1. Yeah, both are working and are certainly better than implementing it myself. Like this all explicit assignment is gone from the code \o/

      The “islice” name is unfortunate though, as I read it as “is slice”, and thus figured it was not of use to me.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.