Data Structures for Andela: Iterators (part 1)

As part of Stack Overflow's pro-diversity initiatives, I have been helping Andela by mentoring one of their fellows, Bosun, in data structures and algorithms. In the Stack Overflow "default open" style, I am putting some of our notes on the internet, in the hope they will be useful to more people. Most of the code is either pseudo-code or JavaScript, which we chose because it's simple and easy to access and everyone pretty much knows the basics of it.

Tree

Andela notes index

  1. Iterators (part 1)
  2. Iterators (part 2)
  3. Stacks
  4. Queues
  5. Implementing a calculator (part 1)
  6. Implementing a calculator (part 2)
  7. Trees & Graphs
  8. Compression: RLE encoding and decoding
  9. Compression: Huffman encoding (part 1)
  10. Compression: Huffman encoding (part 2) & decoding

In bold the current part, links point to the other available parts, in italics the parts not yet available.

A disclaimer

IANACS! A lot of the following may be wrong, simplistic or offensive to c++ developers. That's because I am a self-taught frood that knows where his towel is, and not one of those fancy-pants computer scientists. If you encounter any problem, ask your parents if this content is suitable for your programming religion.

Iterators (aka enumerations)

Iterators or enumerations are sets with an order. In practice, they are sets of elements (objects, values...), which implement the following behaviors:

MoveForward(); // prepares the next element
Current();     // returns the current element
IsEnd();       // returns true if at the last element, false otherwise

In .NET, MoveForward() and IsEnd() are combined in one method in the IEnumerator interface. In JavaScript, the three methods are collapsed into one, next(), which advances to the next element, and returns a composite object of the form { value: <current>, done: <isEnd> }.

What's cool about them

They are a great starting point to build a library of functions you can chain toghether. If you ever met LINQ or jQuery you might have seen them. By the way, JavaScript natively offers some support, but they are functions which are easy enough to write and extremely powerful once available.

There are two categories of functions that we can compose:

  • functions that work on the single elements, and can be implemented lazily.
  • functions that work by iterating the whole set (and thus are functions of the whole set)

In this blog post I'll focus on the lazy functions, and I'll cover the other functions in the next part.

Limitations

There are two main limitations to iterators. Firstly they can, in general, be iterated only once. Which is simple English for the posh "not idempotent". Think of them as pointers to items in a set: the element they point to might be mutated or not available for another iteration; or they could be generators: the element might not be retrievable anymore; or they could be reading a IO port, whose value is effectively not under the control of the iterator.

Secondly, they offer no random access.

Oh, but they can be infinite.

A sample iterator

A nice way of implementing a base iterator, which uses JavaScript 6 is using generator functions. These allow us to use the yield keyword which makes building enumerators a breeze. This iterator is infinite and returns all natural numbers (ℕ)

function* numbers() {
    var i = 0;
    while (true) {
        yield i++;
    }
}

this can then be used as such:

var num = numbers();

console.log(num.next());  // { value: 0, done: false }
console.log(num.next());  // { value: 1, done: false }

This returns an object with two properties: value and done. These are equivalent to the two methods I described above.

Lazy composable functions (aka lazy higher-order functions)

We can start building functions which we can compose upon the above sample. As you will note, all these functions operate on iterators, and are iterators themselves. They all take a function (with a particular signature) and call it by passing it the iterator elements one by one. These two properties (being iterators, and operating on one element at a time) make it so that no code is actually executed until the first call to the .next() method of the iterator. In many cases this represents a considerable saving on execution times.

Filters

Filters are functions that operate on iterator elements, taking in input a function with the following signature:

bool filter_function( item )

thus, a filter (for example created by extending Object as it seems it's the only common ancestor of iterators in JavaScript):

Object.prototype.filter = function* ( filter_function ) {}

A filter is thus an iterator as well. It should be implemented so that it only returns items from another iterator for which filter_function returns true. In this way, only one filter is ever necessary: it can be applied on any iterator (even on itself!) and the logic of what should be filtered is all in the filter_function which is a parameter.

For example, let's say you have an iterator that returns plane ticket reservations, and you want only the reservations for a particular flight. You can pass a function that, given a reservation, returns true if it belongs to the correct flight, to a filter.

var foo123Flights = ticketReservations().filter(function(ticket){
    return ticket.flight == "FOO123";    
});

Projections

A projection, or map, is a function that operates on iterator elements, taking in input a function with this signature:

<any> project_function( item )

Projections are also iterators. They are used to change the elements of an iteration to something else. The signature of a projection is this:

Object.prototype.project = function* ( project_function ) {}

It should be implemented as an iterator whose elements are the results of passing the items of the iterator it operates on to project_function.

Suppose you have an iterator returning fulfilled orders and want to generate a list of invoices from those orders. You can pass a function that, given an order, creates an invoice, to a project.

var invoices = orders().project(function(order){
    createInvoice(order);
});

Flatten

A flatten is a function that operates on iterator elements, and that takes in input a function looking like this:

<Iterator> flatten_function( item )

Flattens are iterators and they are used to merge sequences of sequences into a single sequence. In a way it's the "opposite" of grouping by some characteristic. Imagine having an iterator returning invoices, and wanting to extract all invoice entries in all invoices. To do so we pass a function that, given an invoice, returns all the entries. flatten takes care of "glueing" all the entries.

var invoiceEntries = invoices().flatten(invoice) {
    return function* entries() {
        for(var entry in invoice.entries) {
            yield entry;
        }
    }();
});

Let's code!

I've put together a project on GitHub which you can clone and complete. It already has tests and solutions if you want to cheat -- but I'd advise you at least try to implement these operators yourself for max enjoy.

The exercise boils down to completing the functions in this file

Object.prototype.project = function* (func) {
    if (this.constructor.name=='GeneratorFunctionPrototype') {
        // code here
    }
};

Object.prototype.filter = function* (func) {
    if (this.constructor.name=='GeneratorFunctionPrototype') {
        // code here
    }
};

Object.prototype.flatten = function* (func) {
    if (this.constructor.name=='GeneratorFunctionPrototype') {
        // code here
    }
};

How to set up your system

  1. Install a recent version of node.js (link)
  2. Fork https://github.com/sklivvz/andela-data-structures on github
  3. Clone in your preferred folder
  4. From a terminal npm install restores the test packag
  5. npm start runs the code

You should get something like

screenshot

Edit the file ./parts/01-iterators.js with your solution and run again until all tests pass.


External links:


(discuss on hacker news)


I am the Chief R&D at BaxEnergy, developer, hacker, blogger, conference lecturer. Bio: ex Stack Overflow core, ex Toptal core.

Read more

Newest Posts

TDD and the Zero-Defects Myth

TDD can’t guarantee zero-defects. Let us debunk this software development myth.

Read more
What can Stack Overflow learn from ChatGPT?

Stack Overflow could benefit from adopting a using conversational AI to provide specific answers

Read more
Fan mail

Multiple people with my name use my email address and I can read their email, chaos ensues!

Read more
Intelligent Trip

After years of building, our top-notch consultancy to help start-ups and scale-ups create great, scalable products, I think it is high time I added an update to how it is going and what's next for us.

Read more
Guest blog: Building, in partnership with communities by Shog9

A lesson in building communities by Stack Overflow's most prominent community manager emeritus, Shog9

Read more

Gleanings

hackurls - news for hackers and programmers
Claudio Santini • Mar 27, 2017

$ wget -O - hackurls.com/ascii | less

Read more…