Saturday, November 23, 2013

Exploring ways to express an algorithm

Suppose you want to rewrite the following loop
  
struct Shape {
    int rgb_color() const;
};

map<int, Shape*> id_to_shape;
vector<int> bright_colors;


for( map<int, Shape*>::iterator it=id_to_shape.begin(); it != id_to_shape.end(); ++it )
{
    int color = it->second->rgb_color();
    if (color_is_bright (color))
        v.push_back(bright_colors);
}
  
Here follows a tour in ways that this loop could be expressed.


Quick and Dirty

The loop could easily be coded like
  
void push_back_to_vector_if_color_is_bright (const pair<int, Shape*>& entry, vector<int>& colors) {
    Shape* shape = entry->second;
    int color = shape->rgb_color();
    if (color_is_bright(color))
        colors.push_back(color);
}

for_each(id_to_shape.begin(), id_to_shape.end(),
         bind(push_back_to_vector_if_color_is_bright, _1, bright_colors));
  
which is relatively straightforward, but push_back_to_vector_if_color_is_bright() will hardly ever be reused.
- it is tightly coupled to std::pair<int,Shape*>
- it is tightly coupled to std::vector<int>

If we choose to follow this path, the best that can be done make the for_each more readable, is to write a function object:
  
struct push_back_to_vector_if_color_is_bright {
    vector<int> &m_dest;
    push_back_to_vector_if_color_is_bright (vector<int>& colors) : m_dest(colors) {}

    void operator() (const pair<int, Shape*>& entry) {
        Shape* shape = entry->second;
        int color = shape->rgb_color();
        if (color_is_bright(color))
            m_dest.push_back(color);
    }
};

for_each(id_to_shape.begin(), id_to_shape.end(),
         push_back_to_vector_if_color_is_bright(bright_colors));
  

Exploring alternative ways to express

I think that since copying is involved, a std::copy_if() would be ideal to express it. However, there is a transformation involved here (from pair<int,Shape*> to Shape* then to color), so copy_if is not easy to implement.

As a second best, std::transform() could be used. I could either
  1. transform map value_types to Shapesand then insert to the vector if the color of the shape is bright
  2. transform map value_types to Shapes to colors, and then insert to the vector if color is bright
option 1
 
transform(id_to_shape.begin(), id_to_shape.end(),
          conditional_back_inserter (colors, color_of_shape_is_bright),
          select2nd() );
  
option 2
  
transform(id_to_shape.begin(), id_to_shape.end(),
          conditional_back_inserter (colors, color_is_bright),
          map_entry_to_color );
  
out of these two options, I'd reject the 2nd because converting from specific map entry to a color is not a useful piece of code, and will be thrown away on next change of structure.

Option 1 still has a tightly coupled function, color_of_shape_is_bright(), which could be decomposed like so:
  
transform(id_to_shape.begin(), id_to_shape.end(),
          conditional_back_inserter (colors, bind(color_is_bright, bind(&Shape::rgb_color, _1)),
          select2nd() );
  
Now, this piece of code contains only reusable snippets:
  • color_is_bright()
  • select2nd(), which is SGI's implementation that selects the second from a pair (hence the value from a map). A select1st() also exists, to select the key from a map.
  • conditional_back_inserter(), which we'll have to implement. If we do so, we should also provide a conditional_inserter, and a  conditional_front_inserter, for completeness in our library. The downside of this is that this is custom code and has no value outside our codebase.

transformation iterators

It would be nice if the iterators were clever enough to do the transformation themselves. Then a copy_if algorithm could be put to use:

    copy_if (begin_map_iterator_to_color, end_map_iterator_to_color,
             back_inserter(colors), 
             color_is_bright );

I recently discovered that boost has a number of flexible iterators. Here's what you could do with a boost::transform_iterator:
  
copy_if (make_transform_iterator(id_to_shape.begin(), pair_to_color)
         make_transform_iterator(id_to_shape.end(), pair_to_color),
         back_inserter(colors), 
         color_is_bright );

int  pair_to_color( const pair<int,Shape*> &p ) {
    return p->second->rgb_color();
}
  
in this case the pair_to_color() is not reusable, and the declaration of the iterators is a bit wooly.

we could do a bit better by
  
auto to_color = make_adaptor (pair_to_color);
copy_if (to_color(id_to_shape.begin()), to_color(id_to_shape.end()),
         back_inserter(colors), 
         color_is_bright );
  
where make_adaptor() produces a helper object that reduces the need to write the make_transform_iterator twice.

Taking things further, I try to break down the pair_to_color() into reusable items:
  
auto to_color = make_adaptor (bind(&Shape::rgb_color, bind(select2nd(), _1));
copy_if (to_color(id_to_shape.begin()), to_color(id_to_shape.end()),
         back_inserter(colors), 
         color_is_bright );
  


Concluding

Although the example does not fit all cases you might face, you can see that there is some choice in expression. 

First, you can choose on how to name your algorithms. for_each is more technical, less expressive. transform still involves some consideration that the container is a map. copy_if feels closer to what we want to achieve.
  
for_each(id_to_shape.begin(), id_to_shape.end(),
         bind(push_back_to_vector_if_color_is_bright, _1, bright_colors));
  
  
transform(id_to_shape.begin(), id_to_shape.end(),
          conditional_back_inserter (colors, color_of_shape_is_bright),
          select2nd() );
  
  
auto to_color = make_adaptor (pair_to_color);
copy_if (to_color(id_to_shape.begin()), to_color(id_to_shape.end()),
         back_inserter(colors), 
         color_is_bright );
  
moving down, the complexity of the loop is reduced, and it is shifted into the iterators or the transformation function. 


You can now choose which style reads better in your case.


You might have noticed that I did not choose the versions that use bind, like the one below:
  
auto to_color = make_adaptor (bind(&Shape::rgb_color, bind(select2nd(), _1));
copy_if (to_color(id_to_shape.begin()), to_color(id_to_shape.end()),
         back_inserter(colors), 
         color_is_bright );
  
Although such versions use of smaller and reusable functions, I feel that the heavy use bind reduces readability. So I prefer to write the more expressive

    auto to_color = make_adaptor (pair_to_color);

instead of 

    auto to_color = make_adaptor (bind(&Shape::rgb_color, bind(select2nd(), _1));

I tend to prefer writing function objects to binding functions, because they read better and tend to perform better.


more to read:

different ways to declare transform_iterators

No comments:

Post a Comment