Tuesday, November 12, 2013

Moving from legacy data structures to STL containers

When one is introduced to the STL containers there is a temptation to switch from the non-homogeneous legacy C-based data structures to the more smooth and standard containers. The existing code cannot magically turn all Trees to std::maps, DynamicArrays to std::vectors, LinkedLists to std::lists, so old code will probably still use old data structures.

So, a question arises:

  • I have C and C++ functions that take Tree*, DynamicArray*, etc as arguments. Isn't it going to be awkward to introduce functions that now take vector, map, deque as well ?
  • Do I have to make a version of the function for each data type ?


The quick answer

My guess is that most of the times, you don't provide two versions of a function. Either you provide a version that takes an array, or a dynamic array, or a tree.

To call such functions, user either

  • created the structure required by your function and called it, or
  • copied his existing structure to your structure before calling it. 
For cases such as these, you will be doing no harm in creating a function that has a suitable data structure for you.

However, this is not the proper way of designing functions (or algorithms)
  •  If a function needs a container to perform a job, why should this be a vector and not a deque, or a list, or a set ?
Think about it; whenever you write some pseudo-code (aka algorithm), you almost never describe a data structure for it. Why should your code have this constraint ?

The proper answer

The right way to approach functions that have a container as an argument, it to build a generic one. Follow the trend of the STL algorithms:
  • find_if is not written for a specific data type only.
  • find_if is not overloaded for each container by rewriting it.
You should write functions that take generic iterators and not a specific container as arguments. This way, 
  • everybody's code runs faster, since the original data structure doesn't have to copied anymore to adapt a function's arguments
  • writing user code will be easier, because they will not have to think of the algorithms as a constraint for what data structure they want to use.


Isn't this an overkill ?

Just because a function can be generic doesn't mean you always have the time to do it. 
In such cases you can follow the scout's rule:
whenever you spend some time on some field, you should leave it cleaner than when you arrived
Make the function generic the next time you need to use it, but it's arguments are not suitable for you.

No comments:

Post a Comment