C++ Standard Template Library Part III - Algorithms

C++ Standard Template Library Part III - Algorithms

This is the third part to a series on the C++ Standard Template Library (STL). We have previously discussed containers and iterators. This article discusses C++ STL algorithms, which are operations that act on the containers via the iterator concept.

STL algorithms are extremely useful because they reduce or eliminate the need to 'reinvent the wheel' when implementing common algorithmic functionality. In quantitative finance coding they can save a significant amount of time.

Many quant finance applications make use of legacy C++98/C++03 code and cannot make use of the new C++11 standard. Hence, this article will solely concentrate on algorithms included prior to the introduction of C++11. Later articles will discuss algorithms added with the C++11 standard.

While some of the algorithms are designed to modify the individual values within ranges (certain STL containers or arrays of values), they do not modify the containers themselves - i.e. there is no reallocation or size modification to the containers.

Algorithm Categories

To use these algorithms it is necessary to include the <algorithm> header file. For the numeric algorithms, it is necessary to include the <numeric> header file. As with containers and iterators, algorithms are categorised according to their behaviour and application:

  • Nonmodifying algorithms - Nonmodifying algorithms do not change the value of any element, nor do they modify the order in which the elements appear. They can be called for all of the standard STL container objects, as they make use of the simpler forward iterators.
  • Modifying algorithms - Modifying algorithms are designed to alter the value of elements within a container. This can either be done on the container directly or via a copy into another container range. Some algorithms classed as 'modifying' algorithms can change the order of elements (in particular copy_backward()), but most do not.
  • Removal algorithms - Removal algorithms are, by definition, modifying algorithms, but they are designed to remove elements in a container or when copying into another container.
  • Mutating algorithms - Once again, mutating algorithms are modifying algorithms, but they are designed specifically to modify the order of elements (e.g. a random shuffle or rotation).
  • Sorting algorithms - Sorting algorithms are modifying algorithms specifically designed for efficient sorting of elements in a container (or into a range container).
  • Sorted range algorithms - Sorted range algorithms are special sorting algorithms designed to function on a container which is already sorted according to a particular sorting criterion. This allows for greater efficiency.
  • Numeric algorithms - Numeric algorithms are designed to work on numerical data in some fashion. The principal algorithm in this category is accumulate(), which allows mathematical operators to be applied to all elements in a container.

We'll now take a look at these categories in depth. If you would like to find out more about each subsequent algorithm, you can click on the name of any algorithm below.

Nonmodifying Algorithms

  • for_each() - This is possibly the most important algorithm in this section, as it allows any unary function (i.e. a function of one argument) to be applied to each element in a range/container. Note that this function can actually also be modifying (hence why it is included below). It is often better to use a more specific algorithm, if one exists, than to use this, as specialist implementations will be more efficient.
  • count() - This returns the number of elements in a range or container.
  • count_if() - This counts how many elements in a range or container much a particular criterion.
  • min_element() - Returns the element that has the smallest value, making use of the < relation to perform comparison. It can accept a custom binary function to perform the comparison instead.
  • max_element() - Returns the element that has the largest value, making use of the > relation to perform comparison. It can accept a custom binary function to perform the comparison instead.
  • find() - Finds the first element in a range or container that equals a passed value.
  • find_if() - Finds the first element in a range or container that matches a particular criterion, rather than a passed value.
  • search_n() - This is like find AND find_if except that it looks for the first $N$ occurances of such a value OR the first $N$ occurances where a relational predicate is met.
  • search() - This searches for the first occurance of a subrange within a range/container and can do so either by first/last value of the subrange or via a predicate matching all the values of the desired first/last subrange.
  • find_end() - Similar to search, except that it finds the last occurance of such a subrange.
  • find_first_of() - Finds the first element in a subrange of a range or container. Can make use of a binary predicate function, otherwise uses direct value.
  • adjacent_find() - Returns the location (an iterator) to the first matching consecutive pair of values in a range/container. Can also match via a binary predicate.
  • equal() - Compares two ranges to see if they are equal.
  • mismatch() - Compares two ranges and returns a pair of iterators containing the points at which the ranges differ.
  • lexicographical_compare() - The lexicographical comparison is used to sort elements in a manner similar to how words are ordered in a dictionary. It can either use operator< or make use of a binary predicate function to perform the comparison.

Modifying Algorithms

  • for_each() - This is the same as the for_each we discussed above, but I have included it in the Modifying section to reinforce that it can be used this way too!
  • copy() - Copies a range/container of elements into another range.
  • copy_backward() - Copy a range/container of elements into another range, starting from the last element and working backwards.
  • transform() - Transform is quite a flexible algorithm. It works in two ways. A unary operation can be applied to the source range, on a per element basis, which ouputs the results in the destination range. A binary operation can be applied to both elements in the source and destination range, subsequently overwriting elements in the destination range.
  • merge() - Merge is intended to take two sorted ranges and combine them to produce a merged sorted range. However, it is possible to utilise unsorted ranges as arguments but this then leads to an unsorted merge! For this reason I've decided to include it in the modifying category, rather than the category for sorted ranges, below.
  • swap_ranges() - This swaps the elements of two ranges.
  • fill() - This replaces each element in a range with a specific value.
  • fill_n() - Similar to fill, but replaces the first $N$ elements in a range with a specific value.
  • generate() - This replaces each element in a range with the result of an operation of a generator function.
  • generate_n() - Similar to generate, but replaces the first $N$ elements in a range with the result of an operation of a generator function.
  • replace() - This replaces elements matching a specific value with another specific value.
  • replace_if() - This replaces elements matching a specific criterion with another specific value.
  • replace_copy() - Similar to replace, except that the result is copied into another range.
  • replace_copy_if() - Similar to replace_if, except that the result is copied into another range.

Removal Algorithms

  • remove() - This removes elements from a range that match a specific value.
  • remove_if() - This removes elements from a range that match a specific criterion, as determined via a unary predicate.
  • remove_copy() - Similar to remove, except that elements are copied into another range.
  • remove_copy_if() - Similar to remove_if, except that elements are copied into another range.
  • unique() - This is quite a useful algorithm. It removes adjacent duplicate elements, i.e. consecutive elements with specific values.
  • unique_copy() - Similar to unique, except that it copies the elements into another range.

Mutating Algorithms

  • reverse() - This simply reverses the order of the elements in a range or container.
  • reverse_copy() - Similar to reverse, except that the results of the reversal are copied into another range.
  • rotate() - By choosing a 'middle' element in a range, this algorithm will cyclically rotate the elements such that the middle element becomes the first.
  • rotate_copy() - Similar to rotate, except that the result is copied into another range.
  • next_permutation() - This rearranges the elements in a range to produce the next lexicographically higher permutation, using operator<. It is also possible to use a binary predicate comparison function instead of operator<.
  • prev_permutation() - Similar to next_permutation, except that it rearranges to produce the next lexicographically lower permutation.
  • random_shuffle() - Rearranges the list of elements in a range in a random fashion. The source of randomness can be supplied as a random number generator argument.
  • partition() - Rearranges a range/container such that the elements matching a predicate are at the front. Does NOT guarantee relative ordering from the original range.
  • stable_partition() - Similar to partition, except that it does guarantee relative ordering from the original range.

Sorting Algorithms

  • sort() - Sorts the elements into ascending order, using operator< or another supplied comparison function.
  • stable_sort() - This is similar to sort. It is used when you need to ensure that elements remain in the same order when they are "tied" for the same position. This often comes up when dealing with priorities of tasks. Note also that the performance guarantee is different.
  • partial_sort() - Similar to sort, except that it only sorts the first $N$ elements and terminates after they're sorted.
  • partial_sort_copy() - Similar to partial_sort except that it copies the results into a new range.
  • nth_element() - This allows you to ensure that an element at position $n$ is in the correct position, were the rest of the list to be sorted. It only guarantees that the elements preceeding $n$ are less than in value (in the sense of operator<) and that proceeding elements are greater than in value.
  • partition() - See above for partition.
  • stable_partition() - See above for stable_partition.
  • make_heap() - Rearranges the elements in a range such that they form a heap, i.e. allowing fast retrieval of elements of the highest value and fast insertion of new elements.
  • push_heap() - This adds an element to a heap.
  • pop_heap() - This removes an element from a heap.
  • sort_heap() - This sorts the elements in a heap, with the caveat that the range is no longer a heap subsequent to the function call.

Sorted Range Algorithms

  • binary_search() - Searches the range for any matches of a specific value.
  • includes() - Determines whether each element in one range is also an element in another range.
  • lower_bound() - Searches for the first element in a range which does not compare less than a specific value. Can also use a custom comparison function.
  • upper_bound() - Searches for the first element in a range which does not compare greater than a specific value. Can also use a custom comparison function.
  • equal_range() - Finds a subrange within a range which contains values equal to a specific value.
  • merge() - See above for merge.
  • set_union() - Creates a set union of two sorted ranges. Thus the destination range will include elements from either or both of the source ranges. Since this is a set operation, duplicates are eliminated.
  • set_intersection() - Creates a set intersection of two sorted ranges. Thus the destination range will include elements that exist only in both of the source ranges.
  • set_difference() - Creates a set difference of two sorted ranges. Thus the destation range will include elements from the first range that are not in the second range.
  • set_symmetric_difference() - This is the symmetric version of set_difference. It creates a set symmetric difference, which is formed by the elements that are found in one of the sets, but not in the other.
  • inplace_merge() - This combines two sorted ranges into a destination range, which is also sorted. It is also stable as it will preserve ordering for subranges.

Numeric Algorithms

  • accumulate() - This is an extremely useful algorithm. It allows all elements of a container to be combined by a particular operation, such as via a sum, product etc.
  • inner_product() - This combines all elements of two ranges by summing the multiples of each 'component'. It is possible to override both the 'sum' and 'multiply' binary operations.
  • adjacent_difference() - Assigns every element the value of the difference between the current value and the prior value, except in the first instance where the current value is simply used.
  • partial_sum() - Assigns the result of the partial sum of the current element and its predecessors, i.e. $y_i = \sum^i_{k=1} x_i$, for $y$ the result and $x$ the source.

In later articles we will be making extensive use of these algorithms. Not only do they save us valuable time, but we are able to ascertain their performance characteristics from the C++ Reference on Algorithms. Further, they have all been highly optimised via their respective compilers and so our programs will likely run significantly faster than if we construct our own.