C++ Standard Template Library Part I - Containers

C++ Standard Template Library Part I - Containers

The Standard Template Library (STL) is a cross-platform standard of efficient data structures and algorithms. It is one of the most important features of C++ and will form the basis of all ongoing quantitative finance programs within this book. The STL is often confusing to novice programmers because it makes use of the generic programming paradigm, as opposed to the object oriented paradigm. In fact, it could be said that the STL is the opposite approach, in that rather than encapsulate the data and algorithms acting upon it, the STL actually separates these functions through a common iterator interface.

The STL is extremely important in quantitative finance because many of the quant data structures and algorithms rely on more elementary concepts such as sorting, maxima/minima, copying, mathematical function modelling and searching. Rather than reinvent these basic algorithms from scratch, C++ provides them "out of the box". This allows you to spend more time testing your own algorithms, rather then "reinventing the wheel".

The STL is a large library. It contains a wealth of data structures and algorithms. In this article I will attempt to give a broad overview of the STL and highlight those aspects which are relevant to quantitative finance.

Components of the STL

The STL can be broadly split into four separate components:

  • Containers - Data structures designed to efficiently store information across a variety of generic types.
  • Iterators - Methods by which to traverse/access the data within the containers.
  • Algorithms - Basic algorithmic capabilities, such as sort, find, max, min, replace. Makes use of iterators to allow "container independent" code.
  • Function Objects - a.k.a. functors. Used to make templates classes "callable" to allow modelling of mathematical functions.

There are extra smaller components, but these are the four that will interest us for quantitative finance work.

Containers

Containers allow certain types of data to be grouped into logical structures. The STL provides generic containers so that the same data structure can handle multiple different types without any additional required coding. For instance, we could create a std::vector<double> or a std::vector<int>. The former would be a vector of double precision values, while the latter is a vector of integer precision values. Notice that we are not required to code anything else to let the vector know we are using different types!

Sequence Containers

Sequence containers are extremely common within quantitative finance applications. They are often used to store numerical data sequentially. For instance we may want to store components of an element of a vector field or matrix of complex numbers. There are three main types of sequential containers that tend to be used the most:

  • Lists - The two types of list are the singly-linked list and the doubly-linked list. A singly-linked list only has forward pointers to the next element, whereas a doubly-linked list has forward and backward pointers. They do not allow random access to elements, but do allow fast inserts and removals.
  • Vectors - Vectors allow fast random access to elements, but are slow for inserts and removals within the vector (as all elements need to be copied and moved). They allow fast appends and "pops" (i.e. removals from the end).
  • Deques - i.e. double-ended queues. Deques are similar to vectors except that they allow fast appends/removals to both ends of the queue.

There is also the valarray sequence container, which is (supposedly) optimised for numerical work. However, in practice valarray is harder to use and many modern compilers do not implement the optimisations well enough to warrant using them as an alternative to vectors.

Associative Containers

Associative containers are different to sequential containers in that there is no direct linear sequence to the elements within the container. Instead, they are sorted based on a comparison criterion. This criterion is often determined via the context. For instance, in a std::set<int>, the comparison operator is the less than (<) operator. There are four main types of associative container, which differ in terms of duplicate elements and whether the key data and value data coincide:

  • Set - A set provides a container that does not allow duplicate elements and where the data is itself the key and value. Sets can be sorted according to comparison functions. Here is an example of a set of integers: $\{1,2,3,4\}$.
  • Multiset - A multiset is very similar to a set except that it can possess multiple duplicate values. Here is an example of a multiset of integers: $\{1,2,3,3,4\}$. Notice the duplicate 3s.
  • Map - A map contains pairs of elements, the two components of which are known as the key and the value. The key is used to "look up" the value data. Maps do not allow duplicate keys, but they do provide a great deal of flexibility in what types of data can represent the keys and values. An example of a map would be a set of date keys, each of which possessed a price value, without duplicate dates.
  • Multimap - A multimap is similar to a map except that duplicate keys are allowed. This is possible because the values can be iterated over in a sorted fashion, using the aforementioned comparison function.

Although it might initially seem that associative containers are less appropriate for handling numerical data within quantitative finance, this is not actually the case. A map is ideal for holding the values of sparse matrices (such as tridiagonal or banded), which crop up in finite difference methods, for instance.

Container Adaptors

Although not strictly containers in themselves container adaptors are a very useful component of the STL. Container adaptors are designed to adapt the behaviour of the aforementioned containers above such that their behaviour is restricted or modified in some fashion. The three container adaptors that are of interest to us are as follows:

  • Stacks - A stack is a common data structure within computer science as well in day-to-day life. It represents the Last-In-First-Out (LIFO) concept, where elements are placed upon each other and the first one to be removed was the last one to be added.
  • Queues -A queue differs from a stack in that it represents the First-In-First-Out (FIFO) concept, where the first element added to the queue becomes the first to be removed. Queues are often used to represent task lists of "background jobs" that might need to be carried out, such as heavy-duty numerical processing.
  • Priority Queues -A priority queue is similar to a queue, except that elements are removed from it based on a priority comparison function. This is particularly common in operating system design where many tasks are constantly being added to a queue, but some are more essential than others and so will be executed first, even if they were added at a later point.

In the next article we will discuss iterators and how they allow algorithms to interface with the containers we have just discussed.