Forex Trading Diary #6 - Multi-Day Trading and Plotting Results

By Michael Halls-Moore on June 3rd, 2015

It's been a while since my latest Forex Trading Diary update. I've been busy working on the new QuantStart Jobs Board and so I've not had as much time as usual to work on QSForex, although I have made some progress!

In particular I have been able to add some new features including:

  • Documentation - I've now created a QSForex subsection on the site, which includes all entries of the Forex Trading Diary and documentation for QSForex. In particular, it includes detailed installation instructions and a guide to usage for both backtesting and live trading.
  • Simulated Tick Data Generation - Since it is challenging to download forex tick data in bulk (or at least it has been from certain vendors I use!) I decided it would be more straightforward to simply generate some random tick data for testing the system.
  • Multi-Day Backtesting - A long-standing feature request in QSForex is the ability to backtest over multiple days of tick data. In the latest release QSForex now supports both multi-day and multi-pair backtesting, making it substantially more useful.
  • Plotting Backtesting Results - While console output is useful, nothing beats being able to visualise an equity curve or historical drawdown. I've made use of the Seaborn library to plot the various performance charts.

In this entry I'll describe all of the new features in detail below.

If you haven't been able to follow the series to date you can head to the QSForex section in order to catch up on previous entries.

Simulated Tick Data Script

An extremely important requested feature for QSForex has been the ability to backtest over multiple days. Previously the system only supported backtesting via a single file. This was not a scalable solution as such a file must be read into memory and subsequently into a Pandas DataFrame. While the tick data files produced are not huge (roughly 3.5Mb each), they do add up quickly if we consider multiple pairs over periods of months or more.

In order to begin creating a multi-day/multi-file capability I started trying to download more files from the DukasCopy historical tick feed. Unfortunately, I ran into some trouble and I was not able to download the necessary files in order to test the system.

Since I was not too fussed about the actual time series themselves, I felt it would be simpler to write a script to generate simulated forex data myself. I have placed this script in the scripts/generate_simulated_pair.py file. The current code can be found here.

The basic idea of the script is to generate a list of randomly distributed timestamps, each possessing both bid/ask values and a bid/ask volume values. The spread between the bid and the ask is constant, while the bid/ask values themselves are generated as a random walk.

Since I won't actually ever be testing any real strategies on this data I wasn't too bothered about its statistical properties or its absolute values in relation to real forex currency pairs. As long as it had the correct format, and approximate length, I could use it to test the multi-day backtesting system.

The script is currently hardcoded to generate forex data for the entire month of January 2014. It uses the Python calendar library in order to ascertain business days (although I haven't excluded holidays yet) and then generates a set of files of the form BBBQQQ_YYYYMMDD.csv, where BBBQQQ will be the specified currency pair (e.g. GBPUSD) and YYYYMMDD is the specified date (e.g. 20140112).

These files are placed in the CSV_DATA_DIR directory, which is specified in the settings.py in the application root.

In order to generate the data the following command must be run, where BBBQQQ must be replaced with the particular currency name of interest, e.g. GBPUSD:

python scripts/generate_simulated_pair.py BBBQQQ

The file will require modification in order to generate multiple months or years of data. Each daily tick file is on the order of 3.2Mb in size.

In the future I will be modifying this script to generate multiple months or years of data based on a list of currency pairs provided, rather than the values being hardcoded. However, for the time being this should help you get started.

Please note that the format exactly matches that of the DukasCopy historical tick data, which is the dataset that I am currently using.

Multi-Day Backtesting Implemented

Following on directly from the generation of simulated tick data is the implementation of multi-day backtesting.

While my long-term plan is to use a more robust historical storage system such as PyTables with HDF5, for the time being I am going to make use of a set of CSV files, one file per day per currency pair.

This is a scalable solution as the number of days increases. The event-driven nature of the system requires only ever needing $N$ files in memory at once, where $N$ is the number of currency pairs being traded on a particular day.

The basic idea of the system is for the current HistoricCSVPriceHandler to continue to use the stream_next_tick method, but with a modification to account for multiple days of data via loading each day of data sequentially.

The current implementation exits the backtest upon the receipt of the StopIteration exception thrown by the next(..) call to self.all_pairs as shown in this pseudocode snippet:

# price.py

..
..

def stream_next_tick(self):
	  ..
	  ..
    try:
        index, row = next(self.all_pairs)
    except StopIteration:
        return
    else:
    	..
    	..
      # Add the tick to the queue

In the new implementation, this snippet is modified to the following:

# price.py

..
..

def stream_next_tick(self):
    ..
    ..
    try:
        index, row = next(self.cur_date_pairs)
    except StopIteration:
        # End of the current days data
        if self._update_csv_for_day():
            index, row = next(self.cur_date_pairs)
        else: # End of the data
            self.continue_backtest = False
            return

    ..
    ..

    # Add the tick to the queue

In this snippet, when StopIteration is raised, the code checks for the result of self._update_csv_for_day(). If the result is True the backtest continues (on self.cur_date_pairs, which could have been changed to the subsequent days data). If the result is False, the backtest ends.

This approach is very memory efficient as only a particular days worth of data is loaded at any one point. It means that we can potentially carry out months of backtesting and are only limited by the CPU processing speed and the amount of data we can generate or acquire.

I have updated the documentation to reflect the fact that the system now expects multiple days of data in a particular format, in a particular directory which must be specified.

Plotting Backtesting Results with Seaborn Library

A backtest is relatively useless if we can't visualise the performance of the strategy over time. While the system has been mostly console-based to date, I have begun the transition to a Graphical User Interface (GUI) with this release.

In particular, I have created the usual "three pane" of charts that often accompany performance metrics for quantitative trading systems, namely the equity curve, the returns profile and the drawdown curve. All three are calculated for every tick and are output into a file called equity.csv in the OUTPUT_RESULTS_DIR found in settings.py.

In order to view the data we make use of a library called Seaborn, which produces publication-quality (yes, ACTUAL publication-quality) graphics that look substantially better than the default graphs produced by Matplotlib.

The graphics look very close to those produced by the R package ggplot2. In addition Seaborn actually uses Matplotlib underneath, so you can still use the Matplotlib API.

In order to allow output I've written the output.py script that lives in the backtest/ directory. The listing for the script is as follows:

# output.py

import os, os.path

import pandas as pd
import matplotlib
try:
    matplotlib.use('TkAgg')
except:
    pass
import matplotlib.pyplot as plt
import seaborn as sns

from qsforex.settings import OUTPUT_RESULTS_DIR


if __name__ == "__main__":
    """
    A simple script to plot the balance of the portfolio, or
    "equity curve", as a function of time.

    It requires OUTPUT_RESULTS_DIR to be set in the project
    settings.
    """
    sns.set_palette("deep", desat=.6)
    sns.set_context(rc={"figure.figsize": (8, 4)})

    equity_file = os.path.join(OUTPUT_RESULTS_DIR, "equity.csv")
    equity = pd.io.parsers.read_csv(
        equity_file, parse_dates=True, header=0, index_col=0
    )

    # Plot three charts: Equity curve, period returns, drawdowns
    fig = plt.figure()
    fig.patch.set_facecolor('white')     # Set the outer colour to white
    
    # Plot the equity curve
    ax1 = fig.add_subplot(311, ylabel='Portfolio value')
    equity["Equity"].plot(ax=ax1, color=sns.color_palette()[0])

    # Plot the returns
    ax2 = fig.add_subplot(312, ylabel='Period returns')
    equity['Returns'].plot(ax=ax2, color=sns.color_palette()[1])

    # Plot the returns
    ax3 = fig.add_subplot(313, ylabel='Drawdowns')
    equity['Drawdown'].plot(ax=ax3, color=sns.color_palette()[2])

    # Plot the figure
    plt.show()

As you can see the script imports Seaborn and opens the equity.csv file as a Pandas DataFrame, then simply creates three subplots, one each for the equity curve, returns and drawdown.

Note that the drawdown chart itself is actually calculated from a helper function that lives in performance/performance.py, which is called from the Portfolio class at the end of a backtest.

An example of the output for the included MovingAverageCrossStrategy strategy, on a randomly generated set of GBPUSD data for the month of January 2014, is given as follows:


Portfolio value/equity curve, returns profile and drawdown curve for the test MovingAverageCrossStrategy

In particular, you can see the flat sections of the equity curve on the weekends where no data is present (at least, for this simulated data set). In addition you can see that the strategy simply loses money in a rather predictable fashion on this randomly simulated data set.

This is a good test of the system. We are simply attempting to follow a trend on a randomly generated time series. The losses occur due to the fixed spread introduced in the simulation process.

This makes it abundantly clear that if we are to make a consistent profit in higher frequency forex trading we will need a specific quantifiable edge that generates positive returns over and above transaction costs such as spread and slippage.

We will have a lot more to say about this extremely important point in subsequent entries of the Forex Trading Diary.

Next Steps

Fixing Position Calculations

I've recently had a lot of extremely helpful correspondence with QSForex users via the Disqus comments and the QSForex Issues page regarding the correctness of the calculations within the Position class.

Some have noted that the calculations may not be mirroring exactly how OANDA (the broker that is used for the trading.py system) themselves calculate cross-currency trades.

Hence, one of the most important next steps is to actually make and test these suggested modifications in position.py and also update the unit tests that live in position_test.py. This will have a knock-on effect with portfolio.py and also portfolio_test.py.

Performance Measurement

While we now have a basic set of visual performance indicators via the equity curve, returns profile and drawdown series, we are in need of more quantified performance measures.

In particular we will need strategy level metrics, including common risk/reward ratios such as the Sharpe Ratio, Information Ratio and Sortino Ratio. We will also need drawdown statistics including the distribution of the drawdowns, as well as descriptive stats such as maximum drawdown. Other useful metrics include the Compound Annual Growth Rate (CAGR) and total return.

At the trade/position level we want to see metrics such as avg profit/loss, max profit/loss, profit ratio and win/loss ratio. Since we have built the Position class as a fundamental part of the software from the start, it shouldn't be too problematic to generate these metrics via some additional methods. More on this in the next entry, however!

comments powered by Disqus