Announcing the QuantStart Advanced Trading Infrastructure Article Series

Announcing the QuantStart Advanced Trading Infrastructure Article Series

To date on QuantStart we have considered two major quantitative backtesting and live trading engines. The first arised from the Event-Drive Backtesting series I wrote back in March 2014. The second is QSForex, an open-source backtest and live trading engine that hooks into the OANDA Forex Broker API, which is still being used by many of you.

I've had a lot of requests recently for a more "updated" version of the event-driven backtesting series and/or a "QSForex" that works on other asset classes. The most common request is that it should play extremely well with Interactive Brokers. Now that we're coming up to 2016, I've also been thinking about updating my own trading infrastructure design.

For these reasons I've decided to write a new article series on how to design and construct and end-to-end infrastructure for a full portfolio and order management system, including a backtesting/research environment, remote server deployment capabilities and algorithmic execution. Eventually I would like this to be cross-asset capable, but it is generally preferable to start with a single asset class in order to minimise excessive configuration capability.

I'm going to start with US equities/ETFs traded with Interactive Brokers, on a daily frequency, as this is often the most popular request.

In particular, the outcome of the series will lead to my new personal/QuantStart trading infrastructure as well, so I will have a lot of personal interest in making sure it is robust, reliable and highly efficient! I'll document the process in an "end-to-end" fashion so that you should be able to fully replicate my results in completeness.

Note that any future posts on the site that discuss trading strategy performance will make use of this library, allowing you to fully replicate the results as long as you use a) the exact same processed data as I and b) an identical set of random seeds to any stochastic models that are used within the code. I will of course outline how to make sure these two criteria are fullfilled!

Once again, I'll be making the software available under an open-source MIT-style license on GitHub, along with all of the scripts and configuration files. This should allow you to either design your own system using mine as a template, or get started on strategy development knowing that you'll have a robust library doing the "heavy lifting" for you.

Design Considerations

This design will be equivalent to what I would write were I still employed at a small quant fund. Thus, I consider the end goal of this project to be a fully open-source, but institutional grade, production-ready portfolio and order management system, with risk management layers across positions, portfolios and the infrastructure as a whole.

It will be end-to-end automated, meaning that minimal human intervention is necessary for the system to trade once it is set "live". It is impossible to completely eliminate human intervention, especially when it comes to input data quality, such as with erroneous ticks, but it is certainly possible to have the system running in an automated fashion most of the time.

Quantitative Trading Considerations

The trading system will mirror the infrastructure that might be found in a small quant fund or family office quant arm. It will be highly modular and loosely coupled. The main components are the data store (securities master), signal generator, portfolio/order management system, risk layer and brokerage interface.

We have already outlined in previous articles how these systems tend to fit together, but the following is a list of "institutional grade" components that we wish to build the system around:

  • Data Provider Integration - The first major component involves interacting with a set of data providers, usually via some form of API. I make use of Quandl, DTN IQFeed and Interactive Brokers as my providers so I will support these initially.
  • Data Ingestion and Cleaning - In between data storage and download we will have a filtration/cleansing layer that will only store data if it passes certain checks. It will flag "bad" data and note if data is unavailable.
  • Pricing Data Storage - We will need to create an intraday securities master database, storing symbols as well as price values obtained from a brokerage or data provider
  • Trading Data Storage - All of our orders, trades and portfolio states will need to be stored over time. We will use a form of object serialisation/persistence for this, such as Python's pickle library.
  • Configuration Data Storage - We will need to store time-dependent configuration information for historical reference in a database, either in tabular format or, once again, in pickled format.
  • Research/Backtesting Environment - We have discussed Python and R at length both on the site and in my two ebooks Successful Algorithmic Trading and Advanced Algorithmic Trading. Our research/backtesting environment will hook into our securities master and ultimately use the same trading logic to generate realistic backtests.
  • Signal Generation - We've discussed machine learning, time series analysis and Bayesian statistics to some degree on the site. We can implement these techniques for our signal generators to produce trading recommendations to our portfolio engine.
  • Portfolio/Order Management - The "heart" of the system will be the portfolio and order management system (OMS) which will receive signals from the signal generator and use them as "recommendations" for constructing orders. The OMS will communicate directly with the risk management component in order to determine how these orders should be constructed.
  • Risk Management - The risk manager will provide a "veto" or modification mechanism for the OMS, such that sector-specific weightings, leverage constraints, broker margin availability and average daily volume limits are kept in place. The risk layer will also provide "umbrella hedge" situations, providing market- or sector-wide hedging capability to the portfolio.
  • Brokerage Interface - The brokerage interface will consist of the raw interface code to the the broker API (in this case the C++ API of Interactive Brokers) as well as the implementation of multiple order types such as market, limit, stop etc.
  • Algorithmic Execution - We will implement and utilise automated execution algorithms in order to mitigate market impact effects.
  • Accounting and P&L - Accounting is basically answering the question "How much money have I made?". It is actually not that straightforward a question to answer! We will spend a lot of time thinking about how to correctly account for PnL in a professional trading system.

Software Engineering Considerations

Perhaps the crucial difference between this system and most "retail" algo trading systems is that high availability, redundancy, monitoring, reporting, accounting, data quality and robust risk management will be given "first class citizen" status within the system. Creating the system in this manner will allow us to have a significant degree of confidence in our automation, allowing us to (eventually) concentrate on optimising signal generation and portfolio management.

The following concepts, the majority of which are taken from the field of professional software engineering, will provide the basis of the design:

  • Automated Task Scheduling - We will use robust automated task scheduling software, such as managed cron, in order to determine whether our repeated actions are being carried out
  • High Availability - Our system will have a significant degree of high availability through redundancy, using multiple instances of our databases and application servers
  • Backup and Restoration - All of our data will be backed up using robust cloud systems (such as Amazon RDS), which will allow straightforward recovery if we have a failure in our databases
  • Monitoring - Our systems will be continually monitored, including the "usual" metrics of CPU usage, RAM usage, hard disk capacity and network I/O, allowing us to monitor the "health" of our trading system over time
  • Logging - In as much as it's possible, everything will be logged in our sytem to allow retrospective fault-finding and straightforward debugging
  • Reporting - Our performance will be continually calculated, compared against desired benchmarks and risk will be continually assessed
  • Version Control Systems - Our source code, scripts and configuration files will all live in version control, namely GitHub, avoiding the necessary tedium of copy & pasting new versions of code locally and remotely
  • Test Driven Development - As with QSForex we will follow a full test-driven development (TDD) approach by writing extensive unit tests for the code
  • Continuous Delivery - In order to minimise the introduction of bugs that could rapidly eliminate profits, we will use the concepts of Continuous Integration and Continuous Deployment for our server deployment. More on this in future articles.
  • Remote Deployment - We will use a fully cloud/server-based deployment such that there is no local dependence on our trading infrastructure.

Next Steps

The first task will be to discuss the software stack and tools we will use to build our trading system. This will include our hosting provider, version control and continuous deployment systems, our monitoring tools and our data storage mechanisms (including backup and restore), as well as our choice of brokerage and interface.

In the next article I will outline all of the vendors that I feel are up to the task, as well as a reasonable estimate of costs. We will then proceed to actually flesh out this infrastructure in a detailed manner.