I've written a lot on the site about how to become a financial engineer or a quant analyst, but I've not really delved into the role I actually had in a hedge fund, which was that of a pricing quantitative developer or what it involved. Since a lot of you are probably interested in programming as much as mathematics and finance, it makes sense for me to discuss what the role was actually like and what I was working on "day to day", in case you decide that this type of work is more suitable than a "purer" quant role.
Many systematic/quantitative hedge funds are structured as independent "intrapreneurial" units that consist of small groups of quant researchers, quant traders and quant developers. All of those job titles are prefixed with "quant" because they all involve a significant degree of mathematics. Each aspect of systematic trading is highly interwoven and thus every individual is exposed to mathematics and algorithms.
In systematic funds there are three key areas that need to be implemented before a "trading pipeline" can be established. Broadly, they are:
Unfortunately I won't be talking about the exact algorithm that we used, because this article is not able divulging trading strategies! However, I will discuss the pricing aspects of being a quantitative developer.
Pricing consists of four main areas: Connecting to data sources and obtaining data, storing that data in a unified manner, cleaning the data so it is free of errors and presenting that data to quant researchers in a straightforward, easy-to-use way.
Our fund predominantly, but not exclusively, utilised an equity long/short model as a trading mechanism. We were primarily concerned with the following asset classes: Global equities, fixed income macro and derivatives data, forex spot data (and futures), commodities (futures and options) and indices such as S&P500, FTSE100, VIX etc. Frequencies were predominantly end-of-day/OHLC (open, high, low, close) through to ten minute polls of other proprietary sources.
The first step to building a securities database of this kind is to create what is known as a securities master list. This lists every security/asset that might be of interest in a single, non-duplicated database. One of the key issues with such master lists is that different sources refer to the same security via different codes. It is necessary to construct a securities mapping list providing unique pricing data for each security.
Our pricing data was obtained from a mix of proprietary and free sources, usually via Application Programming Interfaces (API), so that it could be carried out in a repeated, automated fashion. We constructed a system to check for errors and flag up concerns if the data was not obtained or did not match other sources of the same securities. Our data was stored in a Relational Database Management System (RDBMS), which had been extensively tweaked for performance and our use cases.
Once the data was downloaded we ran three main types of data analysis and modification scripts. The first checked that identical values were achieved for the same security obtained from separate sources. The second checked that there were no unexplained "spikes" in the data (i.e. significant deviations from the normal trading range), which is usually indicative of an error. The third type of analysis was price adjustment for corporate actions (dividends, stock splits, share issues etc), such that our output returns stream became a series of percentage price changes, rather than absolute prices.
This data was then exposed to other software packages via a mixture of internally written APIs and database replication techniques.
This entire process was eventually fully automated. The only manual tasks that needed to be carried out were checking error logs and fixing data sources, adding new data sources and adjusting APIs to allow additional functions to be called.
On top of my duties as a "pricing quant dev" I also produced web-based reporting tools, portfolio reconciliation tools and a variety of other "housekeeping" scripts for certain tasks. All of this software was written in a mixture of Python (80%) and C++ (20%). I used C++ where I needed extensive speed-up of some algorithms (particularly portfolio reconciliation) and Python for most of the data collection and storage. We also made heavy use of MatLab and Excel for our strategy development and analysis.comments powered by Disqus