Cholesky Decomposition in Python and NumPy

Following on from the article on LU Decomposition in Python, we will look at a Python implementation for the **Cholesky Decomposition** method, which is used in certain quantitative finance algorithms.

In particular, it makes an appearance in Monte Carlo Methods where it is used to simulating systems with correlated variables. Cholesky decomposition is applied to the correlation matrix, providing a lower triangular matrix L, which when applied to a vector of uncorrelated samples, u, produces the covariance vector of the system. Thus it is highly relevant for quantitative trading.

Cholesky decomposition assumes that the matrix being decomposed is *Hermitian* and *positive-definite*. Since we are only interested in real-valued matrices, we can replace the property of Hermitian with that of *symmetric* (i.e. the matrix equals its own transpose). Cholesky decomposition is approximately 2x faster than LU Decomposition, where it applies.

In order to solve for the lower triangular matrix, we will make use of the **Cholesky-Banachiewicz Algorithm**. First, we calculate the values for L on the main diagonal. Subsequently, we calculate the off-diagonals for the elements below the diagonal:

As with LU Decomposition, the most efficient method in both development and execution time is to make use of the NumPy/SciPy linear algebra (`linalg`

) library, which has a built in method `cholesky`

to decompose a matrix. The optional `lower`

parameter allows us to determine whether a lower or upper triangular matrix is produced:

import pprint import scipy import scipy.linalg # SciPy Linear Algebra Library A = scipy.array([[6, 3, 4, 8], [3, 6, 5, 1], [4, 5, 10, 7], [8, 1, 7, 25]]) L = scipy.linalg.cholesky(A, lower=True) U = scipy.linalg.cholesky(A, lower=False) print "A:" pprint.pprint(A) print "L:" pprint.pprint(L) print "U:" pprint.pprint(U)

The output from the code is given below:

A: array([[ 6, 3, 4, 8], [ 3, 6, 5, 1], [ 4, 5, 10, 7], [ 8, 1, 7, 25]]) L: array([[ 2.44948974, 0. , 0. , 0. ], [ 1.22474487, 2.12132034, 0. , 0. ], [ 1.63299316, 1.41421356, 2.30940108, 0. ], [ 3.26598632, -1.41421356, 1.58771324, 3.13249102]]) U: array([[ 2.44948974, 1.22474487, 1.63299316, 3.26598632], [ 0. , 2.12132034, 1.41421356, -1.41421356], [ 0. , 0. , 2.30940108, 1.58771324], [ 0. , 0. , 0. , 3.13249102]])

As with LU Decomposition, it is unlikely that you will ever need to code up a Cholesky Decomposition in pure Python (i.e. without NumPy/SciPy), since you can just include the libraries and use the far more efficient implements found within. However, for completeness I have included the pure Python implementation of the Cholesky Decomposition so that you can understand how the algorithm works:

from math import sqrt from pprint import pprint def cholesky(A): """Performs a Cholesky decomposition of A, which must be a symmetric and positive definite matrix. The function returns the lower variant triangular matrix, L.""" n = len(A) # Create zero matrix for L L = [[0.0] * n for i in xrange(n)] # Perform the Cholesky decomposition for i in xrange(n): for k in xrange(i+1): tmp_sum = sum(L[i][j] * L[k][j] for j in xrange(k)) if (i == k): # Diagonal elements # LaTeX: l_{kk} = \sqrt{ a_{kk} - \sum^{k-1}_{j=1} l^2_{kj}} L[i][k] = sqrt(A[i][i] - tmp_sum) else: # LaTeX: l_{ik} = \frac{1}{l_{kk}} \left( a_{ik} - \sum^{k-1}_{j=1} l_{ij} l_{kj} \right) L[i][k] = (1.0 / L[k][k] * (A[i][k] - tmp_sum)) return L A = [[6, 3, 4, 8], [3, 6, 5, 1], [4, 5, 10, 7], [8, 1, 7, 25]] L = cholesky(A) print "A:" pprint(A) print "L:" pprint(L)

The output from the pure Python implementation is given below:

A: [[6, 3, 4, 8], [3, 6, 5, 1], [4, 5, 10, 7], [8, 1, 7, 25]] L: [[2.449489742783178, 0.0, 0.0, 0.0], [1.2247448713915892, 2.1213203435596424, 0.0, 0.0], [1.6329931618554523, 1.414213562373095, 2.309401076758503, 0.0], [3.2659863237109046, -1.4142135623730956, 1.5877132402714704, 3.1324910215354165]]

The SciPy implementation and the pure Python implementation both agree, although we haven't calculated the upper version for the pure Python implementation. In production code you should use SciPy as it will be significantly faster at decomposing larger matrices.

comments powered by DisqusYou'll get instant access to a free 10-part email course packed with hints and tips to help you get started in quantitative trading!

Every week I'll send you a wrap of all activity on QuantStart so you'll never miss a post again.

Real, actionable quant trading tips with no nonsense.