# Linear Programming in Python with CVXOPT

In a previous post,
I compared the performances of two Linear Programming (LP) solvers, COIN and
GLPK, called by a Python library named PuLP. It then took around 100 ms to solve
problems of moderate size. As it turns out, this is *way too slow* for this
kind of problems, probably due to the fact that PuLP calls solvers externally
via the command line. In this second post, I used the CVXOPT library and compared the performances with the previous
approach. As it turns out, using CVXOPT is 50~70 times faster! Where it took
100 ms with PuLP, it now takes 2~3 ms with CVXOPT on my machine.

## CVXOPT setup

If you don't plan on using external solvers such as GLPK or MOSEK, installing CVXOPT on Ubuntu or Debian is as simple as:

$ sudo apt-get install python-cvxopt

To install GLPK as well, you'd best build from source. An easy way to get
everything done automatically is to use `pip`:

$ sudo apt-get install libglpk-dev $ sudo CVXOPT_BUILD_GLPK=1 pip install cvxopt

You should now be able to import `cvxopt` from Python.

## Matrix-vector LP problem

The problem for this benchmark is the same as in the previous post: find a vector \({\bf x} \in [x_\min, x_\max]^n\) that minimizes the maximum of a set of affine functions:

However, from CVXOPT's documentation, CVXOPT takes LP problems formulated as:

We thus need to formulate our problem in matrix-vector form. First, we append \(m\) as the last coordinate of the variables vector \({\bf x}\) so that \(m = {\bf c}^\top {\bf x}\) with \({\bf c} = [0\ 0\ \ldots\ 0\ 1]^\top\). Next, we stack the scalars \(a_i\) into a vector \(\bf a\), and the vectors \({\bf b}_i\) into a matrix \(\bf B\). The LP problem becomes:

Where the vector notation \({\bf a} \leq {\bf b}\) means \(\forall i, a_i \leq b_i\). Each instance of our benchmark problem is then a pair \(({\bf a}, {\bf B})\) that we will generate by uniform random sampling in \([-1, 1]^n \times [-1, 1]^{n \times n}\).

## Solving LPs from CVXOPT

Here is the function that solves the LP corresponding to an instance \(({\bf a}, {\bf B})\):

from numpy import array, eye, hstack, ones, vstack, zeros def cvxopt_solve_minmax(n, a, B, x_min=-42, x_max=42, solver=None): c = hstack([zeros(n), [1]]) # cvxopt constraint format: G * x <= h # first, a + B * x[0:n] <= x[n] G1 = zeros((n, n + 1)) G1[0:n, 0:n] = B G1[:, n] = -ones(n) h1 = -a # then, x_min <= x <= x_max x_min = x_min * ones(n) x_max = x_max * ones(n) G2 = vstack([ hstack([+eye(n), zeros((n, 1))]), hstack([-eye(n), zeros((n, 1))])]) h2 = hstack([x_max, -x_min]) c = cvxopt.matrix(c) G = cvxopt.matrix(vstack([G1, G2])) h = cvxopt.matrix(hstack([h1, h2])) sol = cvxopt.solvers.lp(c, G, h, solver=solver) return array(sol['x']).reshape((n + 1,))

You can choose which solver to use via the `solver` keyword argument, for
example `solver='glpk'` to use GLPK. Leaving it to `None` will call
CVXOPT's default solver for Linear Cone Programs, which should be less
efficient than GLPK as it solves a more general class of problems.

### Disabling the output from GLPK in CVXOPT

A minor problem I had was to disable solver outputs in CVXOPT. The standard way
to do that is via the `options` dictionary in `cvxopt.solvers`, which is
passed to the selected solver at instantiation time:

cvxopt.solvers.options['show_progress'] = False

It works for the default solver, but not with GLPK. A post on CVXOPT's bulletin
board points to the parameter `LPX_K_MSGLEV`, but it didn't work with my
version (1.1.7) of the software either. A docstring in the source code src/C/glpk.c
mentions another parameter `msg_lev`, which works for me:

cvxopt.solvers.options['glpk'] = {'msg_lev': 'GLP_MSG_OFF'} # cvxopt 1.1.8 cvxopt.solvers.options['msg_lev'] = 'GLP_MSG_OFF' # cvxopt 1.1.7 cvxopt.solvers.options['LPX_K_MSGLEV'] = 0 # previous versions

## Comparing solver performances

In this benchmark, I compared four methods:

`pulp_coin`: COIN called via PuLP`pulp_glpk`: GLPK called via PuLP`cvxopt`: CVXOPT's default (general) solver`cvxopt_glpk`: GLPK called via CVXOPT

Here is a sample of computation times on my machine for problems of size \(n=10\):

In [1]: %timeit solve_random_minmax(10, 'pulp_coin') 10 loops, best of 3: 30.4 ms per loop In [2]: %timeit solve_random_minmax(10, 'pulp_glpk') 10 loops, best of 3: 23.3 ms per loop In [3]: %timeit solve_random_minmax(10, 'cvxopt') 100 loops, best of 3: 2.22 ms per loop In [4]: %timeit solve_random_minmax(10, 'cvxopt_glpk') 1000 loops, best of 3: 330 µs per loop

In this case, calling GLPK from CVXOPT rather than PuLP is 70 times faster! The reason is most likely that PuLP writes files and call the command line, while CVXOPT uses its own internal modules. Meanwhile, CVXOPT-GLPK is faster than CVXOPT, which is also expected because the default solver in CVXOPT handles a larger class of problems called Cone Programs.

Here is more benchmarking data:

The bottom line is:

`cvxopt_glpk`is 2 to 10 times faster than`cvxopt`,`cvxopt_glpk`and`cvxopt`are 10 to 70 times faster than PuLP.

This difference is especially significant on small problems. You can try for yourself on your own machine, the full benchmark script is available here: lp-benchmark.py.