Linear Programming in Python with CVXOPT

In a previous post, I compared the performances of two Linear Programming (LP) solvers, COIN and GLPK, called by a Python library named PuLP. It then took around 100 ms to solve problems of moderate size. As it turns out, this is way too slow for this kind of problems, probably due to the fact that PuLP calls solvers externally via the command line. In this second post, I used the CVXOPT library and compared the performances with the previous approach. As it turns out, using CVXOPT is 50~70 times faster! Where it took 100 ms with PuLP, it now takes 2~3 ms with CVXOPT on my machine.

You can try out the benchmarking script and test on yours ;)

CVXOPT setup

If you don't plan on using external solvers such as GLPK or MOSEK, installing CVXOPT on Ubuntu or Debian is as simple as:

$ sudo apt-get install python-cvxopt

To install GLPK as well, you'd best build from source. An easy way to get everything done automatically is to use pip:

$ sudo apt-get install libglpk-dev
$ sudo CVXOPT_BUILD_GLPK=1 pip install cvxopt

You should now be able to import cvxopt from Python.

Matrix-vector LP problem

The problem for this benchmark is the same as in the previous post: find a vector \({\bf x} \in [x_\min, x_\max]^n\) that minimizes the maximum of a set of affine functions:

\begin{equation*} \begin{array}{rl} \textrm{minimize} & m \\ \textrm{subject to} & \forall i, \ a_i + \sum_j b_{ij}\,x_j \leq m \\ & \forall i, x_\min \leq x_i \leq x_\max \\ \end{array} \end{equation*}

However, from CVXOPT's documentation, CVXOPT takes LP problems formulated as:

\begin{equation*} \begin{array}{rl} \textrm{minimize} & {\bf c}^\top {\bf x} \\ \textrm{subject to} & {\bf G} {\bf x} \leq {\bf h} \\ & {\bf A} {\bf x} = {\bf b} \end{array} \end{equation*}

We thus need to formulate our problem in matrix-vector form. First, we append \(m\) as the last coordinate of the variables vector \({\bf x}\) so that \(m = {\bf c}^\top {\bf x}\) with \({\bf c} = [0\ 0\ \ldots\ 0\ 1]^\top\). Next, we stack the scalars \(a_i\) into a vector \(\bf a\), and the vectors \({\bf b}_i\) into a matrix \(\bf B\). The LP problem becomes:

\begin{equation*} \begin{array}{rl} \textrm{minimize} & {\bf c}^\top {\bf x} \\ \textrm{s.t.} & {\bf a} + {\bf B} {\bf x} \leq {\bf 0} \\ & x_\min \leq {\bf x} \leq x_\max \end{array} \end{equation*}

Where the vector notation \({\bf a} \leq {\bf b}\) means \(\forall i, a_i \leq b_i\). Each instance of our benchmark problem is then a pair \(({\bf a}, {\bf B})\) that we will generate by uniform random sampling in \([-1, 1]^n \times [-1, 1]^{n \times n}\).

Solving LPs from CVXOPT

Here is the function that solves the LP corresponding to an instance \(({\bf a}, {\bf B})\):

from numpy import array, eye, hstack, ones, vstack, zeros

def cvxopt_solve_minmax(n, a, B, x_min=-42, x_max=42, solver=None):
    c = hstack([zeros(n), [1]])

    # cvxopt constraint format: G * x <= h
    # first,  a + B * x[0:n] <= x[n]
    G1 = zeros((n, n + 1))
    G1[0:n, 0:n] = B
    G1[:, n] = -ones(n)
    h1 = -a

    # then, x_min <= x <= x_max
    x_min = x_min * ones(n)
    x_max = x_max * ones(n)
    G2 = vstack([
        hstack([+eye(n), zeros((n, 1))]),
        hstack([-eye(n), zeros((n, 1))])])
    h2 = hstack([x_max, -x_min])

    c = cvxopt.matrix(c)
    G = cvxopt.matrix(vstack([G1, G2]))
    h = cvxopt.matrix(hstack([h1, h2]))
    sol = cvxopt.solvers.lp(c, G, h, solver=solver)
    return array(sol['x']).reshape((n + 1,))

You can choose which solver to use via the solver keyword argument, for example solver='glpk' to use GLPK. Leaving it to None will call CVXOPT's default solver for Linear Cone Programs, which should be less efficient than GLPK as it solves a more general class of problems.

Disabling the output from GLPK in CVXOPT

A minor problem I had was to disable solver outputs in CVXOPT. The standard way to do that is via the options dictionary in cvxopt.solvers, which is passed to the selected solver at instantiation time:

cvxopt.solvers.options['show_progress'] = False

It works for the default solver, but not with GLPK. A post on CVXOPT's bulletin board points to the parameter LPX_K_MSGLEV, but it didn't work with my version (1.1.7) of the software either. A docstring in the source code src/C/glpk.c mentions another parameter msg_lev, which works for me:

cvxopt.solvers.options['glpk'] = {'msg_lev': 'GLP_MSG_OFF'}  # cvxopt 1.1.8
cvxopt.solvers.options['msg_lev'] = 'GLP_MSG_OFF'  # cvxopt 1.1.7
cvxopt.solvers.options['LPX_K_MSGLEV'] = 0  # previous versions

Comparing solver performances

In this benchmark, I compared four methods:

  • pulp_coin: COIN called via PuLP
  • pulp_glpk: GLPK called via PuLP
  • cvxopt: CVXOPT's default (general) solver
  • cvxopt_glpk: GLPK called via CVXOPT

Here is a sample of computation times on my machine for problems of size \(n=10\):

In [1]: %timeit solve_random_minmax(10, 'pulp_coin')
10 loops, best of 3: 30.4 ms per loop

In [2]: %timeit solve_random_minmax(10, 'pulp_glpk')
10 loops, best of 3: 23.3 ms per loop

In [3]: %timeit solve_random_minmax(10, 'cvxopt')
100 loops, best of 3: 2.22 ms per loop

In [4]: %timeit solve_random_minmax(10, 'cvxopt_glpk')
1000 loops, best of 3: 330 µs per loop

In this case, calling GLPK from CVXOPT rather than PuLP is 70 times faster! The reason is most likely that PuLP writes files and call the command line, while CVXOPT uses its own internal modules. Meanwhile, CVXOPT-GLPK is faster than CVXOPT, which is also expected because the default solver in CVXOPT handles a larger class of problems called Cone Programs.

Here is more benchmarking data:

Results of the benchmark of LP solvers in Python

The bottom line is:

  • cvxopt_glpk is 2 to 10 times faster than cvxopt,
  • cvxopt_glpk and cvxopt are 10 to 70 times faster than PuLP.

This difference is especially significant on small problems. You can try for yourself on your own machine, the full benchmark script is available here: lp-benchmark.py.

Pages of this website are under the CC-BY 4.0 license.