# Conversion from least squares to quadratic programming

Casting a linear least squares to a quadratic program is a common question for newcomers, who see this operation routinely mentioned or taken for granted in writing. Let us review the details of this derivation using linear algebra over normed vector spaces.

## Definitions

### Least Squares

A constrained weighted linear least squares (LS) problem is written as:

This problem seeks the vector \(x\) of optimization variables such that
\(R x\) is as "close" as possible to a prescribed vector \(s\),
meanwhile satisfying a set of linear inequality and equality constraints
defined by the matrix-vector couples \((G, h)\) and \((A, b)\),
respectively. Vector inequalities apply coordinate by coordinate. How close the
two vectors \(R x\) and \(s\) are is measured via the weighted norm
\(\| y \|_W = \sqrt{y^T W y}\), where the weight matrix \(W\) is
positive semi-definite and usually
diagonal. The overall function \(x \mapsto (1/2) \| R x - s \|^2_W\) is
called the *objective function* (*a.k.a.* cost function) of the problem.

### Quadratic Programming

A quadratic program (QP) is written in standard form as:

This problem seeks the vector \(x\) of optimization variables such that the quadratic objective function defined by the (positive semi-definite) matrix \(P\) and vector \(q\) is minimized, meanwhile satisfying a set of linear inequality and equality constraints defined by the matrix-vector couples \((G, h)\) and \((A, b)\), respectively. Vector inequalities apply coordinate by coordinate.

### Equivalence between optimization problems

Let us define formally what it means to "cast" a least squares to a quadratic
program. Since they have the same inequality and equality constraints, the only
difference between them lies in their objective functions. We will say that two
objective functions \(f(x)\) and \(g(x)\) are *equivalent*, written as
\(f(x) \propto g(x)\), if they have the same optimum:

Informally, the symbol \(\propto\) thus means "does not change the optimum of the optimization problem".

## Conversion to a quadratic objective

Our goal, starting from the LS objective function \(f(x) = (1/2) \| R x - s \|^2_W\), is to find a QP objective function \(g(x) = (1/2) x^T P x + q^T x\) such that \(f(x) \propto g(x)\). The one-half multiplicative constant being here by convention (and not affecting the optimum in any way), we can forget it for now and expand the weighted norm as:

We can discard the constant term \(s^T W s\) as it does not depend on the optimization vector \(x\) and therefore doesn't affect the optimum of our problem. We then have:

Next, recall that \(W\) is positive semi-definite, hence in particular symmetric: \(W^T = W\). The number \(s^T W R x\) being real, it is equal to its own transpose, that is:

Back to our rewriting of the objective function, we get:

Multiplying by the positive one-half constant yields:

with \(P = R^T W R\) and \(q = - R^T W s\).

### Example in Python

The qpsolvers Python module
for quadratic programming provides a `solve_ls` function alongside its main
`solve_qp` function. This function boils down to:

def solve_ls(R, s, G, h, A, b, lb, ub, W, solver='quadprog'): WR = dot(W, R) P = dot(R.transpose(), WR) q = -dot(s.transpose(), WR) return solve_qp(P, q, G, h, A, b, lb, ub, solver=solver)

## To go further

LS and QP are not exactly equivalent (thanks to Adrien Escande for correcting this in an earlier version of this post): LS can be cast to QP, but QP can only be cast to LS when the vector \(q\) is in the image of the matrix \(P\).

The post on least squares
gives more details on *e.g.* using the weight matrix to steer the optimum of
the problem, or switching to lexicographic optimization when LS is not
expressive enough. For more focus on implementation, you can also check out the
more practical blog post on quadratic programming in Python.