Sensitivity Analysis

  • Relative error analysis

  • Condition number

We consider the OLS problem:

 min_x : |Ax - y|_2.

with

  • A in mathbf{R}^{m times n} the data matrix (known);

  • y in mathbf{R}^m is the measurement (known);

  • x in mathbf{R}^n is the vector to be estimated (unknown).

We assume that A is full column rank. Then, the solution hat{x}_{rm LS} to the OLS problem is unique, and can be written as a linear function of the measurement vector y:

 hat{x}_{rm LS} = A^dagger y,

with A^dagger the pseudo-inverse of A. Again, since A is full column rank,

 A^dagger = (A^TA)^{-1}A^T .

We are interested in analyzing the impact of perturbations in the vector y, on the resulting solution hat{x}_{rm LS}. We begin by analyzing the absolute errors in the estimate, and then turn to the analysis of relative errors.

Bound on the absolute error

We first try to find a bound on the error in the solution given a bound on the error in the input. Let us assume a simple model of potential perturbations: we assume that delta y belongs to a unit ball: |delta y |_2 le alpha, where alpha>0 is given. We will assume alpha = 1 for simplicity; the analysis is easily extended to any alpha >0.

Let us define hat{x}_{rm LS}+delta x to be the estimate corresponding to the measurement y + delta y, that is:

 hat{x}_{rm LS}+delta x = A^dagger (y + delta y).

Since hat{x}_{rm LS} = A^dagger y, we have

 delta x = A^dagger  delta y.

We obtain that

 |delta x|_2 le |A^dagger| cdot |delta y|_2,

where |cdot| is the matrix norm, which is defined for a matrix B as

 |B| := max_{x ::: |x|_2 le 1} : |Bx|_2.

In other words, |B| is simply the maximum output norm that we can attain with a unit-norm input. (The matrix norm can be computed from an SVD of the matrix B.)

The interpretation is that |A^dagger| is the largest absolute error that can result from an absolute error in the input y that bounded by 1.

Bound on the relative error analysis

We now turn to the problem of analyzing the impact of relative errors in the measurement, on the relative error in the solution.

To simplify, let us assume that the matrix A is square (m=n) and full rank (invertible). The development before simplifies to

 delta x  = (A^TA)^{-1} A^Ty = A^{-1} delta y,

so that

 |delta x|_2 le |A^{-1}| cdot |delta y|_2 .

Likewise, the equation Ax = y allows to bound |x|_2 with respect to |y|_2:

 |y|_2 le |A| cdot |x|_2 .

Combining the two inequalities we obtain:

 frac{|delta x|_2}{|x|_2} le (|A| cdot |A^{-1}|) cdot frac{|delta y|_2}{|y|_2} .

Condition number

As seen above, when A is square, invertible, the quantity

 kappa(A) := |A| cdot |A^{-1}|

measures the sensitivity to relative errors in the system Ax=y.

The quantity above is called the condition number of A. The condition number is always greater than 1. The closer to 1, the more confident we can be about the solution x when y is subject to small relative errors. We can generalize this notion to non-necessarily square matrices that have full rank.

Many practical algorithms to solve linear equations (or OLS) use the condition number (or an estimate) to evaluate the quality of the solution.