# svdsketch

Compute SVD of low-rank matrix sketch

## Syntax

``[U,S,V] = svdsketch(A)``
``[U,S,V] = svdsketch(A,tol)``
``[U,S,V] = svdsketch(A,tol,Name,Value)``
``[U,S,V,apxErr] = svdsketch(___)``

## Description

example

````[U,S,V] = svdsketch(A)` returns the singular value decomposition (SVD) of a low-rank matrix sketch of input matrix `A`. The matrix sketch is a low-rank approximation that only reflects the most important features of `A` (up to a tolerance), which enables faster calculation of a partial SVD of large matrices compared to using `svds`.```

example

````[U,S,V] = svdsketch(A,tol)` specifies a tolerance for the matrix sketch. `svdsketch` uses `tol` to adaptively determine the rank of the matrix sketch approximation. As the tolerance gets larger, fewer features of `A` are used in the matrix sketch.```

example

````[U,S,V] = svdsketch(A,tol,Name,Value)` specifies additional options with one or more `Name,Value` pair arguments. For example, you can specify `'MaxIterations'` and a scalar to adjust the number of iterations used to form the matrix sketch.```

example

````[U,S,V,apxErr] = svdsketch(___)` additionally returns a vector `apxErr` that contains the relative approximation error in each iteration, `norm(U*S*V'-A,'fro')/norm(A,'fro')`. The final entry in the vector `apxErr(end)` is the relative approximation error of the output returned by `svdsketch`.```

## Examples

collapse all

Use `svdsketch` to compute the SVD factors of a low-rank matrix approximation.

Use `gallery` to create a 200-by-200 random matrix with geometrically distributed singular values.

`A = gallery('randsvd',200);`

Use `svdsketch` to calculate the SVD of a low-rank approximation of `A`.

`[U,S,V] = svdsketch(A);`

Check the size of the outputs.

`size(S)`
```ans = 1×2 120 120 ```

The results indicate that the low-rank matrix approximation of `A` has a rank of 120.

Specify a tolerance with `svdsketch` to compute the SVD factors of a low-rank matrix approximation. `svdsketch` adaptively determines the appropriate rank of the matrix sketch based on the specified tolerance.

Use `gallery` to create a 200-by-200 random matrix with geometrically distributed singular values.

`A = gallery('randsvd',200);`

Use `svdsketch` to calculate the SVD of a low-rank approximation of A. Specify a tolerance of `1e-2`, and find the size of the output `S` to determine the rank `svdsketch` uses for the matrix sketch.

```tol = 1e-2; [U,S,V] = svdsketch(A,tol); size(S)```
```ans = 1×2 60 60 ```

The results indicate that the low-rank matrix approximation of `A` has a rank of 60.

Examine how well the matrix sketch approximates `A` by comparing `A` to `U*S*V'`.

`norm(U*S*V' - A,'fro')/norm(A,'fro')`
```ans = 0.0048 ```

The result indicates that the matrix sketch approximates `A` within the specified tolerance of `1e-2`.

Use `svdsketch` with the `MaxSubspaceDimension` option on a matrix that has slowly decaying singular values. You can use this option to force `svdsketch` to use a subset of the features of `A` in the matrix sketch.

Create a 5000-by-5000 matrix with values drawn from the standard normal distribution. View the distribution of matrix singular values.

```A = randn(5000); semilogy(svd(A),'-o')```

Since the singular values in `A` decay slowly, `A` has many important features and does not lend itself to low-rank approximations. To form a matrix sketch that reasonably approximates `A`, most or nearly all of the features need to be preserved.

Use `svdsketch` on the matrix with a tolerance of `1e-5`. Specify four outputs to return the SVD factors as well as the relative approximation error in each iteration.

```tol = 1e-5; [U1,S1,V1,apxError1] = svdsketch(A,tol); size(S1)```
```ans = 1×2 5000 5000 ```

The size of `S` indicates that to satisfy the tolerance, `svdsketch` needs to preserve all of the features of `A`. For large sparse input matrices, this can present memory issues since the low-rank approximation of `A` formed by `svdsketch` is roughly the same size as `A` and thus might not fit in memory as a dense matrix.

Check the approximation error of the outputs. Since `svdsketch` preserves everything in `A`, the computed answer is accurate, but the calculation was just an expensive way to calculate `svd(X)`.

`apxError1(end)`
```ans = 1.9075e-08 ```

Now, do the same calculation but specify `MaxSubspaceDimension` as 650 to limit the size of the subspace used to sketch `A`. This is useful to force `svdsketch` to use only a subset of the features in `A` to form the matrix sketch, reducing the size of the outputs.

```[U2,S2,V2,apxError2] = svdsketch(A,tol,'MaxSubspaceDimension',650); size(S2)```
```ans = 1×2 650 650 ```

The outputs now have a smaller size.

Check the approximation error of the new outputs. The tradeoff for forcing the outputs to be smaller is that many important features of A need to be omitted in the matrix sketch, and the resulting rank 650 approximation of `A` does not meet the specified tolerance.

`apxError2(end)`
```ans = 0.8214 ```

## Input Arguments

collapse all

Input matrix, specified as a sparse or full matrix. `A` is typically, but not always, a large and sparse matrix. `svdsketch` is best suited to operate on rank-deficient matrices that have a relatively small number of features.

Data Types: `single` | `double`
Complex Number Support: Yes

Matrix sketch tolerance, specified as a real numeric scalar in the range `sqrt(eps(class(A))) <= tol < 1`.

`svdsketch` uses the value of `tol` to adaptively determine which features of `A` to use in the low-rank approximation (matrix sketch) of `A`. As the value of `tol` increases, `svdsketch` uses fewer features of `A` to form the matrix sketch.

Example: `[U,S,V] = svdsketch(A,1e-4)`

Data Types: `single` | `double`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `[U,S,V] = svdsketch(A,1e-10,'MaxIterations',100)` uses 100 iterations in the `svdsketch` algorithm.

Maximum subspace dimension, specified as a positive integer scalar. The subspace dimension controls the memory consumption of the `svdsketch` algorithm. If the algorithm runs out of memory for a specified tolerance, then you can specify a smaller value for `MaxSubspaceDimension` so that the algorithm stays within memory. For example, adjusting this parameter can be useful when `A` has singular values that decay slowly.

When you specify the `MaxSubspaceDimension` option, you set a maximum value on the rank of the matrix sketch used by `svdsketch`. Therefore, if `svdsketch` cannot satisfy the specified tolerance with a rank smaller than `MaxSubspaceDimension`, it uses the maximum allowed rank, and the resulting outputs might not satisfy the specified tolerance.

Example: ```[U,S,V] = svdsketch(A,1e-7,'MaxSubspaceDimension',150)```

Data Types: `single` | `double`

Initial algorithm block size, specified as a positive integer scalar. The block size is the number by which the rank of the matrix sketch increases each iteration. A larger block size reduces the number of needed iterations by doing more work per iteration, but might also add more information to the calculation than is necessary to achieve convergence.

As the algorithm proceeds, `svdsketch` might adjust the block size from the initial value to speed up convergence if the relative approximation error `apxErr` is not decaying fast enough.

If you specify `BlockSize`, the value should be smaller than `MaxSubspaceDimension`.

Example: ```[U,S,V] = svdsketch(A,1e-7,'BlockSize',10)```

Data Types: `single` | `double`

Maximum number of algorithm iterations, specified as a positive integer scalar. More iterations can produce a higher-quality matrix sketch at the cost of more execution time and higher memory consumption.

Example: ```[U,S,V] = svdsketch(A,1e-7,'MaxIterations',25)```

Data Types: `single` | `double`

Number of power iterations, specified as a nonnegative integer scalar. Power iterations improve the orthogonality of the `U` and `V` outputs. You should generally select the number of power iterations to be `0`, `1`, or `2`, as larger values can adversely contribute to round-off error.

Example: ```[U,S,V] = svdsketch(A,1e-7,'NumPowerIterations',2)```

Data Types: `single` | `double`

## Output Arguments

collapse all

Left singular vectors of matrix sketch, returned as a matrix. The columns of `U` are orthonormal, and they form a set of basis vectors for the range of the matrix sketch of `A`.

The size of `U` depends on the value of `tol`. As `tol` gets larger, `svdsketch` uses fewer features of the input matrix to form the matrix sketch, so `U` and `V` also have fewer columns.

Different machines and releases of MATLAB® can produce different singular vectors that are still numerically accurate. Corresponding columns in `U` and `V` can flip their signs, since this does not affect the value of the expression `A = U*S*V'`.

Singular values of matrix sketch, returned as a square diagonal matrix. The diagonal elements of `S` are the strictly positive singular values of the matrix sketch in decreasing order.

The size of `S` depends on the value of `tol`. As the tolerance increases, `svdsketch` uses fewer features of the input matrix to form the matrix sketch, so `S` has correspondingly fewer rows and columns.

Right singular vectors of matrix sketch, returned as a matrix. The columns of `V` are orthonormal, and they form a set of basis vectors for the null space of the matrix sketch of `A`.

The size of `V` depends on the value of `tol`. As `tol` gets larger, `svdsketch` uses fewer features of the input matrix to form the matrix sketch, so `U` and `V` also have fewer columns.

Different machines and releases of MATLAB can produce different singular vectors that are still numerically accurate. Corresponding columns in `U` and `V` can flip their signs, since this does not affect the value of the expression `A = U*S*V'`.

Relative approximation error in each iteration, returned as a vector. The length of `apxErr` is equal to the number of iterations used in the `svdsketch` algorithm. Use `MaxIterations` to adjust the number of iterations.

The entries of `apxErr` are the relative approximation errors in each iteration, `norm(U*S*V'-A,'fro')/norm(A,'fro')`. The final entry in the vector `apxErr(end)` is the relative approximation error of the output returned by `svdsketch`.

## Tips

• Use `svdsketch` when you do not know ahead of time what rank to specify with `svds`, but you know what tolerance the approximation of the SVD should satisfy.

• `svds` computes the best possible rank-k approximation of the SVD (using the default `"largest"` method). `svdsketch` does not guarantee its rank-k approximation is the best one, which accounts for its speed advantage over `svds`.

## Algorithms

`svdsketch` applies a tolerance to form a low-rank matrix approximation $A\approx QB$ of the input matrix `A`. This low-rank approximation is called a matrix sketch. The matrix sketch only preserves important features from `A`, filtering unnecessary information out. The relative approximation error `apxErr` of the matrix sketch aims to satisfy the specified tolerance `tol`:

`${e}_{\text{sketch}}=\frac{{‖QB-A‖}_{F}}{{‖A‖}_{F}}\le tol$`

The process `svdsketch` follows to form the matrix sketch is:

• `svdsketch` forms the matrix sketch iteratively, with each iteration adding new columns to Q and new rows to B. The new columns and rows are created by extracting features from `A` using a random sample matrix. You can control the number of columns and rows added in each iteration with the `BlockSize` name-value pair.

• During each iteration, `svdsketch` uses power iterations to improve the orthogonality of the new columns in Q. You can adjust the number of power iterations with the `NumPowerIterations` name-value pair.

• The iterations to form the matrix sketch stop when: the number of columns in Q and rows in B reach the specified value for `MaxSubspaceDimension`, the number of iterations reaches `MaxIterations`, or the relative approximation error converges (`apxErr <= tol`).

• To improve convergence speed, `svdsketch` might increase the specified initial value for `BlockSize` from iteration to iteration if the decay in `apxErr` is not sufficient.

After the matrix sketch $A\approx QB$ is formed, `svdsketch` computes the singular value decomposition (SVD) of the matrix sketch via `[U1,S,V] = svd(B,'econ')`, such that

`$A\approx QB=\left(Q{U}_{1}\right)S{V}^{H}=US{V}^{H}.$`

If `svdsketch` is able to filter out some features of `A` based on the specified tolerance, then the resulting SVD factors contain fewer singular values and singular vectors than if a full SVD was performed on `A`.

## References

[1] Yu, Wenjian, Yu Gu, and Yaohang Li. “Efficient Randomized Algorithms for the Fixed-Precision Low-Rank Matrix Approximation.” SIAM Journal on Matrix Analysis and Applications 39, no. 3 (January 2018): 1339–59. https://doi.org/10.1137/17M1141977.