Coherent/Incoherent Matrix Completion

Incoherence is sufficient for recovering a matrix

Theorem Matrix $\mathbf A$ has coherence $\mu$ and rank $r$. There exists universal constant $c, c_1, c_2 > 0$ such that, if each entry of $\mathbf A$ is observed i.i.d. with probability $$ p > c\frac{r \log^2 n}{n}. $$ Then $\mathbf A$ can be recovered from the unique optimal solution of the following optimization w.p. at least $1 - c_1 n^{-c_2}$: $$ \begin{align*} \min_{\hat{A}} &\quad ||\hat{A}|| \quad \text{(nuclear norm)} \cr \text{subject to} & \quad \hat A_{ij} = A_{ij}, \quad \forall (i,j) \in \Omega \text{ (observed set)} \end{align*} $$

Btw, coherence $\mu$ is defined as

Definition (Coherence.) Assume that matrix $\mathbf A$ is rank $r$ with SVD $$ \mathbf A = U\Sigma V^T $$ so $U\in \R^{n\times r}$ and $V\in \R^{n\times r}$ are orthonormal rectangular matrices (without loss $A$ can be zero-filled to be a square matrix).
Coherence of $A$ is defined as the smallest $\mu\in \R$ such that $$ \max_{k= 1, \ldots, n} |U^T e_k|^2 \le \mu\frac{r}n, \quad\max_{k= 1, \ldots, n} |V e_k|^2 \le \mu\frac{r}n. $$

Note for coherence:

We first look at the property of coherence number $\mu$ from two extreme examples.

(i) For any matrix $U$ of rank $r$ with orthonormal columns, the smallest $\mu$ that satisfies $\max_{k= 1, \ldots, n} |U^T e_k|^2 \le \mu\frac{r}n$ is $\mu = 1$ — when $U$ consists of vectors $(\frac 1{\sqrt n},\frac 1{\sqrt n}, \ldots, \frac 1{\sqrt n})$.

(ii) On the other hand, the largest coherence of $U$ occurs when $U$ is (part of) an identity matrix — i.e. the $k$th column of $U$ be $u_k = e_k$. Then $\mu = \frac nr$ and it can be proved that this is $\mu$’s upperbound.

The coherence number $\mu$ reflects how concentrated matrix $\mathbf A$’s singular vectors ($U$, $V$ out of SVD) are. Intuitively, when coherence $\mu$ is high — the singular vectors of $\mathbf A$ are concentrated, then it’s like like having a few very important entries in the matrix that determine the rest. Random sampling would likely miss these crucial entries, making it difficult to reconstruct the matrix accurately. So we want ***in*coherence (low $\mu$).

Incoherence is not necessarily necessary though for recovering a matrix

Incoherence assumption is required exactly because of uniform sampling. But imagine this — if you have perfect control over which entry of $\mathbf A$ be observed, technically you only need $n\times r$ observations to recover a rank-$r$ matrix.

So it should be imaginable that the incoherent requirement can be relaxed when we have some control over which entry of $\mathbf A$ is observed. One way is, each entry of $\mathbf A$ is observed w.p. w.r.t. their local coherence:

Definition (Local Coherence)
For matrix $\mathbf A$ of $n_1\times n_2$ with SVD $U\Sigma V^T$ and rank $r$, its local coherences — $\mu_i$ for any row $i$ and $\nu_i$ for any column $j$ — are defined by $$ \begin{align*} \text{row:}\quad & ||U^Te_i|| = \sqrt{\frac{\mu_i r}{n_1}},\quad \forall i\in [n_1]\cr \text{column:}\quad & ||V^Te_i|| = \sqrt{\frac{\nu_i r}{n_2}},\quad \forall j\in [n_2] \end{align*} $$

And then

Theorem (Chen et al. 2014)
For $n_1\times n_2$ matrix $\mathbf A$ be with local coherence parameters $\lbrace \mu_i, \nu_j \rbrace$. Then exists constant $c’, c_1’, c_2’$ such that, if each entry $A_{ij}$ are observed independently with probability $p_{ij}$: $$ \begin{align*} p_{ij} & \ge \min\lbrace c’\frac{(\mu_i + \nu_j)r\log^2(n_1 + n_2)}{\min(n_1, n_2)}, 1\rbrace\cr p_{ij} & \ge \frac{1}{\min(n_1, n_2)^{10}} \end{align*}. $$ Then $\mathbf A$ can be recovered from the unique optimal solution of the following optimization w.p. at least $1 - c_1’(n_1 + n_2)^{-c_2’}$ (same from the previous theorem) $$ \begin{align*} \min_{\hat{A}} &\quad ||\hat{A}|| \quad \text{(nuclear norm)} \cr \text{subject to} & \quad \hat A_{ij} = A_{ij}, \quad \forall (i,j) \in \Omega \text{ (observed set)} \end{align*} $$

Note that, the expected number of observations in this theorem is $$ \sum_{i, j}p_{ij} \ge c’ \frac{r \log^2 (n_1 + n_1)}{\min(n_1, n_2)}\sum_{i, j}(\mu_i + \nu_j). $$ Because $U, V$ are orthogonal — $\sum_i \mu_i r/n_1 = \sum_j \nu_jr/n_2 = r$, so we further have $$ \sum_{i, j}p_{ij}\ge 2c’\max(n_1, n_2)r\log^2(n_1 + n_2). $$ Which match the previous theorem.

Reference

[1] Coherent Matrix Completion Chen et al. ICML 2014

[2] Lecture 19–20: Matrix Completion Lecture Note by V. Morgenshtern.

Incoherence is sufficient for recovering a matrix#

Incoherence is not necessarily necessary though for recovering a matrix#

Reference#

Incoherence is sufficient for recovering a matrix

Incoherence is not necessarily necessary though for recovering a matrix

Reference