Matrix Norms via SDP
Nuclear & Spectral Norms
Maximum Cut Problem
Graph Laplacian & Binary Encoding
Goemans-Williamson
0.878-Approximation Algorithm
Shannon Capacity
Lovász Theta Function
Week 6: SDP and Combinatorial Optimization
Instructor: Prof. Xin Wang
Date: October 16, 2025
Nuclear & Spectral Norms
Graph Laplacian & Binary Encoding
0.878-Approximation Algorithm
Lovász Theta Function
For a matrix $A \in \mathbb{R}^{m \times n}$ with singular values $\sigma_1(A) \geq \sigma_2(A) \geq \cdots \geq 0$:
Sum of all singular values (also called trace norm)
Euclidean norm of the matrix entries
Largest singular value (also called operator norm or $\ell_2$ operator norm)
The nuclear and spectral norms are dual to each other:
where $\langle A, Y \rangle = \tr(A^TY)$ is the trace inner product
For a symmetric matrix $M \in \mathbb{S}^n$, the spectral norm is:
Consider the block matrix:
Key Insight (Schur Complement): $X(t) \succeq 0$ if and only if:
For a general matrix $M \in \mathbb{R}^{m \times n}$, the nuclear norm $\|M\|_1 = \sum_{k} \sigma_k(M)$:
Correctness: At optimum, by complementarity, $U$ and $V$ are tight, giving $U = (MM^T)^{1/2}$ and $V = (M^TM)^{1/2}$, hence $\tr(U) + \tr(V) = 2\sum_k \sigma_k(M)$
The PSD constraint ensures:
With dual variable $Y \in \mathbb{R}^{m \times n}$:
subject to:
which is equivalent to $\|Y\|_\infty \leq 1$
The unit balls $\{\|M\|_\infty \leq 1\}$ and $\{\|M\|_1 \leq 1\}$ are polar to each other under the trace inner product, reflecting the fundamental duality between spectral and nuclear norms.
Weighted Undirected Graph: $G = (V,E,w)$ where:
This is NP-hard!
No polynomial-time exact algorithm unless P = NP (Karp, 1972)
Complete graph on 3 vertices with unit weights:
Optimal value: 2 (any balanced partition)
Associate to every cut $(S,S^{c})$ the sign vector $x\in\{-1,+1\}^{n}$:
For any edge $\{i,j\} \in E$:
Since $(1-x_ix_j)/2 = (x_i - x_j)^2/4$ when $x_i, x_j \in \{-1,+1\}$
Define $L_G\in\mathbb{S}^{n}$ such that:
Explicit form:
Note: $L_G$ is positive semidefinite with smallest eigenvalue 0
Step 1: Cut value in terms of indicator
Step 2: Convert to quadratic form
Step 3: Express using Laplacian
Conclusion: MAX-CUT $\equiv$ $\maximize_{x\in\{-1,1\}^{n}} \frac{1}{4}x^{T}L_{G}x$
Given feasible $x\in\{-1,1\}^{n}$, define $X := xx^{T}\in \mathbb{S}^{n}$
$X\succeq 0$
(PSD as Gram matrix)
$\text{rank}(X)=1$
(Rank-one structure)
$X_{ii}=1$
(Since $x_i^2 = 1$)
Key observation: $x^{T}L_Gx = \sum_{ij}(L_G)_{ij}x_ix_j = \sum_{ij}(L_G)_{ij}X_{ij} = \tr(L_GX)$
Drop the non-convex rank-one constraint:
Alternative formulation using $\tr(L_GX) = \frac{1}{2}\sum_{\{i,j\} \in E}w_{ij}(X_{ii} + X_{jj} - 2X_{ij})$:
Cut Polytope: $\conv\{xx^T : x \in \{-1,1\}^n\}$
Elliptope: $\mathcal{E}_n = \{X \succeq 0 : X_{ii} = 1, i=1,\ldots,n\}$
We have: Cut Polytope $\subseteq$ Elliptope
SDP optimizes over the larger elliptope
Fundamental Question: How close is the SDP optimum to the true MAX-CUT value?
Every convex relaxation of a discrete optimization problem provides an upper bound (for maximization) or lower bound (for minimization) on the optimal value. The quality of this bound depends on how tight the relaxation is.
where
The minimum is achieved at $\theta \approx 2.331$ radians (equivalently $t = \cos\theta \approx -0.689$)
This is the best possible for polynomial-time algorithms assuming the Unique Games Conjecture (Khot et al., 2007)!
Input: Optimal SDP solution $X^{\star}$ with Cholesky factorization $X^{\star} = VV^T$
where $V \in \mathbb{R}^{n \times r}$ (with $r = \text{rank}(X^{\star})$) has rows $v_1, \ldots, v_n$ that are unit vectors: $\|v_i\|_2 = 1$
The algorithm cuts the unit vectors $\{v_i\}_{i=1}^n \subset \mathbb{R}^r$ by a random hyperplane through the origin with normal direction $g$. Vertices on the same side of the hyperplane are placed in the same partition.
For the rounding $x_i = \sign(\langle v_i,g\rangle)$ where $g \sim \mathcal{N}(0,I_r)$:
where $\theta_{ij} = \arccos(\langle v_i,v_j\rangle) \in [0,\pi]$ is the angle between $v_i$ and $v_j$
Let $\theta = \arccos(\langle v_i,v_j\rangle)$ be the angle between $v_i$ and $v_j$.
Key insight: $x_ix_j = -1$ (different signs) occurs when $g$ falls in a cone of angular width $\theta$ on each side.
By rotational symmetry of the Gaussian:
Therefore:
Using the identity $\val(S,S^c) = \frac{1}{4}x^TL_Gx$:
The SDP optimal value is:
Since $X^*_{ij} = \langle v_i, v_j \rangle = \cos\theta_{ij}$:
Define $\alpha(\theta) := \frac{\theta/\pi}{(1-\cos\theta)/2}$ for $\theta \in (0,\pi]$
Then: $\frac{2\theta}{\pi} \geq \alpha_{GW} \cdot (1-\cos\theta)$ where $\alpha_{GW} = \min_{\theta \in (0,\pi]} \alpha(\theta) \approx 0.87856$
Therefore:
We'll explore how SDP techniques apply to information theory and graph coloring problems through the Lovász theta function.
Setting: Transmit symbols from alphabet $\{1,\dots,n\}$ over a noisy channel
Confusability Graph $G=(V,E)$:
$\alpha(G)$ is the independence number (maximum size of independent set)
Transmit blocks of length $k$: $(i_1,\dots,i_k) \in V^k$
$G^{\boxtimes k}$ has vertex set $V^k$ and edges:
if for every coordinate $\ell$, either $i_\ell=j_\ell$ or $\{i_\ell,j_\ell\} \in E(G)$
(The limit exists by Fekete's lemma since $\{\log\alpha(G^{\boxtimes k})\}$ is subadditive)
Information Rate: $\log_2\Theta(G)$ bits per symbol
Alternative form:
A collection $(c,u_1,\dots,u_n)$ of unit vectors in $\mathbb{R}^m$ is an orthonormal representation of $G$ if:
Non-adjacent vertices must have orthogonal unit vectors!
Note: All vectors are unit: $\|c\|_2 = \|u_i\|_2 = 1$
where the minimum is over all orthonormal representations of $G$.
Alternative formulation: $\vartheta(G) = \max_{\text{orth. rep.}} \min_i (\langle c,u_i\rangle)^2$ is the dual form
Independent set gives orthogonal vectors
$\chi(\overline{G})$ is chromatic number of complement
Via semidefinite programming!
Furthermore, $\vartheta$ is multiplicative under strong product: $\vartheta(G \boxtimes H) = \vartheta(G) \cdot \vartheta(H)$
This implies $\vartheta(G^{\boxtimes k}) = \vartheta(G)^k$, so $\Theta(G) \leq \vartheta(G)$
Since computing $\alpha(G)$ is NP-hard, $\vartheta(G)$ provides a polynomial-time computable upper bound!
Let $u_0 = c$ and construct the $(n+1) \times (n+1)$ Gram matrix:
So $X_{ij} = \langle u_i, u_j \rangle$ for $i,j = 0,1,\ldots,n$
Note: The objective $\sum_{i=1}^n X_{0i} = \sum_{i=1}^n \langle c,u_i\rangle$ is maximized when all $\langle c,u_i\rangle$ are equal
where $J$ is the all-ones matrix and $A_{\overline{G}}$ is the adjacency matrix of the complement graph $\overline{G}$
Both primal and dual satisfy Slater's condition, ensuring:
The pentagon $C_5$ has 5 vertices arranged in a cycle. By symmetry, we can construct an optimal solution to the dual SDP:
For $C_5$, the complement $\overline{C_5}$ consists of edges between vertices at distance 2 in the cycle.
The matrix $\lambda I - J - A_{\overline{C_5}}$ must be PSD. By symmetry and eigenvalue analysis:
where $\varphi = \frac{1+\sqrt{5}}{2}$ is the golden ratio
For PSD, we need all eigenvalues $\geq 0$:
But by careful analysis of the pentagon's structure:
The primal constructs unit vectors in $\mathbb{R}^3$ forming a symmetric "umbrella" configuration. The handle vector $c$ and rib vectors $u_1,\ldots,u_5$ satisfy:
This gives $\vartheta(C_5) = 5 \cdot (1/\sqrt{\varphi}) = \sqrt{5}$
Since we also know $\alpha(C_5^{\boxtimes 2}) = 5$, we have $\Theta(C_5) \geq \sqrt{5}$
This is one of the few cases where Shannon capacity is exactly computable!
Semidefinite programming has revolutionized approximation algorithms and beyond:
Questions and Discussion