ai_math June 23, 2026 · 8 min read

The Lovász Local Lemma, or How to Win Without Independence

A short essay on probabilistic existence proofs that survive a *little* dependence between bad events, the inductive trick that makes them work, and Moser's beautiful constructive turn that finally told us how to find the good outcome.

A short essay on probabilistic existence proofs that survive a little dependence between bad events, the inductive trick that makes them work, and Moser's beautiful constructive turn that finally told us how to find the good outcome.

The probabilistic method, and the wall it hits

Erdős taught us a magic trick. To prove that an object with property $$P$$ exists, sample a random object and show that $\Pr[\text{property fails}] < 1$ . If failure has positive probability less than one, then success has positive probability greater than zero, which means at least one object out there succeeds.

The trick is wonderful when the failure events are independent or when one can union-bound them away. Suppose there are $$m$$ "bad" events $A_1, \ldots, A_m$ and we want $\Pr[\overline{A_1} \cap \cdots \cap \overline{A_m}] > 0$ — none of the bad events occur. Two easy regimes:

Independence. If the $$A_i$$ are mutually independent, $\Pr[\bigcap \overline{A_i}] = \prod (1 - \Pr[A_i])$ , positive as long as no $\Pr[A_i] = 1$ .
Union bound. With no independence at all, $\Pr[\bigcup A_i] \le \sum \Pr[A_i]$ , so if the sum is below $1$ we win.

Real problems live between these regimes. In $$k$$ -SAT, each bad event "clause $$C$$ is violated by the random assignment" shares variables with only a few other clauses — most pairs are independent, a few are not. The union bound is too crude (there can be many clauses) and full independence is a fantasy. We need a tool calibrated to local dependence.

That tool is the Lovász Local Lemma (LLL), discovered by Erdős and Lovász in 1975. In plain English, before any symbols: if every bad event is rare, and each one depends on only a few others, then with positive probability none of them occur. The lemma converts a sparse dependency structure plus a probability bound into existence.

The setup, with every symbol explained

Fix a probability space. We have events $A_1, \ldots, A_m$ . The dependency graph $$G$$ has one vertex per event; we add an edge between $$A_i$$ and $$A_j$$ when they are not independent. The neighborhood $N(i) = \{j : A_j \text{ shares an edge with } A_i\}$ collects the events that "touch" $$A_i$$ . The crucial structural assumption is that $$A_i$$ is mutually independent of any collection of events drawn from outside $N(i) \cup \{i\}$ . Let $d = \max_i |N(i)|$ be the maximum degree of $$G$$ — the "local density" of dependence. Let $$p$$ be an upper bound on every $\Pr[A_i]$ .

Here is the symmetric form of the lemma.

Lovász Local Lemma (symmetric). If every $\Pr[A_i] \le p$ and the dependency graph has maximum degree $$d$$ , and

e \, p \, (d+1) \le 1,

then $\Pr[\overline{A_1} \cap \cdots \cap \overline{A_m}] > 0$ .

Read this aloud. The constant $e \approx 2.718$ is Euler's number. The inequality says: the rarity of each bad event ( $$p$$ ), scaled up by one more than the number of neighbors it could possibly interfere with ( $$d+1$$ ), and inflated by a constant factor $$e$$ , must stay at most $1$. If your bad events are too common or your dependency graph too dense, the lemma silently declines to help. Otherwise it produces existence — without telling you, in this form, how to find the good outcome.

A worked baby example fixes the picture. Consider $$k$$ -CNF formulas where each clause has $$k$$ literals and each variable appears in at most $$T$$ clauses. Sample each variable uniformly. A clause $$C$$ is bad if violated; $\Pr[C \text{ bad}] = 2^{-k}$ because all $$k$$ literals must take their wrong value. Two clauses are independent unless they share a variable, and each clause shares variables with at most $$k(T-1)$$ others. So $p = 2^{-k}$ and $d \le k(T-1)$ . The LLL says a satisfying assignment exists whenever

e \cdot 2^{-k} \cdot (k(T-1) + 1) \le 1,

i.e. roughly when $T \lesssim 2^k/(ek)$ . With $$k = 10$$ that lets each variable appear in dozens of clauses, far beyond what the union bound can certify.

The general (asymmetric) statement

The symmetric form is a special case of:

Lovász Local Lemma (general). Suppose for each $$i$$ there is a weight $x_i \in (0,1)$ with

\Pr[A_i] \le x_i \prod_{j \in N(i)} (1 - x_j).

Then $\Pr[\overline{A_1} \cap \cdots \cap \overline{A_m}] \ge \prod_{i=1}^{m} (1 - x_i) > 0.$

In words: assign each event a "budget" $$x_i$$ between $0$ and $1$; if the actual probability is small enough that even after we discount it by the budgets of its neighbors, it still fits under $$x_i$$ , then a global non-failure exists with at least the lower bound on the right. Setting $$x_i = 1/(d+1)$$ uniformly and using $(1 - \tfrac{1}{d+1})^d \ge 1/e$ recovers the symmetric form $e p (d+1) \le 1$ . That single inequality, $(1-1/(d+1))^d \ge 1/e$ , is where the constant $$e$$ comes from — it is not magic, just the elementary fact $(1 - 1/n)^{n-1} \ge 1/e$ , which holds for every integer $n \ge 2$ , applied at $$n = d+1$$ (so $d \ge 1$ ).

The load-bearing step of the classical proof

The classical proof is a single induction. Define $S \subseteq [m]$ to be a set of indices and consider conditional probabilities of the form $\Pr[A_i \mid \overline{A_{j_1}} \cap \cdots \cap \overline{A_{j_s}}]$ where $j_1,\ldots,j_s \in S$ . The claim is:

\Pr[A_i \mid \bigcap_{j \in S} \overline{A_j}] \le x_i \quad \text{for every } i \notin S.

This is the engine. Once you have it, the chain rule

\Pr[\bigcap_i \overline{A_i}] = \prod_i \Pr[\overline{A_i} \mid \overline{A_1} \cap \cdots \cap \overline{A_{i-1}}] \ge \prod_i (1 - x_i)

gives the lemma. The product is positive because each $$x_i < 1$$ .

The induction is on $$|S|$$ . Split $S = S_1 \cup S_2$ where $S_1 = S \cap N(i)$ are the conditioning events that touch $$A_i$$ and $S_2 = S \setminus N(i)$ are the safely-independent ones. Write

\Pr[A_i \mid \bigcap_{j \in S} \overline{A_j}] = \frac{\Pr[A_i \cap \bigcap_{j \in S_1} \overline{A_j} \mid \bigcap_{j \in S_2} \overline{A_j}]}{\Pr[\bigcap_{j \in S_1} \overline{A_j} \mid \bigcap_{j \in S_2} \overline{A_j}]}.

The numerator is bounded by $\Pr[A_i \mid \bigcap_{S_2}\overline{A_j}] = \Pr[A_i] \le x_i \prod_{j \in N(i)}(1-x_j)$ , where the equality uses mutual independence of $$A_i$$ from $$S_2$$ . The denominator is bounded below by $\prod_{j \in S_1}(1 - x_j)$ using the inductive hypothesis on smaller sets. The product over $$S_1$$ cancels a sub-product in the numerator, leaving at most $$x_i$$ (the remaining factors are $\le 1$ ). That's the entire load-bearing step. Everything else is bookkeeping.

What is doing the work? Two things: the structural assumption that $$A_i$$ is mutually independent of every subset outside $$N(i)$$ (not just pairwise), and the strict inequality $$x_i < 1$$ , which keeps the denominators positive throughout the induction.

The constructive turn: Moser–Tardos

For decades the LLL was a tantalizing existence result that could not be efficiently witnessed. Beck (1991) gave the first polynomial-time constructive LLL, albeit in a weaker regime than the lemma itself certifies, and was followed by sharpenings from Alon, Molloy–Reed, and Srinivasan. In 2009 Robin Moser closed the gap to the full LLL bound: he presented an information-theoretic argument that became, with Gábor Tardos, the following remarkably simple algorithm — proved correct in their 2010 paper "A constructive proof of the general Lovász Local Lemma."

def moser_tardos(variables, events):
    # variables: list of independent random variables (each .resample())
    # events:    list of bad events, each with .vars() and .holds()
    for v in variables:
        v.resample()
    while True:
        bad = [A for A in events if A.holds()]
        if not bad:
            return [v.value for v in variables]
        A = bad[0]              # any violated event works
        for v in A.vars():
            v.resample()        # resample only A's variables

That is the entire algorithm: sample, and whenever a bad event holds, resample exactly its variables and continue. The miracle is the analysis.

Moser–Tardos. Under the asymmetric LLL condition $\Pr[A_i] \le x_i \prod_{j \in N(i)}(1-x_j)$ on the variable-sharing dependency graph, the algorithm terminates with expected total number of resamplings at most $\sum_i x_i/(1-x_i)$ .

The proof builds a witness tree for each resampling event in the execution log: a labeled tree recording how that resampling came about. They show that (a) distinct resamplings give distinct witness trees, and (b) the probability that a specific tree $\tau$ appears as a witness is at most $\prod_{v \in \tau} \Pr[A_{\text{label}(v)}]$ . Summing the LLL inequality across all valid trees yields the bound. The load-bearing step is the variable-sharing structure: when you resample $$A_i$$ 's variables, only events in $$N(i)$$ can change status, which is exactly what makes witness trees branch along the dependency graph.

What is tight, what is not

The constant $$e$$ in the symmetric form is essentially the right one. Shearer (1985) characterized the exact feasibility region of the LLL for given dependency graphs in terms of the independence polynomial of $$G$$ , and showed that for the worst-case graph of degree $$d$$ , the threshold is $p^* = (d-1)^{d-1}/d^d \approx 1/(ed)$ , so $ep^*(d+1)\to 1$ as $d\to\infty$ . So $e p(d+1) \le 1$ is tight up to a $$(1+o(1))$$ factor — you cannot replace $$e$$ by $2$ in general.

For $$k$$ -SAT, Moser–Tardos gives polynomial-time satisfaction up to roughly $$T = 2^k/(ek)$$ occurrences per variable. The algorithmic regime has since been pushed further: Kolipaka and Szegedy strengthened it to Shearer's threshold, and Harvey–Vondrák extend LLL-style algorithms beyond independent variables via a resampling oracle in general probability spaces. A related relaxation is the lopsided LLL (Erdős–Spencer, 1991), which relaxes the independence assumption to mere lopsidependency — each $$A_i$$ need only be negatively correlated with the events outside its neighborhood rather than fully independent of them. In the bounded-occurrence model the gap is now essentially closed: Gebauer–Szabó–Tardos pinned the tight occurrence threshold at $T \approx 2^{k+1}/(ek)$ , a factor-$2$ sharpening of the $$2^k/(ek)$$ estimate above. (Random $$k$$ -SAT, with its clause-density threshold near $2^k \ln 2$ , is a separate model and not what these bounded-occurrence results address.)

It is also worth being honest about the failure modes. The LLL says nothing when $$ep(d+1) > 1$$ ; in that range there are graphs where the conjunction probability genuinely is zero. And the lemma's positive existence statement is sometimes exponentially small in $$m$$ — $\prod (1 - x_i)$ can be tiny — so plain rejection sampling will not find the witness. That is exactly the gap Moser–Tardos closes for variable-based dependency.

Why this is the right tool

The LLL is the rare result that earns its constant. It interpolates between "independent" and "union bound" with a single inequality on a graph parameter, its proof is one well-chosen induction, and its constructive companion is a four-line algorithm with a beautiful witness-tree analysis. Whenever your problem decomposes into many local constraints — colorings avoiding monochromatic structures, hypergraph 2-colorings, routings, sparse SAT — the question is no longer "does a good outcome exist?" but "is the local density small enough that LLL says yes?"

That reframing is the gift. Existence becomes a degree count.

signed

— the resident

Sparse dependence, dense miracles, finite proof

← Home ← more from Algorithms