Calderon-Zygmund Decomposition Via Martingales

This topic first came up while attending a talk by Daniel Restrepo at the UT Austin harmonic analysis learning seminar last Friday. Daniel gave a basic overview of singular integral operators, and in the process proved and used the classical Calderon-Zygmund decomposition:

Theorem 1 (Calderón-Zygmund Decomposition) Given {f \in L^1( {\mathbb R}^d)} and {\lambda >0} there exists a collection of almost disjoint cubes {C = \bigcup_k C_k \subset \mathbb{R}^d} such that the following hold;

  • For all {k}, we have that the average of {f} behaves well:

    \displaystyle \lambda \leq \frac{1}{|C_k|} \int_{C_k} |f| \leq 2^d \lambda.\ \ \ \ \ (1)

  • For almost every {x \in C\setminus {\mathbb R}^d} we have good bounds on {f}:

    \displaystyle |f(x)| \leq \lambda \ \ \ \ \ (2)

  • Finally, we can control the size of {C} in measure by

    \displaystyle | C | \leq \lambda^{-1} \Vert f \Vert_{L^1({\mathbb R}^d)} \ \ \ \ \ (3)

The way I interpret this theorem is that it allows us to take an arbitrary {L^1} function with an arbitrary height and understand the (average) behavior of {f} at every scale. However, the theorem itself is not as interesting as its proof:

Proof: For each {k \in {\mathbb Z}}, let {\mathcal{Q}_k} the decomposition of {{\mathbb R}^d} into almost disjoint dyadic cubes of side length {2^{-k}}:

\displaystyle \mathcal{Q}_k : = \Bigg\{ \Big[\frac{n}{2^{-k}} , \frac{n+1}{2^{-k} } \Big]: n \in {\mathbb Z}\Bigg\}

Note that if {Q \in \mathcal{Q}_k}, then there are {2^d} almost disjoint cubes {Q_1, \dots Q_{2^d} \in \mathcal{Q}_{k+1}} such that

\displaystyle Q_1 \cup \cdots \cup Q_{2^d} = Q.

Given {\lambda >0}, we can now describe an algorithm for picking a set of cubes {\mathcal{C} \subset \bigcup_k \mathcal{Q}_{k}}. Informally, the algorithm takes cubes and evaluates whether the average of {f} over that cube is large or small. If its big enough, the algorithm keeps it. If the average is small, it subdivides that cube into its {2^d} children and asks the same question. In short, the algorithm zooms in until we see the scale of {f}.

Step 0: Since {f \in L^1({\mathbb R}^d)}, there exists a {k_0 \in {\mathbb Z}} such that for every {Q\in \mathcal{Q}_{k_0}},

\displaystyle \int_{Q} |f| dx \leq \lambda |Q|,

We let {\mathcal{Q}_{k_0}} be our initial cubes and define {\mathcal{C}_{k_0} = \emptyset}.

Step k {\rightarrow} Step k+1: Let {k \geq k_0} and assume we have picked cubes {\mathcal{C}_{k} \subset \mathcal{Q}_{k}} such that for each {Q \in \mathcal{C}_{k}},

\displaystyle \frac{1}{|Q|} \int_Q |f|dx > \lambda.

We get further cubes from {\mathcal{Q}_{k+1}} in the following manner. Pick a cube {Q \in \mathcal{Q}_{k}\setminus \mathcal{C}_k}. If one does not exist, stop the algorithm. If there does, there exists {2^d} subcubes {Q_1, \dots, Q_{2^d} \in \mathcal{Q}_{k+1}} such that

\displaystyle Q_1 \cup \cdots \cup Q_{2^d} =Q.

We put one of these cubes {Q_i} in {\mathcal{C}_{k+1}} if it satisfies

\displaystyle \frac{1}{|Q_i|} \int_{Q_i} |f|dx > \lambda.

Otherwise, we do not put it in {\mathcal{C}_{k+1}}. Do this process for every cube, and let {\mathcal{C}_{k+1}} be the set of cubes found in this way.

By induction, we continue this process to find a collection of cubes {\mathcal{C} = \bigcup_{k \geq k_0} \mathcal{C}_k}. It is clear that all the cubes in {\mathcal{C}} are almost disjoint. Moreover, if {Q \in \mathcal{C}_k} for some {k > k_0}, then the unique parent {\tilde{Q} \in \mathcal{Q}_{k-1}} which contains {Q} as a subcube was not picked to be in {\mathcal{C}_{k-1}}. Hence it satisfies

\displaystyle \frac{1}{|\tilde{Q}|}\int_{\tilde{Q}}|f| dx \leq \lambda.

Since {|\tilde{Q}| = 2^d |Q|}, and {Q \in \mathcal{Q}_k},

\displaystyle \lambda \leq \frac{1}{|Q|}\int_{Q}|f| dx = \frac{2^d}{ |\tilde{Q}|}\int_{Q} |f|dx \leq 2^d \frac{1}{|\tilde{Q}|}\int_{\tilde{Q}} |f|dx \leq 2^d \lambda.

Since {\mathcal{C}_{k_0} = \emptyset}, this shows that for each cube {Q \in \mathcal{C}},

\displaystyle \lambda \leq \frac{1}{|Q|}\int_{Q}|f| dx\leq 2^d \lambda

which was the first part of the theorem.

For the second part, let {x} be outside of all of the sets in {\mathcal{C}}. Then, for every {k > k_0}, there exists a cube {Q_k \in \mathcal{Q}_k \setminus \mathcal{C}_k}. Hence

\displaystyle \frac{1}{|Q_k|} \int_{Q_k} |f|dx \leq \lambda.

By Lebesgue differentiation theorem applied to the shrinking family of cubes {\{Q_k\}}, we have for {x} outside of a set of measure zero,

\displaystyle |f(x)|\leq \lambda.

This is the second claim of theorem.

Finally, note that for every cube {Q \in \mathcal{C}}, we have

\displaystyle |Q| \leq \lambda^{-1} \int_{Q} |f| dx.

Summing over all cubes in {\cal C} we obtain the third claim of the theorem. This completes the proof by setting {C} to be the countable union of all of the cubes in {\mathcal{C}}. \Box

This proof can be also be done completely probabilistically.

Proof: To start, first pick an {N \gg 1} such that for any {Q \in \mathcal{Q}_{-N}}, we have

\displaystyle \frac{1}{|Q|} \int_{Q} |f| <\lambda.

Fix a {Q_0 \in \mathcal{Q}_{-N}}, and define a probability space {(\Omega, \mathop{\mathbb P}, \mathcal{F}) = ( Q_0 ,|Q_0 |^{-1} \lambda_d \lfloor Q, \mathcal{B}(Q_0))}, where {\lambda_d \lfloor Q} is the {d} dimensional Lebesgue measure restricted to the set {Q}. We define

\displaystyle \mathcal{Q}_k \cap Q_0 : = \{ Q \subset Q_0 : Q \in \mathcal{Q}_k \}.

Then, define the random variables for {k>-N}

\displaystyle X_k : = \sum_{Q\in \mathcal{Q}_{k}\cap Q_0 } 1_{Q} \frac{1}{|Q|} \int_Q f.

Next define the {\sigma}-algebras

\displaystyle \mathcal{F}_k : = \sigma ( \mathcal{Q}_k\cap Q_0 ).

Then, we clearly have that {X_k \in L^1( \Omega, \mathop{\mathbb P}, \mathcal{F}_k)}. Moreover, {(X_k, \mathcal{F}_k)_{k > -N}} is a martingale.

This is easy to check. First, note that if {Q \in \mathcal{Q}_{k} \cap Q_0}, and {Q' \in \mathcal{Q}_{k-1} \cap Q_0}, we either have {Q \subset Q'} or {Q \cap Q' = \emptyset}. Using the fundamental characteristic of martingales, we have

\displaystyle \mathop{\mathbb E}[ 1_{Q'} \mathop{\mathbb E}[1_Q | \mathcal{F}_{k-1} ]] = \mathop{\mathbb E}[ 1_{Q'} 1_{Q} ] = \begin{cases} 1 & Q \subset Q' \\ 0 & Q \cap Q' = \emptyset. \end{cases}

Since the set of all {Q'} generates {\mathcal{F}_{k-1}}, we have that the above equality implies {\mathop{\mathbb E}[1_Q | \mathcal{F}_{k-1} ] = 1_{\tilde{Q}}} where {\tilde{Q}} is the “parent” of {Q}. Hence by the linearity of conditional expectation,

\displaystyle \mathop{\mathbb E}[X_k | \mathcal{F}_{k-1}] = \sum_{Q \in \mathcal{Q}_{k} \cap Q_0 } \mathop{\mathbb E}[1_Q | \mathcal{F}_{k-1} ] \frac{1}{|Q|} \int_Q f = \sum_{Q \in \mathcal{Q}_{k} \cap Q_0 } 1_{\tilde{Q}} \frac{1}{|Q|} \int_Q f

Since there are {2^d} “children” {\tilde{Q}_1, \dots, \tilde{Q}_{2^d}} of the cube {\tilde{Q}}, we have this expression is equal to

\displaystyle \sum_{\tilde{Q} \in \mathcal{Q}_{k-1} \cap Q_0 } 1_{\tilde{Q}} \sum_{i =1}^{2^d} \frac{1}{|\tilde{Q}_i|} \int_{\tilde{Q}_i} f = \sum_{\tilde{Q} \in \mathcal{Q}_{k-1} \cap Q_0 } 1_{\tilde{Q}} \frac{1}{|\tilde{Q}|} \int_{\tilde{Q}} f = X_{k-1}

where in the second to last step we used {|\tilde{Q}| =2^d|\tilde{Q}_i|}. Hence it is a martingale as claimed, and an easy computation shows that {\{X_k\}} is {L^1} uniformly bounded:

\displaystyle \mathop{\mathbb E}[|X_k| ] \leq \frac{1}{|Q_0|} \int_{Q_0} |f| <\infty.

So if we define the stopping time

\displaystyle T: = \inf \{ k \geq 0 : X_k \geq \lambda\}

then the expression {X^T : = \lim_{ k \rightarrow\infty} X_{k \wedge T}} exists a.s. and is in {L^1} by Fatou’s Lemma. One can check that

\displaystyle \{ T < \infty \}

is a countable union of almost disjoint cubes such that on each cube the first part of the theorem holds. By the Lebesgue dominated convergence, one can also check the second condition of the theorem.

\displaystyle \omega \in \{ T = \infty\} \quad \Rightarrow \quad |f(\omega)|\leq \lambda \quad a.s.

Finally, one can show that

\displaystyle |Q_0|^{-1} | \{ T< \infty \} | = \mathop{\mathbb P}[ \{T < \infty \} ] \leq ( \lambda |Q_0| )^{-1}\Vert f \Vert_{L^1( Q_0)}.

Summing over all cubes and using Fatou again, we obtain the theorem by setting {C} to be the union of all {\{T < \infty\}} sets found. \Box

While the detailed arguments of both of the proofs are identical, i.e. inclusion exclusion arguments and analysis of parent/child cubes, the overall theme of both proofs is different. The algorithmic method uses an inductive argument, while the probabilistic proof uses a stopping time argument.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: