This topic first came up while attending a talk by Daniel Restrepo at the UT Austin harmonic analysis learning seminar last Friday. Daniel gave a basic overview of singular integral operators, and in the process proved and used the classical Calderon-Zygmund decomposition:

Theorem 1 (Calderón-Zygmund Decomposition)Given and there exists a collection of almost disjoint cubes such that the following hold;

- For all , we have that the average of behaves well:
- For almost every we have good bounds on :
- Finally, we can control the size of in measure by

The way I interpret this theorem is that it allows us to take an arbitrary function with an arbitrary height and understand the (average) behavior of at every scale. However, the theorem itself is not as interesting as its proof:

*Proof:* For each , let the decomposition of into almost disjoint dyadic cubes of side length :

Note that if , then there are almost disjoint cubes such that

Given , we can now describe an algorithm for picking a set of cubes . Informally, the algorithm takes cubes and evaluates whether the average of over that cube is large or small. If its big enough, the algorithm keeps it. If the average is small, it subdivides that cube into its children and asks the same question. In short, the algorithm zooms in until we see the scale of .

**Step 0:** Since , there exists a such that for every ,

We let be our initial cubes and define .

**Step k Step k+1:** Let and assume we have picked cubes such that for each ,

We get further cubes from in the following manner. Pick a cube . If one does not exist, stop the algorithm. If there does, there exists subcubes such that

We put one of these cubes in if it satisfies

Otherwise, we do not put it in . Do this process for every cube, and let be the set of cubes found in this way.

By induction, we continue this process to find a collection of cubes . It is clear that all the cubes in are almost disjoint. Moreover, if for some , then the unique *parent * which contains as a subcube was not picked to be in . Hence it satisfies

Since , and ,

Since , this shows that for each cube ,

which was the first part of the theorem.

For the second part, let be outside of all of the sets in . Then, for every , there exists a cube . Hence

By Lebesgue differentiation theorem applied to the shrinking family of cubes , we have for outside of a set of measure zero,

This is the second claim of theorem.

Finally, note that for every cube , we have

Summing over all cubes in we obtain the third claim of the theorem. This completes the proof by setting to be the countable union of all of the cubes in .

This proof can be also be done completely probabilistically.

*Proof:* To start, first pick an such that for any , we have

Fix a , and define a probability space , where is the dimensional Lebesgue measure restricted to the set . We define

Then, define the random variables for

Next define the -algebras

Then, we clearly have that . Moreover, is a martingale.

This is easy to check. First, note that if , and , we either have or . Using the fundamental characteristic of martingales, we have

Since the set of all generates , we have that the above equality implies where is the “parent” of . Hence by the linearity of conditional expectation,

Since there are “children” of the cube , we have this expression is equal to

where in the second to last step we used . Hence it is a martingale as claimed, and an easy computation shows that is uniformly bounded:

So if we define the stopping time

then the expression exists a.s. and is in by Fatou’s Lemma. One can check that

is a countable union of almost disjoint cubes such that on each cube the first part of the theorem holds. By the Lebesgue dominated convergence, one can also check the second condition of the theorem.

Finally, one can show that

Summing over all cubes and using Fatou again, we obtain the theorem by setting to be the union of all sets found.

While the detailed arguments of both of the proofs are identical, i.e. inclusion exclusion arguments and analysis of parent/child cubes, the overall theme of both proofs is different. The algorithmic method uses an inductive argument, while the probabilistic proof uses a stopping time argument.