The Dirichlet problem for second order parabolic operators in divergence form

We study parabolic operators H = $\partial$t -- div $\lambda$,x A(x, t)$\nabla$ $\lambda$,x in the parabolic upper half space R n+2 + = {($\lambda$, x, t) : $\lambda$>0}. We assume that the coefficients are real, bounded, measurable, uniformly elliptic, but not necessarily symmetric. We prove that the associated parabolic measure is absolutely continuous with respect to the surface measure on R n+1 in the sense defined by A$\infty$(dx dt). Our argument also gives a simplified proof of the corresponding result for elliptic measure.


Introduction and statement of main results
A classical result due to Dahlberg [8] states in the context of Lipschitz domains that harmonic measure is absolutely continuous with respect to surface measure, and that the Poisson kernel (its Radon-Nikodym derivative) satisfies a scale-invariant reverse Hölder inequality in L 2 . Equivalently, the Dirichlet problem with L 2 -data can be solved with L 2 -control of a non-tangential maximal function. Ever since Dahlberg's original work the study of elliptic measure has been a very active area of research and a number of fine results have been established, see [1,14,20] for recent accounts of the state of the art.
In contrast to the study of elliptic measure, the fine properties of parabolic measure are considerably less understood. In [13] a parabolic version of Dahlberg's result was established for the heat equation in time-independent Lipschitz cylinders. A major contribution in the study of boundary value problems and parabolic measure for the heat equation in time-dependent Lipschitz type domains was achieved in [15,21,22]. In these papers the correct notion of time-dependent Lipschitz type cylinders, correct from the perspective of parabolic measure and parabolic layer potentials, was found. In particular, in [21,22] the mutual absolute continuity of parabolic measure and surface measure and the A ∞ -property were established and in [15] the authors obtained a version of Dahlberg's result for parabolic measure associated to the heat equation in time-dependent Lipschitz-type domains. In this context the properties of parabolic measures were further analyzed in the influential work [16], parts of which have been simplified in [28].
Very recently, there have been advances in the theory of boundary value problems for second order parabolic equations (and systems) of the form Hu := ∂ t u − div λ,x A(x, t)∇ λ,x u = 0, (1.1) in the upper-half parabolic space R n+2 + := {(λ, x, t) ∈ R × R n × R : λ > 0}, n ≥ 1, with boundary determined by λ = 0, assuming only bounded, measurable, uniformly elliptic and complex coefficients. In [6,26,27], the solvability for Dirichlet, regularity and Neumann problems with L 2 -data were established for the class of parabolic equations (1.1) under the additional assumptions that the elliptic part is also independent of the time variable t and that it has either constant (complex) coefficients, real symmetric coefficients, or small perturbations thereof. Focusing on parabolic measure, a particular consequence of Theorem 1.3 in [6] is the generalization of [13] to equations of the form (1.1) but with A real, symmetric and time-independent. This analysis was advanced further in [4], where a first order strategy to study boundary value problems of parabolic systems with second order elliptic part in the upper half-space was developed. The outcome of [4] was the possibility to address arbitrary parabolic equations (and systems) as in (1.1) with coefficients depending also on time and on the transverse variable with additional transversal regularity.
In this paper we advance the study of parabolic boundary value problems and parabolic measure even further. We consider parabolic equations as in (1.1), assuming that the coefficients are real, bounded, measurable, uniformly elliptic, but not necessarily symmetric. We prove that the associated parabolic measure is absolutely continuous with respect to the surface measure on R n+1 (dx dt) in the sense defined by the Muckenhoupt class A ∞ (dx dt). As consequences, the associated Poisson kernel exists, satisfies a scale-invariant reverse Hölder inequality in L p for some p ∈ (1, ∞), and the Dirichlet problem with L q -data, q being the index dual to p, can be solved with appropriate control of non-tangential maximal functions. In particular, our main result, which is new already in the case when A is symmetric and time-dependent, gives a parabolic analogue of the main result in [14] concerning elliptic measure. Our proof heavily relies on square function estimates and non-tangential estimates for parabolic operators with time-dependent coefficients that were only recently obtained by us in [4] as well as the reduction to a Carleson measure estimate proved in [9]. As we shall avoid the change of variables utilized in [14], this also gives a simpler and more direct proof of the A ∞ -property of elliptic measure.

Weak solutions.
If Ω is an open subset of R n+1 , we let H 1 (Ω) = W 1,2 (Ω) be the standard Sobolev space of complex valued functions v defined on Ω, such that v and ∇v are in L 2 (Ω) and L 2 (Ω; C n ), respectively. A subscripted 'loc' will indicate that these conditions hold locally. A function u is called weak solution to the equation Hu = 0 on R n+1 + × R if it satisfies u ∈ L 2 loc (R; W 1,2 loc (R n+1 + )) and for all φ ∈ C ∞ 0 (R n+2 + ).
1.3. Parabolic measure. Given (x, t) ∈ R n+1 and r > 0 we let Q = Q r (x) := B(x, r) ⊂ R n be the standard Euclidean ball centered at x and of radius r, and we let I = I r (t) := (t − r 2 , t + r 2 ). We let ∆ = ∆ r (x, t) = Q r (x) × I r (t) and write ℓ(∆) := r. We will use the convention that cQ and cI denote the dilates of balls and intervals, respectively, keeping the center fixed and dilating the radius by c and we let c∆ := cQ × c 2 I. Given A real, satisfying (1.2), and f continuous and compactly supported in R n+1 , there exists a unique (weak) solution u to the continuous Dirichlet problem Indeed, assume f ≥ 0 and let u k , k ≥ 1, be the unique weak solution to Hu = 0 in Ω k := (0, k) × ∆ k (0, 0), with boundary values f (x, t)ψ(||(x, t)||/k) on ∆ k (0, 0), and zero otherwise. Here, ||(x, t)|| := |x| + |t| 1/2 and ψ is a continuous decreasing function on [0, ∞) such that 0 ≤ ψ ≤ 1, ψ(r) = 1 for 0 ≤ r ≤ 1/2, and ψ(r) = 0 for r > 3/4. Then 0 ≤ u k ≤ u k+1 ≤ ||f || ∞ in Ω k and one can deduce, by the maximum principle and the Harnack inequality, see [25] for these estimates, that In particular, u can be constructed as the monotone and uniform limit of {u k } as k → ∞ on the closure of Ω l for each l ≥ 1. Uniqueness follows from the maximum principle. Furthermore, by the maximum principle and the Riesz representation theorem we deduce where {ω(λ, x, t, ·) : (λ, x, t) ∈ R n+2 + } is a family of regular Borel measures on R n+1 and we refer to ω(λ, x, t, ·) as H-parabolic measure, or simply parabolic measure (at (λ, x, t)).
Given r > 0 and (x 0 , t 0 ) ∈ R n+1 we let . Assume that A satisfies (1.2). Then parabolic measure is a doubling measure in the sense that there exists a constant c, 1 ≤ c < ∞, depending only on n and the ellipticity constants such that the following is true.
, ∆ whenever ∆ ⊂ 4∆ 0 . We refer to [11], [12] and [25] for details. The doubling property of parabolic measure serves as a starting point for further investigation. In this paper we are interested in scale invariant quantitative version of absolute continuity of parabolic measure with respect to the measure dx dt on R n+1 . Given a set E ⊂ R n+1 we let |E| denote the Lebesgue measure of E.
If ω belongs to A ∞ (dx dt), then ω(A + 4r 0 (x 0 , t 0 ), ·) and dx dt are mutually absolutely continuous and hence one can write dω A + 4r 0 (x 0 , t 0 ), x, t = K A + 4r 0 (x 0 , t 0 ), x, t dx dt. We refer to K A + 4r 0 (x 0 , t 0 ), x, t as the associated Poisson kernel (at A + 4r 0 (x 0 , t 0 )). Definition 1.2. For p ∈ (1, ∞) we say that ω belongs to the reverse Hölder class B p (dx dt) if there exists a constant c, 1 ≤ c < ∞, such that for all ∆ 0 := ∆ r 0 (x 0 , t 0 ) the Poisson kernel K A + 4r 0 (x 0 , t 0 ), · satisfies the reverse Hölder inequality Note that as parabolic measure has the doubling property the statement that parabolic measure ω belongs to A ∞ (dx dt) has several equivalent formulations. Furthermore, A ∞ (dx dt) = p>1 B p (dx dt). We refer to [7] for more on A ∞ . For (x, t) ∈ R n+1 , and a function F , we define the non-tangential maximal function where Λ = (λ/2, λ), Q = B(x, λ) and I = (t − λ 2 , t + λ 2 ). Given (x 0 , t 0 ) ∈ R n+1 , η > 0, we also introduce the parabolic cone . We say that the Dirichlet problem for H in R n+2 + with data in L q (R n+1 ), D q for short, is solvable if the following holds. Given f ∈ L q (R n+1 ) then there exists a weak solution u such that in L q (R n+1 ) and n.t., Here, n.t. is short for non-tangentially and means u(λ, x, t) → f (x 0 , t 0 ) for almost every (x 0 , t 0 ) ∈ R n+1 as (λ, x, t) → (x 0 , t 0 ) through the parabolic cone Γ η (x 0 , t 0 ) for some η > 0. Furthermore, we say that D q uniquely solvable if D q is solvable and if the solution is unique.
Assume that parabolic measure ω belongs to A ∞ (dx dt) and, in particular, that ω belongs to B p (dx dt) for some p ∈ (1, ∞). The latter is equivalent to the statement that D q for H is solvable, q being the dual index to p, see for example Theorem 6.2 in [25]. While the results in [25] are derived under the assumption of symmetric coefficients, the lemmas underlying the proof of Theorem 6.2 in [25] do not rely on this assumption.

Remark 1.4.
Concerning D q being uniquely solvable, establishing a criteria for this in terms of parabolic measure is more complicated and forces one to also consider the adjoint parabolic measure. The adjoint parabolic measure ω * is the parabolic measure associate to H * := −∂ t − div λ,x A * (x, t)∇ λ,x , A * being the transpose of A. Definition 1.1 and Definition 1.2 for ω * are as stated but with the point We claim that one can prove that if ω belongs to B H p (dx dt) and ω * belongs to B H * p (dx dt), then D q for H is uniquely solvable, q still being the dual index to p. The assumption that ω * belongs to B H * p (dx dt) is used to conclude the uniqueness. The proof of the claim is akin to the elliptic argument in Theorem 1.7.7 in [19].

Statement of the main result.
The following theorem is our main result. Theorem 1.5. Assume that A satisfies (1.2). Then parabolic measure ω belongs to A ∞ (dx dt) with constants depending only n and the ellipticity constants. In particular, there exists p ∈ (1, ∞) such that ω belongs to the reverse Hölder class B p (dx dt) with p and the constant in the reverse Hölder inequality depending only n and the ellipticity constants. Equivalently, D q , where q is the index dual to p, is solvable. Theorem 1.5 is new and gives the parabolic counterpart of the corresponding recent result for elliptic measure obtained in [14], with a simplified argument compared to [14]. As mentioned before, Theorem 1.5 is new even in the case when A is symmetric and time-dependent. Note that in [17] the result of Dahlberg was proved for elliptic measure associated to the elliptic counterpart of (1.1) with symmetric A, that is, in this case the associated Poisson kernel exists and belongs to B 2 . In contrast, in the parabolic case it is not clear if such a result holds true if we allow for time-dependent coefficients (the case of time-independent coefficients was treated in [6] and does give B 2 ). Theorem 1.5 generalizes immediately to the setting of time-independent Lipschitz domains in the following sense. Consider the domain {(x 0 , x, t) : x 0 > ϕ(x)} above the graph of the timeindependent Lipschitz function ϕ and consider the equation in this domain. Using the simple change of variables (λ, x, t) → (λ + ϕ(x), x, t), this equation is equivalent to an equation in the upper parabolic half space to which Theorem 1.5 applies. In contrast, this argument does not apply to a time-dependent domain of the form {(x 0 , x, t) : x 0 > ϕ(x, t)} as the change of variables (λ, x, t) → (λ + ϕ(x, t), x, t) with ϕ Lipschitz in both x and t destroys the structure of the equations studied here. If ϕ is only Lipschitz with respect to the parabolic metric, that is, Lipschitz continuous in x and 1/2-Hölder continuous in t, then more elaborate changes of variables have to be employed but this changes the nature of the assumption on the coefficients, see [16] for details.

1.5.
Outline of the proof of Theorem 1.5. The proof consists of three parts: a reduction to a Carleson measure estimate, the construction of a particular set F , and the proof of the Carleson measure estimate by partial integration. These three parts have four sources of insights [4,9,14,20]. In general, c will denote a generic constant, not necessarily the same at each instance, which, unless otherwise stated, only depends on n and the ellipticity constants. We often write c 1 c 2 when we mean that c 1 /c 2 is bounded by a constant depending only n and the ellipticity constants.
Reduction to a Carleson measure estimate. The key insight in [20] is that the A ∞ -property of elliptic measure follows once a certain Carleson measure condition is verified. More recently, this idea has also been implemented in the parabolic context: On pp.1172-1175 in [9] it is shown that in order to conclude ω ∈ A ∞ (dx dt) it suffices to prove the following result, which we state here as our second main theorem.
Remark 1.7. Theorem 1.6 is a priori equivalent to the statement that (1.5) holds for all parabolic cubes whenever u is the unique solution to the continuous Dirichlet problem for Hu = 0 with continuous compactly supported boundary data f satisfying |f | ≤ 1, see Remark 5 in [9]. Note that in this case |u| ≤ 1 by the maximum principle. This reformulation has the advantage that it allows one to assume that A is smooth as long as all bounds depend on A only through its ellipticity constants, see p. 20 in [16] for this type of reduction.
Based on Remark 1.7 we can assume qualitatively that A is smooth and we are left with the task of proving the Carleson measure estimate (1.5) if u is any weak solution to (1.1) bounded by |u| ≤ 1. The fact that u could be chosen continuous up to the boundary will not enter the argument.
As a first reduction step we claim that instead of (1.5) it suffices to prove for all parabolic cubes ∆, To see this, we truncate the integral on the left at 2ε > 0 and pick a piecewise linear function η = η(λ), equal to 1 on (2ε, ℓ(∆)) and equal to 0 on (0, ε) and (2ℓ(∆), ∞). In particular, |λ∂ λ η| ≤ 2. Integration by parts in λ on the term η|∇ λ,x u| 2 λ then leads to where the third and fourth term arise from bounding λ 2 2 ∂ λ |∇ λ,x u| 2 via Young's inequality. The standard Caccioppoli inequality (see Lemma 2.1 below) along with the uniform bound |u| ≤ 1 allows us to control the first two integrals on the right-hand side by C|∆|, where C depends on n and the ellipticity constants. The third integral is finite and can be absorbed into the left-hand side. Finally, for the fourth integral we use that ∂ λ u is a solution to Hu = 0 as well (A is independent of λ) and apply Caccioppoli's inequality on parabolic Whitney cubes covering (2ε, ℓ(∆)) × ∆. In total, we get |∂ λ u| 2 λ dx dt dλ.
Passing to the limit ε → 0, we see that having (1.6) for all parabolic cubes is sufficient for having (1.5) for all parabolic cubes. Hence, we can concentrate on (1.6).
Furthermore, as our equations have real and uniformly elliptic coefficients, the solution ∂ λ u satisfies De Giorgi-Moser-Nash estimates, see for example Lemmas 3.3 and 3.4 in [16] or [2]. From a John-Nirenberg Lemma for Carleson measures, see Lemma 2.14 in [5], it follows that for (1.6) it is sufficient to prove that the following holds: For each parabolic cube ∆ ⊂ R n+1 , r := ℓ(∆), there is a Borel set Hence, we are in the setup of Lemma 2.14 in [5] with parabolic scaling. Its proof can then be readily adapted to justify the reduction in (1.7). Note that in (1.7) the set F is a degree of freedom subject to the restrictions. This completes our reduction to a Carleson measure estimate. To avoid duplication with [9] and for the sake of brevity, we will not give more details concerning these facts. Instead we will simply prove Theorem 1.6 and Theorem 1.5 by verifying (1.7) for a properly constructed set F and this is the main contribution of the paper.
Construction of the set F . In the context of elliptic measure the freedom of having a set F ⊂ ∆ at one's disposal in (1.7) was cleverly brought into play in [14] via an adapted Hodge decomposition. Inspired by this, we look for a parabolic Hodge decomposition. To this end, we split the coefficient matrix A as Then A ⊥ is an n-dimensional row vector and A ⊥ is an n-dimensional column vector. We have a similar decomposition of A * , which is the transpose of A since A has real coefficients.
Introduce the parabolic operator H := ∂ t − div x A ∇ x and its adjoint H * := −∂ t − div x A * ∇ x on R n+1 . Let us recall that H and H * admit the following hidden coercivity used systematically in [4,6,26,27]. In fact, it appeared already in [18]. First, we define the homogeneous energy spacė and identifying functions that differ only by a constant. Here, the half-order t-derivative D 1/2 t is defined via the Fourier symbol i|τ | 1/2 . This closure can be realized in L 2 (R n+1 ) + L ∞ (R n+1 ) and modulo constantsĖ(R n+1 ) becomes a Hilbert space, see for example Section 3.2 in [4]. The corresponding inhomogeneous energy space The hidden coercivity of the sesquilinear form on the right-hand side now pays for this operator being invertible with operator norm depending only on n and the ellipticity constants of A , see Theorem 1 in [18] or Lemma 5.9 in [6]. An analogous construction applies to H * . Considering a parabolic cube ∆ = ∆ r ⊂ R n+1 , we let χ 8∆ = χ 8∆ (x, t) be a smooth cut off for 8∆ which is 1 on 8∆, vanishes outside of 16∆ and satisfies r|∇ (1.10) and satisfying the a priori estimates (1.11) We refer to ϕ andφ as parabolic hodge decompositions of the vector fields A ⊥ χ 8∆ and A ⊥ χ 8∆ , respectively. These decompositions give representations of the vector fields A ⊥ χ 8∆ , A ⊥ χ 8∆ adapted to the operators H * , H , representations which combined with the a priori estimates in (1.11) allow us to make use of the powerful toolbox behind the solution of the parabolic Kato problem in [4]. Note that as we can undo the factorization of ∂ t leading to (1.9) if v is a test function, (1.10) holds a fortiori in the usual weak sense. More in the spirit of operator theory, Lemma 4 in [3] shows that the part of H in L 2 (R n+1 ) with maximal domain The recent resolution of the Kato problem for parabolic operators identifies the domain of its unique maximal accretive square root as D( with a homogeneous estimate see Theorem 2.6 in [4]. Thus, writing we can extend (µ + H ) −1 by density from E(R n+1 ) to a bounded and invertible operator onĖ(R n+1 ). Again we also have the analogous results for H * . In particular, for m a natural number and λ > 0 we can introduce the higher order resolvents of ϕ,φ, within the homogeneous energy spaceĖ(R n+1 ). In the further course we will fix m large enough (without trying to get optimal values) to have a number of estimates at our disposal. In fact, as can be seen from the proof of Lemma 4.5 below, m = n + 1 is sufficient for our purposes as this allows us to prove pointwise estimates of certain kernels needed in the proof of non-tangential maximal estimates of ∂ λ P * λ ϕ and ∂ λ P λφ . Coming back to the actual construction of F , we also introduce the parabolic maximal differential operator which maps boundedly into L 2 (R n+1 ) as we shall prove later on in Lemma 2.3. Here, · indicates again the parabolic distance. In particular, (1.11) implies (1.14) The non-tangential maximal function operator N * acting on measurable functions F on R n+2 + was introduced in (1.3). For (x, t) ∈ R n+1 we also introduce the integrated non-tangential maximal function where Λ = (λ/2, λ), Q = B(x, λ) and I = (t − λ 2 , t + λ 2 ). If g : R n+1 → R and is locally integrable we let M(g) be the (n + 1)-dimensional (parabolic) Hardy-Littlewood maximal function |g| dy ds and we let M x and M t denote the standard (euclidean) Hardy-Littlewood maximal operators in the x and t variables only. Our construction of F is then done through the following definition. Definition 1.8. Let ∆ be fixed and also fix m = n + 1. Given κ 0 ≫ 1, we let F ⊂ 16∆ be the set of all (x, t) ∈ 16∆ such that the following requirements are met: Given ∆ and κ 0 ≫ 1, let F be defined as above. Then, using the weak type (1, 1) of M, the strong type (2, 2) of M x M t , the estimates (1.11) and (1.14) and the L 2 -bounds for the non-tangential maximal functions that will later be obtained in Lemma 4.2 and Lemma 4.5, it follows that In particular, we can now choose κ 0 , depending only on n and the ellipticity constants, so that This completes our construction of the set F and from now on κ 0 is fixed as stated ensuring that (1.16) holds.
Proof of the Carleson measure estimate. Based on the previous steps, the proofs of Theorem 1.5 and Theorem 1.6 are reduced to verifying (1.7). To do this we construct, given ∆ = ∆ r , F ⊂ ∆ a Borel set and ǫ > 0, a parabolic sawtooth region above F using parabolic cones of aperture 0 < η ≪ 1. The parameter η is an important degree of freedom in the argument. In (5.4) we will construct a (smooth) cut-off function Ψ = Ψ η,ǫ such that Ψ(λ, x, t) = 1 on F × (2ǫ, 2r) and Ψ(λ, x, t) = 0 if λ ∈ (0, ǫ) ∪ (4r, ∞), and we let Then, by ellipticity of A, Since Ψ has compact support in the upper half space, we can ensure finiteness of J η,ǫ and hence everything boils down to the following key lemma: Lemma 1.9 (Key Lemma). Let σ, η ∈ (0, 1) be given degrees of freedom. Then there exist a finite constant c depending only on n and the ellipticity constants, and a finite constantc depending additionally on σ and η, such that Indeed, choosing σ and η small, both depending at most on n and the ellipticity constants, we first derive where now η is fixed butc is still independent of ǫ. On letting ǫ → 0, we see from (1.17) that the estimate (1.7) holds. As discussed before, this completes the proofs of Theorem 1.6 and Theorem 1.5.
1.6. Organization of the paper. Section 2 is partly of preliminary nature and we here prove (1.14). Section 3 is devoted to the important square function estimates underlying the proof of Theorem 1.9. These estimates rely on recent results established in [4]. In Section 4 we prove the nontangential maximal function estimates underlying the statements in Definition 1.8 (iv)-(v). Based on the material of Sections 2-4 the set F introduced in Definition 1.8 is well-defined and we can ensure (1.16). In particular, thereby the set F ⊂ 16∆ is fixed as we proceed into Section 5 and Section 6. In Section 5 we then introduce sawtooth domains above F , we define the cut-off function Ψ = Ψ η,ǫ referred to above and we prove some auxiliary Carleson measure estimates. The proof of Lemma 1.9 is given in Section 6.

Technical tools
In this section we collect three technical lemmas that shall prove useful in the further course. We begin with standard Caccioppoli estimate which we here state without proof.
for some finite constant c depending on n and the ellipticity constants of A.
Next, we record a Poincaré-type estimate for functions in the homogeneous energy spaceĖ(R n+1 ). We use the standard notation for parabolic cubes introduced in Section 1.3.
Proof. We write ∆ ̺ = Q ̺ × I ̺ and we let noting that this function is contained in the homogeneous fractional Sobolev spaceḢ 1/2 (R), see Section 3.1 in [4]. Then by Poincaré's inequality in the spatial variable x only. Furthermore, for f ∈Ḣ 1/2 (R) we have at hand the non-local Poincaré inequality [4]. Rearranging the covering of the real line by translates of I ̺ into a covering by dyadic annuli, we obtain where the second step can rigorously be justified using Fubini's theorem, see Lemma 3.10 in [4].
As a consequence, we obtain an important estimate for the parabolic maximal differential operator D defined in (1.13).

Lemma 2.3. The operator
holds for almost every (x, t), (y, s) ∈ R n+1 . Indeed, let (x, t) be a Lebesgue point for v and for ̺ > 0 let v ̺ denote the average of v over the parabolic cube ∆ ̺ := ∆ ̺ (x, t). Then, by a telescoping sum and an application of Lemma 2.2, Furthermore, let also (y, s) be a Lebesgue point for v and assume that (y, s) ∈ ∆ ̺ (x, t). Then ∆ ̺ (x, t) ⊂ ∆ 2̺ (y, s) and we obtain as above, Now, for (x, t) = (y, s) as above we can specify ̺ := ||(x − y, t − s)|| and (2.1) follows by adding up the previous two estimates. In particular, we obtain for almost every (x, t) ∈ R n+1 and since all occurring maximal operators are L 2 -bounded, we conclude t v 2 as required.

Functional calculus and square function estimates
In this section we prove the important square function estimates for H and H * underlying the proof of Lemma 1.9. Most of this material is taken from [4].
Given µ ∈ (0, π/2) we let denote the open double sector of angle µ. We let where H ∞ (S µ ) is the set of all bounded holomorphic functions on S µ . Furthermore, recall that an operator T in a Hilbert space is bisectorial of angle ω ∈ (0, π/2) if its spectrum is contained in the closure of S ω and if, for each µ ∈ (ω, π/2), the map z → z(z − T ) −1 is uniformly bounded on C \ S µ . In this case a bounded operator ψ(T ) is defined by the functional calculus for bisectorial operators and we refer to [24] or [10] for the few essentials of this theory used in this section. Turning to concrete operators, we represent vectors h ∈ C n+2 as where the normal part h ⊥ is scalar valued, the tangential part h is valued in C n and the time part h θ is again scalar valued and let Here, M is considered as a bounded multiplication operator on L 2 (R n+1 ; C n+2 ) and the parabolic Dirac operator P is an unbounded operator in L 2 (R n+1 ; C n+2 ) with maximal domain. The link with the parabolic operator H is that (P M ) 2 and (M P ) 2 are operator matrices in block form where the entries * do not play any role in the following but of course they could be computed explicitly. Note that taking adjoints in (3.1), hence using (P * M * ) 2 or (M * P * ) 2 , allows to obtain H * .
The following theorem provides square function estimates. is mentioned. But due to a general result on quadratic estimates for bisectorial operators on Hilbert spaces, see [24] or Theorem 3.4.11 in [10], this quadratic estimate is in fact equivalent to the set of quadratic estimates stated above. The statement for M P follows from the fact that this operator is similar to P M on their respective ranges by M P = M (P M )M −1 . The statements for P * M * , M * P * follow by duality, see again [10,24].
Below, we single out some particular instances of the theorem above and reformulate them in terms of H and H * to have direct references later on. Throughout, we let ϕ,φ be as in (1.10), (1.11) and we recall that the resolvent operators P * λ , P λ were defined in (1.12) for the moment with m unspecified.

Lemma 3.2.
There exists c, 1 ≤ c < ∞, depending only on n, the ellipticity constants and m ≥ 1 such that Proof. In the following we will only prove the estimates for P λφ , the estimates for P * λ ϕ being proved similarly with P * and M * replacing P and M . Note thatφ ∈Ė(R n+1 ) and hence the following calculations can be justified, for example, by approximatingφ by smooth and compactly supported functions in the semi-norm ofĖ(R n+1 ). Keeping this in mind, we may directly argue withφ. We begin with (iii). Let and note, using (3.1) and elementary manipulations of resolvents of P M and M P , that by an application of Theorem 3.1 and (1.11). This proves (iii). Likewise, (i) and (iv) follow with ψ(z) = −2mz 2 (1 + z 2 ) −m−1 and ψ(z) = −2mz 3 (1 + z 2 ) −m−1 , respectively. Finally, to prove (ii) we write analogously  Proof. We have Applying Hardy's inequality and Lemma 3.2 (i) we see that The proof of the estimate for (I − P * λ )ϕ is similar.

Non-tangential maximal function estimates
The pointwise non-tangential maximal operator N * was introduced in (1.3) and its integrated version N * was defined in (1.15). In this section we use the previously obtained square function estimates to derive bounds for these maximal functions.
where the implicit constants depend only on dimension and the ellipticity constants of A. The conclusion remains true also with P M replaced by P * M * .
Proof. For P M , this is Theorem 2.12 in [4]. The same statement can be proved for P * M * .
In the following P * λ ϕ, P λφ are again as defined in (1.12).

Lemma 4.2.
There exists c, 1 ≤ c < ∞, depending only on n, the ellipticity constants and m ≥ 1 such that We only give the proof of the estimate of N * (∇ x P λφ ). To start the proof we first note as in the proof of Lemma 3.2 (ii) that To this end, we first note see for example Lemma 8.10 in [4] for an explicit proof. Since ψ ∈ Ψ(S µ ) for every µ ∈ (0, π/2), we deduce from Theorem 3.1 that which in combination with (4.1) yields the claim.
For the λ-derivatives of P * λ ϕ and P λφ we could get L 2 -bounds for the integrated non-tangential maximal function immediately from the square function estimate in Lemma 3.2 (i). However, this would not be enough for our purpose. To derive the required bounds for the pointwise non-tangential maximal function, we need the following lemma.
where C, c > 0 depend only on n, the ellipticity constants and m. An analogous representation holds for (1 + λ 2 H * ) −m with adjoint kernel K * λ,m . Proof. It suffices to do it when m = 1 as iterated convolution in (x, t) of the estimate on the right hand side of (4.2) with m = 1 yields the result.
Let f ∈ C ∞ 0 (R n+1 ). Let u = (1 + λ 2 H ) −1 f given by the functional calculus of H . Then u ∈ L 2 (R n+1 ) and, in particular, u is a weak solution to λ 2 ∂ t u − λ 2 div x A ∇ x u + u = f . On the other hand, by Aronson's result [2], the operator H has a fundamental solution, denoted by K(x, t, y, s), having bounds with constants C, c depending only on dimension and the ellipticity constants, and satisfying R n K(x, t, y, s) dy = 1 for x ∈ R n , t, s ∈ R, t > s. Set K λ,1 (x, t, y, s) = λ −2 K(x, t, y, s)e − t−s λ 2 and v(x, t) = R n+1 K λ,1 (x, t, y, s)f (y, s) dy ds. Aronson's estimate implies v ∈ L 2 (R n+1 ) and a calculation shows that v is a weak solution to the same equation as u. Thus, w := u − v is a weak solution of ∂ t w − div x A ∇ x w + λ −2 w = 0 and we may use the Caccioppoli estimate of Lemma 2.1 in R n+1 . Choosing test functions ψ that converge to 1 reveals ∇ x w = 0 as w ∈ L 2 (R n+1 ). Hence w depends only on t. Again, as w ∈ L 2 (R n+1 ), w must be 0. This shows that P λ f has the desired representation for all f ∈ C ∞ 0 (R n+1 ) and we conclude by density.

Lemma 4.5.
Fix m = n + 1 in the definitions of P * λ and P λ . There exists c, 1 ≤ c < ∞, depending only on n and the ellipticity constants such that Proof. By symmetry of definitions, we only have to prove one of the estimate and we do the one of N * (∂ λ P * λ ϕ) for a change. To start the proof, fix (µ, y, s) ∈ W (λ, B(x, λ) and I λ (t) = (t − λ 2 , t + λ 2 ) is one of the Whitney regions used in the definition of N * and recall that ∆ λ (x, t) = Q λ (x) × I λ (t). Let σ ∈ Λ λ be arbitrary for the moment. We note that within the functional calculus for H * , and we introduceP * It is convenient to expand this identity as where the operator T is given by a linear combination of the resolvent kernels K * µ,m and K * µ,m+1 provided by Lemma 4.3. Setting G 0 (x, t) := ∆ 2λ (x, t) and G j (x, t) := ∆ 2 j+1 λ (x, t) \ ∆ 2 j λ (x, t), j ≥ 1, since (µ, y, s) ∈ Λ λ × ∆ λ (x, t), we can infer pointwise estimates where C, c > 0 depend only on n, the ellipticity constants and m + k. Note that the bound for j = 0 only holds since m + k ≥ m = n + 1 ≥ n/2 + 1 guarantees that K * µ,m+k is bounded. As we have λ/2 < σ < λ, the kernel K * of the operator acting on σH * P * σ ϕ on the right-hand side of (4.4) has analogous bounds and we can eventually record with C, c > 0 depending only on n and the ellipticity constants. As (µ, y, s) ∈ W (λ, x, t) was arbitrary in this argument, we have in fact sup (µ,y,s)∈W (λ,x,t) where we have also used Cauchy-Schwarz to switch to L 2 -averages and exploited the exponential decay. Since only the right-hand side depends on σ ∈ Λ λ , we can average in σ and take the supremum in λ to find By a direct application of Tonelli's theorem, see Lemma 8.10 in [4] for an explicit proof, this implies and hence the claim follows from Lemma 3.2 (i) applied with m = 1.

Parabolic sawtooth domains associated with F
Throughout this section, let ∆ and κ 0 ≫ 1 be given and let F ⊂ 16∆ be the set introduced in Definition 1.8 with P λ = (1 + λ 2 H ) −n−1 , P * λ = (1 + λ 2 H * ) −n−1 from now on. Let us recall that the non-tangential maximal operators N * and N * at (x 0 , t 0 ) ∈ R n+1 are defined with reference to the Whitney regions , and that Γ(x 0 , t 0 ) denotes the parabolic cone with vertex (x 0 , t 0 ) and aperture one, see (1.4). In particular, we have Next, we introduce a sawtooth domain associated with F , and establish pointwise estimates for the differences Proof. If (x, t) ∈ F , then by the fundamental theorem of calculus and the construction of the set F , see Definition 1.8 (iv), This proves (i). Similarly, consider (λ, x, t) ∈ Ω. Then (λ, x, t) ∈ Γ(x 0 , t 0 ) for some (x 0 , t 0 ) ∈ F and since ϕ andφ are functions of (x, t) only, we obtain together with an analogous estimate for ∂ λθλ (x, t). Hence, (ii) is again a consequence of Definition 1.8 (iv). As ϕ andφ do not depend on λ, we also have showing that (iii) is a consequence of parts (i) and (v) in Definition 1.8.
Our next lemma extends the bound in part (ii) above to the whole sawtooth region.
Proof. By symmetry of the definitions it suffices to prove the bound for θ λ . As a preliminary observation note that if (λ, x, t) ∈ Ω, then (λ, x, t) ∈ Γ(x 0 , t 0 ) for some (x 0 , t 0 ) ∈ F and in particular (x, t) ∈ ∆ λ (x 0 , t 0 ). Since ϕ is a weak solution to the equation H * ϕ = div x (A ⊥ χ 8∆ ) on R n+1 , we can then use the classical local estimates for weak solutions with real coefficients, see e.g. Theorem 6.17 in [23], to the effect that Hence, using the construction of the set F , see Definition 1.8 (iii), we deduce To start with the actual proof of the estimate stated in the lemma, we let (λ, x, t) and (x 0 , t 0 ) be fixed as above and we denote by ϕ 2λ the average of ϕ over the set ∆ 2λ (x 0 , t 0 ). Thinking of P * λ as given by kernel representation from Lemma 4.3, see also Remark 4.4, we have P * λ 1 = 1 and consequently, Hence, using (5.1) and Lemma 5.1 (i), we deduce To estimate the remaining two terms on the right, we bring in the kernel of P * λ explicitly. Indeed, Lemma 4.3 yields with a kernel enjoying the bound for some constants C, c > 0 depending only on dimension and ellipticity. So, splitting R n+2 into ∆ 2λ (x 0 , t 0 ) and annuli ∆ 2 j+1 λ (x 0 , t 0 ) \ ∆ 2 j λ (x 0 , t 0 ), j ≥ 1, we can infer that Next, by a telescopic sum and Lemma 2.2 we deduce that and parts (i) and (ii) of Definition 1.8 guarantee that the last term is no larger than 2 j+3 λκ 0 . In particular, summing up in (5.3), we can conclude |P * can be done similarly, taking into account ∆ λ (x 0 , t 0 ) ⊂ ∆ 2λ (x, t) when writing out the telescopic sum of averages. Now, the claim follows from (5.2).

An adapted cut-off and associated Carleson measures.
Here, we bring into play the degree of freedom 0 < η ≪ 1 and the parameter 0 < ǫ ≪ r that already appeared in the outline of Section 1.5.
Writing Γ η (x 0 , t 0 ) for the parabolic cone with vertex (x 0 , t 0 ) and aperture η, we define the thinner sawtooth domains Then Ω η ⊂ Ω. We are now going to define a smooth cut off adapted to Ω η .

Proof of the Key Lemma
We are now ready to prove the Key Lemma, hence completing the proof of Theorem 1.5. As discussed in Section 1.5, throughout the proof we can qualitatively assume that A is smooth. In that case, one can see that qualitatively ϕ,φ, P λφ and P * λ ϕ as well as u are smooth by interior parabolic regularity. Furthermore, we will simply write J for J η,ǫ and we note -and this is a consequence of the introduction of ǫ -that no boundary terms will survive when we perform partial integration. Similarly, we will write Ψ for Ψ η,ǫ defined in Section 5.1. Throughout, σ will denote a positive degree of freedom and c will denote a generic constant (not necessarily the same at each instance), which depends only on the dimension n and the ellipticity constants. In contrast,c will denote a generic constant that may additionally depend on σ and η. The fact that |u| ≤ 1 will be used repeatedly in the proof.
To start the estimate of J we first note that uΨ 2 λ is a test function for the weak formulation of the equation for u. Hence, Combining this with (6.1), we see that J = J 1 + J 2 + J 3 , where The estimates of J 1 and J 3 turn out to be straightforward: Indeed, by the Cauchy-Schwarz inequality and hence, using the elementary Young's inequality, ellipticity of A and the Carleson measure estimates in Lemma 5.3, Furthermore, Thus, by the Carleson measure estimates in Lemma 5.3 and as |u|, |Ψ| ≤ 1, As for J 2 , we first use the decomposition (1.8) of the coefficients and split J 2 = J 21 + J 22 , where Since A does not depend on λ, integration by parts yields and we write J 21 = J 211 + J 212 , where Once again, |J 212 | ≤c|∆| follows by Lemma 5.3. In order to handle J 211 , we introduce ϕ as in (1.10), that is, as the energy solution on R n+1 to the problem The weak formulation with φ = u 2 Ψ 2 (λ, ·, ·)/2 as test function for λ > 0 fixed, which by construction of Ψ is supported in 8∆, yields Recall that we write θ ηλ = ϕ − P * ηλ ϕ. Then, splitting ϕ = θ ηλ + P * ηλ ϕ in both integrals, we may write For the time being, let us concentrate on the second and fourth term in (6.2). Integrating by parts with respect to λ leads us to where we have again used the λ-independence of the coefficients. We stress that throughout (and with a slight abuse of notation) ∂ λ P * ηλ ϕ denotes the derivative in λ of the function λ → P * ηλ ϕ, so that there is a factor η showing up in front by the chain rule. A similar notational convention will apply to ∂ λ θ ηλ . Taking into account the definition of the parabolic operator H * , we can regroup these terms as where λ dx dt dλ, By the Cauchy-Schwarz inequality and the square function estimates stated in parts (ii) and (iii) of Lemma 3.2 we first deduce that and then, by Lemma 5.3 and Young's inequality, we can conclude |I 2 | + |I 3 | ≤ σJ +c|∆|. To estimate I 1 , we write I 1 = I 11 + I 12 , where λ dx dt dλ, By a familiar argument relying on Cauchy-Schwarz, Lemma 3.2 and Lemma 5.3 we deduce |I 11 | ≤ c|∆|. The estimate of I 12 is more involved. Here, we first use the equation ∂ t u = div λ,x A∇ λ,x u, which thanks to our smoothness assumption may be interpreted in the classical (pointwise) sense, in order to split I 12 = I 121 + I 122 , where Then, For the first term on the right-hand side we can infer control by σJ +c|∆| using Cauchy-Schwarz, Lemma 3.2 and Young's inequality in a by now familiar manner. For the other two terms we shall use for the first time the definition of the set F . More precisely, in virtue of Lemma 5.1 we can replace the resolvent by its pointwise upper bound |∂ λ P * ηλ ϕ(x, t)| ≤ cη ≤ c noting that (λ, x, t) ∈ supp(Ψ) implies (ηλ, x, t) ∈ Ω by construction, see Section 5.1. Having done this, the second integral on the right-hand side is bounded by ηJ thanks to ellipticity of A and for the third one we obtain a bound cJ 1/2 |∆| 1/2 by applying Cauchy-Schwarz and Lemma 5.3. Put together, we have Also, using Lemma 3.2 we immediately have |I 122 | ≤ σI 1221 +c|∆|, (6.3) where This term can be estimated using a Whitney type covering argument, the fact that ∂ λ u is a solution and Caccioppoli's inequality: Indeed, let W = {W i } denote a partitioning of R n+2 + into (parabolic) Whitney cubes, that is, each W i has dyadic (parabolic) sidelength ℓ(W i ) and is located at distance 4ℓ(W i ) to the boundary. Let φ i ∈ C ∞ 0 (2W i ) be a standard cut-off for W i such that 0 ≤ φ i ≤ 1, |∇ λ,x φ i | + |∂ t φ i | 1/2 ≤ c/ℓ(W i ) and i φ 2 i (λ, x, t) = 1 for all (λ, x, t) ∈ R n+2 + . Then by an application of Lemma 2.1 and hence, taking into account the finite overlap of the Whitney cubes, Crudely employing ellipticity, the first integral on the right-hand side is under control by cJ. Since λ|∂ λ u| ≤ c in a pointwise fashion, as follows easily from DeGiorgi-Moser-Nash interior estimates, Caccioppoli's inequality and |u| ≤ 1, we can apply Lemma 5.3 to bound the second and third integral byc|∆|. So, as to (6.3), we have Put together we can conclude that the second and fourth term all the way back in (6.2) can be estimated by At this stage of the proof it only remains to focus on J 2111 + J 2113 and we note that by definition where Using Lemma 5.2, we have |θ ηλ (x, t)| ≤ cηλ ≤ cλ for (λ, x, t) in the support of Ψ and hence we can conclude, using Lemma 5.3, that |II 2 | ≤c|∆| holds. Similarly, for II 3 we would like to bring into play the (integrated) non-tangential control for ∇ x θ λ provided by Lemma 5.1 (iii). To this end, we use an "averaging trick" justified by Tonelli's theorem in order to write where again δ(x, t) denotes the parabolic distance from (x, t) to the set F , we deduce from (5.5) and (5.6) that the integrand on the right-hand side vanishes outside ofẼ 1 ∪Ẽ 2 ∪Ẽ 3 and that we have a bound to see that II 1 = II 11 + II 12 , where Using again the fact that |θ ηλ | ≤ cηλ holds on the support of Ψ along with Cauchy-Schwarz and Lemma 5.3, we deduce the estimate |II 12 | ≤ cηJ +c|∆| 1/2 J 1/2 ≤ (σ + cη)J +c|∆|.
To estimate II 11 , we capitalize again that the smoothness of our coefficients allows us to plug in the equation ∂ t u = div λ,x A∇ λ,x u in the pointwise sense. Then, splitting A according to (1.8), we can write II 11 = II 111 + II 112 + II 113 , where II 111 := − R n+2 + A ⊥ · ∇ x (θ ηλ uΨ 2 )∂ λ u dx dt dλ, Unwinding the derivative in λ and using once more the bound |θ ηλ | ≤ cηλ on the support of Ψ, Here, the first term gives a contribution cη|∆|, the second one can be treated by the familiar combination of Young's inequality and Lemma 5.3, whereas for the third term we make use of Lemma 3.2 (i) instead, noting that ∂ λ θ ηλ = ∂ λ P * ηλ ϕ holds since ϕ does not depend on λ. By these means, we find |II 112 + II 113 | ≤ (σ + cη)J +c|∆|.
To estimate II 1111 we first integrate by parts in λ and regroup derivatives to find Note that the first term on the right-hand side has the same structure as II 3 with the only exception that we have a λ-derivative on Ψ instead of an x-derivative. Hence, we can derive a boundc|∆| by the very same methods. Also the third term on the right-hand side is of the same kind as a term we encountered earlier in the proof -I 2 in this case -which we already know how to bound by σJ +c|∆|. All in all, we have reached a stage of the proof, where the only term that remains to be estimated is and we remark that this final term resembles J 211 except that we have an additional factor ∂ λ P * ηλ ϕ acting to our favor. We now introduceφ as in (1.10), that is, as the energy solution to the problem div x (A ⊥ χ 8∆ ) = ∂ tφ − div x (A ∇ xφ ) = H φ on R n+1 . We remark that Ψ 2 u 2 ∂ λ P * ηλ ϕ is qualitatively smooth and compactly supported, therefore it can be used as test function for the equation above in order to rewrite III 1 . More precisely, we also recallθ ηλ =φ − P ηλφ and write III 1 = III 11 + III 12 + III 13 , (6.4) where III 11 := − R n+2 + ∂ tθηλ (Ψ 2 u 2 ∂ λ P * ηλ ϕ) dx dt dλ, The estimate |III 13 | ≤c|∆| is a consequence of the square function estimate in Lemma 3.2 (i) and (iii). To estimate III 12 , we write III 12 = III 121 + III 122 + III 123 , where Once having applied the pointwise bound |∂ λ P * ηλ ϕ| ≤ cη ≤ c on the support of Ψ, see Lemma 5.2 (iii), the estimate of III 122 reduces to that of II 3 withθ ηλ in lieu of θ ηλ . As the latter two functions share identical estimates, we can record |III 122 | ≤c|∆|. To estimate III 121 we first note, using Cauchy-Schwarz' and Young's inequality, that Note that so far we have neglected III 11 appearing in (6.4). Now, we come back to this term and combine it with the first integral on the right-hand side above to obtain +θ ηλ ∂ t (Ψ 2 u 2 )∂ λ P * ηλ ϕ dx dt dλ.
The first term on the right can be bounded by , which in itself is bounded byc|∆| by square function estimates, see Lemma 3.2 (iv) and Lemma 3.3.
As for the second term on the right, having applied the pointwise bound |θ ηλ | ≤ cηλ ≤ cλ on the support of Ψ, we are left with the task of estimating I 2 , which we have done before.