Maximum Agreement Subtrees and Hölder homeomorphisms between Brownian trees

We prove that the size of the largest common subtree between two uniform, independent, leaf-labelled random binary trees of size n is typically less than n 1/2 − ε for some ε > 0. Our proof relies on the coupling between discrete random trees and the Brownian tree and on a recursive decomposition of the Brownian tree due to Aldous. Along the way, we also show that almost surely, there is no ( 1 − ε ) -Hölder homeomorphism between two independent copies of the Brownian tree


Introduction
Maximum agreement subtree.-Let t, t ′ be two binary trees with n leaves labeled from 1 to n.The maximum agreement subtree of t and t ′ is the size of the largest subset I ⊂ {1, . . ., n} such that the subtrees of t and t ′ induced by the labels of I are the same (as on Figure 1, see also Section 2.1 for precise definitions).This quantity, which will be denoted by MAST(t, t ′ ), was introduced by Gordon and Finden [15,14] in order to measure the compatibility of the outputs of different classifications methods in phylogeny.It is also a generalization of the well studied problem of the longest increasing subsequence of a permutation, and the two problems share a lot of similarities (as noted e.g. in [5]).Since then, it has been studied from algorithmic, extremal (a) A tree t and (b) The tree t|I = t ′ |I .(c) A tree t ′ and its induced subtree t|I .its induced subtree t ′ |I .and probabilistic points of view.In particular, the quantity MAST(t, t ′ ) can be computed in polynomial time in n [29].On the extremal side, the minimal possible values of MAST(t, t ′ ) over all pairs (t, t ′ ) of leaf-labeled binary trees of size n is known to be of order log n (the upper bound was proved in [20] and the lower bound in [23]).
Maximum agreement subtree of random trees.-Another natural question is to understand the typical order of magnitude of the maximum agreement subtree, that is, the random variable MAST(T n , T ′ n ), where T n and T ′ n are random trees of size n.The most natural model is the one where T n and T ′ n are independent and picked uniformly in the set of labeled binary trees of size n.This model was first investigated by Bryant, McKenzie and Steel [10], who proved by a first moment computation that MAST(T n , T ′ n ) is O( √ n) with high probability.They also provided numerical evidence that MAST(T n , T ′ n ) should be of order n β for some β close to 1/2.On the other hand, a polynomial lower bound of order n 1/8 was obtained by Bernstein, Ho, Long, Steel, St. John and Sullivant in [7].This lower bound was recently improved to n ( √ 3−1)/2 ≈ n 0.366 by Aldous [5] and to n 0.4464 by Khezeli [19] in expectation.Finally, we also mention that √ n has been proved to be the right order of magnitude if the trees T n and T ′ n are conditioned to have the same shape [25], and that the upper bound in √ n holds robustly for many random trees models arising from branching processes [27].
Our main contribution in this paper is to show that the upper bound √ n is actually not optimal in the independent model, which was conjectured by Aldous in [5].
More explicitly, we find that we can take ε 1 = 10 −338 (see Section 5.1 for a discussion on explicit constants).We have not tried to optimize the constants and this value should be easy to improve, but we do not think that our strategy of proof can give a "reasonable" lower bound (like e.g.ε 1 = 10 −6 ).We also mention that our arguments are sufficient to prove that the probability that MAST(T n , T ′ n ) exceeds n 1/2−ε1 is O(n −a ) for some a > 0, and that E [MAST(T n , T ′ n )] ⩽ n 1/2−ε for some ε > 0 (see Section 5.2 for a quick discussion).
Comparison with the Brownian tree.-As recalled before, it is proved in [25] that the MAST of two trees of the same shape is typically of order √ n.Therefore, our strategy relies on the fact that two independent large random trees have "different shapes" at every scale.To formalize this, we make heavy use of the continuous scaling limit of T n , which exhibits nice scale invariance properties, and on which more explicit computations can be performed.
More precisely, we denote by T the Brownian tree, which is the scaling limit of T n , seen as a measured metric space, where distances have been normalized by √ n and masses by n.This compact, continuous random tree with fractal dimension 2 was introduced in [2] and can be built in a natural way from a normalized Brownian excursion (see Section 2.2 for complete definitions).It also has the important property that its branching points all have degree 3. We highlight that comparisons between the discrete trees T n and the continuous object T already play an important role in the proofs of the lower bounds of [5] and [19].
Hölder homeomorphisms of Brownian tree.-Since proving Theorem 1 requires to compare the shapes of two independent copies of T, we obtain along the way the following result of independent interest.Theorem 2. -Let T and T ′ be two independent copies of the Brownian tree.There exists a constant ε 2 > 0 such that almost surely, there is no (1 − ε 2 )-Hölder homeomorphism from T to T ′ .
Just like in Theorem 1, we find that we can take ε 2 = 10 −338 , which we did not try to optimize.Although none of Theorems 1 and 2 easily implies the other, they are closely related to each other.Indeed, as can be seen on Figure 1, a common subtree of two trees T n and T ′ n gives a "correspondence" between a part of T n and a part of T ′ n , which can be extended to a homeomorphism in the continuous limit.This is not a completely new idea, as the arguments of [5] (and the improvements done in [19]) can already be interpreted as a proof of the existence of a homeomorphism from T to T ′ with a certain Hölder exponent.As we check in Theorem 22 in the appendix, the actual Hölder exponent given by [5] turns out to be 5 − 2 √ 6 ≈ 0.1010.More generally, statements similar to Theorem 2 on very different objects appear under the name of Hölder equivalence in the geometry literature.In geometry, this problem is often of the following form: given a metric space X that is homeomorphic to R n , what is the optimal Hölder exponent of a homeomorphism from R n to X?An immediate upper J.É.P. -M., 2024, tome 11 bound is n/dim H (X), where dim H (X) is the Hausdorff dimension of X.We refer to [16] for improved upper bounds in specific contexts such as sub-Riemannian or contact manifolds.However, we are in a very different setting here, as the Brownian trees involved are not manifolds, and so our arguments do not share any commonalities.Another difference between our setting and the one studied by Gromov is that we prove that the Hölder exponent cannot be arbitrarily close to 1 in a context where both sides of the homeomorphism have the same Hausdorff dimension.
We also note that Theorem 2 becomes quite easy if "(1 − ε)-Hölder" is replaced by "Lipschitz" (1) Finally, we remark that our results have a similar flavor to those proved in [6,Th. 1.2,Th. 1.7] for a quite different model (largest increasing subsequence of a random separable permutation).More precisely, the proofs in [6] consist of showing that a random tree cannot contain a large subtree satisfying some properties, which improves on the first moment upper-bound and is achieved by comparison with continuous objects.The very recent preprint [9] improves their result (and also provides some lower-bound) using some careful analysis on the Brownian tree and its associated fragmentation process.
Recursive decomposition of the Brownian tree.-In order to highlight precisely what Theorem 1 and Theorem 2 have in common, we introduce an important tool in our proofs, which already crucially appears in [5].This is a randomized recursive decomposition of the Brownian tree T, which was introduced by Aldous [4].The decomposition consists in picking 3 random uniform points X 1 , X 2 , X 3 in T, blowing up T into three pieces at the unique branching point that separates X 1 , X 2 , X 3 , and iterating the decomposition in each of the three pieces (see Section 2.5 for complete definitions).After k steps, we obtain a (randomized) partition of T into 3 k regions, indexed by a set T k 3 .We denote those regions by (R[i]) i∈T k

3
. This decomposition enjoys very nice independence properties that we will heavily rely on.
We can now state the key result that we will use to prove both Theorem 1 and Theorem 2. It roughly states that a homeomorphism between two independent realizations T, T ′ of the Brownian tree cannot be "almost measure-preserving", in the sense that it has to send "most" regions of T to regions of T ′ with a much smaller mass.We will denote by |A| the measure of a subset A of a Brownian tree.
be a Brownian tree and its recursive decomposition, and let T ′ be an independent copy of T. There exist constants ξ, η > 0 such that almost surely the following holds for k large enough: For any homeomorphism of T has measure at most e −ξk . (1)For example, take a large branching point b 1 of T, and consider an "exceptional scale" δ at which b 1 is unusually close to another branching point b 2 .Then Ψ(b 1 ) would have to be a large branching point of T ′ , and Ψ(b 2 ) would have to be a branching point of T ′ (of scale ≈ δ) very close to Ψ(b 1 ), which is unlikely to exist.Theorem 2 will follow almost immediately from this result.On the other hand, in order to deduce Theorem 1 from Proposition 3, we will rely on the nice coupling existing between the discrete random tree T n and the continuous one T.We can then argue that a common subtree between T n and T ′ n can be extended to a homeomorphism Ψ from T to T ′ .Proposition 3 guarantees that for most of the regions ) is very small, so only few labels can appear in both R[i] and in Ψ(R[i]) when T n is coupled to T and T ′ n to T ′ .We will conclude by using the classic square root upper bound for each of those regions ( Lemma 19).
Ideas of the proof of Proposition 3. -Finally, let us mention some of the ideas behind the proof of Proposition 3. The proof roughly consists in showing that a certain multiscale exploration of the tree T has many "mismatches" with the analog exploration in T ′ , which we believe is the main innovation of the present work.Fix a typical point x ∈ T, and imagine that we try to build a "good" homeomorphism from T to T ′ .By looking at smaller and smaller regions of the recursive decomposition around the point x, we can encode the masses of a sequence of nested neighborhoods of x by a sequence (f j (x)) j⩾0 of i.i.d.numbers, where j represents decreasing scales.We will argue that it is not possible to find x ′ ∈ T ′ such that f j (x ′ ) is very close to f j (x) for most of the scales j.This will imply that the ratio between the mass of a small region around x and the mass of its image around Ψ(x) cannot stay "stationary" as the scale of that region decreases to 0. By a "martingale-like" argument, we will conclude that this ratio must decay quickly, which will yield Proposition 3.This is somewhat reminiscent of some ideas of [6], in the sense that "finding a large substructure is difficult because some positive proportion of the mass is lost at every scale".
Structure of the paper.-In Section 2, we will introduce all the definitions of discrete and continuous objects that will be needed in the proofs, as well as some basic properties of the Brownian tree and of its recursive decomposition.Section 3 is the central part of the paper and is devoted to the proof of Proposition 3, which represents most of the work.In Section 4, we conclude the proofs of the main theorems.In Section 5, we discuss the quantitative values of ε 1 and ε 2 provided by our proof, as well as some open questions.In the appendix, we construct a Hölder homeomorphism between T and T ′ . of t, and the vertices of degree 1 will be called its leaves.A labeled binary tree of size n ⩾ 2 is a finite binary tree with exactly n leaves (and therefore n − 2 nodes), some of which are labeled by integers so that each label appears at most once.By convention, we also say that a single vertex with no edge is a binary tree of size 1, and the empty tree is a binary tree of size 0.
For n ⩾ 1, we denote by B n the set of such trees where all n leaves are labeled and the labels are 1, . . ., n, up to isomorphism (these are sometimes called cladograms in the literature).We highlight that the trees that we consider are not rooted, and they are not plane trees (i.e., there is no clockwise ordering of the neighbors of a fixed vertex).
We recall that for all n ⩾ 3, we have In all the paper, we will denote by T n a uniform random variable on B n .We will also denote by T ′ n an independent copy of T n .If t ∈ B n and I is a subset of {1, . . ., n}, we will denote by t| I the subtree of t induced by I.More precisely, this is the labeled binary tree obtained from t by keeping only the paths joining the labels of I together, and by contracting the vertices of degree 2 that may appear in the process (see Figure 1).Note that t| I does not belong to B #I , unless I = {1, . . ., k}.If t, t ′ ∈ B n , we write MAST(t, t ′ ) = max #I I ⊂ {1, . . ., n} such that t| where by t| I = t ′ | I we mean that t| I and t ′ | I are isomorphic as leaf-labeled trees.
If v 1 , v 2 are nodes of a finite binary tree, a region r of t delimited by v 1 , v 2 is a connected component of the forest obtained by blowing up the nodes v 1 and v 2 into three leaves each.In particular, it is a labeled binary tree, where the leaves coming from v 1 or v 2 are unlabeled.We write #r for the number of original leaves of t that belong to the region r and call this quantity the size of r.If t, t ′ are two labeled binary trees and if r ⊂ t and r ′ ⊂ t ′ are two such regions, we may consider the quantity MAST(r, r ′ ), which is the size of the largest subset I of {1, . . ., n} such that all the elements of I appear both in r and in r ′ , and r| I = r ′ | I .
2.2.The Brownian tree.-We start with the construction of the Brownian tree T introduced in [2,3].Let e = (e t ) 0⩽t⩽1 be a normalized Brownian excursion (that is, a Brownian motion conditioned to stay positive in (0, 1) and hit 0 at time 1).For s, t ∈ [0, 1], we write s ∼ e t if e s = e t = min where s ∧ t and s ∨ t stand respectively for min(s, t) and max(s, t).We also write d e (s, t) = e s + e t − 2 min Then d e is a pseudo-distance on [0, 1], i.e., it is symmetric and satisfies the triangle inequality.Moreover, we have d e (s, t) = 0 if and only if s ∼ e t.Then the quotient T = [0, 1]/ ∼ e equipped with d e is a random compact metric space, which we call the Brownian tree.Moreover T carries a natural probability measure, which is the pushforward of the Lebesgue measure on [0, 1] under the canonical map from [0, 1] to T. We will denote by | • | this probability measure on T, which has full support and no atom.We recall that the metric space T is almost surely a real tree, i.e., for all x, y in T, there is a unique injective, continuous path from x to y, and this path is a geodesic.For m > 0, we say that a random measured metric space (E, d, µ) is a Brownian tree of mass m if (E, m −1/2 d, m −1 µ) has the law of T. Note in particular that T is a Brownian tree with mass 1.
Let k ⩾ 1 and let (E, d, µ) be a Brownian tree with mass m.Conditionally on (E, d, µ), let X 1 , . . ., X k be k i.i.d. points of E, sampled according to (a normalized version of) µ.Note that almost surely, the points X i are all leaves, i.e., E ∖ {X i } is connected.We call (E, d, µ, X 1 , . . .X k ) a k-pointed Brownian tree with mass m.
In this work, decompositions of T into several regions will play a crucial role.For this purpose, we will call region of T the closure of a connected component of T ∖ F , where F is a subset of T of cardinal 0, 1 or 2.
We recall that T is almost surely a binary real tree, i.e., for all x ∈ T, the space T ∖ {x} has at most three connected components.A point b ∈ T such that T ∖ {b} has exactly three connected components will be called a branching point of T.Moreover, we define the size of a branching point b as the measure of the smallest of the three connected components of T ∖ {b}, and denote this quantity by |b| T .Similarly, if R is a region of T and b ∈ R is such that R ∖ {b} has three connected components, the relative size of b in R is the measure of the smallest such connected component, and is denoted by |b| R .The Dirichlet distribution enjoys two properties that we use throughout the paper.The first one can be derived from the so-called beta-gamma algebra results developed in [13]; for the second one we refer to [1,Lem. 17].Suppose (W 1 , . . ., W d ) ∼ Dir (a 1 , . . ., a d ).Then -For any i ∈ {1, . . ., d − 1}, we have and the three random variables appearing in the last display are independent.
J.É.P. -M., 2024, tome 11 -Let I be such that P (I = i | (W 1 , . . ., W d )) = W i .Then for any i ∈ {1, . . ., d} we have  ) and T5.The combinatorial structure of the paths joining the distinguished points, shown in blue, is that of the discrete tree T5 we started with.

2.4.
Coupling between discrete and continuum trees.-We recall the classical coupling between the discrete tree T n and the continuous tree T. Let n ⩾ 3 and let (e i ) 1⩽i⩽2n−3 be an enumeration of the edges of T n such that for 1 ⩽ i ⩽ n, the edge e i is the unique edge incident to the leaf labeled i.
Let (W i ) 1⩽i⩽2n−3 be a random vector with distribution Dir (1/2, . . ., 1/2), independent from T n .Conditionally on T n and (W i ), let T i , X 1 i , X 2 i 1⩽i⩽2n−3 be independent bi-pointed Brownian trees with respective masses W i .For all i, let v 1 i and v 2 i be the two endpoints of the edge e i of T n , with the convention that if one of the endpoints is a leaf, then it is v 1 i .For 1 ⩽ i, i ′ ⩽ 2n − 3 and j, j ′ ∈ {1, 2}, we write As a metric space, this quotient is understood as the "metric gluing" of the T i 's in the sense of [11].The measure on T is straightforwardly obtained from those of the T i , and its total mass is The following result can be found for n = 3 in [4].Even though the corresponding result for n > 3 is folklore, we were not able to find a statement in the literature that exactly matches the one that we use here.However, it can be seen as a consequence of the discrete result proved in [26,Ex. 7.4.12 and Ex.7.4.13](the proof relies on the Rémy algorithm [28] to build uniform binary trees, and the Dirichlet distribution comes from a Pólya urn argument).
See Figure 2 for an illustration of this construction.In particular, the combinatorial structure of the paths joining n uniform points X 1 , . . ., X n of T is that of T n .In the rest of the paper, we will always consider that the continuous tree T and the discrete tree T n are coupled in this way.
We conclude this section with a lemma comparing the mass of a region in T and some corresponding region in T n .
Lemma 5. -Let ε > 0 and let (T, X 1 , . . ., X n ) be an n-pointed Brownian tree.Then, with probability 1 − o n (1), for any region R of T delimited by at most two branching points, denoting R the smallest region of T n that contains all the leaves with label j ∈ {1, . . ., n} such that X j ∈ R, we have the following bound Proof.-Let R be a region of T delimited by at most two branching points.This way the topological frontier ∂R of the region R contains at most two elements.With the notation just above, for any i ∈ {1, . . ., 2n − 3}, we denote by • T i the interior of the region T i , seen as a subset of T. We introduce We note that the set of edges {e i , i ∈ I} defines a region R of T n delimited by at most two nodes; we denote N := # R. By definition of I, the region R contains all the leaves with labels j ∈ {1, . . ., n} that are such that X j ∈ R, so R ⊂ R and so #R ⩽ # R = N .If N < n ε we have nothing to prove, so let us assume N ⩾ n ε .Remark that since R is a region of T n , it has at most two unlabeled leaves, so its number of edges satisfies #I ∈ {2N − 3, 2N − 1, 2N + 1}, depending on whether it is delimited by 0 or 1 or 2 nodes.Now, for any i ∈ I, if Since ∂R has cardinality at most 2, this can happen for at most two such i ∈ I, say i 1 and i 2 (pick them arbitrarily if only 0 or 1 such value exists).This entails R ⊃ i∈I∖{i1,i2} T i .Now, let (W i ) 1⩽i⩽2n−3 be as in the coupling of Theorem 4 (that is, W i is the mass of the set of those points x ∈ T such that e i is the closest edge of T n from x).From the above reasoning, we have We now condition on the discrete tree T n .We recall that (W i ) 1⩽i⩽2n−3 is independent of T n and has Dirichlet distribution, so according to (2.1) we have i∈I∖{i1,i2} conditionally on T n .Let us now fix the region R of T n .Using the explicit density of the Beta distribution, we can write, recalling that which decays faster than any polynomial in n.In the above computation we have used the fact that Γ(x)/Γ(x − k) ⩽ x k for any x > 0 and any integer k such that x − k > 0, and in the end our assumption that N ⩾ n ε .Still conditionally on T n , we can perform a union-bound over all the O(n 4 ) possibilities for choosing the region R and the two labels i 1 , i 2 corresponding to the removed edges.Combining this with (2.4), with high probability, for any region R and corresponding I, i 1 , i 2 , N , we have which is what we wanted to prove.implies that the vector and that conditionally on this vector, the three regions ) are independent bi-pointed Brownian trees with prescribed masses.We will then pick a third point uniformly at random in each R[i] and apply this decomposition recursively.
The complete ternary tree.-More precisely, let be the set of finite words on the alphabet {1, 2, 3}.We will generally use the notation i = i 1 i 2 . . .i k to denote an element of T 3 .For such a word we call k the depth of i, and denote by T k 3 the set of elements of T 3 of depth k.The set T 3 can be seen as the complete ternary tree, with root ∅ and where the parent of i For any k ⩾ 0 and any word i ∈ T 3 with depth at least k, we will write i k = i 1 . . .i k .Finally, we will use concatenation of words of T 3 : if i = i 1 . . .i k and j = j 1 . . .j ℓ , we will write ij for i 1 . . .i k j 1 . . .j ℓ .
The recursive decomposition of the Brownian tree.-Our decomposition will associate to each word i of T 3 a region R[i] of T in such a way that for all i ∈ T 3 , the regions R [21] (a) A realization of the Brownian tree T. (b) Its recursive decomposition up to depth 2.
(a) A region delimited by one point.(b) Its decomposition.
(c) A region delimited by two points.(d) Its decomposition.It can easily be checked by induction on the depth that this construction makes sense and that for all i ∈ T 3 , the tree R[i] is a bi-pointed Brownian tree with a randomized mass.Indeed, if (R (i) The vectors

d. bi-pointed Brownian trees conditionally on their masses.
Proof.-Let F k be the σ-algebra generated by the masses |R[i]| for i of depth at most k.It is sufficient to show that for all k ⩾ 0, conditionally on F k , the vectors (2.5) for i of depth k are i.i.d. with law Dir (1/2, 1/2, 1/2), and that conditionally on F k and on those vectors, the regions R[i] for i of depth k + 1 are independent bi-pointed Brownian trees with prescribed masses.We prove this statement by induction on k.
For k = 0, this is just Theorem 4 for n = 3.If the statement is true for some k ⩾ 0, the induction hypothesis guarantees that conditionally on F k , the R[i] for i of depth k + 1 are independent bi-pointed Brownian trees with randomized masses.We then apply Theorem 4 for n = 3 to each of these Brownian trees, still conditionally on F k .□ Zooming in on a random point.-Finally, let x ∈ T and k ⩾ 0. If x is not one of the points x[i], which is the case for almost every x ∈ T, we will denote by and by i k (x) the k-th letter of the word i k (x).It will be useful for us to study the recursive decomposition around a random point X picked uniformly in T. In this setting, we will need the following result.
) i∈T3 be a decomposed Brownian tree with a point X sampled uniformly on T, independently of the decomposition.Then the vectors for j ⩾ 0 are i.i.d..Moreover, they have the distribution of (W 1 , W 2 , W 3 , I), where In particular, the index I is uniform on {1, 2, 3} and the variable W I follows the Beta (3/2, 1) distribution.
Proof.-The argument is basically the same as the proof of Proposition 6, where F k is replaced by the σ-algebra G k generated by the vectors (2.6) for 0 ⩽ j ⩽ k − 1.The only additional element is that since X is picked independently of the decomposition, conditionally on the Brownian trees R [1], R [2], R [3], it has probability |R[a]| to be in R[a] for a ∈ {1, 2, 3}.In particular, we have ] is still a (bipointed) Brownian tree with prescribed mass.The heredity step is adapted in the same way.Finally, the fact that W I has a Beta (3/2, 1) distribution is a consequence of (2.1), (2.2) and (2.3).□ Remark 8. -Often in the paper, it will be useful to re-interpret a quantity defined as the mass of a certain subset of T, typically of the form as the probability that a random point X falls into it.This will typically take the following form Estimates for the size of the regions.-We now present a very rough result controlling the sizes of the regions appearing in the Aldous recursive decomposition of T. Lemma 9. -There are constants C > c > 0 such that almost surely, for k large enough, for all i ∈ T k 3 , we have Proof.-The proof uses classical branching random walk arguments.We first notice that, by the independence properties of our recursive decomposition, the process (log |R[i]|) i∈T3 is a branching random walk.That is, the vectors for i ∈ T 3 are i.i.d. and have the law of (log W 1 , log W 2 , log W 3 ), where the law of Therefore, for any C > 0, using the Chernoff bound we can write, for any λ > 0, for any i ∈ T k 3 , Since Dirichlet distributions have polynomial tails, we can find λ such that the expectation is finite.We can then find C such that the right-hand side in the last equation is bounded by 4 −k .We conclude the proof by a union bound over i ∈ T k 3 and then using the Borel-Cantelli lemma over k.
Similarly, for c > 0 and λ > 0, we have Since W 1 does not have an atom at zero, there exists a number λ > 0 such that E e λ log(W1) < 1/5.Once the value of λ is fixed, we can choose c > 0 sufficiently small so that the last right-hand side is bounded by 4 −k .This proves the other direction, again by union bound and Borel-Cantelli.□

Proof of Proposition 3
This entire section is devoted to proving Proposition 3, which is central to the proof of the main results in the next section.
Rough idea of the proof.-We start with a rough idea of the proof.Let us fix a point x ∈ T and consider the ratios obtained by "zooming scale after scale" around x.By Lemma 7, these are i.i.d.variables with a fixed, absolutely continuous distribution.Using this, we will argue that if we fix a small region r ′ of T ′ , the probability that we can find nested regions of T ′ around r ′ that respect the ratios (3.1) up to a factor 1 ± δ at most of the scales 1 ⩽ j ⩽ k is of order δ k .By choosing δ small enough, we will be able to do a union bound over the possible candidate regions r ′ of T ′ .This shows that for any small region r ′ of T ′ , a homeomorphism sending x to a point of r ′ will have "δ-mismatches" between the ratios for "many" scales j (this is Proposition 13 below, that we prove in Section 3.1).Moreover, the ratio can be written as the telescopic product of those mismatches, so the existence of mismatches proves that the ratio (3.2) will vary "quite often" by a factor 1 ± δ.To conclude, we will argue that for most points x, those mismatches cannot compensate each other.Indeed, if for a region | even smaller.Another way to say this is that if x is picked uniformly at random in T, the ratios (3.2) form a nonnegative martingale in k (see (3.13) below), so they converge almost surely.By existence of mismatches, they change by a factor 1 ± δ many times, so the limit of the martingale has to be 0.This argument is done in a quantitative way in Section 3.2.
-Before giving the precise definition of mismatches, it will be convenient to restrict ourselves to a (not too small) subset of the intermediate scales.More precisely, we first notice that by construction of the Aldous recursive decomposition, if a word i has its last letter equal to 3, then the boundary of the region R[i] in T is a single point (see Figure 4).
Definition 10. -Let α > 0 and let i ∈ T 3 with depth k.We say that a scale and if furthermore The reason why we want i j = 3 is that it implies that there are two small regions R[i j 2], R[i j 3] whose boundary is the singleton {b[i j ]} (see Figure 4), and the ratio between the masses of those regions gives a convenient way to say that T and T ′ have "different shapes" at a certain scale.The point of the assumption (3.3) is that if the branching point b[i] splits the region R[i] in a very uneven way, there will be many possible choices for only few choices will be available.
The first step is to make sure that around most points of the tree T, good scales represent a positive proportion of the scales.
Lemma 11. -Let T, (R[i]) i∈T3 be a decomposed Brownian tree.There exists constants α, β, γ > 0 such that denoting then almost surely, for k large enough, we have |E k | < e −γk .Proof.-Let us prove that we can find α, β, γ such that E |E k | < e −2γk , so that the conclusion of the lemma can be obtained using the Markov inequality and the Borel-Cantelli lemma.Let X be a uniform point taken under the mass measure on T independently of the decomposition.By Remark 8, we can rewrite E |E k | in the following way: has less than βk scales that are α-good) .

J.É.P. -M., 2024, tome 11
As in the proof of Lemma 7, for k ⩾ 0, we denote by G k the σ-algebra generated by the word i k (X) and by the masses R[i j (X)a] for 0 ⩽ j ⩽ k − 1 and a ∈ {1, 2, 3}.It is clear that the event that j is a good scale for i k (X) belongs to G j+1 .
Moreover, by Lemma 7, for every odd j, we have where (W 1 , W 2 , W 3 ) ∼ Dir (1/2, 1/2, 1/2).This probability is deterministic and does not depend on j so we denote it by p, and note that p > 0 if α > 0 was chosen small enough.Therefore, the variables are i.i.d.Bernoulli variables with positive mean, which is sufficient to prove the lemma by a Chernoff bound. □ From now on, we will fix α, β, γ > 0 that satisfy the conclusion of Lemma 11 and we will simply refer to an α-good scale as a good scale.Now let 0 < δ < α and let Ψ : T → T ′ be a homeomorphism.
Since T, (R[i]) i∈T3 and δ are fixed throughout the paper, we will often simply write "mismatch for Ψ at i" instead of "δ-mismatch for Ψ at i with respect to (R[i]) i∈T3 ".Informally, the scale j being a mismatch for Ψ at i indicates that when we perform one step of the Aldous recursive decomposition in T, the decomposition of R[i j ] into three parts and its image by Ψ in Ψ (R[i j ]) do not split the masses with the same proportions.
Our next goal is the following result, which guarantees the existence of mismatches around most points for any homeomorphism.It represents a significant proportion of the proof of Proposition 3.
Proposition 13. -There exists δ > 0 such that almost surely, for k large enough, for any i ∈ T k 3 that has at least βk good scales, for any homeomorphism Ψ from T to T ′ , one of the following holds: ) there are at least ⌊ β 2 k⌋ good scales in {1, . . ., k − 1} that are δ-mismatches for Ψ at i.
Proof.-Let δ ∈ (0 , α), whose value will be specified later.Let k ⩾ 0. Let C > 0 be the constant given by the lower bound in Lemma 9, and let B k be the event that for all i ∈ T 3 of depth at most k, the branching point b By Lemma 9 (applied to k + 1), almost surely B k occurs for k large enough.

J.É.P. -M., 2024, tome 11
Now let i ∈ T k 3 , and let J be a subset of {1, 2, . . ., k − 1} with size ⌊ β 2 k⌋ and with only odd elements.We are interested in the existence of a homeomorphism Ψ that would satisfy the three following properties: ) for all j ∈ J, the scale j is good but is not a δ-mismatch for Ψ at i.
We insist that in item (i), we mean e −Ck and not e −Cj .We hence define the event A J (i) as the event A J (i) = there exists a homeomorphism Ψ : T → T ′ such that (i), (ii), (iii) hold for Ψ and i .
In the rest of the proof, whenever there exists a homeomorphism Ψ such that (i), (ii), (iii) hold for Ψ and i we will say that "Ψ makes A J (i) occur".Now let us consider what happens on the event where B k occurs but none of the A J (i) does for any i ∈ T k 3 and any J ⊂ {1, 2, . . ., k − 1} with #J = ⌊ β 2 k⌋.Fix some Ψ : T → T ′ and i ∈ T k 3 that has at least βk good scales.On the event considered, (i) is satisfied so either (ii) fails (and in this case point 1. of the proposition holds); or (ii) holds for Ψ and i, which entails that (iii) fails for all choices of J. Since we assumed that i has more than βk good scales, the fact that we cannot find any ⌊ β 2 k⌋ good scales that are not δ-mismatches tells us that at least βk − ⌊ β 2 k⌋ ⩾ ⌊ β 2 k⌋ of them are indeed δ-mismatches.In view of this, since we already know that B k occurs for k large enough, it suffices to find δ > 0 such that almost surely, none of the A J (i) with i ∈ T k 3 and J ⊂ {1, . . ., k} of cardinal ⌊ β 2 k⌋ occurs for k large enough.For that, we will show using a union bound that the probability of (3.5) is summable in k, and conclude with the Borel-Cantelli lemma.Suppose that there exists some Ψ that makes the event A J (i) occur.We write ℓ = ⌊ β 2 k⌋ and denote by j 1 < j 2 < • • • < j ℓ the elements of J, and write b h = b[i j h ] for all h ∈ {1, . . ., ℓ}.We now try to understand the sequence of points We first note that this sequence must satisfy a topological condition: we recall that i j h = i j h +1 = 3 because the scale j h is good.Hence, for all h ∈ {1, . . ., ℓ}, the point b h is a branching point of T with the points b 1 , . . ., b h−1 in the same connected component of T ∖ {b h }, and the points b h+1 , . . ., b ℓ in another component.Since Ψ is a homeomorphism, the same is true for the sequence (c h ) in T ′ .Moreover, we claim that the sizes of the branching points c h cannot be too small.More precisely, for all h ∈ {1, . . ., ℓ − 1}, let R ′ h (c) be the connected component of T ′ ∖ {c h } that contains c h+1 , . . ., c ℓ , with the convention that R ′ 0 (c) = T ′ .In other words, we have Then for all h ∈ {1, . . ., ℓ}, we have Using successively the fact that j h is not a δ-mismatch and the fact that the scale j h is α-good, we deduce, for 1 ⩽ h ⩽ ℓ − 1: From now on, we assume δ < α/4, so that For the same reasons, we also have so, using (ii) and (i) in the definition of the event A J (i), we have Following what precedes, we define a candidate sequence of length ℓ as a sequence (c h ) 1⩽h⩽ℓ of branching points of T ′ such that: -for all h ∈ {1, . . ., ℓ}, the points c 1 , . . ., c h−1 all lie in one connected component of T ′ ∖ {c h }, and the points c h+1 , . . ., c ℓ lie in another one, denoted if h < ℓ by R ′ h (c); -for all h ∈ {1, . . ., ℓ − 1}, we have )k .Note that crucially, the notion of candidate sequence does not depend on δ.By the discussion above, any Ψ : T → T ′ that makes the event A J (i) occur must send (b h ) 1⩽h⩽ℓ to some candidate sequence of length ℓ in T ′ .We will now provide a bound on the number of such candidate sequences in T ′ .This bound is entirely deterministic, and only uses the fact that T ′ is a real tree with total mass 1 where all branching points have degree 3, which is almost surely the case for a realization of a Brownian tree.
Lemma 14. -There exists a constant K = K(C, α, β) such that almost surely, for all ℓ ⩾ 2, the number of candidate sequences of length ℓ in Therefore, the number of possible values of the sequence (s h (c)) 0⩽h⩽ℓ is at most On the other hand, for any such sequence s = (s h ) 0⩽h⩽ℓ , let us bound the number of candidate sequences c for which the scale sequence s(c) is s.For this, we start with an easy remark.Let R be a region of T of mass m 0 .Then we claim the following Claim.
-The number of branching points c in R satisfying |c| R ⩾ m is at most m 0 /m.Indeed, let S m be the set of branching points of size at least m in R. Then R can be obtained by gluing the connected components of R ∖ S m along the structure of a finite binary tree.The nodes of this binary tree correspond to the points of S m , so this binary tree has #S m nodes and therefore #S m + 2 leaves.Those #S m + 2 leaves correspond to #S m + 2 disjoint parts of R with mass at least m each, so (#S m + 2) m ⩽ m 0 and the claim follows.Now, let s = (s h ) 0⩽h⩽ℓ be a non-decreasing sequence of integers with s 0 = 0 and s ℓ ⩽ log(2/α) + (C + 1)ℓ.Let us build step by step a candidate sequence c satisfying s(c) = s: -For c to be a candidate sequence, c 1 needs to satisfy by definition of s 1 (c).Using the last display and the claim, the number of possible choices for c 1 for which s 1 (c) = s 1 is at most 2 α e s1+1 .-Let 1 ⩽ h ⩽ ℓ − 1 and assume that c 1 , . . ., c h−1 have already been chosen.Then , so there are only 3 possible choices for R ′ h−1 (c).Moreover, once this region has been chosen, the point c h must be a point of R ′ h−1 (c) with J.É.P. -M., 2024, tome 11 by definition of s h (c).Using the claim again and the fact that |R ′ h−1 (c)| ⩽ e −s h−1 (c) = e −s h−1 by construction, the number of possible choices for c h that ensure that s h (c) = s h , given c 1 , . . ., c h−1 , is bounded above by -Finally, by the same reasoning, the number of possible choices for c ℓ given c 1 , . . ., c ℓ−1 is bounded above by Using the above in cascade and reducing the telescopic product we get that, for any s, the number of candidate sequences c such that s(c) = s is bounded above by Combined with (3.7), this proves the lemma, with We return to the proof of Proposition 13.From now on, we will work conditionally on the tree T ′ and fix J, i.We recall that j 1 < • • • < j ℓ are the elements of J and that b h = b[i j h ].We have seen that if there exists a Ψ that makes the event A J (i) occur, then the sequence (Ψ(b h )) 1⩽h⩽ℓ is a candidate sequence in T ′ .Therefore, we fix such a candidate sequence (c h ) 1⩽h⩽ℓ , and estimate the probability that there exists a Ψ satisfying (i), (ii), (iii) as well as Ψ(b h ) = c h for all h ∈ {1, . . ., ℓ}.For this, let 2 ⩽ h ⩽ ℓ − 1.On the event that such a Ψ exists, since j h is not a δ-mismatch for i, we have and a similar estimate holds for |R[i j h 3]|.
On the other hand, by our conventions in the construction of the recursive decomposition and the fact that | the respective masses of these two connected components, and highlight that those masses are completely determined by the sequence c.From (3.9) and the analog equation for R[i j h 3], using that (x, y) → x/(x + y) is increasing in x and decreasing in y, we get J.É.P. -M., 2024, tome 11 Since the scale j h is α-good and we have chosen δ < α/2, we have as in (3.6) that for all h ∈ {2, . . ., ℓ − 1}.On the other hand, by Proposition 6, the variables It follows that we have where On the other hand, by (2.1), the law of integrating between q h − δ/α and q h + δ/α, assuming that δ < α 2 /4 we have We can now take the union bound over all candidate sequences (c h ).Using Lemma 14, we obtain We can now remove the conditioning on T ′ and perform a union bound over the 3 k possible values of i and the k−1 ℓ ⩽ 2 k possible values of the set J. We find (3.11) Reminding that ℓ = ⌊βk/2⌋, if we choose the constant δ > 0 sufficiently small, this decays exponentially in k.Given the discussion before (3.5), this proves the proposition.□ 3.2.The martingale argument.-For the next part of the argument, we need to introduce another notion of mismatch that is slightly looser than the one of Definition 12. From now on, we fix a value of δ > 0 that satisfies the conclusion of Proposition 13.Suppose that T, (R[i]) i∈T3 is a decomposed Brownian tree and T ′ is another Brownian tree, and that Ψ : T → T ′ is a homeomorphism.
Definition 15. -Let j ⩾ 1, and let i ∈ T 3 of depth at least j + 1.We say that the scale j is a weak mismatch for In particular, a mismatch as defined in Definition 12 is also a weak mismatch.The difference between the two definitions is that we have removed the "topological" part of the assumption that j has to be a good scale, i.e., that i j = i j+1 = 3 (see Definition 10).In this section, we prove the following result.
Proposition 16. -Let β be as in Lemma 11.There exists a constant η > 0 such that for any homeomorphism Ψ from T to T ′ , for any k ⩾ 1, the set V k,Ψ of indices i ∈ T k 3 such that Ψ has more than β 2 k weak mismatches at i and such that We highlight that the result is in fact deterministic: it holds almost surely for realizations of Brownian trees T, T ′ , decomposition (R[i]) i∈T k 3 and any homeomorphism Ψ.In what follows, we will refer to weak mismatches as simply mismatches.
Proof.-Although the result is deterministic, we will give a probabilistic proof by interpreting the mass of a subset of T as the probability that a point X sampled uniformly in T belongs to that subset, as explained in Remark 8.In all the proof, we treat T, T ′ as deterministic, compact real trees, equipped with a nonatomic mass measure, and we also treat (R[i]) i∈T3 as deterministic.We pick X in T according to its mass measure.For j ⩾ 0, we denote by H j the σ-algebra generated by i j (X), so that (H j ) j⩾0 is a filtration.Note that since X is uniform, we have for j ⩾ 0 and a ∈ {1, 2, 3} (3.12) We also note that if 0 ⩽ j ⩽ k, then the event that scale j is a mismatch for Ψ at i k (X) is H j -measurable (2) , since it only depends on i j (X).Now, for j ⩾ 0, we define A simple computation using (3.12) shows that the process (M j ) j⩾0 is an (H j )-martingale.
The idea behind Proposition 16 is that a mismatch gives an opportunity for M j+1 to be significantly different from M j .Since the martingale M is positive, it converges almost surely, and if its value changes often the limit has to be 0. To obtain a quantitative version of this intuition, we will study (log M j ) j⩾0 , which is a supermartingale.We will use the fact that the steps corresponding to mismatches tend to bring the (2) This is the reason why we are looking at the weak mismatches of Definition 15, and not at the mismatches of Definition 12: the assumption i j+1 = 3 in the definition of a good scale is not H j -measurable.
J.É.P. -M., 2024, tome 11 value of this process down by more than an additive constant in expectation.Hence, after a large number k of steps, either we have seen few mismatches or log M has gone down by a lot.
More precisely, let us fix a constant µ > 0 (to be made precise later).For j ⩾ 1, we introduce We will prove that if µ > 0 is chosen sufficiently small, then for all j ⩾ 1, we have From here, the result follows from using a Chernoff bound.Indeed, using the last display in cascade, we obtain E exp 1 2 k j=1 Z j ⩽ 1 so that we have which is what we want to prove if we set η = µβ/8.So now, we only have to prove (3.14) for some value µ > 0. First, on the event that scale j − 1 is not a mismatch, we have by the martingale property.Therefore, by concavity of x → x 1/2 and Jensen's inequality, we have E e 1 2 Zj H j−1 ⩽ 1 1/2 = 1.Hence we only have to focus on the event where scale j − 1 is a mismatch.For this, we use the following lemma.
Lemma 17. -There exists µ = µ(α, δ) such that the following holds.Let (p 1 , p 2 , p 3 ) and (q 1 , q 2 , q 3 ) be two elements of the simplex {(x 1 , x 2 , x 3 ) | x 1 + x 2 + x 3 = 1}, and let Z be a random variable given by where for all i ∈ {1, 2, 3}, We then apply the lemma to the random variable Z j conditionally on H j−1 , on the event that j − 1 is a mismatch, by taking This ensures that (3.14) is satisfied for the appropriate choice of µ given in the lemma.□ Proof of Lemma 17. -For µ > 0, we have For each term, we have p Moreover, let a be such that |p a −q a | ⩾ δ.We have This proves our claim, by taking µ > 0 small enough.□ Given Propositions 13 and 16, it is now easy to conclude the proof of Proposition 3.
Proof of Proposition 3. -We know that almost surely, for k large enough, the conclusions of Lemma 11 and Proposition 13 hold.We fix k for which it is the case.Let Ψ : T → T ′ and consider some i ∈ U k,Ψ , meaning that Then -either i ∈ U k as defined by Lemma 11, meaning that i has less than βk good scales, -or Item 2 of Proposition 13 holds, as Item 1 is prohibited since In this case, the region i must have at least ⌊ β 2 k⌋ scales that are δmismatches, hence also weak δ-mismatches.This entails, using our initial assumption on i, that i ∈ V k,Ψ .Therefore, on the event that we considered, we have where γ and η are given respectively by Lemma 11 and Proposition 16.This concludes the proof by taking 0 < ξ < min(γ, η).□

Proofs of the main results
4.1.Hölder homeomorphisms.-We start with the proof of Theorem 2, which follows quite straightforwardly from Proposition 3.
Proof of Theorem 2. -Let T, (R[i]) i∈T3 be a decomposed Brownian tree and let T ′ be an independent Brownian tree.Let also Ψ : T → T ′ be a homeomorphism.We will find a constant ζ > 0 such that almost surely, for k large enough, we can find i ∈ T k Since the maximal diameter over all the regions of level k tends to 0 as k → +∞, this will entail that Ψ −1 cannot be (1 − ζ 2 )-Hölder.Since the problem is symmetric in T and T ′ , this also shows that Ψ cannot be (1 − ζ/2)-Hölder either.
Let k ⩾ 0. On the one hand, we know that almost surely, for k large enough, the conclusions of Proposition 3 and Lemma 9 hold.On the other hand, we recall that by Proposition 6, conditionally on their masses, the regions (R[i]) i∈T k 3 are independent Brownian trees with those respective masses.Moreover, there exists a constant u > 0 such that the Brownian tree T of mass 1 satisfies for all x > 0 (this can e.g.be deduced from the explicit distribution of the maximum [18]).By union bound and Borel-Cantelli, there is a constant c > 0 such that almost surely, we have for k large enough Combining this with the upper bound of Lemma 9 (which ensures that |R[i]| behaves roughly as a decreasing exponential in k), this implies that for any ε > 0, almost surely for k large enough and i ∈ T k 3 , we have Controlling the diameter of the regions Ψ (R[i]) of T ′ cannot be done in the same way, as we do not have a priori estimates on the shape of Ψ (R[i]).Therefore, we will use the definition of T ′ via the Brownian excursion, and the fact the excursion is Hölder.More precisely, we recall from Section 2.2 that T ′ is built as a quotient of [0, 1] using a Brownian excursion e ′ = (e ′ t ) 0⩽t⩽1 .We denote by p e ′ the canonical projection from [0, 1] to T ′ .Now, for any i ∈ T 3 , the region R[i] is delimited by at most 2 points.Therefore, the same is true for the region Ψ (R[i]) of T ′ .This implies that Ψ (R[i]) is of the form p e ′ (I 1 ∪ I 2 ∪ I 3 ), where I 1 ∪ I 2 ∪ I 3 is the union of three sub-intervals of [0, 1].By connectedness of Ψ (R[i]), we have On the other hand, we know that |Ψ (R[i])| goes a.s. to 0 as the depth of i goes to +∞, so almost surely, for k large enough and i of depth k, we can write since p e ′ is measure-preserving.We can finally put things together.Using the conclusion of Proposition 3, almost surely for k large enough, there exists i ∈ T k 3 such that (4.4) where the second inequality comes from Lemma 9.For this i, we have From there, we just need to take ε > 0 small enough to conclude.This proves the theorem and we can take ε 2 < η/2C.□ 4.2.Maximum agreement subtree bound.-In this section, we prove Theorem 1 from two intermediate results.The first one, Corollary 18, is a direct corollary of Proposition 3. The second one, Lemma 19, roughly says that the square root upper bound holds simultaneously in all regions of T n and T ′ n (its proof relies on the same ideas as the classic square root upper bound).We will only state this lemma here, and prove it in the next subsection.
Corollary 18. -Let T, (R[i]) i∈T3 be a decomposed Brownian tree and let T ′ be an independent Brownian tree.There exists a constant ρ > 0 such that with probability 1 − o k (1), for any homeomorphism Ψ : T → T ′ , we have Proof.-Let ξ, η > 0 be given by Proposition 3, and let Ψ : T → T ′ be a homeomorphism.For U k,Ψ defined as in Proposition 3, we have (4.5) On the other hand, by Proposition 3, almost surely, for k large enough, we can write (4.6) where the first inequality follows from the Cauchy-Schwarz inequality, and the second from Proposition 3. We conclude by summing (4.5) and (4.6), and by taking ρ < We can now prove Theorem 1.
Proof of Theorem 1. -We let θ = min (1/2C, 1/4 log 3), where C > 0 is given by Lemma 9 and we take k = ⌊θ log n⌋.Note that in particular, the number of regions of the recursive decomposition of T at scale k is 3 k ⩽ n θ log 3 ⩽ n 1/4 .We now let J.É.P. -M., 2024, tome 11 ε = θρ/4, where ρ is given by Corollary 18.We assume that T ′ n is coupled with an n-pointed Brownian tree T ′ , (X ′ j ) 1⩽j⩽n in the way described in Section 2.4.We also assume that T ′ , (X ′ j ) is independent from T, (R[i]) i∈T k 3 and T n .Suppose that for some S ⊂ {1, . . ., n}, we have T n | S = T ′ n | S = t.Then we claim that there exists a homeomorphism Ψ from T to T ′ such that Ψ(X j ) = X ′ j for all i ∈ S. Indeed, up to reparametrization of the edges, there exists a unique embedding φ of t into T that sends the leaf labeled j to X j for any label j appearing in t, and a unique embedding φ ′ of t into T ′ that sends similarly j to X ′ j .For every edge e = (x e , y e ) of t, let T e be the set of points x ∈ T such that the closest point of φ(t) to x belongs to φ(e).We define similarly the region T ′ e ⊂ T ′ .For every e, the regions T e and T ′ e are compact real trees where branching points are dense and all have degree 3, so they have the same topology by [8, Th. 1] (see also [12]). (3)Therefore, there exists a homeomorphism Ψ e : T e → T ′ e such that Ψ e (φ(x e )) = φ ′ (x e ) and Ψ e (φ(y e )) = φ ′ (y e ).The homeomorphism Ψ is obtained by patching the Ψ e together.Finally, for any i ∈ T k 3 , we denote by R[i] the smallest region of T n that contains all the labels j ∈ {1, . . ., n} such that X j ∈ R[i].We define similarly the regions R ′ [i] of T ′ n using T ′ and (X ′ j ).For every i ∈ T k 3 , note that by definition of R[i], R ′ [i] and by the fact that Ψ(X j ) = X ′ j for j ∈ S, we have In particular, this subset induces the same subtree in R[i] and in R ′ [i].Therefore, we can write On the other hand, Lemma 19 ensures that with probability 1 − o n (1), for all i ∈ T k 3 we have We now use Lemma 5: with probability 1 − o n (1), for any i Finally putting everything together, we get with probability 1 − o n (1).□ 3) The results in [8] are only stated for unpointed Brownian trees, but the proofs extend straightforwardly to bipointed trees.
In particular as m → ∞.For t, t ′ ∈ B s we write t ∼ t ′ if t ′ can be obtained from t by relabeling its leaves.Note that this defines an equivalence relation on B s .Then for any random variable T s on B s that satisfy the exchangeability property, we have Since the distribution of (S m )| {1,...,s} satisfies the exchangeability condition, we just need to check that the number appearing at the denominator on the right-hand-side of the inequality is bounded from below by s!/2 s−2 s for any t ∈ B s .For that, it suffices to prove that the number of graph automorphisms of any tree t ∈ B s is bounded below by 2 s−2 s.This follows from the fact that an automorphism of a tree is determined by the image of one leaf and a cyclic ordering of the edges around each node.This proves the first claim of the lemma, and the second follows by the Stirling formula.□ J.É.P. -M., 2024, tome 11 In the proof of Lemma 19, the above result will be used to control the size of the MAST of two regions R ⊂ T n and R ′ ⊂ T n in terms of their number of common labels.The goal of the next result is to bound the number of these common labels in terms of m = #R and m ′ = #R ′ .Lemma 21. -Let ε > 0. Let S and S ′ be two independent uniform random subsets of {1, . . ., n} of respective sizes m, m ′ .Then we have the following bound: Proof.-We write #(S∩S ′ ) as a sum of indicators #(S∩S ′ ) = n i=1 1 i∈S∩S ′ .We also denote by (H i ) 0⩽i⩽n the filtration generated by the sequence (1 i∈S , 1 i∈S ′ ) 1⩽i⩽n .
We can check that Therefore, we have the following stochastic domination By shuffling the indices, the same domination holds for n/2 i=1 1 i∈S∩S ′ , so we obtain where for the last inequality we use that for a binomial random variable X with expectation µ, we have P (X ⩾ (1 + ε)µ) ⩽ exp(εµ/3).□ We can finally prove Lemma 19.
Proof of Lemma 19.-We will actually prove a much stronger statement, which is that the estimate of Lemma 19 holds even if we condition on the shape of T n , T ′ n .More precisely, let σ, σ ′ be two independent uniform random permutations of {1, . . ., n}, defined on the same probability space and independent of T n , T ′ n .We denote by T σ n (resp.(T ′ n ) σ ′ ) the tree obtained from T n (resp.T ′ n ) by replacing each label j be the label σ(j) (resp.σ ′ (j)).By exchangeability of the model, the couple (T σ n , (T ′ n ) σ ′ ) has the same distribution as (T n , T ′ n ) so we can prove the lemma for the former, and it will hold for the latter as well.For any region R That is, the label j is in R σ if and only if σ −1 (j) is in R. Note that if one of our regions is empty or consists of a single leaf, the result is obvious, so we may focus on regions delimited by 1 or 2 nodes.Let E n be the event that there exist two regions R ⊂ T n and R ′ ⊂ T ′ n delimited by at most two nodes such that (4.7) fails for the regions R σ and (R ′ ) σ ′ .We want to show that P (E n ) → 0 as n → +∞.We will actually show that this is true even if we condition on T n , T ′ n , that is For this, we fix t, t ′ ∈ B n .From now on, we condition on (T n , T ′ n ) = (t, t ′ ).Let r, r ′ be two regions of t, t ′ delimited by at most 2 nodes each.Since the number of nodes in those two trees is fixed and equal to n − 2, there are O(n 4 ) such pairs (r, r ′ ).We denote by m, m ′ the respective number of leaves of r and r ′ .We denote by Lab(σ, r) (resp.Lab(σ ′ , r ′ )) the set of labels contained in the region r σ (resp. Then Lab(σ, r) and Lab(σ ′ , r ′ ) are two independent random uniform subsets of {1, . . ., n} of respective size m and m ′ .Let L = Lab(σ, r) ∩ Lab(σ ′ , r ′ ) be their intersection.By Lemma 21, it follows that (4.9) On the other hand, any common induced subtree to r σ and (r ′ ) σ ′ can only use leaf-labels that are common to those two regions, so By summing (4.9) and (4.10), we obtain the bound uniformly in (t, t ′ ) and (r, r ′ ).Finally, we can sum over the O(n 4 ) choices of (r, r ′ ) to obtain (4.8).□

Discussion of the results
5.1.Explicit constants.-The goal of this paragraph is to obtain a quantitative lower bound for the constants appearing in Theorems 1 and 2. We do not try to optimize the computations.x • 1 (0,1] (x).We get We decide to take α = 10 −6 .We then have p ⩾ 1  9 • (1 − 3 √ α) = 0.997/9 and we need β < p/2, so we take β = 1/20.Finally, to conclude the proof in a quantitative way, we use Hoeffding's inequality and we find γ = (p − 2β) 2 ≈ 10 −4 (its exact value won't be needed for the computations below).

Remarks and open questions
The expected maximum agreement subtree.-In all the arguments of the paper, the estimates that are stated with probability 1 − o n (1) actually hold with probability 1−O(n −a ) for some (small) a > 0 (or with probability 1−O(e −ak ), which is equivalent since we take k of order log n).Hence, by Theorem 1 and Lemma 20, we can write Other models of random trees.-Another random tree model where the Maximum agreement subtree has been investigated [10,7] is the Yule-Harding one Y n , i.e., the model where the binary tree Y n is obtained from Y n−1 by choosing a leaf uniformly at random and splitting it into one node and two leaves.The best known lower and upper bounds for this model are given by [7] and are respectively of order n 0.344 and √ n.It seems likely to us that adaptations of the ideas developed in the present paper could be used to prove Theorem 1 for the Yule-Harding model.
Beyond the binary case, another natural question would be to try to estimate the MAST between more general Galton-Watson trees.We believe that similar results could be obtained provided the tail of the offspring distribution is light enough (with the technical difficulty that the coupling between the discrete model and the Brownian tree would not be as simple).On the other hand, when the tail is very heavy, the MAST should become larger because of star-shaped subtrees.

Optimal regularity for homeomorphisms between continuous random structures
It seems natural to introduce the smallest exponent γ + such that there a.s.exists no homeomorphism Ψ : T → T ′ that is γ-Hölder.In some sense, this captures how metrically different two independent realizations of the Brownian tree are.Theorem 2 ensures that γ + ⩽ 1 − 10 −338 , and as it was pointed out in the introduction, Aldous's construction in [5] amounts to constructing a (5 − 2 √ 6)-Hölder homeomorphism between T and T ′ so γ + ⩾ 5 − 2 √ 6 ≈ 0.1010.It would be an interesting direction of research to find tighter bounds on γ + or a good heuristic as to what the value of γ + may be.
The question of finding the optimal regularity for homeomorphisms between independent copies of random metric spaces can be asked for many other models.It is for example natural to ask whether an analog of Theorem 2 could be proved for some models with the topology of the plane such as the Brownian map [22,24], or more generally Liouville Quantum Gravity metrics [17].
Appendix.Construction of a Hölder homeomorphism Theorem 22. -Let T and T ′ be two independent copies of the Brownian tree.There almost surely exists a homeomorphism Ψ from T to T ′ such that Ψ and Ψ −1 are both γ-Hölder, for any γ < 5 − 2 √ 6.

Figure 2 .
Figure 2.  Coupling between the 5-pointed Brownian tree ( T, X 1 1 , . . ., X1  5 ) and T5.The combinatorial structure of the paths joining the distinguished points, shown in blue, is that of the discrete tree T5 we started with.

□ 2 . 5 . 1 .
The Aldous recursive decomposition Decomposing T into three regions.-We now introduce a recursive decomposition of the Brownian tree T, which consists of repeatedly applying the above decomposition for n = 3.More precisely, let (T, x[1], x[2], x[3]) be a 3-pointed Brownian tree with mass Note that almost surely, the points x[1], x[2] and x[3] are leaves.In this case, there exists a unique branching point b[∅] of T such that x[1], x[2] and x[3] lie in three different connected components of T ∖ {b[∅]}.For i ∈ {1, 2, 3}, we call R[i] the (closure in T of the) connected component containing x[i].Then Theorem 4 for n = 3 J.É.P. -M., 2024, tome 11 More precisely, we write R[∅] = T and we define the branching point b[∅] and the bi-pointed trees R[1], R[2] and R[3] as in the previous paragraph.Moreover, let i ∈ T 3 of depth k ⩾ 1 and assume that the region R[i] has been defined, and call x[i1], x[i2] its marked points.Conditionally on all the rest, let x[i3] be a random point sampled according to the mass measure on R[i].We denote by b[i] the unique branching point of R[i] such that x[i1], x[i2] and x[i3] lie in three different connected components of R[i] ∖ {b[i]}, and we call the closures of these three components respectively R[i1], R[i2] and R[i3].Then, for a ∈ {1, 2, 3}, we set x[ia1] := b[i] and x[ia2] := x[ia] (that is, we mark two points on each of the regions R[ia], in the same way as in the first step).Finally, we equip the regions R[ia] with the metric and the measure naturally inherited from R[i].See Figure 3 and Figure 4 for an illustration. b

Figure 3 .
Figure 3.A decomposed Brownian tree T and its decomposition (R[i]) i∈T 2 3

Figure 4 .
Figure 4.One step of the decomposition for a region R[i] delimited by respectively one point and two points.Note that in both cases, the newly created region R[i3] is delimited by only one point.
is a Brownian tree, then almost surely the points x[i1], x[i2] and x[i3] defined above are pairwise distinct leaves, so b[i] is uniquely characterized and distinct from x[i1], x[i2], x[i3] and Theorem 4 for n = 3 ensures that the R[ia] are Brownian trees.If i has depth k, we will call R[i] a region of scale k.Using Theorem 4 with n = 3 at each scale, we easily have the following independence properties.Proposition 6. -The following statements hold.

Figure 5 .
Figure 5.The sequence (b h ) 1⩽h⩽ℓ and their positions relative to R[ij h 2] and R[ij h 3].
The refined square root bound.-This section is devoted to proving Lemma 19.Before getting to the proof, we will first state and prove two other intermediate results, Lemma 20 and Lemma 21.First, Lemma 20, stated below, is a consequence of [10, Lem.4.1], see also [7, Lem.4.1, Prop.4.2], but expressed in a slightly different context.We provide a quick proof for completeness, adapted from the same references.We say that a random labeled tree is exchangeable if its distribution is invariant under uniform random permutation of the labels on the leaves.Lemma 20. -Suppose that S m and S ′ m are independent exchangeable random variables on B m with m leaves (with possibly different distribution).Then for any s ⩾ 1, J.É.P. -M., 2024, tome 11 4.3.