Random walks on hyperbolic spaces: second order expansion of the rate function at the drift

Let $(X,d)$ be a geodesic Gromov-hyperbolic space, $o \in X$ a basepoint and $\mu$ a countably supported non-elementary probability measure on $\operatorname{Isom}(X)$. Denote by $z_n$ the random walk on $X$ driven by the probability measure $\mu$. Supposing that $\mu$ has finite exponential moment, we give a second-order Taylor expansion of the large deviation rate function of the sequence $\frac{1}{n}d(z_n,o)$ and show that the corresponding coefficient is expressed by the variance in the central limit theorem satisfied by the sequence $d(z_n,o)$. This provides a positive answer to a question raised in \cite{BMSS}. The proof relies on the study of the Laplace transform of $d(z_n,o)$ at the origin using a martingale decomposition first introduced by Benoist--Quint together with an exponential submartingale transform and large deviation estimates for the quadratic variation process of certain martingales.


Introduction
Let (X, d) be a separable geodesic Gromov-hyperbolic space, G = Isom(X), and o ∈ X a base point of X.A probability measure µ on G defines a random walk on the group G and subsequently on the metric space X in the following way.Let (X i ) i∈N be a sequence of i.i.d.random variables on G with distribution µ.We let L n = X n . . .X 1 denote the successive positions of the random walk on G.The process (z n ) n∈N on X defined by z n = L n • o constitutes a Markov chain on X that we shall refer to as a random walk on X.To avoid measurability issues, we will always suppose that the probability measure µ is countably supported.
Thanks to the subadditive ergodic theorem, under a finite first moment assumption, we have the following law of large numbers (1.1) where ℓ µ ∈ [0, ∞) is a constant called the drift of the random walk.There has recently been substantial interest in the finer study of asymptotic properties of a random walks on Gromov-hyperbolic spaces.This recent progress shows that the resemblance between the asymptotic behaviour of random walk displacement and classical sums of i.i.d.real random variables is far more than the law of large numbers (1.1): a central limit theorem (CLT) with the optimal finite second moment assumption is proved by Benoist-Quint [3] (see also Horbez [22]) improving previous more restrictive versions by Ledrappier [23] and Björklund [5] -an alternative proof of the CLT was later given by Mathieu-Sisto [26] and in a more restrictive setting by Gouëzel [19].These show that for a non-elementary probability measure µ with finite second moment (see below for the definitions), we have The analogue of Cramér's theorem on large deviation principles was recently proved by Boulanger-Mathieu-Sert-Sisto [6] (see also Gouëzel [20]): they showed that for a non-elementary probability measure with a finite exponential moment, the sequence where int(R) denotes the interior and R the closure of R. Furthermore, concentration inequalities reminiscent of Hoeffding inequalities were recently shown by Aoun-Sert [1] and a local limit theorem for random walks on Gromov-hyperbolic groups was proved by Gouëzel [18].
J.É.P. -M., 2023, tome 10 However, establishing these results analogous to the classical setting of sums of i.i.d.real random variables involves overcoming serious issues by use of various approaches and techniques.Apart from mostly geometric approaches such as the ones used in [6,20,26], two classical methods are present -say in aforementioned different proofs of the CLT.These are Nagaev's analytic method [28] and Gordin-Lifšic's martingale method [16].
Nagaev's method can be seen as a version of the classical Fourier-Laplace transform and it relies on techniques of analytic perturbation theory, and in general, yields sharper estimates.However, implementing it requires proving a certain spectral gap result for a Markov operator acting on an appropriate boundary space.Although this is by-now standard, say, on classical hyperbolic spaces or on free groups, it is not well-developed in the generality of spaces, namely (not necessarily proper) geodesic Gromov-hyperbolic spaces that we shall be working with in this article.A thorough study of the analytical method in the case of Gromov-hyperbolic groups was done by Gouëzel [19,Prop. 3.6,§5].
We will extensively use the martingale approach -developed in this setting by Benoist-Quint [2,3] and adapted to greater generality by Horbez [22] and Aoun-Sert [1] -to tackle the analytic problem of giving a second-order expansion of the limit Laplace transform of the sequence d(z n , o) (or by convex duality, of its large deviation rate function in (1.3)) and relating it to the variance in the central limit theorem (1.2).Similar results are known to hold in settings where spectral methods are available.We now expound on these notions and precisely state the main result of this note.
A geodesic metric space (X, d) is said to be Gromov-hyperbolic if there exists δ > 0 such that for every x, y, z, o ∈ X, we have (x|y) o ⩾ (x|z) o ∧(z|y) o −δ, where (.|.) . is the Gromov-product given by (x|y A probability measure µ on Isom(X) is called non-elementary if its support S generates a semigroup that contains two independent loxodromic elements (see Section 3.2).For such a measure µ and n ∈ N, we denote by µ * n its n th convolution which is the law of the random variable L n .
Given a probability measure µ, the limit Laplace transform of the sequence zn,o) ].
Note that, since the increments are i.i.d and G acts by isometries on X, subadditivity implies that the limit in (1.4) exists.Under a finite super-exponential moment assumption, the Fenchel-Legendre transform of Λ is the rate function I of the large deviation principle satisfied by The goal of this note is to prove the following result which answers part of [6, Quest.C.1] and which says that the convex function Λ has a second order Taylor expansion at 0 with second derivative equal to the variance in the central limit theorem: Theorem 1.1.-Let (X, d) be a separable geodesic Gromov-hyperbolic space and µ a non-elementary probability measure on Isom(X).Suppose that µ has a finite exponential moment, i.e., for some α > 0, e α d(g•o,o) dµ(g) < +∞.Then, we have The proof uses extensively the martingale approach developed in this context by Benoist-Quint [2,3].The martingale decomposition proved in these works allows us to reduce the study of Λ near zero to the study of the limit Laplace transform of a martingale induced by an i.i.d.random walk on the group Isom(X).Once this reduction is done, the proof is divided into two parts: proving the lower bound, i.e., lim λ→0 (Λ(λ) − λℓ µ )/λ 2 ⩾ σ 2 µ /2 and the upper bound, i.e., lim λ→0 (Λ(λ) − λℓ µ )/λ 2 ⩽ σ 2 µ /2.The proof of the lower bound is based on a new exponential submartingale transform that we establish in Proposition 2.2.The latter extends a classical result of Freedman [15] to the case of martingales with unbounded differences.The proof of the upper bound uses ideas from martingale concentration inequalities.Another important tool is large deviation estimates for the quadratic variation of our martingales.(3) (Positivity of σ µ ) By an argument of Benoist-Quint [3], it follows from the expression of σ 2 µ (see (3.14)) that σ µ > 0 if any only if µ is non-arithmetic (see Remark 3.7).
Using the convexity of the rate function proved in [6] and standard results from convex analysis, we deduce Corollary 1.3 (About the rate function).-Keep the assumptions of Theorem 1.1 and let I be the rate function (1.3).Then, we have Finally, we note that our results are also valid for the right random walk R n = X 1 . . .X n since for every n ∈ N, L n and R n have the same distribution.
Remark 1.4 (Role of hyperbolicity and possible extensions).-The key ingredients of the proof of our main result in which hyperbolicity plays a role are Lemma 3.1 (in combination with the solution of cohomological equation (3.4)) and Lemma 3.8 (a qualitative form of which is also sufficient for the purposes).It may therefore be J.É.P. -M., 2023, tome 10 possible to establish this connection between large deviations and central limit theorem in similar geometric settings (see for example [22] and the recent monograph [8] and the articles [9,10].) The paper is organized as follows.In Section 2, we recall some preliminaries on submartingales and prove an exponential transform for submartingales.In Section 3, we recall basic definitions about Gromov-hyperbolic spaces and metric compactifications as well as results from the theory of random walks on hyperbolic spaces.In particular, we recall that d(z n , o) − nℓ µ is at bounded distance from a martingale and prove a large deviation estimate for the predictable quadratic variation of the latter.In Section 4 we prove Theorem 1.1 in its general form Theorem 4.1, by treating separately the lower bound (Section 4.2) and the upper bound (Section 4.3).In Section 4.4, we deduce Corollary 1.3, and finally, discuss some ensuing questions in Section 4.5.
Acknowledgements.-The authors would like to thank the anonymous referee for the careful reading of our paper and useful suggestions.

Preliminaries on martingales
In this section, we recall some preliminaries from the theory of martingales and prove a result about exponential martingale transforms that will play a crucial role in the proof of our main theorem.
Let us first fix our notation.We shall denote by F = (F n ) n∈N an increasing sequence of σ-algebras (a filtration) on a fixed standard probability space Ω.Usually, we will consider the filtration to be fixed and omit it from the notation.The notation M = (M n ) n∈N will be reserved for an adapted sequence of random variables that form either a martingale or submartingale.Denoting by ∆M the sequence of differences given by ∆ n M := M n − M n−1 , we recall that M being a submartingale means that for every n ∈ N, M n is F n -measurable, integrable, and it satisfies E[∆ n M |F n−1 ] ⩾ 0. In the sequel, unless otherwise stated, we take M 0 = 0 a.s.The predictable quadratic variation (or conditional quadratic variation) of the submartingale M n is denoted by Finally, the following special function defined on R will play a significant role: We start by recalling Freedman's submartingale transform whose statement and proof strategy will be used in our generalization below.(1) Let X be an integrable random variable with E(X) = 0 (resp.E[X] ⩾ 0) and X ⩾ −1 (resp.|X| ⩽ 1) a.s.Then, for every λ ⩾ 0, we have E[e λX ] ⩾ e f(λ)Var(X) .J.É.P. -M., 2023, tome 10 (2) Let (M n ) n∈N be a submartingale and such that for every 1 ⩽ n ∈ N, |∆ n M | ⩽ 1 almost surely.Then for every λ ⩾ 0, the sequence of random variables is a submartingale with respect to the same filtration.
We note that the second statement above is a consequence of the first one.The following result provides a generalization of Proposition 2.1 to submartingales with increments possessing a finite exponential moment.Proposition 2.2 (Submartingale transform).-Let (M n ) n∈N be a submartingale.Suppose that there exists a constant α > 0 such that for every n ∈ N, we have Then, given any a > 0, for every λ > 0 small enough, the sequence of random variables is a submartingale with respect to the same filtration.
This is an extension (to unbounded differences) of Freedman's submartingale transform in his seminal work [15].Indeed, if the difference sequence ∆ n M satisfies |∆ n M | ⩽ 1 a.s., the transform in the previous result boils down to Proposition 2.1.On the other hand, it applies, for instance, when there exists a constant α > 0 such that for every n ∈ N, E[e α|∆nM | |F n−1 ] < +∞.This will be the case in our application.The counterparts of Proposition 2.2 for supermartingale transforms were obtained by Dzhaparidze-van Zanten [13] (see also Fan-Grama-Liu [14]).
Proof.-Let a > 0. By the finite exponential moment hypothesis on (M n ) n∈N , it is clear that for every λ > 0 small enough and for every n ∈ N, exp(λM n −(f (λa)/a 2 )G a n ) is F n -measurable and integrable.Therefore, by expanding the conditional expectation, one sees that it is enough to show the following: for any integrable random variable X with E[X] ⩾ 0, for any λ > 0, Denote by ν the distribution of X.
-Case 1: E[X] = 0 and ν is supported on two points −c and d with c, d > 0 and both c, d ⩽ a.Let r := max{c, d}.Since E[X] = 0 and X ⩾ −r almost surely, 1. of Proposition 2.1 (applied to X/r and to λr) yields Since the function x → f(x)/x 2 is decreasing on R and since r ⩽ a, we deduce that f(λr)/r 2 ⩾ f(λa)/a 2 .Therefore J.É.P. -M., 2023, tome 10 -Case 2: E[X] = 0 and ν is supported exactly on two points −c and d and we are not in Case 1.By Jensen's inequality, we obtain (2.3) If both c, d ⩾ a, then the right hand side of (2.2) is equal to 1 and hence (2.2) holds in view of (2.3).So, since we are also not in Case 1, we can suppose that either c > a and d < a, or d > a and c < a.Let us treat the case c > a and d < a.Notice also that since By assumption on c, d and (2.3), these yield On the other hand, 2) follows from combining (2.4) and (2.5).The case d > a and c ⩽ a can be treated similarly.
-Case 3: E[X] ⩾ 0 and ν is supported on two points {−c, d} with c, d > 0. We will study the behavior of the left-hand-side and the right-hand-side of (2.2) when we vary ν with the condition E[X] ⩾ 0, while fixing a, c, d, λ.Since ν is supported on two points, it is enough to treat the behavior of these quantities when β :  1) ⩾ 0 follows from the facts that the function x → f(x)/x 2 is decreasing and that f(x) ⩽ x for every x ⩾ 0. This concludes the proof of (2.2) in this case.
-Case 4: here we treat the general case (cf.proof of [15,Prop. 3.6]).Since E[X] ⩾ 0, we can find a family (ν α ) α∈I of probability measures, each supported on two points −c α ⩽ 0 and d α > 0 and of expectation ⩾ 0, and a probability measure θ on I such that ν = dθ(α)ν α .We have J.É.P. -M., 2023, tome 10 where we applied (2.2) for each probability measure ν α in the second inequality and Jensen in the third inequality.□

Random walks on hyperbolic spaces
3.1.Preliminaries on hyperbolic spaces.-Let us first fix our notation.Let (X, d) be a geodesic metric space.Fix a base point o ∈ X. Recall that (X, d) is said to be δ-hyperbolic (where where (.|.) . is the Gromov-product given by (x|y For simplicity, we will often omit the basepoint o from the notation.We recall that this category of metric spaces comprises many usual spaces: trees, classical hyperbolic spaces, the fundamental group of compact surfaces of genus ⩾ 2. We recall that the definition of hyperbolicity is equivalent to geodesic triangles being thin.We refer to [11] for general properties of these spaces.Denote by G := Isom(X) the group of isometries of the metric space (X, d).The displacement of g ∈ G is by definition An element γ ∈ G is said to be loxodromic if for any x ∈ X, the sequence (γ n x) n∈Z constitutes a quasi-geodesic (see [11,Ch. 3]).Equivalently, γ is loxodromic if and only if it fixes precisely two points x + γ , x − γ on the Gromov boundary ∂X of X [11, Ch. 9 & 10].Two loxodromic elements γ 1 , γ 2 are said to be independent if the sets of fixed points {x + γi , x − γi } for i = 1, 2 are disjoint.Finally, a set S, or equivalently a probability measure with support S, is said to be non-elementary if the semigroup generated by S contains at least two independent loxodromic elements.Now we recall the definition of the Busemann compactification of X (no need for hyperbolicity in this part).Denote by Lip 1 (X) the set of real valued Lipschitz functions on X with Lipschitz constant 1, endowed with the topology of pointwise convergence.Fixing o ∈ X, for x ∈ X, let the function , called the horofunction compactification of X (see e.g.[25,Prop. 3.1]).It will be denoted as X h .The map x → h x is injective on X (and an embedding when X is a proper metric space) and we usually identify X with its image in X h .The horofunction boundary of X is defined ).This extends equivariantly the isometric action of Isom(X) on X and the set J.É.P. -M., 2023, tome 10 Observe that for every g ∈ G and x ∈ X h , Finally, we recall that the Gromov-product can be extended to the whole Busemann compactification by setting (x|y In particular, one can infer that for x ∈ X h and y ∈ X,

Random walks.
-There are two main goals in this section.The first one (discussed in Section 3.2.1) is to recall a martingale decomposition (Lemma 3.1) of the Busemann cocycle along non-elementary random walks on Gromov-hyperbolic spaces which is due to Benoist-Quint [2,3] (see also an extension in [22]).We will use a slightly more general version of this worked out in [1].The second goal (discussed in Section 3.2.2) is to prove Proposition 3.3 about large deviations of predictable quadratic variation and its consequence expressed in Corollary 3.9.The latter will be crucial in the proof of our main result.

3.2.1.
Benoist-Quint martingale decomposition.-Let µ be a probability measure on the isometry group G of X with countable support.Recall that it is said to have a finite exponential moment (resp.finite second moment) if there exists α > 0 such that e αd(g•o,o) dµ(g) < ∞ (resp.κ(g) 2 dµ(g) < ∞).Let L n = X n . . .X 1 be the random walk on G and ℓ µ the drift of the random walk on X defined in (1.1).Denote by F the natural filtration generated by the increments X i 's.Finally, we denote by P µ the Markov operator on the horofunction compactification X h induced by the random walk on G, i.e., P µ f (x) = f (g • x)dµ(g) for every bounded measurable function f on X h .The starting point of the proof of Theorem 1.1 is the following.
Lemma 3.1.-Let µ be a non-elementary probability measure with finite second moment.Then, for every x ∈ X h , there exists a martingale M x = (M x,n ) n∈N with respect to the filtration F starting at the origin and such that for every n ∈ N, , where O x,n (1) is a random variable whose absolute value is bounded uniformly in Proof.-When X is proper, Benoist-Quint [3,Prop. 4.6] showed that that there exists a bounded measurable function ψ on the Busemann boundary ∂ h X such that It was then verified in [22] that this solution can be extended to the case when X is non-proper and also in [1] that ψ could be defined on the whole compactification X h while preserving the boundedness of ψ.This is equivalent to finding a cocycle σ 0 : G × X h → R with constant drift equal to ℓ µ , i.e., σ 0 (g, x)dµ(g) = ℓ µ for every x ∈ X h , such that the following identity holds for every Let then The constant drift property of σ 0 implies that M x := (M x,n ) n∈N is a martingale with respect to the filtration F, which finishes the proof.
we obtain the existence of some C ⩾ 0 such that for every n ∈ N and every x ∈ M , From now on, for every x ∈ X h we denote by M x = (M x,n ) n∈N the martingale defined in the proof of Lemma 3.1, i.e., Many properties of a martingale are encoded in its different notions of quadratic variation.For instance, a martingale whose predictable quadratic variation (see below for the definition) is almost surely bounded satisfies a Bennett-Bernstein concentration result (see [15] for the bounded difference case and [29,13,14] for the general case).Burkholder inequalities [7] are another instance of the relevance of the quadratic variation in studying martingales.

3.2.2.
Large deviation estimate for predictable quadratic variation of M x,n .-We now proceed with the second goal of Section 3.2, namely proving Proposition 3.3 below and deducing Corollary 3.9.We first give some observations and definitions regarding the martingale (M x,n ) n∈N introduced in Section 3.2.1.The martingale difference of where (Z x,j := L j • x) is the Markov chain on X h induced by the random walk on G and starting at x.We recall that the (predictable) quadratic variation of ⟨M x ⟩ is the unique increasing predictable process such that We now come to the main result of this section.Its statement contains the expression where ν is any µ-stationary probability measure on X h -we will see that the integral does not depend on ν.This constant σ 2 µ is also the variance appearing in the central limit theorem (1.2) (see proof of [3,Th. 4.7.b] or [22,Th. 1.3]).

Proposition 3.3 (Large deviation estimates for the quadratic variation)
Let µ be a non-elementary probability measure with finite second moment.Then, for every ε > 0, lim sup To proceed to prove this result, we first observe that we can reformulate the statement as a statement about large deviations for an additive functional of a Markov chain.Indeed, for x ∈ X h , defining the expression (3.7) shows that Benoist-Quint showed a large deviation estimate for functionals along Markov chains [2, Prop.3.1], which is a quantitative refinement of Breiman's law of large numbers.In the aforementioned paper, the authors work with continuous functions in the framework of Markov-Feller operators on compact metric spaces.However, in the generality that we work with, we were not able to prove the continuity of ϕ.Note that by the expression (3.4) of the cocycle σ 0 , the continuity of ϕ would follow from the continuity of the Gromov-product on the Busemann compactification X h .Up to our knowledge, the latter is known in familiar cases including trees and classical hyperbolic spaces but not in our generality (note that by [27, §10] the Gromov-product on the Busemann compactification of a general metric space X may fail to be continuous even if X is proper and geodesic).To overcome this issue, we will adapt the statement of Benoist-Quint by relaxing the continuity assumption.
is a bounded martingale difference sequence with respect to the filtration F and hence (3.11) follows from Azuma-Hoeffding's concentration inequality for martingales with bounded differences.
In the second step, we show that for every m ∈ N, 1 n n j=1 ϕ(Z j ) concentrates around the Cesàro average 1 m m ℓ=1 P ℓ ϕ(Z j ); more precisely for every ε ′ > 0, n, m ∈ N, and x ∈ X, where F is the filtration induced by the Markov chain.For each k ∈ {0, • • • , ℓ − 1}, we apply (3.11) with the sequence of random variables {ξ j,k = E x [ϕ(Z j )|F j−k ] | j ∈ N} which are adapted to the filtration {F j−k | j ∈ N} and bounded by ∥ϕ∥ ∞ .Combining the resulting ℓ estimates, we obtain that Noticing that E x [ϕ(Z j )|F j−ℓ ] = P ℓ ϕ(Z j−ℓ ), the previous estimate gives (after killing the boundary terms using the boundedness of ϕ) that ). Estimate (3.12) immediately follows.
Finally, let ε > 0 be given.The uniform convergence (3.10) yields an integer m 0 such that P x -almost surely for every j ∈ N, Plugging (3.13) into (3.12) with m = m 0 and ε ′ = ε/3m 0 gives some constant C(ε) > 0 such that for every n ⩾ 6m 0 ∥ϕ∥ ∞ /ε, -If E is a compact metric space, P a Markov Feller operator and ϕ is a continuous function which has a unique average with respect to stationary measures on E, then (3.10) is fulfilled.As mentioned earlier, this is the case, for instance, for random walks on trees, classical hyperbolic spaces and also for strongly irreducible and proximal random walks on projective spaces (see for instance [4]).
We now check that (3.10) is satisfied for our function ϕ defined in (3.9) and the Markov operator P = P µ of the Markov chain on X h induced by the random walk on G (see Section 3.2).
Lemma 3.6.-Let ϕ : X h → R as defined in (3.9).Then the sequence of functions converges uniformly on X h to σ 2 µ ⩾ 0. The limit σ 2 µ can be expressed as where ν is any µ-stationary measure on X Here, a probability measure µ on Isom(X) is said to be nonarithmetic if there exists n ∈ N and g, g ′ ∈ supp(µ * n ) such that τ (g) ̸ = τ (g ′ ) where τ is the translation distance, τ (g) = lim n→∞ κ(g n )/n.
The proof of the previous lemma is based on showing that f n (x) − f n (y) converges uniformly to zero (see (3.17)), which imposes the limit to be the average σ 2 µ as defined in (3.14).To prove this, we express f n (x) as the variance of M x,n / √ n (see (3.18)).Using Burkholder's inequalities, the proof boils down to showing deviation inequalities for σ(L n , x) − σ(L n , y) uniformly in x, y ∈ X h (see (3.22)).For the latter fact, we will use the following lemma which is a direct consequence of uniform punctual deviation estimates given in [6, Prop.2.12].Then there are constants C, β > 0 such that for any k ∈ N and any x ∈ X h , R > 0, we have Proof.-Notice that for g ∈ Isom(M ) and x ∈ X h , by (3.3) we have κ(g) − σ(g, x) = 2(g −1 o|x).In particular, when x ∈ X, the statement precisely corresponds to [6, Prop.2.12] applied with the image μ of µ by the map g → g −1 on Isom(X).
To extend it to X h , given x ∈ X h , let x n be a sequence in X such that x n → x in X h .
By continuity of σ(g, •), we have κ(g) − σ(g, x) = lim n→∞ 2(g −1 o|x n ).Therefore, given R > 0, for every k ∈ N. Denoting by h n (•), the map g → 1 (go|xn)>R/2 , by (3.15) we have where we used dominated convergence in the last equality.Hence the statement follows from [6, Prop.2.12].□ Proof of Lemma 3.6.-First, we reduce the problem to showing that uniformly in x and y in X h .Indeed, let us assume for a while that (3.16) holds.Fix any µ-stationary measure ν on X h (the latter exists by compactness of X h ).We have for every n ∈ N, Since ν is µ * n -stationary for every n ∈ N, we deduce that for every n ∈ N, Let ε > 0 and y ∈ X.We can find n 0 depending only on ε such that for every n ⩾ n 0 and for every x ∈ X h , f n (x) − ε ⩽ f n (y) ⩽ f n (x) + ε.Integrating on both sides with respect dν(x), we obtain that |f n (y) − σ 2 µ,ν | ⩽ ε for every n ⩾ n 0 , concluding the proof of the uniform convergence of the sequence of functions (f n ) n∈N towards σ 2 µ,ν .It also shows that σ 2 µ,ν is independent of the choice of the stationary measure ν.From now on, we focus on showing the convergence (3.16) uniformly in x, y ∈ X h .
Since (M 2 x,n − ⟨M x ⟩ n ) n∈N is a martingale starting at zero, we have that x,n ] for every n ∈ N so that by (3.17): Let us check that the sequence in L p for every p > 1; and hence in particular uniformly integrable.Indeed, by Burkholder's inequality ([7, Th. 9]), we have for every k > 2, 2 is the quadratic variation of M x .By Jensen's inequality, we have J.É.P. -M., 2023, tome 10 Remembering that σ(g, x) = σ 0 (g, x) − ψ(g • x) + ψ(x), |σ(g, x)| ⩽ κ(g), and that ψ is bounded on X h , we get that for every n ∈ N, Since the X i 's have the same distribution, by plugging (3.21) and (3.20) in (3.19) we get The right-hand-side is finite (since µ has a finite moment of any order k > 2) and does not depend neither on n nor on x, showing the boundedness in L k/2 of (M x,n / √ n) 2 uniformly in n and x.
Let now ε > 0. It follows from the uniform integrability of the family M x,n / √ n that there exists L(ε) > 0 such that for every n ∈ N and x, y ∈ X h we have Using Lemma 3.8 together with the fact that σ differs from σ 0 by a bounded function on X h , we obtain some T (ε) > 0 such that for every n ∈ N, x, y ∈ X h , (3.22) Hence, if A x,y,n denotes the event we have for every n ∈ N, x, y ∈ X h that P(A x,y,n ) < 2ε.Now we write Let us estimate a x,y,n .By Cauchy-Schwarz inequality, we have for every n ∈ N, where C 4 > 0 is a constant independent of n, x and ε; guaranteed by the uniform boundedness in L 4 shown at the beginning of the proof.Finally, we estimate b x,y,n .Since the function x → x 2 is uniformly continuous on From the definition of the event A C x,y,n , we deduce that for every n ⩾ n 0 (ε), x, y ∈ X h , b x,y,n < ε.Hence for n ⩾ n 0 (ε), which finishes the proof of the uniform convergence (3.16).□ We end this section with the following consequence of Proposition 3.3.In the statement below, for every x ∈ X h , we use the transform G a n introduced in (2.1) associated to the martingale M x .To ease the notation, we omit the dependence on x in G a n .
-Suppose µ has finite exponential moment.Then for every x ∈ X h , Proof.-Let x ∈ X h .The result will follow from Cauchy-Schwarz inequality and the following two estimates (i) We start by proving (3.23).To ease the notation, let Noticing that Y n ⩾ −nσ 2 µ , we write we get that for every n ⩾ n 0 (ε), Keeping ε and λ > 0 (small enough) fixed, we let n → +∞ and deduce that Since α(ε) > 0, we get by letting λ → 0 + (while keeping ε fixed) that Letting ε → 0, we conclude that This shows (3.23).
J.É.P. -M., 2023, tome 10 (ii) Finally, we show (3.24).Using the expression (2.1) for G a n , we see that Observe that by the expression of our martingale difference (3.6) and by the decomposition (3.4), we have a.s.for every i ∈ N, Since the ζ i 's have the same distribution, Observe that the constant h µ (a) is independent of i.Since µ has finite second moment, we deduce that On the other hand, for every a > 0, the random variables aζ This concludes the proof of (3.24) and hence the corollary.□

Proof of the main result
Having established the submartingale transform from Section 2 and the exponential decay of large deviation probabilities of the predictable quadratic variation from Section 3.2, we are now ready to give the proofs of Theorem 1.1 and Corollary 1.3.In fact, we will prove a slightly more general version given by Theorem 4.1 below.

Statement of the main result.
-To state the more general version of Theorem 1.1, we recall and introduce some notation.We are given a separable geodesic Gromov-hyperbolic space X with a fixed based point o ∈ X.The Busemann cocycle σ : Isom(X) × X h → R with respect to the base point o is as defined in Section 3.1.Given a countably supported probability measure µ on Isom(X) and x ∈ X h , we define the upper Λ + x and lower Λ − x limit Laplace transforms as Ln,x) ]. and Whenever µ has finite exponential moment both functions have values in R in a neighborhood of 0 ∈ R.
We will omit sub/super-scripts when x ∈ X, indeed, for every x ∈ X, we have x, y ∈ X).This common function Λ = Λ o is the notation used in Theorem 1.1 where we work with the basepoint x = o ∈ X.
Theorem 4.1.-Let (X, d) be a separable geodesic Gromov-hyperbolic space and µ a non-elementary probability measure on Isom(X).Suppose that µ has a finite exponential moment.Then for every x ∈ X h , Proof of Proposition 4.2.-Given a probability measure µ as in the statement and x ∈ X h , let M x be the martingale given by Lemma 3.1.It satisfies , for every n ∈ N, where O x,n (1) is a random variable that is bounded (in absolute value) uniformly in x ∈ X h and n ∈ N. Let σ 2 µ > 0 be as defined in (3.14).Let x ∈ X h be fixed for the rest of the proof.For every λ ∈ R, we have where we used the fact that the random variables O x,n (1) are bounded below and above uniformly in n ∈ N. Notice that since µ has finite exponential moment, for every λ in a neighborhood of 0 ∈ R (independent of x ∈ X h ), the last quantity in the above displayed equation is finite.
The inequality is proved in precisely the same way replacing the martingale M x by the martingale −M x using the fact that both martingales have same transforms G a .This completes the proof of Proposition 4.
The proof is based on showing that for large n ∈ N the random variable 1 n σ(L n , x) − ℓ µ has a subgaussian behaviour in a neighborhood of 0. This is shown in the following proposition which controls the limit Laplace transform of the sequence of random variables 1 n σ(L n , x) − ℓ µ .The proof is based on the martingale decomposition given in Lemma 3.1 and standard techniques for concentration results for martingales (in particular [31,Th. 2.19]).With the notation of Section 3, the main tool for the proof of Proposition 4.3 is the following.
Then there exists C > 0 such that for every ε > 0, there exists b > 0 such that for every |λ| < v(µ)/b, every n ∈ N and every x ∈ X h , ( E e λ(σ(Ln,x)−nℓµ) ⩽ exp This proposition will yield (4.4) with σ 2 µ replaced by the larger quantity v(µ).To obtain (4.4), we will use an acceleration technique speeding up the random walk, see the proof of Proposition 4.3.
We now proceed with proving Proposition 4.4.The proof is based on the following control of the conditional expectation of the martingale difference ∆M x : Lemma 4.5.-For every ε > 0, there exists a constant b > 0 such that for every |λ| < b, n ∈ N and x ∈ X h , the following inequality holds almost surely: Proof.
-By expanding the expression (3.5) of the martingale M x and taking conditional expectation, it suffices to show that for every ε > 0, there exists a constant b > 0 such that for every ξ ∈ X h and |λ| < b Using the exponential moment assumption on µ, let α > 0 be such that e ακ(g) dµ(g) < ∞.
Remark 4.6.-Given a martingale M with unbounded differences, controlling various quantities involving the conditional expectation of the martingale difference sequence ∆M is generally an important step to prove concentration results for the martingale M ; see the works of de La Peña [30], Dzhaparidze-van Zanten [13], Fan-Grama-Liu [14] and Liu-Watbled [24] who prove Bennett-Bernstein type concentration inequalities generalizing results of Freedman [15] to the case of unbounded differences.Proposition 4.4 avoids using these more sophisticated results thanks to Lemma 4.5 which, exploiting the special form of our martingales (namely, coming from an i.i.d.random walk on a group), gives a deterministic bound for the exponential of the conditional expectation.
We are now ready to give: Proof of Proposition This yields (4.4) with σ 2 µ replaced with the larger quantity v(µ).We now employ an acceleration trick.More precisely, consider, for every k ∈ N, the probability measure µ * k (distribution of L k ), which is a non-elementary probability measure with finite exponential moment.Denote by Λ(µ * k , .) the Laplace transform based at x = o for the µ * k -random walk (L nk ) n∈N .In particular, Λ(µ, .)= Λ(.).Applying (4.8) for the µ * k -random walk, we deduce that for every k ⩾ 1, and Λ(µ * k , .)= kΛ(.).Hence, for every k ⩾ 1, It remains to check that (4.9) By definition of v(µ * k ) given in (4.5) and using (3.18) (with the notation of Lemma 3.6), we get that for every k ⩾ 1, Finally, the uniform convergence given in Lemma 3.6 for the sequence (f k ) k∈N implies (4.9) and finishes the proof of the proposition.3 is true in that case.We therefore suppose σ 2 µ > 0. To treat this case, we will use some standard terminology from convex analysis, for which we refer the reader to [21].Let, as usual, Λ denote the limit Laplace transform of the sequence 1 n d(L n • o, o).Note that Λ is convex (as it follows by a direct application of Hölder inequality), and, thanks to the finite exponential moment assumption, it takes finite values on an interval of type (−∞, α) with α > 0 and hence it is continuous on this interval.Let Λ * be its Fenchel-Legendre transform.By Theorem 1.  1) .This is in line with the recent work [1] where, under additional assumptions, the appearing constants are made explicit (e.g.relating with the spectral radius of the probability measure µ in the regular representation L 2 (G) of the isometry group G).x have the same derivatives at 0 for every x ∈ X h .Moreover, it is not hard to see that Λ + x = Λ − y on [0, +∞) for every x, y ∈ X h .These suggest the following questions: Question 4.8.-Is it true that Λ + x = Λ − x for every x ∈ X h ?More importantly, does there exist a neighborhood of 0 such that Λ + x = Λ + y for every x, y ∈ X h (and similarly The answer to Question 4.8 is positive for x, y ∈ ∂ h X in standard cases when an analytic approach can be implemented.These include random walks on free groups or on classical hyperbolic spaces H n .Regarding the last part of the question, we note that there are simple examples which show that one cannot ask that the functions Λ x and Λ y coincide throughout the region where they are finite/well-defined -take for example the random walk on the group F 2 = ⟨a, b⟩ driven by the measure µ = (1) These finite time estimates then can be used, with the acceleration trick, to prove that lim λ→0 I(ℓµ + λ)/λ 2 ⩾ 1/2σ

?
Moreover, we note that thanks to Benoist-Quint [3, §5], the definition of the variance σ 2 µ given in (3.14) even makes sense under the finite first moment hypothesis supposing that the isometry group Isom(X) acts cocompactly on X.Therefore, this suggests the subsequent question as to whether the second-order term in the secondorder expansion of I below the drift vanishes when σ 2 µ = ∞.Similar questions can be asked about the second-order expansion of the limit Laplace transform Λ below zero.

Remark 1. 2 ( 1 )
(Busemann cocycle) A general version of Theorem 1.1 will be proved in Theorem 4.1 where the displacement d(z n , o) is replaced with the Busemann cocycle σ(L n , x) of L n based at any point of x in the horofunction compactification of X. See also Question 4.8 for an ensuing problem.(2) (Translation distance) Thanks to [6, Th. 1.3], when µ has bounded support, one can replace d(z n , o) by τ (L n ) in (1.4), where τ (.) denotes the translation distance given for g ∈ Isom(X) by τ (g) = lim n→∞ 1 n d(g n • o, o).

h . Remark 3 . 7 .-
It follows from(3.14)and the argument in the proof of[3, Th. 4.7.b] that σ µ = 0 if any only if there exists a constant C > 0 such that for every n ∈ N and g ∈ supp(µ * n ), we have |κ(g) − nℓ µ | ⩽ C. It follows that σ µ > 0 if any only if µ is non-arithmetic.

3
i 1 |ζi|>a are i.i.d random variables.Denote by ζ a their common distribution and by Λ ζa the Laplace transform of ζ a .The latter is differentiable at 0 for every a > 0 as ζ 1 has finite exponential moment (because µ has finite exponential moment).It follows that (

Sections 4 . 4 . 2 .
2 and 4.3 are devoted to the proof of Theorem 4.1.Proof of the lower bound.-Here we prove the following.Proposition 4.2.-Keep the setting of Theorem 4.1.Then, for every x ∈ X h lim λ→0

4. 5 . 4 . 5 . 1 .
Concluding remarks and questions.-In this final part, we include two questions motivated by our results and and make some brief comments on them.Limit Laplace transform of the Busemann cocycle.-As a direct consequence of Theorem 4.1, we have that the functions Λ +x and Λ −
2 µ .4.5.2.Second-order expansion below the drift without exponential moment.-The rate function I appearing in (1.3) for 1 n κ(L n ) exists without any moment assumption [6, Th. 2.8].Moreover, if µ fails to have finite exponential moment, then the rate function I vanishes on [ℓ µ , +∞) (see [6, Rem.3.2]).On the other hand, it follows from Gouëzel's [20, Th. 1.2] that I is positive on [0, ℓ µ ) when µ has finite first moment.This suggests the following question Question 4.9.-Suppose µ is a non-elementary probability measure with finite second order moment.Is it true that