The Markov-quantile process attached to a family of Marginals

Let $\mu$ = ($\mu$t)t$\in$R be any 1-parameter family of probability measures on R. Its quantile process (Gt)t$\in$R : ]0, 1[ $\rightarrow$ RR, given by Gt($\alpha$) = inf{x $\in$ R : $\mu$t(]--$\infty$, x])>$\alpha$}, is not Markov in general. We modify it to build the Markov process we call"Markov-quantile".We first describe the discrete analogue: if ($\mu$n)n$\in$Z is a family of probability measures on R, a Markov process Y = (Yn)n$\in$Z such that Law(Yn) = $\mu$n is given by the data of its couplings from n to n + 1, i.e. Law((Yn, Yn+1)), and the process Y is the inhomogeneous Markov chain having those couplings as transitions. Therefore, there is a canonical Markov process with marginals $\mu$n and as similar as possible to the quantile process: the chain whose transitions are the quantile couplings. We show that an analogous process exists for a continuous parameter t: there is a unique Markov process X with the measures $\mu$t as marginals, and being a limit for the finite dimensional topology of quantile processes where the past is made independent of the future at finitely many times (many non-Markovian limits exist in general). The striking fact is that the construction requires no regularity for the family $\mu$. We rely on order arguments, which seems to be completely new for the purpose.We also prove new results the Markov-quantile process yields in two contemporary frameworks:-- In case $\mu$ is increasing for the stochastic order, X has increasing trajectories. This is an analogue of a result of Kellerer dealing with the convex order, peacocks and martingales. Modifiying Kellerer's proof, we also prove simultaneously his result and ours in this case.-- If $\mu$ is absolutely continuous in Wasserstein space P2(R) then X is solution of a Benamou--Brenier transport problem with marginals $\mu$t. Itprovides a Markov probabilistic representation of the continuity equation, unique in a certain sense.


Introduction
We prove four main theorems, stated and labelled as A, B, C and D in this introduction and proved in the order C, A, B, D in the paper ; see also their interdependence in Figure 1 p. 7. Theorem A answers Problem 1.4 p. 3 below, and is a general theoretical result in Probability Theory; it builds a certain stochastic process with given marginals: the Markov-quantile process. Theorem B gives a convergence result to it. Theorems C and D present applications of the Markov-quantile process to two other contexts (Martingales and a theorem of Kellerer [33,34], and Optimal Transport), giving by the way Problem 1.4 additional motivations, see §1.2-1.3 and Figure 1. We prove also Theorems 2.26 and 5.20, linked with Theorem C and D respectively. Being a bit more technical they are not stated in this introduction.
In this introduction we first give the very necessary notions to state Theorems A-B as quickly as possible in §1.1, then state Theorems C and D in §1. 2-1.3. We slow the flow in §1.4 to give a qualitative insight into Problem 1.4 and its difficulties, that also shows why it is a natural problem in itself. We give in §1.5 the structure of the article and in §1.6 an index of our notation. This paper treats of Measure, Probability, and Transport Theories. To be understood by a large panel of readers, we give the definitions of the more specific notions of each of these fields, or Reminders about them if needed.
1.1. Our results and their motivation. Take (E τ ) τ ∈T a (finite or not) family of measurable spaces; τ ∈T E τ is endowed with its cylindrical σalgebra, generated by the preimages of those of the factors by the projections.
Reminder 1.1. A process is a family X = (X τ ) τ ∈T of measurable maps X τ : Ω → E τ , called random variables, defined on the same probability space (Ω, P). In this article, contrarily to what may be considered usual, no measurability condition is required on Ω × T . For every T ⊂ T , (X τ ) τ ∈T defines a map F T from Ω to τ ∈T E τ . The law of (X τ ) τ ∈T is the pushedforward probability measure (F T ) # P on τ ∈T E τ , which is also called the marginal law of the measure (F T ) # P on τ ∈T E τ . Now let µ τ be a probability measure on E τ , for each τ . Notation 1.2. For all measurable space E, M(E) and P(E) are the spaces of measures and probability measures on E. If T ⊂ T , proj T is the projection τ ∈T E τ → τ ∈T E τ ; in case T = {τ 1 , . . . , τ m } is finite, proj τ 1 ,...,τm means proj {τ 1 ,...,τm} . When P ∈ P τ ∈T E τ and s < t, P s stands for (proj s ) # P and P s,t for (proj s,t ) # P , and Marg((µ τ ) τ ∈T ) denotes {P ∈ P τ ∈T E τ : ∀τ ∈ T , (proj τ ) # P = µ τ }. When not otherwise specified, what we call the marginals of P are its marginals P s on a single factor. Reminder 1.3. (a) If P ∈ Marg((µ τ ) τ ∈T ), setting Ω := ( τ ∈T E τ , P ) and X = (X τ ) τ ∈T := (proj τ ) τ ∈T we get a process called the canonical process, of law P . For this reason, by an abuse of language -e.g., in the title of this article-, we sometimes call process a probability measure on a product space. For the same reason we may also see Marg((µ τ ) τ ∈T ) as the set of the processes (X τ ) τ ∈T such that Law(X τ ) = µ τ for all τ .
Here is our problem. It is stated for any family of measures, without any assumption of regularity in the parameter t. Problem 1.4. If µ = (µ t ) t∈R is a one-parameter family of probability measures on R, we want to build a measure MQ ∈ Marg(µ) that at once: (a) is Markov, (b) resembles as much as possible the quantile measure Q ∈ Marg(µ).
Let us explain those two points. Take P ∈ Marg((µ t ) t∈R ) and (X t ) t∈R a process of law P ; point (a) means: (1) ∀s ∈ R, ∀t > s, Law(X t | (X u ) u s ) = Law(X t | X s ), where Law(X t | (X u ) u ) is the law of X t conditionally to the σ-algebra generated by the X u . See also Definition 2.11, where the Markov property is introduced only through notions defined in this article. For point (b), Q ∈ Marg((µ t ) t∈R ) may be defined by an explicit construction (see Reminder 1.9 below) or implicitly, as the unique measure such that for all x ∈ R and all t > s, if X is a process of law Q: where this minimum is with respect to the stochastic order: The problem is that Q is not Markov in general, see Remark 1.8(a) a bit below. In view of its definition through (2), we seek some Markov process MQ, if it exists, such that: -Processes of law MQ satisfy a version of (2) among Markov processes, -like for Q ∈ Marg((µ t ) t∈R ), if (µ t ) t∈R is increasing for sto , some processes (X t ) t∈R of law MQ ∈ Marg((µ t ) t∈R ) are increasing, i.e. the functions t → X t (ω) are, -like for Q, the couplings (proj s,t ) # MQ of MQ have an increasing kernel (or briefly, MQ has increasing kernels), as follows (see Definition 3.11 for alternative definitions not resorting to conditional laws): Definition 1.6. Take P ∈ Marg((µ t ) t ) and for s < t, set P s,t = (proj s,t ) # P . We call kernel of P s,t the data of the conditional measures Law(X t |X s = x) where X is a process of law P ; just below we denote it by (P s,t x ) x∈R . We say that P s,t has increasing kernel if, for all process (X s , X t ) of law P s,t , x y ⇒ P s,t x sto P s,t y , and that P has increasing kernels if every P s,t has.
Be careful that the following convention is now used throughout. Our answer to Problem 1.4 is Theorem A below. Convention 1.7. When we introduce finite sets {r 1 , . . . , r m } or m-tuples (r k ) m k=1 of real numbers, we mean implicitly that r 1 < . . . < r m .
Theorem A. Let (µ t ) t∈R be a family of probability measures on R.
(a) There exists a unique measure MQ ∈ Marg((µ t ) t∈R ) such that: (i) MQ is Markov, (ii) MQ has increasing kernels, (iii) MQ has minimal couplings (alias transports) among the measures satisfying (i) and (ii), in the sense that it satisfies (2) where the minimum is taken among processes (Y t ) t satisfying (i) and (ii). It is also the unique process satisfying (i) above and: (iv) each MQ s,t is a limit of products of quantile couplings (Q s,t where for R = {r 1 , . . . , r m } ⊂ [s, t], Q s,t [R] is the product Q s,r 1 .Q r 1 ,r 2 . · · · .Q r m−1 ,rm .Q rm,t of the quantile couplings Q r i ,r j ∈ Marg(µ r i , µ r j ). Moreover: (b) If (µ t ) t∈R is increasing for the stochastic order, i.e. s t ⇒ µ s sto µ t , there exists a process X = (X t ) t∈R : Ω → R R of law MQ with increasing trajectories, i.e. such that t → X t (ω) is an increasing function, for all ω ∈ Ω.
Informally, we may interpret Theorem A(a) as the following answer to Problem 1.4: MQ is the Markov process whose "infinitesimal transitions" are those of the quantile process. Besides, an immediate consequence of Theorem A(a) is the important point (b) of Remark 1.8.
Reminder 1.9 (The quantile process). The quantiles of a measure µ ∈ P(R) generalize the notion of the median, which is the quantile of level 1 2 . The quantile of µ of level α is the smallest real number x µ (α) such that µ(]−∞, x µ (α)]) α and µ([x µ (α), +∞[) 1 − α. The quantile process X = (X τ ) τ ∈T , defined on Ω = [0, 1] with the Lebesgue measure, is given by X t (α) = x µt (α), and we denote Law(X) by Q ∈ Marg((µ t ) t∈T ). Remark 1.10 (Justification of the name "Markov-quantile"). While Properties (i) and (ii) of Theorem A(a) are satisfied by the product measure (law of the independent process) t∈R µ t , the quantile process Q satisfies (ii) and (iii) in the sense that it satisfies (ii) and its couplings Q s,t are minimal among those of the measures satisfying (ii). In fact, Theorem A is constructive and builds MQ as a modification of Q; therefore we call this measure MQ the "Markov-quantile" measure attached to (µ t ) t∈R .
In fact, a deeper convergence statement holds than that resulting from point (iv) above. Indeed we introduce the following notion of a measure in Marg((µ t ) t∈R ) "turned into a Markov law at a finite set of instants", denoted in a way that is consistent with the notation of Theorem A(iv).
Proposition/Notation 1.11. If P ∈ Marg((µ t ) t∈R ) and R ⊂ R is finite, there is a unique measure in Marg((µ t ) t∈R ), denoted by P [R] , such that: -P [R] is the law of a family of variables (X t ) t∈R that is "Markov at the instants of R" i.e. (1) holds with " ∀s ∈ R" instead of " ∀s ∈ R", -for the closure I of each connected component of R \ R, (proj I ) # P [R] = (proj I ) # P .
This proposition follows from the way we define P [R] in Definition 4.18, using the catenation of transport plans given by Definition 2.8. We show: Theorem B. There is an increasing sequence (R n ) n∈N of finite subsets of R such that Q [Rn] ∈ Marg((µ t ) t∈R ) converges weakly to MQ. Reminder 1.12. A sequence (P n ) n of (probability) measures on some measurable topological space E converges weakly to P if, for all bounded continuous function f , f dP n → f dP . For E = R R with the weak topology, this convergence amounts to the weak convergence of all finite marginals. Remark 1.13. Of course, not every sequence (R n ) n∈N is admissible for Theorem B. In fact, we prove a more precise version of Theorem B, see Theorem 4.21. It introduces notably the notion of essential atomic times of (µ t ) t that turn out to be the times contained in ∪ n R n for all sequence (R n ) n∈N admissible for Theorem B; see also Remark 1.29 in this introduction.
Our problem: a classical type of question. The problem of defining a measure or a process P with given marginals and additional properties is a general problem that includes Problem 1.4 and has already been explored several times in pure and applied Probability Theory as well as in Analysis or Dynamics. Without claiming exhaustiveness on this rich topic we review some research streams and provide references.
A result related to Theorem A(b) is proved by Kamae and Krengel in [31]. The measures (µ t ) t∈R are in P(E) where E is a partially ordered Polish space. Assuming the measures in stochastic order, in a suitable sense, the authors prove that there exists an increasing process in Marg(µ). Other orders can be considered together with expected properties on the processes. For E = R d , Chapter 8 of [50] proposes plenty of orders. In Stochastic Analysis and Mathematical Finance, the topic of peacocks and their associated martingales is closely related to our problem. "Peacock" stands for PCOC: Processus Croissant Pour l'Ordre Convexe (French), that is, increasing process for the convex order. One aims at defining a martingale in Marg(µ), where µ = (µ t ) t∈R are the marginals of some peacock, using various techniques. Most of the time the peacock is part of a specific class, so the purpose is more specific than the work of Kellerer presented in Subsection 1.2. In most of this literature the martingales may or not be Markov [18,41,29,20,23,25,45]. The papers by Lowther [40,39] on limits of diffusion processes for the finite dimensional convergence permitted some authors to refocus on the Markov setting (see, e.g., [7,24,28]), rediscovering Kellerer's work by the way. An important example in the topic are the fake Brownian motions, that are processes sharing some of the properties characterizing the standard Brownian motion: they are continuous Markov martingales with marginals µ t = N (0, t). See, e.g., [41,19,1,43,26,20] for examples of constructions.
In Section 5 we will present a more analytic field related to Optimal Transport Theory: Benamou-Brenier's study of the incompressible Euler equation [10], transport representation for solutions of PDEs after Jordan, Kinderlehrer and Otto [27,44]. This transport formalism was thoroughly studied by Ambrosio, Gigli and Savaré [3], and continued among others (see [17,51,5]) by Lisini [38] in metric spaces.

Relations to Kellerer's Theorem.
If you forget about MQ itself, Theorem A(b) gives the following existence property. Corollary 1.14. If (µ t ) t∈R ∈ P(R) R is increasing for sto , there exists a Markov process X = (X t ) t∈R : Ω → R R such that Law(X) ∈ Marg((µ t ) t ) and that the trajectories t → X t (ω) are increasing.
This extends to the case of the stochastic order a famous theorem of Kellerer on martingales and submartingales with given marginals [33,34]. Our Theorem C recovers, with a different proof, Kellerer's result, as well as (simultaneously) Corollary 1.14 on increasing processes. The proof of Theorem C is also completely independent from that of Theorem A. Moreover the method used to show it leads to an existence statement for certain Markov processes, Theorem 2.26 p. 20, omitted in this introduction. To state Theorem C we need to recall two definitions. Definition 1.15. Two measures µ and ν on R, with finite first moments, are said to be in convex order C , respectively in convex stochastic order C,sto , if for every convex, respectively convex increasing function ϕ: Notice that µ sto ν if and only if (3) holds for (bounded) increasing functions ϕ. Now we define a martingale. We do it in the Markov framework  (all we need), where this is a bit simpler. We add in (c) a terminology of our own. Definition 1.16. (a) A measure P on (R d ) 2 is a martingale coupling if for every non-negative continuous bounded function f : R d → R: For T ⊂ R, a Markov measure P on (R d ) T is a Markov martingale if for every {s, t} ⊂ T with s < t, the coupling (proj s,t ) # P is.
(b) When d = 1, submartingale couplings and Markov submartingales are defined alike, the integral in (4) being non-negative instead of null.
(c) A measure P on R 2 is called an increasing coupling if P ({(x, y) ∈ R 2 : x y}) = 1, i.e. P = Law((X i ) i∈{1,2} ) where X 1 X 2 . For T ⊂ R, we say that a measure P on R T has increasing couplings if all (proj s,t ) # P are so. Remark 1.17. (a) If a measure P on R is the law of a process with increasing trajectories, as in Theorem A(b), P is in case (c) of Definition 1.16. Actually a classical type of reasoning shows the converse, see Lemma 1.20 below. So the result in Corollary 1.14 amounts to give the existence of a Markov measure P ∈ Marg((µ t ) t ) with increasing couplings.
(b) Take T ⊂ R. If there exists P ∈ Marg((µ t ) t∈T ) as defined in case (a), (b), or (c) of Definition 1.16, then it is immediate that (µ t ) t is respectively increasing for C , C,sto or sto . When #T = 2, by Strassen's theory [52], the converse is true. More generally, it is also true when T = N; one can deduce it by a quite simple induction based on the Markov catenation (Definition 2.8).
Kellerer shows Remark 1.17(b) for C and C,sto in the much more delicate case T = R. Theorem 1.18 (Kellerer,[33,Theorem 3], [34]). If (µ t ) t∈R is a family of probability measures on R, increasing for C (or C,sto ), there is a Markov measure P ∈ Marg((µ t ) t ) which is a (sub)martingale.
Kellerer's Theorem remains unproved for vectorial measures, see Open question 6.5.2. This a major motivation to search new methods to construct or to establish the existence of certain Markov processes, as it is done to prove Theorems A and C. Following Kellerer's line of proof, but replacing one of its key lemmas by another one (see details below), we prove its following generalization.
Theorem C. If (µ t ) t∈R is a family of probability measures on R, increasing for C , C,sto or sto , there is a Markov measure P ∈ Marg((µ t ) t ) which respectively is a martingale, is a submartingale or has increasing couplings.
Using Remark 1.17(a), one sees that this theorem proves Corollary 1.14, that follows from Theorem A, by another way.
Kellerer's proof uses a continuity result for certain kernels, recalled in Lemma 2.19. We replace it by Lemma 2.23, a continuity result for the increasing kernels of Definition 1.6. A form of Lemma 2.19 appears in every proof of Kellerer's theorem we know [33,34,40,24,7], so that in this respect our proof, resting on another type of kernels, is new. About Lemma 2.23, the following comment, that also appear on Figure 1, are in order.
-It is a significant result of this article; §3 is devoted to its proof.
-It is not used in the proof of Theorem A, so that the proofs of our Theorems A and C are really independent. They bring separately Corollary 1.14, an enhancement Theorem C in the (new with respect to Kellerer's work) case of sto .
-It plays a prominent role in Theorem B, see p. 50, and, as a consequence of it, in all the results of §5 concerning the representation of absolutely continuous curves of order two in Wasserstein space.
Remark 1.19. The Doob-Meyer decomposition theorem of some submartingales in a sum of an increasing process and a martingale is another reason our generalization of Theorem 1.18 to sto is a natural work.
We mention also that Kellerer seems to have never considered the question of the extension to sto in his papers. However in [35] with application in [36], he explored the related question of the existence of increasing couplings P ∈ Marg(µ, ν) that are as independent as possible, in a suitable sense.
Finally, here is the lemma announced in Remark 1.17(a), yielding Corollary 1.14 from Theorem C. It is proven p. 29 in §3.4. Lemma 1.20. If P is a probability measure on R R with increasing couplings (see Definition 1.16(c)), there exists an increasing process X = (X t ) t∈R : Ω → R, i.e. the functions t → X t (ω) are increasing, such that P = Law(X).

1.3.
Application to the continuity equation and its treatment in Optimal Transport Theory. Our Markov-quantile process provides in §5 a uniqueness statement in the context of the transport (i.e. continuity) equation. We introduce it briefly here and state Theorem D; in the introduction of §5 we give more detail on the context of Optimal Transport, we place and motivate our work within it, and introduce Theorem 5.20, absent of the general introduction of the article -this theorem shows that the Markovquantile process is the limit of processes built by a construction by interpolation, classical in Transport, and opens the question of a generalization of this construction to greater dimensions.
We stress that the Markov property was not involved up to now in the a priori rather analytic context of the transport equation. As we have seen in §1.2, considering this property comes from Kellerer's Theorem that is nowadays mostly represented in Martingale Optimal Transport (and Peacocks). This provides new results in Optimal Transport, where it is a novelty. The links between both fields was up to now only the other way around.
In §5, we consider only the -nevertheless still rich-set of continuous curves (µ t ) t∈R : R → P(R), P(R) being endowed with a topology as follows.
Then an inequality involving energies for curves in R d and P(R d ) is proven in Remark 5.10: for all Γ ∈ P(C( In Theorem D below, the existence of Γ such that (5) is an equality is well-known. The theorem rather proves the existence and uniqueness of one Markov such measure.
(a) (Existence of a Markov representation) There exists a Markov probability measure Γ in Marg C (µ) such that (5) is an equality, i.e.: and such that there exists a nested sequence (R n ) n∈N of finite subsets of ]0, 1[ such that Q [Rn] converges to Γ in P(C).
(b) (Uniqueness) If Γ is as in (a) then its law is MQ.
Note that Theorem D relies on Theorems A-B and Lemma 2.23, as represented on Figure 1.
The quantile trajectory (X t (α)) t∈R associated with the level α ∈ [0, 1] on the probability space Ω = ([0, 1], λ) is t ∈ R → α|t|. The process X is not Markov because at time t = 0, with information on (X t ) t<0 , we better know (X t ) t>0 -actually here, we determine it completely. A modification makes X Markov. Namely, consider the catenation is the only possible answer to Problem 1.4: it is Markov (hence (X t ) t<0 and (X t ) t>0 are independent) and equal to the quantile process where the latter is Markov. Moreover it satisfies the properties of Theorem A, so it is the Markov-quantile process.
Definition 1.25. A measure without atom is said to be diffuse. We say that t is an atomic time of a family (µ t ) t∈R of measures if µ t is not diffuse.
Example 1.27. When the set R of atomic times of (µ t ) t∈R is finite, one may repeat at each r ∈ R the independence operation described above at time 0, to produce a Markov process. Moreover, one may check that it does not matter in which order, because these operations commute. The resulting process is indeed our Markov-quantile process. With the notation of the paper, its law is Q [R] . Similarly it is easy to imagine the Markov-quantile process when the atomic times form a locally finite set, like, e.g., Z. The situation becomes complicated when the set A of atomic times is not locally finite or even worse, uncountable. Consider the following a priori reasonable approach. Let (R n ) n∈N be a nested family of finite sets such that R ∞ = ∪ n R n is dense in A. We consider the sequence (Q [Rn] ) n and hope for a limit Q. Then we encounter three problems: -By compactness of Marg((µ t ) t ), (Q [Rn] ) n has an accumulation point (see similar reasonings in [23,29,31,38,54]), but it has no reason to be unique.
-If A is uncountable, this limit needs not to satisfy the Markov property (1) at times s ∈ A \ R ∞ , at which we did not perform the modification of Examples 1.24-1.26. A continuity assumption on t → µ t could let hope to yield it (this was used, with other goals, for measures in stochastic or convex order, see, e.g., [7,24]) but we do not make such an assumption. Also in the "space of quantile levels", the irregularity may be maximal: the set {(t, α) ∈ R × [0, 1] : X t (α) is an atom of µ t } needs not to be measurable.
-Anyway, limits of Markov processes are in general not Markov so here, property (1) is not ensured even at s ∈ R ∞ . To our knowledge, before the present paper this type of problem has principally one solution, based on Lipschitz kernels (see Lemma 2.19), first discovered by Kellerer [33,40,24,9,7]. However, see [42,Lemma 5.3] for a different statement. Another consequence of this non stability of the Markov property is that it is also not possible to consider the sequence of quantile processes for mollified curves µ (n) = (µ t * θ n ) t , relying on the fact that all the measures µ In fact, the convergence in our Theorem A(iv), and hence in its enhancement Theorem B, rests on the order lo introduced in Definition 3.4: for all s and t > s, the choice of the times r i follow from that of a sequence in some set of measures, tending to the supremum of this set for lo , see Lemma 4.10, in particular its point (c). (b) As the examples above suggest, for all s and t > s, the products appearing in point (iv) of Theorem A need only to use couplings Q r i ,r i+1 where the times r i are atomic. Adding non-atomic times has no effect. Similarly, in Theorem B, that provides nested finite sets R n such that Q [Rn] → MQ, R = ∪ n R n may avoid all non-atomic times. Now it appears moreover, but only as a consequence of Theorem A once it is proven, that all the atomic times of (µ t ) t do not play the same role: -Some are "essential" (see Definition 4.25). All of them that lie in ]s, t[ must eventually appear among the r i in Theorem A(iv), and all of them must belong to R in Theorem B. This is possible as they turn out to be at most countable (Proposition 4.26).
-One may choose the r i in Theorem A, or R in Theorem B, so that they avoid any fixed finite set of the other atomic times.
Therefore, the intersection of the sets R satisfying the convergence property of Theorem B is the set of the essential atomic times.
The existence, for any sequence (µ t ) t , of the set of its essential atomic times, at most countable even if the set {(t, α) ∈ R × [0, 1] : X t (α) is an atom of µ t } is not measurable, is in itself a significant result of this article. Perhaps does this notion admit generalizations when the set of parameters or the measurable space, both equal to R in this work, are more general spaces.
Remark 1.30. It is also very important to notice that as soon as the set {(t, α) ∈ R × [0, 1] : X t (α) is an atom of µ t } is regular enough (see the examples of §6 for clearly stated instances of this), the Markov-Quantile process may be explicitly computed, as it is done in the important Example 6.1. More generally, certain properties of this set and of MQ are linked, see the whole of §6. Through its various examples, that section gives also an intuition of how MQ behaves.
1.5. Organization of the paper. In §2 we introduce in §2.1 kernels and transport plans, their composition and catenation, and the Markov property expressed in this language. We give in §2.2 the structure of Kellerer's work [33,34], explain why it motivates our reasoning towards Theorem C, and prove the latter. However, we postpone the introduction of one auxiliary notion, and the proof of Lemma 2.24 and of the essential Lemma 2.23 to §3. In §2.3 we state and prove the "Markovinification" Theorem 2.26.
In §3 we introduce the auxiliary notions and results leading to the proofs of Lemmas 1.20, 2.23 and 2.24, then also used in §4, namely: (a) the "lower orthant" and stochastic orders, related suprema and the notion of increasing kernel in §3.1, (b) the quantile transport and the notion of minimal coupling in §3.2, (c) two distances, ρ andρ, inducing the weak topology on spaces of transport plans Marg(µ, ν) in §3.3. Lemma 1.20 is proved in §3.4 and Lemmas 2.23 and 2.24 are in §3.5.
By the way, §2.1, §3.1, and §3.2 also give all the background to understand in detail the three properties (i)-(iii) characterizing MQ in Theorem A.
In §4 we prove Theorems A and B: in §4.1 we explain how the situation may be pushed forward to the space [0, 1] of "levels of quantiles", in §4.2 we prove Theorem A, i.e. build the Markov-quantile process, and in §4.3 we state and prove Theorem 4.21 which is a more precise and complete version of Theorem B. To do this we introduce the essential atomic times of (µ t ) t .
In §5  Note. When we introduce various tools, sometimes classical, we do it in a way and with remarks adapted to our context. The reader already knowing them may read quickly, taking notice of our few specific remarks, which are useful in the rest of the article.
1.6. Notation. (a) We gather here the notation we use widely, indicating where each item is introduced, so that the reader can find it quickly if needed.
We introduce M(E), P(E), proj T , P t (similarly Γ t ), P s,t and the set Marg((µ τ ) τ ∈T ) in Notation 1.2, the quantile process Q in Reminder 1.9 and, together with the quantile coupling Q(µ, ν), in Definitions 3.18 and 3.20 and Notation 3.21, the stochastic order sto in Reminder 1.5, MQ in Definition 4.16(b), if P is some process, R ⊂ R and #R < ∞, P [R] in Notation 1.11 and Definition 4.18, C and C,sto in Definition 1.15, C and Marg C (µ) in Notation 1.21 and and 5.1, λ in Notation 1.23, the composition k.k of kernels in §2.1.1, the kernels k P in Notation 2.2 and id E in Notation 2.4, Joint(µ, k) in Notation 2.5, the transport plans Id 2,µ and Id n,µ in Notation 2.6, if P is some proces, t P in Definition 2.7, P • P in Definition 2.8, N LK s,t in Definition 2.16 and N IK s,t in Notation 2.25, x y when (x, y) ∈ (R d ) 2 in Notation 3.1, F µ or F [µ], if µ is some measure, in Definition 3.2, µ lo ν in Definition 3.4, losup τ P τ in Definition 3.5, M(µ), P(µ), M (µ) and P (µ) in Notation 3.13, G µ in Definition 3.23, in Notation 3.1, the distance ρ in Notation 3.27,ρ in §3.3.2, the kernels q r , k r and t k r in Notation 4.2, A r,x , A r and r in Notation 4.4, L R , for R ⊂ R, in Notation 4. 10 . Product measures are denoted by µ ⊗ ν.
If f and g are functions, f ⊗ g stands for (x, y) → (f (x), g(y)).
(c) Recall also Convention 1.7: introducing {r 1 , . . . , r m } or m-tuples (r k ) m k=1 of real numbers, we mean implicitly that r 1 < . . . < r m . (b) Throughout this paper "increasing" and "decreasing" mean "nondecreasing" and "nonincreasing". Indeed we deal often with partial orders, for which the two latter terms are unclear: the contrary of (∀s < t, µ s µ t and µ s = µ t ) is (∃s < t : µ s µ t or (µ s and µ t are incomparable)).
Aknowledgements. The authors wish to thank Martin Huesmann, Christian Léonard, Emmanuel Opshtein and Xiaolu Tan for bibliographic or editorial suggestions as well as Michel Émery and Erwan Hillion for discussions on examples related to this work.
2. An extension of a theorem of Kellerer 2.1. The Markov property, composition and catenation of kernels and transport plans. Everywhere E, E , E etc. are topological spaces (or sometimes Polish spaces) and B(E), B(E ) and B(E ) their Borel σ-algebras.
is a probability measure on E for every x in E and k(·, B) is a measurable map for every B ∈ B(E ).
Probability kernels are usually interpreted as transition matrices, see Remark 2.3: after one step a particle at x in E arrives at a random position in E , distributed with respect to k(x, ·). We often have that interpretation in mind.
Remark/Notation 2.2. Every transport plan P ∈ P(E × E ) can be disintegrated with respect to its first marginal P 1 := (proj 1 ) # P and a kernel that we denote by k P , defined from E to E , so that: for every bounded continuous function f . Observe that x → k(x, ·) is P 1almost surely uniquely determined.
2.1.1. Composition and action of kernels. Kernels k from E to E and k from E to E can be composed as follows: Similarly, acting on the right, kernels from E to E transport, or send positive measures θ on E on positive measures on E . Acting on the left, they send (adequately integrable) functions f : E → R, on functions E → R: Associativity holds, e.g., (k.k ).k = k.(k .k ), and θ.(k.f ) = (θ.k).f where the action of measures on functions is the obvious one. This is consistent with the following remark.
Remark 2.3. We recall the usual interpretation of the composition as matrix multiplication.
where (k i,j ) n j=1 is the measure k(x i , · ) ∈ M(E ), viewed as a vector, and a function f from E to R is a (column) vector f = (f (z j )) n j=1 . Then, taking k = ((k j,k ) n j=1 ) n k=1 a kernel from E to E , θ.k, k.k and k .f introduced above have the same sense as products of matrices. Notation 2.4. We denote by id E the identity kernel (that acts trivially) Notation 2.5. With µ ∈ M(E) and k a kernel from E to E is naturally associated the law Joint(µ, k) ∈ M(E × E ), having µ as first marginal and the family (k(x 0 , ·)) x 0 ∈E as laws (on E ) conditioned by x 0 ∈ E: In particular P = Joint(P 1 , k P ).
2.1.3. Action of transport plans on measures and functions. If µ ∈ M(E) and µ ∈ M(E ), transport plans P ∈ Marg(µ, µ ) have an action similar to that of kernels from E to E , on (µ -almost surely defined) classes of functions f and on measures θ ∈ M(E; µ) absolutely continuous with respect to µ. g(x)dP (x, y).
2.1.5. The Markov Property. We introduce the Markov property here in an alternative, equivalent way to the usual one.
Remark/Definition 2.11. As recalled in (1), a process (X t ) t∈R is said to be Markov if: ∀s ∈ R, ∀t > s, Law(X t | (X u ) u s ) = Law(X t | X s ). Denoting Law((X t ) t ) ∈ P(R R ) by P , this is equivalent to the fact that for all finite More generally, we say that any measure P ∈ P(R R ) is Markov if it satisfies this property.
We extend these definition in the obvious way to processes indexed on subsets R ⊂ R and measures on P(R R ).
We recall the Kolmogorov-Daniell theorem and its usual corollary on Markov measures. Proposition 2.12 (Kolmogorov-Daniell theorem). Let (µ S ) S be a family of probability measures on some Polish space E, indexed by the finite subsets S of R. If (proj S ) # µ S = µ S for every S ⊂ S, there exists a unique P ∈ P(E R ) with (proj S ) # P = µ S for every S.
One of the most usual applications of Proposition 2.12 is for measures µ S of type µ s 1 ,s 2 • · · · • µ s d−1 ,s d where S = {s 1 , . . . , s d }.
Corollary 2.13. Let (µ s,t ) s<t be a family of transport plans in P(E × E) such that: µ s,u = µ s,t .µ t,u for every s < t < u. Then there exists a unique Markov measure P ∈ P(E R ) with P s,t = µ s,t for every s < t.
Definition 2.14. It is usual to call consistent family every family (µ S ) S or (µ s,t ) s<t as in Proposition 2.12 and Corollary 2.13.

2.2.
Kellerer's work. Our motivation and proof of Theorem C. In [33] and [34], Kellerer proves the three results that we reproduce as Theorem 2.15, Lemma 2.19 and finally Theorem 2.21, which is a more precise version of Theorem 1.18 given in the introduction. He also introduces Definition 2.16. As we will see Theorem 2.15 extends Corollary 2.13: take N s,t = {µ s,t }.
Our goal here is to prove Theorem C. To put forward quickly both the background and our reasoning we postpone all the intermediate proofs, as well as the introduction of the technical tools they require to the next section.
Kellerer first proves the following statement -we give the sketch of proof p. 20. It seems a bit stronger than in [33] but is what he actually shows. . Let (µ t ) t∈R be a family of probability measures on some Polish space E, and for every s < t let N s,t ∈ P(E 2 ) be a set of transport plans. Assume that: (1) for every s, t, N s,t is not empty, (2) for every s, t, N s,t ⊂ Marg(µ s , µ t ), (3) for every s, t, N s,t is closed for the weak topology, (4) for r < s < t and any (P, P ) ∈ N r,s × N s,t , P.P ∈ N r,t , (5) for every d and Then, there exists a Markov measure P ∈ Marg((µ t ) t ) with (proj s,t ) # P ∈ N s,t for every s < t.
s,t . Remark 2.17 gives some comments, Remark 2.18 is used in the following.
Remark 2.17. (a) The terminology "Lipschitz property" was introduced in [40, Definition 4.1] for a Markov process with Lipschitz transition kernels. It is renamed as "Lipschitz-Markov property" by Hirsch, Roynette and Yor in [24]. The fact that this property is stable for finite dimensional convergence of processes is crucial in those papers and in [39] and appears as an avatar of Kellerer's Lemma 2.19 stating that the catenation operator • is continuous for the corresponding class of kernels. These kernels are called Lipschitz in [28,9] and the present paper, and Lipschitz-Markov in [7].
(b) You may compare Definition 2.16 with that of transport plans with increasing kernel in Definition 3.11 (b).
(c) [33, p. 115] In case the topology of E and E is discrete, hence induced, e.g., by the distance d(x, y) = 1 − δ x,y , everyh is 1-Lipschitz; hence any P has Lipschitz kernel.  (1), (3) and (4), (2) and (5) (5) for sequences (Q n t i ,t i+1 ) n of Markov-Lipschitz transports, without the assumption that the Q n t i ,t i+1 have the same marginals for all n, though this stronger result is not used further in [33]. In our Lemma 2.23 this assumption is crucial.
Finally Kellerer proves this more precise version of Theorem 1.18. [34]). If (µ t ) t is an increasing family of measures on R, for C (or C,sto ), there is a Markov measure P ∈ Marg((µ t ) t ) such that P is a (sub)martingale and the couplings P s,t have Lipschitz kernel. Replacing point (i) by an alternative version (i'), consisting of Lemma 2.23 below, and proving a version of (ii) adapted to this change, we prove Theorem C. Namely we prove that increasing kernels, introduced in Definition 1.6 (see also Definition 3.11 for more details), satisfy Lemma 2.23, a counterpart of Lemma 2.19, as well as Property (3), i.e. the little Lemma 2.24. They are proven respectively on pp. 33 and 30. Then we prove Theorem C.
Lemma 2.24. Take µ and µ in P(R). The space of transport plans with increasing kernel in Marg(µ, µ ) is closed for the weak topology.
Proof of Theorem C. Take (µ t ) t ∈ P(R) R , increasing for C (case (a)), C,sto (case (b)) or for sto (case (c)) to prove the corresponding cases of Theorem C. In the sketch of proof of Theorem 2.21 given in Remark 2.22, replace N LK s,t by N IK s,t and introduce, similarly as defined in Remark 2.22 for case (a) and (b), the spaces N IK Properties  (4) follow. For (1), in our three cases: For the completeness of this exposition, we also provide the following.

2.3.
Relation to the Markov-quantile process. What precedes provides also, through the application of Theorem 2.15, the following existence theorem for Markov processes being limits of products of a given process. When applied to the quantile measure Q ∈ Marg((µ t ) t ) introduced in §3.2, it provides the existence part of Theorem A, see below.
Theorem 2.26 (Markovinification). Take P ∈ P(R R ). If for each s and t > s, P s,t has increasing kernel, there exists a Markov measure P in Marg((µ t ) t ) such that each P s,t is a limit of products P s,r 1 . · · · .P rm,t with {r 1 , . . . , r m } ⊂ ]s, t[. One may take P such that for each (s, t), the limit is obtained with a sequence ({r n 1 , . . . , r n m(n) }) n such that max [R] } are both reduced to {P s,t }, so that the Markov measure obtained from any of them is P itself. This conservation property also holds locally on intervals I ⊂ R if (P t ) t∈I is Markov. Notice also that Theorem 2.26 does not require (µ t ) t∈R to be increasing for sto . Theorem 2.26 links §2 with the Markov-quantile process MQ built in §4. Indeed, taking P = Q, Q s,t is in N IK s,t for all s < t by Remark 3.25, so Theorem 2.26 gives the existence of a Markov process with 2-marginals in N (Q) We prove in §4, by completely different means, that: -this process is unique, -it may be built using the order sto (see also Remark 1.29), instead of being obtained by a non-constructive compactness argument. This is Theorem A. See also Open question 6.5.1.

Three auxiliary notions, and postponed proofs of three lemmas
The next section introduces the notions needed to prove the results of §4 below. They are also necessary for the proofs of three lemmas that were therefore postponed: Lemma 1.20 on versions of increasing processes, the important Lemma 2.23 on the continuity of • when the kernels are increasing, and Lemma 2.24.
3.1. Lower orthant and stochastic orders, related suprema, and increasing kernels.
in R d by x and y. We endow R d with the natural partial order defined by: (b) Several times appear statements where some intervals have to be considered closed or open at some of their bounds, either arbitrarily or depending on possible cases. To alleviate the writing, we introduce the symbol " " and place it at those bounds.
Definition 3.2. If µ ∈ M(R d ), its cumulative distribution function F µ , that we also denote by F [µ] to avoid multiple subscripts, is defined, using Notation 3.1, by: Definition 3.5. We call lower orthant supremum of a family (P τ ) τ ∈T of measures of same mass m on R d , the smallest upper bound of {P τ } τ for lo , if it exists, i.e. a measure P of mass m such that: -for every τ , P τ lo P , -P lo Q as soon as P τ lo Q for every τ . By definition, if it exists it is unique. We denote it by losup τ P τ . Similarly we define loinf τ P τ .
Remark 3.6. (a) In Reminder 3.3, (a) and the first limit of (b) pass to the infimum of functions that are both monotone and upper semi-continuous. To see it notice that for such functions upper semi-continuity (8) reads: If moreover (P τ ) τ has an upper bound P , then the second limit of (b) holds. Indeed, for all τ , Remark/Notation 3.7. If d = 1 the order lo is usually called stochastic order and denoted by sto ; we will then call "stochastic supremum" the lower orthant supremum of Definition 3.5 and denote it by stosup.
Lemma 3.8 (Existence criteria for losup). (a) If, for lo , a sequence (P n ) n∈N ∈ (P(R d )) N is bounded from above, and increasing, i.e. n m ⇒ P n lo P m , then losup n P n exists and (P n ) n converges weakly to it.
(b) If a family (P τ ) τ ∈T is bounded from above for lo and if for every τ, τ ∈ T there exists σ ∈ T such that P σ lo P τ and P σ lo P τ , then losup τ P τ exists and there is an increasing sequence (P τn ) n that converges weakly to it.
The results extend in an obvious way to measures of mass m > 0 in M(R).
We are done. The weak convergence is given by the pointwise convergence of the cumulative distribution functions, see Reminder 3.26.
F (x k ) + 1 n . For each n, by a finite induction using the assumption of (b) on the P τ k,n , we find σ n ∈ T such that: ∀k n, F [P σn ](x k ) F (x k ) + 1 n and (P σn ) n is increasing. Hence: By (a), P = losup n P σn exists and Let us prove that (9) holds for any x ∈ R d , so that F = F [P ]. Assume by contradiction that, for some x, . But on the dense set C, by (9) and (10), , so F is a cumulative distribution function, so by Remark 3.6(b), P = losup τ ∈T P τ . Moreover P σn → P . Remark 3.9. (a) (Case d = 1) In this case, Reminder 3.3(c) is automatically true. Hence, in Lemma 3.8, (a) is true for any bounded (P n ) n , increasing or not, hence (b) shows that any S ⊂ P(I) bounded from above has a stochastic supremum (which has though not to be the weak limit of a sequence of elements of S, consider 1 2 (δ 1 + δ 2 ) = stosup 1 2 (δ 0 + δ 2 ), δ 1 ). Symmetrically, a family bounded from below has a stochastic infimum.
Remark 3.10. In the following we use several times the Lebesgue differentiation theorem for Borel measures; a reference is , e.g., [15, §2.8-2.9].
Proposition/Definition 3.11. Take µ and ν in P(R). We say that a transport plan P ∈ Marg(µ, ν) has increasing kernel if one (and then any) of the following statements holds: (a) Initial definition: if θ, θ µ and θ and θ have he same mass, then θ sto θ implies θ.P sto θ .P .
(b) For every increasing h : R → [0, 1], P.h is µ-almost surely increasing, i.e., more exactly, there is an increasingh : R → [0, 1] such that, for every bounded continuous function g: (c) There exists a kernel k in the µ-equivalence class of k P such that Remark 3.12. Be cautious that having increasing kernel is distinct from being an increasing coupling, a notion defined in Definition 1.16(c).
Proof of the equivalence in Proposition 3.11. Statements (c) and (d) are essentially a change of notation (To get (c) from (d), notice that for y in the µ-null set R \ E of (d), k(y, ·) can be defined as stosup x∈E, x<y k(x, ·)) and (a) ⇒ (c): Suppose (a) and set I q := ]−∞, q] for all q. We will build R ⊂ R, with µ(R) = 1, on which x x ⇒ k(x, I q ) k(x , I q ) for all q ∈ Q, hence all q ∈ R, ensuring (c). By definition of the kernel k P , k P ( . , I q ) is the density with respect to µ of the measure B → P (B × I q ), thus by the Lebesgue differentiation theorem, setting r(x, ε) : Notation 3.13. If µ ∈ M(R) we denote by M(µ) and P(µ) the sets of positive measures, respectively probability measures, absolutely continuous with respect to µ, and M (µ) and P (µ), or M and P if there is no ambiguity, their subsets of measures with bounded and decreasing density.
"Only if" is clear. For the "if" part, by the Lebesgue differentiation theorem, provides a representative of the density. Now if a sequence (θ n ) n satisfies (11) and weakly tends (see Reminder 3.26) to θ ∈ M(µ), θ satisfies it also (if a, b, c or d is an atom of θ, re-obtain (11) by limit of larger intervals).
Remark 3.16. (P has increasing kernel if and only if t P maps M (ν) to M (µ)). In Proposition 3.11, (b) is equivalent to the same statement with decreasing functions; in turn, transposing, this means that, for all decreasing h : R → [0, 1], (hν). t P , which is equal to (P.h).µ, has decreasing density.
3.2. Quantile measures and minimal couplings. We define the quantile coupling (Definition 3.18) and quantile process law (Definition 3.20) through a minimality property that is crucial in our paper. We also state the direct and more classical Definition 3.23 of the quantile measure. Further characterizations of these coupling and process are given throughout the paper, in particular in §5 where the approach is optimal transport. The reader can refer to [47,48,53] for more background. See also the papers [28,46].
The proof follows from Definition 3.23, see below.
Remark 3.19 (Minimality in the language of transport plans). (a) The quantile process Q may also be defined by the fact that its transitions are minimal among those of all transport plans of Marg(µ, ν): for any P ∈ Marg(µ, ν) and fixed x, F [µ ]−∞,x] .k P ](y) = F P (x, y) for every y ∈ R, hence, after Definition 3.18 and the characterization of stochastic order of Remark 3.9: Notice that a property similar to (13), with C in place of sto , defines the (left-)curtain coupling in [8].
is paradoxically maximal for sto , hence minimal transitions mean that the mass of µ is mixed as less as possible when transported on that of ν.
Proof of Propositions 3.18 and 3.20. The existence parts follows from Definition 3.23 below and are proved just after it; in Proposition 3.18 uniqueness is clear; let us prove it in Proposition 3.20. It is rather easy to prove that the equality in (12) for every pair of measures implies the equality in the Hoeffding-Fréchet bound of general dimension: (14) for all finite S ⊂ R are a compatible set of conditions, this proves with Proposition 2.12 that there exists at most one quantile measure P in Marg((µ τ ) τ ∈T ).
Remark 3.25. (Products of quantile couplings have increasing kernel and map M on M ). We will see in §4.1 that Q(µ, ν) is a composition of two transport plans with increasing kernel. Therefore by Remark 3.14 it has increasing kernel. Since t Q(µ, ν) = Q(ν, µ), Remark 3.16 ensures that All this also ensure both properties for products of quantile couplings.
Reminder 3.26. We remind the reader of the "Portmanteau theorem" (see, e.g., [12, Theorem 2.1]): the weak convergence on some metric space E endowed with its Borel σ-algebra is (equivalently) defined by: In R n , it is equivalent to consider in (15) only sets R of the form n i=1 ]−∞, x i ], see Example 2.3 p. 18 of [12].
More precisely, for any P ∈ Marg((µ i ) 1 i d ) and any sequence (P n ) n∈N of elements of Marg((µ i ) 1 i d ), the following are equivalent: , which shows that (F Q ) Q is "equicontinuous on the right". Take ε > 0. Since each b i ∈ R can be approached from the right by a sequence of non-atomic points for to R for P and all the P n . Thus: and if (i) holds, |P n (R ) − P (R )| ε for n great enough, hence (ii) follows.
(ii) ⇒ (iii). We apply an adapted version of the prior argument. Suppose (ii) and take ε > 0. For every i there exists a finite sequence this is classical and is proved, e.g., in [12,Section 12], which deals with the modulus of continuity of càdlàg paths). Every R = ]−∞, b] contains some rectangle R − ε and is included in the interior of some rectangle R + ε , both bounded by consecutive points (b Using again (16) for the first and last terms: where ( * ) is uniform since there are finitely many rectangles R − ε . We get (iii).
Hence if, for some ν ∈ P(R), Proof. For the first equality, as f.g dP = ((f µ).P ).g, by Proposition 3.30: andθ has density bounded by 1}. This set is included in that of the proposition, since the action of R onθ does not increase the maximum of its density.

3.4.
Proof of Lemma 1.20 and remarks about it. We prove Lemma 1.20 on the existence of a process consisting exclusively of increasing paths. Our proof requires the use of stoinf and stosup introduced in §3.1.
Proof of Lemma 1.20. Let (µ t ) t∈R be an increasing family of probability measures for sto and P be a Markov measure in Marg((µ t ) t∈R ) such that for Set µ t − := stosup s<t µ s and µ t + := stoinf s>t µ s , that are also the left and right limits of (µ t ) t for the weak topology, see Remark 3.9. Since (µ t ) t∈R is increasing for sto , we have µ t − sto µ t sto µ t + and t ∈ R is a discontinuity time of (µ t ) t∈R for the weak topology if and only if µ t − = µ t + . Such points are at most countably many. Indeed, the regions in R 2 between the graphs of F [µ t − ] and F [µ t + ], for the discontinuity times t, are disjoint and of positive Lebesgue measure, hence are at most countably many, by σ-additivity of the measure. Let C be a countable dense subset of R containing the discontinuity points. Introduce: Being a countable union of P -null sets, N is P -null. Now take (X t ) t∈R with law P , e.g., take the canonical process Ω := R R and X t = proj t : x ∈ R R → x(t). We define (X t ) t as null functions on N , and as follows on Ω \ N : -for t / ∈ C,X t (ω) = lim s<t, s∈C X s (ω). Hence for every ω ∈ Ω, the curve t ∈ C → X t (ω) is increasing and, even better, t ∈ R → X t (ω) is increasing. We are left to prove that X t =X t almost surely. This is clear for t ∈ C. For each t ∈ R \ C, {ω ∈ Ω : ∃s ∈ C, s < t and X s (ω) > X t (ω)} is a union of null sets, hence is null, so almost surely, X t sup s<t, s∈C X s =X t . Besides X s → s<t, s∈C,s→tXt almost surely and thus in law, so that Law(X t ) = µ t − . Moreover t is a continuity point of (µ t ) t , so that Law(X t ) = µ t = Law(X t ). Thus, X t =X t almost surely.
Remark 3.32. (a) If (µ t ) t is moreover left-continuous for the weak topology, we can adapt the proof of Lemma 1.20 so that s →X s (ω) is increasing and left continuous by using the formula:  (d) Carrying on with the similarity between sto and C established in Theorem C we mention that Kellerer's theorem has also been revisited under continuity assumptions, see [40,24,7]. In particular it has been proved that if (µ t ) t∈R is right-continuous the associated martingale can be defined in the space of càdlàg paths. x ∈ R} in it. Now take a sequence (P n ) n ∈ Marg(µ, ν) N of transport plans having this property and converging weakly to P ∈ Marg(µ, ν). After Proposition 3.27,  The next remark is neither related to Proposition 3.27 nor to Lemma 2.24 but is an analogue of Remark 3.34 with sto in place of ρ.

3.5.2.
Proof of Lemma 2.23. First we prove Lemma 3.37, then its consequence Proposition 3.38, and finally Lemma 2.23.
Proof. By the equivalence proven in Proposition 3.11, the sequence (h n ) n∈N is increasing.
By Propositions 3.30(b) and 3.27,ρ(P n , P 0 ) → n→∞ 0, i.e., for every increasing g with values in [0, 1], g(y)h(z)dP n (y, z) → g(y)h(z)dP 0 (y, z). Hence: In case µ = λ, sinceh n is increasing, hence has at most a countable number of discontinuity points, µ(A) is 1. This also holds in the general case and is given by the increasing functionsh n • G µ , we leave the details to the reader. Now we take any x ∈ A and proveh n (x) →h 0 (x). It is enough to prove lim sup nhn (x) h 0 (x), since lim inf nhn (x) h 0 (x) can be proved symmetrically. Suppose, for contradiction: Ash 0 is increasing andh 0 (x) = µ-essinf [ (18), and thath n is increasing.
Proposition 3.38. Let P n tend to P 0 in Marg(µ 1 , . . . , µ d , η) and P n tend to P 0 in Marg(η, ν). Assume moreover that P n has increasing kernel for every n. Then P n • P n tends to P 0 • P 0 .
Proof. By Propositions 3.30(b) and 3.27, we must show that the integral whereh n = P n .h, by Definition 2.8 and §2.1.3.
We can now prove Lemma 2.23. It follows directly from Proposition 3.38.

Construction and characterization of the Markov-quantile process
Now (µ t ) t∈R is a family of probability measures on R, F t denotes the cumulative distribution function F µt of µ t and G t its quantile function, see §3.2. In this section we build the Markov-quantile measure MQ and prove Theorems A and B. Our proof of Theorems A-B is based on transport plans L [s,t] ∈ Marg(λ, λ), defined in §4.1, that are the 2-marginals of an important auxiliary process law called the quantile level measure Lev ∈ Marg(λ t ) t∈R , where each λ t is a copy indexed by t of λ = λ [0,1] . Here is the link between Lev and MQ: . Then, as proven in the proof Theorem 4.21 p. 50: This section is divided in three. In §4.1 we define the coupling L R for all R ⊂ R, using the key monotonicity Lemma 4.9. In §4.2 we define MQ and prove Theorem A, purely via the 2-marginals (MQ s,t ) s<t : Notation 4.2 (q r , k r , t k r ). For all r ∈ R we set q r = Q(λ, µ r ); thus t q r (see Definition 2.7) is Q(µ r , λ). Those couplings admit the respective disintegration kernels: where Q s,t = Q(µ s , µ t ), see Definition 3.20. Indeed, t q s .q t = t q s .Id λ,2 .q t , then apply (b) below, which will also be useful farther.
(b) If some ordered pair (U, V ) of variables has law T ∈ Marg(λ, λ), and the 4-times process (G µs (U ), U, V, G µt (V )) has law t q s • T • q t and is Markov, since the σ-fields spanned by U and {U, G µs (U )} are the same. (c) From (a) we get t q r .q r = Id µr,2 , so that t q r .q r . t q r = t q r and q r . t q r .q r = q r . However, q r . t q r = Id λ,2 . Indeed, k r . t k r maps any quantile level α ∈ ]0, 1[ on itself except when G r (α) is an atom of µ r . Actually, µ r -almost surely: where ]α − , α + [ denotes any set A r,x as follows.  The product Q s,r 1 .Q r 1 ,r 2 . · · · .Q r m−1 ,rm .Q rm,t appearing in Theorem A is more deeply analyzed in Theorem B. Its kernel reads: In Remark 4.17,(20) is further commented and reexpressed for transports in place of kernels. For now, it leads to introduce the following kernel.
Notice that {r} = r and that for any R, λ. R = λ. Moreover R only depends on (A r ) r∈R . The following lemma is particularly simple.  Proof. For r ∈ R, one easily checks that r is increasing and stabilizes M ; (a) and the first point of (b) follow. If µ ∈ M (λ), to see that µ sto µ. R , look at the cumulative distribution functions. They coincide off the components of A r . Now on each of those, F µ. R is affine whereas F µ is concave since µ has decreasing density, so necessarily F µ F µ. R .
We add a remark, linked with Figures 2-4, on the principle of Theorem A's proof. A reader only looking for the formal proof itself may skip it.
Remark 4.8. We will not (directly) obtain the couplings MQ s,t as a limit of products Q s,t [Rn] , for some finite sets R n with dense union, as suggested in §1.4 p. 11 -to show this does not work. We aim at obtaining MQ s,t as a supremum, of the set {Q s,t [R] : R finite and R ⊂]s, t[} (see Theorem A(iv) for the notation), and actually we do it on the space of quantile levels, i.e. we look for a supremum of { R : R finite and R ⊂]s, t[}. The question is to find the adequate quantity, or order relation, for which a supremum (and hopefully then a maximum) shall be sought.
First, Figure 2 makes us observe how kernels of the type R act on measures of P(]0, 1[). It displays the action of R with R = {r 1 , r 2 , r 3 , r 4 } on some Dirac measure δ. The vertical segment on the left is the space ]0, 1[ of quantile levels. We suppose that each µ r i has a single atom and draw vertically, at abscissa r i , the interval A r i (see Notation 4.4(a)). The drawing is in the case where δ = δ x with x ∈ A r 1 . Then, see Remark 4.5: {r 1 } maps δ on the uniform probability measure on A r 1 ; in turn, {r 2 } leaves the latter unchanged outside of A r 2 and makes it uniform on A r 2 , etc. The first drawing shows a "possible trajectory of an element of mass at x" transported by the discrete Markov chain with transition kernels ( r i ) 4 i=1 . Since we take x ∈ A r 1 , it is displaced by r 1 to x , picked uniformly at random in A r 1 . In case x ∈ A r 2 , as in the figure, it is unchanged by r 2 ; then in case x ∈ A r 3 (figure), it is displaced by r 3 to a random x ∈ A r 3 , and finally, in case x ∈ A r 4 , displaced by r 4 to a random x ∈ A r 4 . The second drawing shows the successive measures δ x , δ x . r 1 , δ x . r 1 . r 2 etc., the level of grey being proportional to the value of their density.
So each r i "spreads" a little more the mass of δ. r 1 . · · · . r i−1 , replacing it by its mean (measure of constant density) on each connected component of A r i . If θ λ, this averaging process lowers the total variation of the density at each step: at most, you get the measure with constant density one, i.e. λ itself, on which all the transports R act trivially. Thus a natural idea is to consider that, if R R are finite and θ λ, the density of θ. R will be closer to 1 ]0,1[ , for some adequate distance, than that of θ. R .
A remedy is to look the kernels R act on measures of density 1 the latter, and more generally any element of P (]0, 1[), which is stable by the action of the couplings R , the idea above works, with the stochastic order. This is Lemmas 4.9 and 4.10 below; see also Remark 4.13. Figure 4 gives the example of the kernels r i of Figure 3 acting on the measure ν of density 1 x 1 ]0,x[ with x = 1 2 : one gets ν. r 1 . r 2 . r 3 . r 4 sto ν. r 1 . r 2 . r 4 . Though simple, the next lemma is a key of our construction of MQ. Lemma 4.9. Let R ⊂ R be two finite subsets of R and µ ∈ M (λ). Then µ. R sto µ. R .
Proof. Using an induction on the cardinal difference, it is enough to prove this if R has one more element than R, say r . We order it with the elements r i of R: r 1 < · · · < r k < r < r k+1 < · · · < r m . By Lemma 4.7(b), if µ is in M (λ), so is µ k := µ. r 1 . · · · . r k and µ k sto µ k . r . We apply r k+1 . · · · . rm to each term of this inequality. (b) For any R ⊂ R, losup{L R : R ⊂ R and R finite} exists. We denote it by L R (which is consistent with Notation 4.6 when R is finite).
(c) For any R ⊂ R, there is a nested sequence (R n ) n of finite subsets of R such that (L Rn ) n converges weakly to L R ; if (R n ) n has this property, so has all sequence (R n ) n such that R n ⊃ R n .
Finally, the assertion about (R n ) n follows from (a) and the interpretation of both ρ and sto with cumulative distribution functions.
Notation 4.11. For all R ⊂ R, we denote by R the kernel associated with L R . This is again consistent with Notation 4.4(b) when R is finite.  The following result is crucial to define processes on (]0, 1[, λ) with Corollary 2.13, as is done in particular in Definition 4.19 for Lev R and Lev.
Proposition 4.14. If R and R are subsets of R such that r r for all (r, r ) ∈ R × R , then : L R∪R = L R .L R . In particular, for s < t < u, Proof. Using Lemma 4.10 (b) and (c) we find sequences (R n ) n and (R n ) n of finite subsets of R and R respectively, such that L Rn converges weakly to L R , L R n to L R , and L Rn∪R n to L R∪R . Besides, since r r for all (r, r ) ∈ R × R , since the R n are finite and since L {r} is idempotent (so that in case R ∩R = ∅ and R n ∩R n = R ∩R = {r}, a repetition of L {r} does not matter), L Rn∪R n = L Rn .L R n . Then, using Proposition 3.27 and the distance ρ introduced in it: by Proposition 3.31 and Remark 3.36, since t L R n also preserves M (λ). All terms tend to zero when n tends to infinity. The wanted equality follows.  [R] denotes the coupling Q(µ s , µ r 1 ).Q(µ r 2 , µ r 3 ). · · · .Q(µ rm , µ t ) ∈ Marg(µ s , µ t ), see Theorem A(iv) p. 4.

Proposition/Definition 4.16. (a) The set {Q s,t
[R] | R ⊂ R and R < ∞} has a lower orthant supremum and there is a nested sequence (R n ) n∈N of finite sets such that Q s,t [Rn] tends to it. We denote it by µ s,t . (b) The family (µ s,t ) s<t is consistent in the sense of Definition 2.14, giving rise to a Markov measure MQ ∈ Marg((µ t ) t∈R ), that we call the Markovquantile measure attached to (µ t ) t∈R .
(c) For all s and t > s, To show Proposition 4.16 we first state the following crucial relation between Q s,t [R] ∈ Marg(µ s , µ t ) and L R ∈ Marg(λ, λ) definined in §4.1.
Remark 4.17. Take any finite subset R of R and (s, t) with s < t, then: Indeed, Q s,t = t q s .q t = Law(G s , G t ), see Remark 4.3(a). It also equals (G s ⊗ G t ) # (Id 2 ), where Id 2 = Id λ,2 is the identity transport form λ to itself, see Notation 2.6, since by Remark 4.3(b), Q(µ s , µ t ) = t q s .q t = t q s .Id 2 .q t = (G s ⊗ G t ) # (Id 2 ). More generally, for any {r 1 , . . . , r m } ⊂ ]s, t[: (notice that this writing involves neither t q s nor q t ) Besides recall that q s . t q s .q s = q s and t q t .q t . t q t = t q t , so that: L R exists and is the limit of some sequence ((G s ⊗ G t ) # L Rn ) n . Hence the limit in (a) is given by Remark 4.17: Q s,r 1 .Q r 1 ,r 2 . · · · .Q r m(n)−1 ,r m(n) .Q r m(n) ,t = (G s ⊗ G t ) # L Rn .
We prove now the first part of (c), i.e. µ s,t = (G s ⊗ G t ) # L ]s,t[ and, at the same time, that the sequence (R n ) n in (a) can be chosen to be nested. Let (R n ) n be a nested sequence of S such that L Rn → L ]s,t[ , i.e. F [L Rn ] pointwise converges to F [L ]s,t[ ]. If M ∈ Marg(µ s , µ t ) satisfies (G s ⊗ G t ) # L R lo M for every T ∈ S, this also holds for R = R n for all n. Now by Remark 3.34,  13 gives (b). Indeed, Proposition 4.14 on the composition of transports L R gives the consistency of (µ s,t ) s,t (see Definition 2.14): For the second equality of (c), proceed as at the end of Remark 4.17.
We now prove Theorem A.

Proof of Theorem A. (a)
Recall that MQ is Markov and defined in Definition 4.16. By construction, MQ ∈ Marg((µ t ) t∈R ) and satisfies (iv). Then MQ satisfies (ii), i.e. has increasing kernel as quantile couplings have, see Remark 3.25, and since this property is stable by composition and weak limit, see Remarks 3.14 and Lemma 2.24. The last claim of (iii) reads: where (X t ) t∈R has law MQ. An alternative writing is that for all P = Law(Y ) as above and all s < t, F [MQ s,t ] F [P s,t ], i.e. MQ s,t lo P s,t . To show it, it is sufficient to show that for any strictly increasing m-tuple Q r 1 ,r 2 . · · · .Q r m−1 ,rm lo P r 1 ,rm . Indeed MQ s,t = losup{Q r 1 ,r 2 . · · · .Q r m−1 ,rm : s = r 1 < . . . < r m = t}, by definition of P in Proposition 4.16. We write (21)   For all finite subset R of R, P ∈ Marg((µ t ) t∈R ) and (s, t) with s < t we introduced the couplings P s,t [R] ∈ Marg(µ s , µ t ) in Theorem 2.26 -and actually in Theorem A(iv) p. 4 in the case P = Q. We used them in §4.2. Now we introduce the measure P [R] ∈ Marg((µ t ) t ) that was announced in §1.4 in Notation 1.11. The notation is consistent, i.e. for all s < t, proj s,t # P [R] is the previously defined P s,t [R] . Then we prove Theorem B, which means than we implement the tentative program introduced p. 11 sq. in §1.4, in a way that avoids the problems explained there. By Proposition 4.14, for any R ⊂ R, (L R∩[s,t] ) s<t ∈ Marg(λ, λ) is also a consistent family, thus again Proposition 2.12 enables us to define the following processes on the set of quantile levels.  The goal of the remaining part of this section is to prove the following statement that is a more precise and technical version of Theorem B. After some preparation its part (a) is proved p. 45. Its parts (b) and (c) are proved p. 49 after some more auxiliary results. In the statement below, see Notation 4.10(b) for L * , Definition 1.25 for "atomic times" and Definition 4.25 for "essential atomic" intervals or times. (c) Conversely, if (R n ) n satisfies (26), then: (i) For any nested finite sets (R n ) n such that R = ∪ n R n , (Q [R n ] ) n → n→∞ MQ. In other words, (26) is a property of R. Moreover, (26) is also satisfied by any countable R ⊃ R. (ii) Let E ⊂ R be the set of non-atomic times of R, then R \ E satisfies (26). Moreover, for any finite set E of non-essential atomic times, there is a set R satisfying (26) and such that R ∩ E = ∅. (iii) The set R meets each essential atomic interval of (µ t ) t , hence in particular, it contains all its essential atomic times (which are at most countably many, by Proposition 4.26).  1] otherwise, then there is no essential atomic interval, all time is atomic, and R suits if and only if R ∩ Q is dense in Q and R ∩ (R \ Q) is dense in R \ Q, so that any set dense in R is not suitable. One may think to "R is the projection on R of a set dense in the set of atomic Notation 4.4) as a condition at least sufficient, but it is not.
(b) In point (c)(ii) of Theorem 4.21, any non-essential atomic time t may be avoided by a set R satisfying (26), but it is not true that if R satisfies (26), any such t ∈ R may be removed from R without making (26) false: see Example 4.29 where R consists of a single non-essential atomic time.
(c) Condition (25) implies (26), but notice that it is not necessary. Take, e.g., µ t = λ [0,1] for t < 0 and µ t = δ 0 otherwise. Then R satisfies (25) if and only if R ∩ R + = ∅, but Q is Markov, so that (26) is true with R = ∅. Lemma 4.23. Let T denote a totally ordered set of indices (in practice, T = R + or T = N). If (R τ ) τ ∈T is a family of subsets of R, increasing for the inclusion, setting R = ∪ τ R τ , (L Rτ ) τ ∈T is increasing for lo and tends weakly to L R when τ tends to infinity.
Proof. It rests on the following remark. By definition of lo in Definition 3.4 and of ρ in Proposition 3.27, if A, A , A are measures of P(R d ) with the same marginals and A lo A lo A , then ρ(A , A ) ρ(A, A ). Now we prove the lemma. By the definition of L Rτ in Lemma 4.10(b) the sequence (L Rτ ) τ ∈T is increasing, be the sets R τ finite or not. By Lemma 4.10(c), and since ρ metrizes the weak topology (Proposition 3.27), for any ε > 0 we find a finite R ⊂ R such that ρ(L R , L R ) ε. Since R = ∪ τ R τ there is a τ 0 such that R τ 0 ⊃ R . Then: τ τ 0 ⇒ L R lo L Rτ lo L R ⇒ ρ(L R , L Rτ ) ε by the remark.
Therefore with L = L ]s,t[∩Rn we obtain the desired convergence to L ]s,t[ . Definition 4.25. Let I be an interval. If, for some interval J ⊃ I such that J \ I is disconnected (then for all such smaller J ), L J = L J\I , we call I an essential atomic interval of (µ t ) t∈R . If I = {t} is essential, we call t an essential atomic time of (µ t ) t∈R .
To check the parenthesis in the definition, suppose that L J = L J \I and get L J = L J\I by Proposition 4.14.
Proposition 4.26. If a nested sequence (R n ) n is as in Lemma 4.24 then ∪ n R n contains all the essential atomic times of (µ t ) t . In particular, those times are at most countably many.
Proof. We show a contrapositive result. Suppose that t is an essential atomic time, that s < t and s > t are such that L ]s,s [\{t} = L ]s,s [ , and that (R n ) n is a nested sequence of finite sets such that t ∈ ∪ n R n , that Therefore, (R n ) n cannot be as in Lemma 4.24.
Remark 4.27. Be careful that the property to be an essential atomic interval is true in general neither for a union of two such (intersecting) intervals, nor for their intersection, nor for an interval containing a such interval or included in it.  (b) Suppose, to simplify, that some µ ∈ P(R) has exactly one atom x and consider a family (µ t ) t∈R such that µ = µ 0 . An obvious sufficient condition for 0 to be unessential is to chose µ t such that for a certain sequence t n → 0, µ tn has an atom x n such that A tn,xn ⊃ A t,x (see Notation 4.4), or even only ∀ε > 0, ∃n 0 : n n 0 ⇒ A tn,xn + ]−ε, ε[ ⊃ A t,x : "atoms merging the same levels of quantile as x merges at t = 0, accumulate on 0". Indeed, taking possibly a subsequence, we may suppose that (t n ) n tends to zero from the right or the left (say, from the right). For any s > t, the If µ 0 has several atoms (x i ) i∈I , a similar statement can be shown, the condition being that each of the intervals A 0,x i has the property above.
(c) A necessary condition for an atomic time t to be unessential is im- Then for ε small enough, L ]t−ε,t+ε[ restricted to J 2 is the identity transport, which prevents t from being essential (see Remark 4.34).
(d) The condition of point (b) is not necessary, nor that of point (c) sufficient. For instance take µ 0 = δ 0 and for t = 0, (µ t ) t = a(t)δ 0 + (1 − a(t))δ 1 . If a has unbounded total variation on any interval ]0, r[, then any L ]0,r[ is the uniform measure on [0, 1] 2 , see Example 6.9, thus 0 is not essential (in fact, even not right-essential in the sense given in Remark 4.34). On the contrary if a has bounded total variation, none of the measures L ]0,r[ or L ]r,0[ is uniform, hence 0 is essential (see Remark 4.34).
Example 4.29. A family (µ t ) t∈R may have an essential atomic interval and no essential atomic time. Let I be any interval (but not a singleton, for our purpose) and take, e.g., µ t = δ 0 for t ∈ I and µ t = λ [0,1] otherwise, then I is the only essential atomic interval of (µ t ) t . Here, Q [{t 0 }] = MQ for any t 0 ∈ I.
We can now prove Theorem 4.21(a). To provide a clear proof of Theorem 4.21(b) we introduce an auxiliary notion in Definition 4.30 below. Suppose that some P ∈ Marg(µ, µ 1 , . . . , µ k ) is disintegrated as P = Joint(µ, k P ) and that g : R → R and h : R k → R k are measurable maps. May we disintegrate (g ⊗ h) # P ? In case g is into one easily checks (g ⊗ Id) # P = Joint(g # µ, k g P ) where k g P is defined by k g P (y, ·) = k g P (g −1 (y), ·), g # µ-almost surely. Otherwise, the next notion and lemma will enable us to obtain a similar disintegration, and associated properties.
Definition 4.30. We say that g : R → R fits P ∈ Marg(µ, µ 1 , . . . , µ k ) if there exists a kernel k g P such that k g P (g(x), ·) = k P (x, ·), µ-almost surely. Remark 4.31. (a) If g fits P = Joint(µ, k) and h is a measurable map from R k into itself we can disintegrate (g ⊗ h) # P as Joint(g # µ, h # k g P ) where h # k g P (y, ·) = k g P (y, h −1 (·)). (b) If g fits P , it also fits P.P and P • P . (c) If g fits P ∈ Marg(µ, ν 1 , . . . , ν k ) or Q ∈ Marg(µ, ν 1 , . . . , ν k ), and if f : R k → R k and h : R k → R k are measurable maps, then: The proofs are direct, using the definitions of a kernel, composition and catenation in Subsection 2.1.4. Notice that, in case µ = λ [0,1] and g is a quantile function (which is the only case in which we will use the remark), point (c) is a particular case of Lemma 4.33(b) below. In the language of this lemma, "g fits t P or Q" means that t P.L = t P or L.Q = Q, which both imply in particular that t P.L.Q = t P.Q We will need the following little technical result. Hence, since g is increasing, g dµ g dν, equivalent to the wished inequality.
Plainly, equality occurs if f or g is constant. If f is not constant, we in fact proved that equality in g c dµ g c dν holds for g c = 1 [c,b] if and only if c ∈ {a, b}. This remains true for positive combinations of functions g c and, being a little careful, for limits of them.
Lemma 4.33. Take µ ∈ P(R), denote by g = G µ its quantile function and F = F µ its cumulative distribution function. Take L = Q(λ, µ).Q(µ, λ) and, similarly as in Notation 4.4, let (b i ) i∈I be the atoms of µ and for each i, A i be the interval ]F (b − i ), F (b i )[ of quantile levels merged by g on b i . Finally set A = ∪ i∈I A i . Moreover, take ((ν i ) i , (ν i ) i ) ∈ P(R) k+k , P ∈ Marg(λ [0,1] , ν 1 ⊗ . . . ⊗ ν k ) and Q ∈ Marg(λ [0,1] , ν 1 ⊗ . . . ⊗ ν k ). Suppose that P and Q have increasing kernel (for lo in place of sto if k 2 or k 2). Then: (a) t P.L.Q equals t P.Q if and only if for each i ∈ I, at least one of the two kernel functions Proof. We recall particularly here that we have throughout in mind the analogy between products of transport plans, or of their kernels, and products of matrices, hinted at in Remark 2.3. In the following, B 1 ⊂ R k and B 3 ⊂ R k stand for any sets of the type j ]−∞, d j ] and B 2 for any interval ]−∞, b]; (a, b, c) stand for variables in the target of (f, g, h), and (x, y, z) for variables in their sources. The coupling L, equal to t q r .q r (with µ = µ r ) introduced in Remark 4.3, is described in this remark. It is such that t P.L.Q = ( t P.L).(L.Q) = t ((g ⊗ Id k ) # P ).((g ⊗ Id k ) # Q). In turn, for any measurable functions f : R k → R k and h : R k → R k : Then, using for instance the expression (7) given in Definition 2.8 of the catenation, and the fact that for any transport plans R and R , R.R = proj 1,3 # (R • R ) we get:  (27), Remark 4.34. From Lemma 4.33(a), we see that t is an essential atomic time if and only if µ t has at least one "essential atom" x in the following sense: x is a "left-essential" atom, in the sense that for some ε > 0, if s ∈ ]t−ε, t[, the kernel x → t ]s,t[ (x, · ) (see Notation 4.4) is not constant on A t,x , -and x is a "right-essential" atom, in the sense that for some ε > 0, if s ∈ ]t, t + ε[, the kernel x → ]t,s[ (x, · ) is not constant on A t,x . Calling "left-essential" an atomic time t such that µ t has at least one left essential atom (and symmetrically "right-essential" atomic times), left-or right-essential atomic times must not be essential. However, left-or rightessential atomic times are also at most countably many. We provide a sketch of proof of that, written for right-essential times: show that any such time is a discontinuity point of some function ϕ s : ]−∞, s] u → L [u,s] . Now if some ϕ s is discontinuous at t, all ϕ s are, for s ∈ ]t, s] (use Remark 3.36). Hence, the union of the sets of discontinuity points of the functions ϕ s , for s ∈ R, is the same as their union for s ∈ Q. Finally the claim below, left to the reader, implies that for each ϕ s , this set is at most countable, which gives the result. Claim. If ψ : u → M u ∈ Marg(λ, λ) is increasing or decreasing for lo , it has at most countably many (weak) discontinuity points. Now we can prove the end of Theorem 4.21, i.e. its parts (b) and (c).
Proof of Theorem 4.21(b)-(c). Let us prove point (b). Take R ⊂ R and Lev R and Lev in Marg((λ t ) t∈R ) given by Definition 4.19. We will need the following claim, that extends Remark 4.20, and relies eventually on Remark 4.3(b): Let us prove it. Take R ⊂ R finite and n its cardinal. We must prove that for any finite where, by an abuse of notation we will often make use of, G stands for ⊗ N i=1 G s i . It suffices to prove it in the case S ⊃ R, which we suppose now. We introduce the cardinals 0 , . . . , n of the subsets of S \ R situated between to consecutive elements r k and r k+1 of R ∪ {±∞}. We reindex those subsets as hence we must prove that: . For all r ∈ R, the quantile function G r = G µr fits q r = Joint(µ r , x → δ Gr(x) ), so we may take k Gr qr : x → δ x as given by Definition 4.30. Now, by Remark 4.31(b), it also fits L {r} = t L {r} = q r . t q r . Therefore, by Remark 4.31 (b) and (c), Proving that B k = Q(µ r k , µ s k 1 , . . . , µ s k k −1 , µ r k+1 ) will now prove the claim.
So only the first and last variables may differ. Now, notice that by Remark 4.3(b) applied to T = L {r} = q r . t q r , if two variables (U, V ) satisfy Law(U, V ) = L {r} , the law of (G r (U ), G r (V )) is t q r .L {r} .q r = t q r .q r . t q r .q r = t q r .q r = Id µr,2 , therefore G r (U ) = G r (V ) almost surely. Using this with r = r k and (U, V ) = (U 1 , U 2 ), respectively r = r k+1 and (U, V ) = (U 2 , U 3 ), we get respectively G r k (U 1 ) = G r k (U 2 ) and G r k+1 (U 2 ) = G r k+1 (U 3 ) almost surely. The claim is proven. Now take R ⊂ R satisfying (25)  Rn . It exists by Corollary 2.13, whose assumption is satisfied by Proposition 4.14. All those transport plans have increasing kernel, hence by Lemma 2.23, for every Thus Lev Rn converges weakly to Lev, and then by Remark 3.34, G # Lev Rn → G # Lev. We are left with the tasks to prove G # Lev Rn = Q [Rn] and G # Lev = MQ. The former is our claim above. Let us prove the latter.
Note: At the beginning of §4 we announced (19), i.e. G # Lev = MQ. In fact, we prove G # Lev = MQ, which is a bit more difficult. To get (19) the same arguments work, the final reasoning with Lemma 4.33(b) being replaced by a direct use of Remark 4.31(c), as for all {r 1 , . . . , r }, G r fits Lev r 1 ,...,r .
We recall that MQ was defined as the unique Markov law with the same marginals of dimension 2 as G # Lev. But by (25) and Proposition 4.16(c), the 2-marginals of G # Lev and G # Lev are equal. Hence it is sufficient to prove that G # Lev is Markov, i.e. that for all (s 1 , . . . , s k ), (G # Lev) s 1 ,...,s k = (G # Lev) s 1 ,s 2 • . . . • (G # Lev) s 1−1 ,s k . Since Lev is Markov, ( Lev) s 2 ,...,s k = Lev s 2 ,s 3 • . . . • Lev s k−1 ,s k ; besides, notice the following fact, that we will prove a bit below: ..,s k viewed as a transport plan from (R, λ) to (R k−1 , λ ⊗k−1 ) has increasing kernel (for the order lo instead of sto ). The second point is true by Proposition 4.14, and if s 2 ∈ R, by Proposition 4.26, s 2 is not an essential atomic time, which is the wanted equality. We finally must prove the fact stated above. Actually it is true for any catenation P 1 •. . .•P k of couplings P i from R to R with increasing kernel -and Lev s 1 ,...,s k is of this type, see (28). We check this for k = 2; the same argument, applied by induction, gives the general case. Take B 2 and B 3 two intervals of the type ]−∞, a] and x x ; we must show that: As P 2 has increasing kernel, the function k P 2 ( · , B 3 ) is decreasing (thus also 1 B 2 k P 2 ( · , B 3 )). As P 1 has increasing kernel, k P 1 (x, · ) sto k P 1 (x , · ).
Thus: }, which are equal by Lemma 4.23 since ∪ n R n = ∪ n R n . Besides, if R ⊃ R, taking a nested sequence (R n ) n of finite sets such that ∪ n R n = R and R n ⊃ R n for all n, we get that, for any s and t > s, lim n Q s,t [Rn] = MQ s,t , but by the minimality property of Theorem A(iii), for all n, MQ s,t lo Q s,t [R n ] . Therefore, lim n Q s,t [R n ] = MQ s,t and the result follows. For (ii), it is sufficient to show that for any finite subsets R and E of R with µ t diffuse for all t ∈ E, Q s,t [R∪E] = Q s,t Indeed, as µ t is diffuse, G t is injective, hence trivially fits P and Q. Then: To alleviate the writing, we prove the rest of (ii), with #E = 1 (the general proof is alike). Take R given by point (a) and t ∈ ]s, s [ some unessential atomic time of (µ t ) t , then Let us prove (iii). Suppose that I is some essential atomic interval, i.e. there is an interval J ⊃ I such that J \ I is disconnected and L J = L J\I , and assume that I ∩ R = ∅. Since L J lo L J\I , this means that there is some by (s, s ). If a ∈ A s and a ∈ A s (see

5.
A Markov probabilistic representation of the continuity equation on R 5.1. Introduction. As we briefly mentioned in §3.2 and explain below in Reminder 5.7, quantile couplings are optimal transport plans for the quadratic cost function. This suggests that the quantile process Q or even the Markov-quantile process MQ could be minimizers of dynamical optimal transport problems. This is true and rather well-known for Q; one approach is in [46]. In this section we show that this also makes sense for MQ, and in which terms it can be formulated. Those terms make sense in dimension greater than one, leading to the question of a generalization of MQ in those dimensions, see Item (b) in the list below in this introduction.
Here is the minimization problem at stake. We consider a now classical action introduced by Benamou and Brenier in the context of the incompressible Euler equations, see Definition 5.9. If X = (X t ) t∈R is a process, its action is: Note however that the original definition by Benamou and Brenier involves the velocity vector fields (one usually calls it "Eulerian") while we present its "Lagrangian" dual action involving the trajectories t → X t . As it will become clear in this section, this action for infinitely many marginals is simply related to the quadratic transport problem with two marginals.
The origin of this research goes back to the interpretation by Arnold in [6] of the solutions of the incompressible Euler equations on a compact Riemannian manifold as geodesic curves in the space of diffeomorphisms preserving the volume Vol. In [10], Benamou and Brenier relaxed the minimisation problem attached to those geodesics and introduced generalized geodesics that are, in probabilistic terms, continuous processes X = (X t ) t∈[0,1] with Law(X t ) = Vol at every time. Their minimisation property is encoded in the fact that they minimize A under the constraint that the marginals Law(X t ) and Law(X 0 , X 1 ) are prescribed.
Later, see [27,44], Otto and his coauthors discovered that the solutions of some PDEs, in particular the Fokker-Planck and porous medium equations can be thought of as curves of maximal (negative) slope for some functionals F in the space of probability measures endowed with the 2-transport distance (alias Wasserstein distance). It catches a comprehensive picture of the infinite dimensional manifold of measures used in optimal transport, building a differential calculus on it, called "Otto calculus". In this context, the derivative of the curve (µ t ) t at time t shall be seen as a vector field v t of gradient type, square integrable with respect to µ t , such that the transport (or continuity) equation: which corresponds to E(|Ẋ t | 2 ) in Benamou-Brenier's action; it has to coincide with the opposite of the slope of F at µ t , hence the derivative of A thorough study of those questions has been conducted in the monograph [3] by Ambrosio, Gigli and Savaré (see also [11,4]) under very loose assumptions on the curve (µ t ) t or the vector field (v t ) t . They proved, in particular, that the vector field (v t ) t is uniquely determined if (µ t ) t is absolutely continuous of order 2 (see "AC 2 " in §5.2). They showed also that a process minimizing the action, for prescribed marginals µ t , exists, by using limits of solutions of mollified versions of (30). Almost every trajectory of the process is in fact solution of the Cauchy problemẊ t = v t (X t ). In a further work [38], Lisini studied the AC 2 curves of probability measures on a metric space. In this context where the continuity equation is not defined, he also proved that the action can be minimized. Now here is the link with our work: In both the results by Ambrosio-Gigli-Savaré and Lisini, no statement is given on the uniqueness of the minimizing process (X t ) t . But on R, the Markov-quantile process turns out to be a minimizing process, which yields a canonical minimizer. That notion depends of course on the chosen criterion that makes it canonical: for instance, the quantile process is a minimizer and can also be considered canonical. Our criterion and its interest are as follows. In this context where (µ t ) t ∈ AC 2 ([0, 1], R d ), i.e. has finite energy, using Theorems A and B gives rise to the two following results when d = 1. The first one is a slightly enhanced version of Theorem D given in the introduction.
(a) Theorem 5.17 makes explicit under which assumptions and in which sense MQ is a canonical minimizer of the action. The existence of such a minimizer, in any dimension d, is classical, and our work adds a uniqueness result when d = 1, under the assumption that it is Markov, and obtained as a limit of products of couplings.
(b) Theorem 5.20 obtains the process MQ, which is a minimizer of A, as a limit of interpolating processes belonging to Disp Rn (see Definition 5.18) instead of the limit of (Q [Rn] ) n as in Theorem B. Using limits of interpolating processes is the classical way to obtain minimizers (see [54,Chapter 7], [38]) in any dimension d, so this places our work within this context. The interest of doing it is that then, our process (that exists for d = 1) satisfies a uniqueness property that makes sense for any d. It opens the question of the existence and uniqueness of a minimizer satisfying it, for any d, i.e. of a counterpart of the Markov-Quantile process in any metric space -see Open question 6.5.4.
In particular, point (a) shows that MQ is concentrated on absolutely continuous curves γ. In point (b), notice that in the general geodesic Polish metric space, the notions from Optimal Transport Theory yield good extensions of the quantile notions but, whereas a quantile catenation does not make sense, the (Markov) catenation defined in §2.1.4 does, as well as the Markov property. That is why the Markov-quantile process appears to be a better canonical minimizing process (open to generalizations), than the quantile process.
In §5. 2-5.4 we introduce the notions we need, which are mostly classical, and prove the propositions leading to Theorems 5.17 and 5.20. In §5.5 we state and prove them. As told above, in this section we give specific results in dimension d = 1, inside a general framework making sense in all dimension d. Hence we work in R d everywhere this makes sense.
Notation 5.1. In this section (µ t ) t is still a family of probability measures on R or R d ; in this §5 we index it by [0, 1]; (X , d) denotes some metric space -the related notions will be used with X = R d or X = P 2 (R d ) introduced below-and C([0, 1], X ), or briefly C in case X = R, the space of continuous curves from [0, 1] to X , with the σ-algebra induced by the topology of · ∞ . Instead of Marg((µ t ) t ), we work here on the set Marg C ((µ t ) t ) of real probability measures on C (or C([0, 1], R d ), according to the context) with marginal µ t for every t ∈ [0, 1].
for any B in the cylindrical σ-algebra of (R d ) [0,1] . Notice that for any dense countable set D of [0, 1], Γ ((proj D ) −1 (C(D, R d ))) = 1. Conversely, suppose that some Q ∈ Marg((µ t ) t ) satisfies this property. Then we say that Q is "concentrated on C((µ t ) t )" and there is a unique Γ Q ∈ Marg C ((µ t ) t ) such that Γ Q = Q. So by a slight abuse, we will not distinguish Γ and Γ or Q and Γ Q . For Γ ∈ Marg C ((µ t ) t ) and R a finite subset of R, this gives sense, e.g., to Γ [R] after Definition 4.18.
We denote by AC([0, 1], X ) the space of such curves. As explained for instance in [3], where the definition is slightly different but equivalent, these curves admit for almost every t a metric derivative which we denote by |γ|(t): (if X = R n and γ is derivable this is |γ(t)|, so the notation is consistent). Now we introduce the notion of energy and the subsequent Proposition 5.6, which seems classical but for which we could not find any reference in the literature. Similar results concerning the length, in particular for geodesic curves, can be found in [5,3]. We will consider them as known.
(ii) We treat (ii) in the case E(γ) < ∞, letting the reader adapt the details in the case E(γ) = ∞. Take ε > 0 and R = {r 0 , . . . , for all k, there is an α 1 > 0 such that |R | min(α, α 1 ) ensures the second inequality below, hence (ii): (iii) Notice that a similar argument as above gives the Chasles relation E c a (γ) = E b a (γ) + E c b (γ) for a < b < c. Then we proceed in three steps.
By the Fatou lemma we get: This holds for every ε, so that γ ∈ AC 2 ([0, 1], X ) and that the announced formula Reminder 5.7. On P(R d ) 2 the following infimum (minimum by the Prokhorov Theorem) has all the properties of a distance except that it may be infinite; it is called the 2-Wasserstein distance: y − x 2 dP (x, y).

5.3.
Action -expected energy of a random curve.
is defined as: Now a series of natural remarks leads to wonder about the behaviour of the measure P of Theorem A in this framework. Proposition 5.11 gives it. (b) If Γ is a measure on C, e.g., an element of Marg C (µ), then: because of the monotone convergence theorem: use a monotone sequence of partitions and Proposition 5.6(b).
Proof. We need the following classical claim.
The claim and Proposition 5.6 (c) give that the action A : P(C) → R is lower semi-continuous. Now take (R n ) n∈N * a sequence of finite subsets of R as given by Theorem B and set Γ n := Q [Rn] (µ). If we show that Γ n converges weakly to MQ in P(C), we will get that A(MQ) lim inf n A(Γ n ), hence the result since E(µ) A(MQ) by Remark 5.10 (c) and A(Γ n ) = E(µ) for all n by Remark 5.10 (d). So let us show this. By the Chebyshev inequality, for every ε there exists α > 0 such that, for all n ∈ N * : Γ n ({γ ∈ C : E(γ) > α}) < ε and Γ n ({γ ∈ C : |γ(0)| > α}) < ε.
α} ∩ {γ ∈ C : |γ(0)| α} has Γ n -mass greater than 1 − 2ε for all n. It follows from its definition that on N , 1 0 |γ| 2 and thus also 1 0 |γ| 2 are bounded, hence N is included in a ball of the Sobolev space W 1,2 ([0, 1]). This Banach space is compactly embedded in C, see [13,Theorem 8.8], so that N is relatively compact in C. So according to the Prokhorov theorem any subsequence of (Γ n ) n has a (weak) limit point. But by Theorem B, each finite marginal of (Γ n ) n tends weakly to the corresponding marginal of MQ, hence all such limit point must be MQ, hence (Γ n ) n tends weakly to MQ.
we also define proj t 2 by proj t 2 (γ) = γ(t) on the set whereγ is defined and proj t 2 (γ) = 0 on its (null) complement. As defined in [3, Definition 5.4.2] we introduce the barycentric projection.
Definition 5.13. Take Γ ∈ P(AC([0, 1], R d ) and for all t ∈ [0, 1] denote (proj t ) # Γ by µ t , (proj t × proj t 2 ) # Γ by M t and by κ t a kernel such that M t = Joint(µ t , κ t ). The barycentric projection of M t is the µ t -almost surely defined vector field u Γ t on R d such that u Γ t (x) is the barycentre of κ t (x, . ). Alternatively, it is defined by the equation: for every continuous bounded vector field v.
Reminder 5.14. If (µ t ) t = (f t λ R d ) t is a family of measures on R d with density (x, t) → f t (x) smooth with compact support, a smooth vector field v t transports the measure µ t , in the sense that its flow Φ t makes µ t (Φ t (B)) constant for any Borel set B, if and only if v t satisfies the continuity equation: (38) ∂ t µ t + div µt (v t ) = 0 (or ∂ t µ t + div(v t µ t ) = 0, see below), div µt (v t ) standing for the signed measure L vt µ t , where L is the Lie derivative. Now (38) keeps a weak meaning in R d × [0, 1] in our framework, namely: for every smooth function ϕ : (38), div ν v depends only on the product vν, so may be written div(vν). Indeed, for g, h ∈ C ∞ (R d ), div gν (hv) = (d(gh).v)ν + gh div ν (v).

5.5.
Our resulting theorems on MQ as a minimizer in this context. The following theorem gathers: -well-known facts, actually true on any P 2 (R d ) for d 1, namely (a) and (b)(i), i.e. the existence of measures Γ for which (35) is an equality, -enhancements of them following from our theorems A and B and Proposition 5.11, notably the uniqueness in the Lagrangian statement for d = 1 -the uniqueness of the field v t in (a) is classical, but remind that it does not imply that of the minimizing process Γ tangent to it. Notice that it is not known whether the process can be chosen Markov for d 2 (see Open question 6.5.4). Moreover Q [Rn] is only defined for d = 1.
Theorem 5.17 (Existence and uniqueness of representations). Take a curve µ = (µ t ) t∈[0,1] in Wasserstein space P 2 (R) with finite energy E(µ). Then: (a) (Eulerian statement.) There exists a vector field v t satisfying the continuity equation (38) and such that Inequality (40): is an equality. This vector field is unique.
(b) (Lagrangian statement.) There exists Γ ∈ Marg C (µ) such such that: (i) Inequality (35): A(Γ) E(µ) is an equality, (ii) the measure Γ is Markov, (iii) it is the limit in P(C) of a sequence (Q [Rn] ) n . Such a Γ is unique in Marg C (µ); it is the Markov-quantile process MQ.
(c) (Link between them.) For any Γ minimizing the action, i.e. making (35) an equality, the curve γ ∈ C is Γ-almost surely a solution of the ODE: for almost every time.
Proof. (a) With u Γ t given by Definition 5.13, note that A(Γ) = 1 0 |u Γ t | 2 dµ t dt for every Γ, so that Proposition 5. 16 gives the existence of the field. Its uniqueness comes from a standard argument: if u t and v t satisfy (38), so does w t := (u t + v t )/2, but if they both make (40) an equality and differ on a non-null subset, 1 0 |w t | 2 dµ t dt < E(µ), which contradicts (40). (b) Proposition 5.11 shows that Γ = MQ suits. By Theorem B, the conditions of Theorem 5.17(b) characterize the Markov-quantile process, which ensures the uniqueness.
(c) Use Proposition 5.16(a) and the uniqueness in (a).
In Definition 5.18, remember that an optimal transport is defined in Reminder 5.7.
Definition 5.18. Let R = {r 0 , r 1 , . . . , r m , r m+1 } be a partition in Part([0, 1]). We denote by Disp R the set of measures M ∈ P(C) that are dynamical transports made Markov at the points of R, and linearly (hence in fact optimally) interpolating (µ t ) t∈[0,1] between them, defined as follows.
Moreover, in dimension d = 1, a Markov limit Γ exists and if a limit Γ is Markov, it is the Markov-quantile measure in Marg C ((µ t ) t∈[0,1] ).
Proof. Adapting [54, Chapter 7], [38] or Proposition 5.11 to our context we obtain the first part of the theorem for every d 1. This requires slight modifications that we do not detail: Villani's chapter is in fact written for geodesic curves (µ t ) t between prescribed µ 0 and µ 1 whereas Lisini's processes are attached to curves (µ t ) t∈[0,1] of finite energy but the processes of the sequence are constant on each interval between two consecutive points of the partition, whereas ours is linear. Note, as an indication, that our measures Γ n minimize A in {Γ ∈ P(C([0, 1], R d )) : ∀r ∈ R n , Γ r = µ r }, the minimum being A(Γ n ) = E(µ, R n ).
In case d = 1, take as before a nested sequence (R n ) n such that Q [Rn] converges to MQ in P(C). Up to taking a subsequence, the same sequence of partitions permits Γ n ∈ Disp Rn to converge to some Γ. By Definitions 4.18 and 5.18, for every S ⊂ R n the measure (proj S ) # Γ n coincides with (proj S ) # Q [Rn] and: (proj S ) # Γ = (proj S ) # MQ. As R ∞ is dense in [0, 1] and the measures are concentrated on C it follows that Γ = MQ. This proves the existence part in case d = 1 For the uniqueness statement, take as before a nested sequence (R n ) n and let Γ n be the single element of Disp Rn (see Remark 5.19). Assume that (Γ n ) n has a Markov limit Γ. By Definitions 4.18 and 5.18, for every S ⊂ R n the measure (proj S ) # Γ n coincides with (proj S ) # Q [Rn] . Using the same argument as for Proposition 5.11, up to taking a subsequence, (Q [Rn] ) n converges to an element of Marg C (µ) that we denote by Γ . Hence for every S ⊂ R ∞ , (proj S ) # Γ = (proj S ) # Γ . As R ∞ is dense in [0, 1] and the measures are concentrated on C it follows Γ = Γ. Note now that for every n ∈ N, the measure Q [Rn] has increasing kernel, so that it also holds for Γ ; similarly M is stabilised by Q [Rn] and Γ (Remark 3.33 and Lemma 2.24). Finally Γ is a process satisfying (i) and (ii) of Theorem A. For s < t we have, on the one hand, Γ s,t lo MQ s,t because the Markov-quantile measure is minimal (Theorem A(iii)). On the other hand, for every s < t in R ∞ we have MQ s,t lo Γ s,t because (Γ ) s,t is a limit of products of quantile couplings, and MQ s,t is defined in Proposition 4.16 as a supremum in this class for lo . Therefore the Markov processes Γ = Γ and MQ have the same law on R ∞ , hence coincide as measures on C.
Remark 5.21. We have seen that both Q ∈ Marg(µ) and MQ ∈ Marg(µ) minimize A. The quantile measure Q is also the optimizer of another multimarginal transportation problem raised by Brendan Pass in [46]. It minimizes: where ϕ is strictly convex and is the unique minimizer. Some assumptions are required (see [46,Part 2]) but some of them can probably be relaxed. The paper is based on the fact that P ∈ Marg(µ 1 , . . . , µ d ) → ϕ(x 1 + · · · + x d ) dP (x 1 , . . . , x d ) is minimized by Q(µ 1 , . . . , µ d ).

Examples and open questions
6.1. Example of Markov-quantile processes attached to discrete measures on N. In this section x + is the positive part max{0, x} of any x ∈ R.
Example 6.1 (Discrete measures). Let (µ t ) t∈[0,1] be concentrated on N for every t and assume that for every k ∈ N the map A k : t → k i=0 µ t (i) is in C 1 ([0, 1]) and piecewise monotone (e.g., A k is analytic). Let moreover A −1 be the zero constant function. We assume that: is bounded from above for (t, k) ∈ [0, 1] × N. Then, using the characterization of the Markov-quantile process as a limit of quantile couplings, namely Theorem A(iv), it can be proved that the Markov-quantile process (X t ) t∈[0, 1] is the time continuous Markov chain with jump rate q k,k+1 = (−A k (t)) + µt(k) from k to k + 1, and q k,k−1 = from k to k − 1 and q k,j = 0 for |j − k| = 1. Denoting P(X t = k) by p k it means that the so-called forward Kolmogorov-Chapman system is satisfied: where the derivative is a right derivative. Recall that the jump rate is defined for i = j by: The classical theory that can be read in Feller's book [16, Chapter XVII, section 9] and the references therein (see also [14]) ensures that our process is solution of the forward Kolmogorov-Chapman system. The uniqueness of the solution for a Markov process is obtained from the uniform bound on the rates q i,j (t).
In place of a complete proof let us compute the jump rate in a typical case. Notice before that similar computations can be found in [29,Section 4]. We are looking for the jump rate q k,k+1 (t) in the case of A := A k−1 and B := A k locally decreasing on the right of t. At every time t the atomic measure µ t is completely described by the partition of the interval [0, 1] of quantile levels through the sequence (A k (t)) k∈N . Indeed, ]A k−1 (t), A k (t)[⊂ [0, 1] is the interval of the quantile levels of the atom µ t (k)δ k . Recall that both A and B are in C 1 ([0, 1]). We can assume that h is so small that B(t + h) > A(t). For {r , r } ⊂ [t, t + h] with r < r and r close to r , the quantile coupling between µ r and µ r transports the main part of the mass of the atom µ r ({k})δ k on itself and the rest on the atoms µ r ({k })δ k with k > k. We aim at proving that the conditional probability to be still in k at time t + h is: Since the probability to jump more than twice is O(h 2 ), (41) furnishes the announced jump rate q k,k+1 (t) = (−A k (t)) + /µ t (k) in the case of decreasing functions. So let us prove (41).
We consider a partition R = {r 0 , . . . , r m } of [t, t + h] with (r 0 , r m ) = (t, t + h) and the discrete quantile Markov chain associated with it. As A and B are decreasing, note that no mass can leave the quantile level interval Example 6.2 (Poisson distributions). Elaborating on the last example we consider, for t ∈ R + , µ t = P(t) where P(t) is the Poisson law of parameter t. In this case A k (t) = k i=0 exp(−t)t i /i! so that the jump rate q k,k+1 (t) is constantly 1 for every k and t, and the other rates are zero. We recover the Poisson process. Note that the Poisson laws are in stochastic order, which matches with the increasing trajectories of the Poisson counting process. Example 6.3 (Binomial distributions). In this example µ t = B(n, t) for t ∈ [0, 1]. Let us define a Markov process X = (X t ) t∈[0,1] ∈ Marg((µ t ) t ) and compute its jump rates; we will then see that Law(X) = MQ. We define X on the probability space [0, 1] n by X : (α 1 , . . . , α n ) → n k=0 1 α k ∈[0,t] , so its law is µ t . The fact that (X t ) t∈[0,1] is Markov comes from the following coarse argument: provided k coordinates of α = (α 1 , . . . , α n ) are smaller than t, the distribution is uniform on [0, t] k for the k coordinates of the past of t and on [t, 1] n−k for the n − k of its future. Between t and t + h the probability to have (at least, as well as exactly) one jump is (n − k) h 1−t + O(h 2 ). As A k (t) = k i=0 n i t i (1 − t) n−i with the notation of Example 6.1, it can easily be checked that (n−k) 1−t = A k (t) µt(k) , which proves that (X t ) t∈[0,1] is the Markovquantile process attached to (µ t ) t∈[0,1] . This example could be of interest with respect to previous works on the entropic interpolation on graphs as, e.g., [21,37]. 6.2. Example of Markov-quantile transport processes. The following examples are related to §5. In particular we will consider processes tangent to a non-autonomous vector field on R. Basically, in the examples, µ t is made of two parts that are translated in opposite directions and cross. We examine three crossing situations for atomic or diffuse measures. Example 6.4 (One atom crossing a diffuse measure). Consider µ = (µ t ) t∈[0,1] with µ t = 1 2 λ [t−3/4,t−1/4] + 1 2 δ 0 . This is the family of marginals of a simple process Γ with affine trajectories, defined by Γ(t → 0) = 1/2 and Γ({t → x 0 +t : x 0 ∈ A}) = λ [−3/4,−1/4] (A). This is not the Markov-quantile process attached to µ but it is a Markov process and it is tangent to the optimal vector field of Theorem 5.17(a), namely: v t (x) = 0 if x = 0, 1 otherwise, so the action A(Γ) equals the minimal value E(µ). The Markov-quantile process (X t ) t∈[0,1] attached to (µ t ) t∈[0,1] can be described as follows: the trajectories start according to µ 0 and are piecewise affine, with pieces taken from the affine curves above. Provided X 0 ∈ [−3/4, −1/4], the first piece is X t = X 0 + t on [0, τ ] where −X 0 = τ . The second affine piece is constant equal to zero on [τ, min(τ + η, 1)] where η is an exponential random variable of parameter 2, independent from X 0 . The third piece, if it exists, is affine of slope 1, namely X t = t − (τ + η) on [τ + η, 1].
Unlike Γ, the process (X t ) t∈[0,1] has increasing kernels and is a strongly Markov process.
Example 6.5 (Crossing of two purely atomic measures). Consider two measures α and β of mass 1/2, concentrated on the rational numbers of [0, 1], with finite or infinite support. Let τ t be the translation of vector t in R. Set µ = (µ t ) t∈R = ((τ t ) # α + (τ −t ) # β) t∈R . As in Example 6.4 the measure Γ ∈ Marg(µ) is concentrated on the space of piecewise affine paths (of slopes 1 and −1) is a minimizer of the action. The two measures (τ t ) # α and (τ −t ) # β are concentrated on Q when t ∈ Q and they are singular if t / ∈ Q. Hence according to Proposition 5.16 the optimal vector field (v t ) t∈[0,1] satisfies Γ ⊗ λ [0,1] -almost surely v t = ±1. It can be checked that the Markovquantile process is again piecewise affine with a random finite number of changes of slope. Interesting exercises on the Markov-quantile process can be considered, as for instance finding the probability for a trajectory coming from −∞ in −∞ to tend to +∞ in +∞. Note that the situation seems to be well approached by truncating the measure to finitely many 'big' atoms. This corresponds to the case of α and β with finite support. In this particular case the above mentioned exercise reduces to the so-called 'gladiator game' [32] that is a stochastic version of Borel's Blotto game [49]. Example 6.6 (Crossing of two diffuse measures). Consider µ t = λ [t−2,t−1] + λ [1−t,2−t] and again Γ such that Γ({t → t + x 0 : x 0 ∈ A}) = λ [−2,−1] (A) and Γ({t → x 0 − t : x 0 ∈ A}) = λ [1,2] (A). Unlike in the previous examples, Γ does not minimize A on Marg C (µ). All the measures µ t are continuous so that the Markov-quantile process (X t ) t∈R is the quantile process. It is affine by part and continuous. With probability 1/2, in fact if X 0 0, first it has slope 1, then slope 0 on [ 1−X 0 2 , 5+X 0 2 ] and finally slope −1. If X 0 > 0, the process (X t ) t starts with slope −1, is flat on [ 1+X 0 2 , 5−X 0 2 ] and continues with slope 1 after 5−X 0 2 . 6.3. Theoretic Markov-quantile processes.
Example 6.7 (One atom with regular level functions). Take (µ t ) t∈[0,1] such that for every t, µ t has exactly one atom x t ∈ R and the interval of quantile levels of this atom at time t is ]A(t), B(t)[. Assume moreover that A and B are of class C 1 and piecewise monotone. Then the Markov-quantile process (X t ) t∈[0,1] can be described using two Poisson point processes of jump rates (A ) + /(B − A) and (B ) − /(B − A). Conditionally on F µt (X t ) ∈ ]A(t), B(t)[, we have X t = x t until the next time t 0 t in the point process. Then the process (X t ) t leaves x t and starts a piece of quantile trajectory constant in the space [0, 1] of quantile levels with value A(t 0 ) or B(t 0 ). The process may hit again x t if there exists some t 1 > t 0 with A(t 1 ) = x t 0 , or B(t 1 ) = x t 0 respectively. The next remark is of general interest and particularly significant with respect to Example 6.8. It presents the Markov-quantile process as one end of the spectrum of processes of law in Marg(µ) that satisfies (ii) of Theorem A, i.e. have increasing kernels, the other end of which is the independent process.
Remark 6.8. The minimality condition (iii) of Theorem A satisfied by the Markov-quantile process (X t ) t attached to some (µ t ) t can also be stated as follows. For every process (Y t ) t∈R satisfying (i) and (ii) of Theorem A, for every s < t and every x ∈ R it holds: A similar relation that concerns maxima of sto in place of minima is satisfied by the independent process (Z t ) t∈R . If a process (Y t ) t has increasing kernels, we obtain: We conclude with the following result: Assume that for some s < t and (X t ) t the Markov-quantile process, X s is independent of X t . Then for any process (Y t ) t satisfying (i) and (ii) of Theorem A, we have for every x ∈ R µ t = Law(X t | X s x) sto Law(Y t | Y s x) sto Law(Z t | Z s x) = µ t so that, due to the Markov property, for every s s and t t, Y s and Y t are independent. Example 6.9 (Two atoms). We set µ t = a(t)δ 0 + b(t)δ 1 with a + b = 1 but do not assume any regularity on the functions a and b. Let (X t ) t be the Markov-quantile process. We shall show that X s and X t are independent if and only if the total variation of a (or b) on [s, t] is infinite or m s,t := min(inf [s,t] a, inf [s,t] b) = 0. If m s,t = 0 the independence is true for any Markov process. Indeed, by assumption, for any ε > 0, we may take r such that P(X r = 0) is small enough so that P(X r = 0|X s = 0) ε, P(X r = 1|X s = 0) 1 − ε, and |P(X t = 0|X r = 1) − P(X t = 0)| ε. Then: |P(X t = 0|X s = 0) − P(X t = 0)| = P(X t = 0|X r = 0)P(X r = 0|X s = 0) +P(X t = 0|X r = 1)P(X r = 1|X s = 0) − P(X t = 0) since X is Markov |P(X t = 0|X r = 1) − P(X t = 0)| + 2ε 3ε, which is the wanted independence.
Hence, we assume that m s,t > 0 and that a takes values in [m s,t , 1 − m s,t ] in µ t = a(t)δ 0 + b(t)δ 1 . We are left with the task to prove that independence is equivalent to a infinite total variation of a on [s, t]. Let θ 0 be the uniform measure on [0, a(s)]. Our goal reduces to establishing λ = θ 0 [s,t] := stosup R θ 0 r 1 r 2 · · · rm where R ranges among the partitions {r 0 , . . . , r m } with (r 0 , r m ) = (s, t). For the measures under consideration, if a(r k−1 ) a(r k ) a(r k+1 ) or a(r k−1 ) a(r k ) a(r k+1 ) it holds r k−1 k k+1 = r k−1 r k+1 . Therefore we can assume without loss of generality that the sequence (a(r k )) k=0,...,m has increments with alternating sign, for instance a(r 2k+1 ) a(r 2k ) for every k. We define θ n = θ 0 r 0 · · · rn .
The measure θ n can be written in the form: where d n = a(r n ) −1 θ n ([0, a(r n )]) in fact parametrizes the complete measure. Note, after Remark 6.8 that θ n sto λ, which means d n 1 d n . As a(r n ) ∈ [m s,t , 1 − m s,t ], the sequence converges to λ if and only if d n → 1.
Recalling the effect of the kernel r n+1 , described on Figure 3 and defined in Notation 4.4(b), we find: if a(r n+1 ) < a(r n ), otherwise.
The product Π m n=1 min( a(r n+1 ) a(rn) , 1−a(r n+1 ) 1−a(rn) ) can be arbitrarily close to zero (over all partitions of [s, t]) if and only if a ∈ [m s,t , 1 − m s,t ] has infinite total variation. This proves the claimed equivalence.
Example 6.10 (One atom on the lower levels). Consider (µ t ) t∈[0,1] such that for every t, µ t has exactly one atom and this atom is between the quantile levels A(t) = 0 and B(t). An example is µ t = B(t)δ 0 + (1 − B(t))E(1) where δ 0 is the Dirac mass in zero and E(1) the exponential law of parameter 1.
No regularity assumption is made on B. Similar observations as in Example 6.9 permit us to specify the kernel between time s and t > s. Let α s,t be sup r∈[s,t] B(r). Then L [s,t] is simply the uniform measure of mass α s,t on [0, α s,t ] 2 plus the one-dimensional uniform measure of mass 1 − α s,t on the diagonal between (α s,t , α s,t ) and (1, 1). The same for the kernel s,t reads: A particle of quantile value α s,t at time s is uniformly mapped at time t on the particles of quantile levels [0, α s,t ]. If the quantile value at time s is greater that α s,t , the particle keeps on with the same level until time t as if it were the quantile process.
Example 6.11 (Markov-quantile processes). According to (µ t ) t , a quantile process may be Markov or not. Recalling Remark 1.8 (b), if Q is Markov it 6.5.3. Markov Kamae-Krengel theorem. Kamae and Krengel proved in [31] that if (µ t ) t∈R are measures on a partially ordered Polish space E such that t → µ t is increasing for the stochastic order, in the sense that t → f dµ t is increasing for any increasing bounded f : E → R, there exists an increasing process (X t ) t with law in Marg(µ). We proved in Theorem A and C that if E is R, the process can moreover be Markov. A natural problem is whether this is also true for any E. The action A and energy E are defined on metric spaces. Definition 5.18 can also be extended to geodesic Polish metric spaces X in a natural way based on processes representing the geodesics of P 2 (X ) as in [54,Corollary 7.22] or [2, §2.2]. The first part of Theorem 5.20 is also true in this setting; we did not prove it to avoid technicalities. However, the question of the existence of an analogue to the Markov-quantile process on such a metric space seems us very interesting from an Optimal Transport perspective. 6.5.5. Strong Markov property. The Markov-quantile process is not the unique process that minimizes the energy E as is shown in Example 6.4. However the process in this example is strongly Markov. Is the Markov-quantile process strongly Markov? Does this characterize it?