Time Consistent Optimal Stopping

Dynamics

Let $(\Omega,\mathcal{F},\mathbb{P})$ denote a complete probability space which supports a standard Brownian motion $(W_t)_{t\geq 0}$ with its natural filtration $\lbrace F_t\rbrace_{t\geq 0}$. Let $X = \lbrace X_t\rbrace_{t\geq 0}$ denote the payoff value process and suppose that its dynamics are given by

${dX_t\over X_t}=b(X_t)dt+\sigma(X_t)dW_t,\ X_0=X_0$

where the bounded function $b$ describes the instantaneous conditional expected percentage change in $X$ per unit of time and the bounded function $\sigma$ is the instantaneous conditional standard deviation per unit time. The payoff of investment at time $t$ is given by some payoff function $G : [0,\infty) \to \mathbb{R}$.

Stopping Rules

Definition A stopping rule is a function of time and the process value, $u : [0,\infty)\times [0,\infty) \to \lbrace 0,1\rbrace$, where $0$ indicates “continue” and $1$ indicates “stop”. For any time $t \geq 0$, each stopping rule $u$ defines a stopping time $\tau_u^t$ after $t$, via

$\tau_u^t=\inf\lbrace s\geq t, u(s,X_s)=1\rbrace$

Let $h$ denote a given weighted discount function with corresponding weighting distribution F, i.e. $h^F$. Consider some time $t \geq 0$ and refer to it as “self $t$” of the group. Define the discount function of self $t$ by $h_t(s) := h(s − t), s \geq t$. This means that self $t$ treats calendar date $t$ as the present, which is also reflected by the fact that $h_t(t) = 1$. If $X_t = x$, self $t$ seeks to maximize the weighted discounted payoff from its investment decision according to the stopping rule $u$ :

$J(t,x;u)=\mathbb{E}[h(\tau_u^t-t)G(X_{\tau_u^t})\mid X_t=x]$

Since $h$ exhibits decreasing impatience, the preferences of the selves change over time. In particular, self $0$ is more patient at time $t$ than self $t$ is at time $t$. This may lead the selves to prefer a different choice of $u$. In general, each self $t$ can only choose current (time-$t$) behavior.

Note: $J$ is determined by choice of $u$ when $t$ and $x$ are fixed, i.e. decision makers choose the best strategy to maximize the expected profit.

Given a stopping rule $u$ and a self $t \geq 0$, we can define the following stopping rule to be used from $t$ on:

$u^{\varepsilon,a}(s,x)= \begin{cases} u(s,x),\quad \text{if}\ s\in [t+\varepsilon,\infty) \\ a\quad \text{if}\ s\in [t,t+\varepsilon) \end{cases}$

where $\varepsilon>0$ and $a\in \lbrace 0,1\rbrace$ are fixed. We call $u^{\varepsilon,a}$ the $(\varepsilon,a)$-deviation from $u$.

Given the infinite horizon and the stationarity of the process $X$, we need to consider only stationary stopping rules $u$, which are functions of the state variable $x$ only. As

$\begin{align} J(t,x;u)&=\mathbb{E}[h(\tau_u^t-t)G(X_{\tau_u^t})\mid X_t=x] \\ &=\mathbb{E}[h(\tau_u^0)G(X_{\tau_u^0})\mid X_0=x] \\ &=J(0,x;u) \end{align}$

Hence, each self $t$ faces the same decision problem, which only depends on the current state $X_t= x$, but not on time $t$ directly. We can thus identify group self $t$ by the current state $X_t= x$ of the process, and drop the time index from its objective functional.

Stopping Equilibrium

The sophisticated group anticipates the disagreement between its current and future selves. Therefore, it searches for a stopping rule $\hat u$ that all possible future selves $x$ are willing to go through with, i.e., no future self $x$ wishes to deviate from $\hat u$. In other words, the group plays a game with its future selves, and behavior is described by the equilibrium of that game.

Definition (Equilibrium stopping rule) The stopping rule $\hat u$ is an equilibrium stopping rule if

$\liminf_{\varepsilon\to0}{J(x;u)-J(x;u^{\varepsilon,a})\over \varepsilon}\geq 0$

where $u^{\varepsilon,a}$ is the $(\varepsilon,a)$-deviation from $\hat u$.

In particular, either $u^{\varepsilon,1}$ or $u^{\varepsilon,0}$ are different from the equilibrium strategy, i.e., the $u^{\varepsilon,a}$ constitutes possible deviation strategies. If $\varepsilon = 1$, the non-negativity of the ratio in says that every future self (characterized by wealth $x$) prefers $\hat u$ over deviating according to $u^{\varepsilon,1}$ or $u^{\varepsilon,0}$. For an arbitrary short interval $\varepsilon$ it follows that deviation is unattractive at the current value of $x$.

Theorem (Equilibrium Characterization) Consider the performance functional $J(x;u)=\mathbb{E}[h(\tau_u)G(X_{\tau_u})]$ with weighted discount function $h(t)=\int_0^\infty e^{-rt}dF(r)$, a stopping rule $\hat u$, and functions $w(x;r)=\mathbb{E}[e^{-r\tau_{\hat u}}G(X_{\tau_{\hat u}})]$ and $V(x)=\int_0^\infty w(x;r)dF(r)$. Let $\mathcal{A}={1\over 2}\sigma(x)^2x^2{\partial^2\over \partial x^2}+b(x)x{\partial\over\partial x}$ and suppose that $(V,w,\hat u)$ solves

$\begin{align} &\max\lbrace\mathcal{A}V(x)-\int_0^\infty rw(x;r)dF(r),G(x)-V(x)\rbrace=0 \\ &\hat u(x)= \begin{cases} 1\quad \text{if}\ V(x)=G(x) \\ 0\quad \text{o.w.} \end{cases} \end{align}$

subject to $V(0)=\max\lbrace G(0),0\rbrace$. Then, $\hat u$ is an equilibrium stopping rule and the value function of the problem is given by $V(x)$, i.e., $V(x)=\mathbb{E}[h(\tau_{\hat u})G(X_{\tau_{\hat u}})]$.

Interpretation: Let us call $\mathcal{S}=\lbrace x\in [0,\infty):V(x)=G(x)\rbrace$ and $\mathcal{C}=\lbrace x\in [0,\infty):V(x)>G(x)\rbrace$ the stopping region and continuation region of the stopping problem, respectively. Than, the equilibrium stopping rule $\hat u$ can be determined by whether $x$ is in $\mathcal{S}$ or $\mathcal{C}$. The corresponding stopping time is given by $\tau_{\hat u}=\inf\lbrace s\geq 0:X_s\in \mathcal{S}\rbrace$. Equations constitute the so-called Bellman system, a system of coupled equations. Note that the second equation involving $\hat u$ is an equation (rather than a definition of $\hat u$) since its right hand side involves $V$ that in turn depends on $\hat u$ through $w(x;r)$. Therefore, the equilibrium stopping rule $\hat u$ is part of the solution of the Bellman system. Essentially, this theorem tells us how we can obtain an equilibrium stopping rule $\hat u$ or, equivalently, the values of $x\in \mathcal{S}$ where the agent will stop and the values $x\in\mathcal{C}$ where she will continue.

The function $w(x;r)=\mathbb{E}[e^{-r\tau_{\hat u}}G(X_{\tau_{\hat u}})]$ depends on the equilibrium stopping rule and describes group member $r$’s expected discounted payoff in equilibrium when the current value of the process is $x$. If the group consists of just one member with discount rate $r$, then $V (x) = w(x;r)$, i.e., the value function $V$ is given by that member’s expected discounted payoff and equation becomes the well-known Bellman equation $\max\lbrace\mathcal{A}V(x)-rV(x),G(x)-V(x)\rbrace=0$, which is independent of $\hat u$ and can be solved once the model primitives are given. Then, $\hat u$ is obtained immediately from the second equation using $V(x)$ and $G(x)$.

Now consider the case of several group members. After swapping integration and expectation, $V(x)$ can be written as $\int_0^\infty w(x;r)dF(r)$. This calculation shows that the weighted form of the discount function carries over tho the value function.

More specifically, because $w$ features standard exponential discounting it satisfies

$\mathcal{A}w(x;r)-rw(x;r)=0,\quad x\in\mathcal{C}$

with boundary conditions $w(0;r)=0$ and $w(x;r)\vert_\mathcal{S}=G(x)\vert_\mathcal{S}$ as $V(x)=G(x)$ when $x\in\mathcal{S}$. Since $V$ is the weighted average of the $w$, by integrating the differential equation of $w$ against $F$, $V$ must satisfy

$\mathcal{A}V(x)-\int_0^\infty rw(x;r)dF(r)=0,\quad x\in\mathcal{C}$

The final equation given above is then obtained by comparing the value of continuation and stopping.

Example

We can consider the example of the pseudo-exponential discount function. In that case, the group consists of two group members with discount rates $r$ and $r+\lambda>r$ whose weights in the decision process are $\delta$ and $1-\delta$, respectively. We can recover the equilibrium stopping rule from the following four equations by plugging in

$\int_0^\infty rw(x;r)dF(r)=\delta rw(x;r)+(1-\delta)(r+\lambda)w(x;r+\lambda)$