MATHEMATICS I THINK ABOUT

Feynman-Kac for SPDEs with Multiplicative Noise
On Integrals with respect to Mollified Noise
Particle Filter for an easy Zakai Equation

Feynman-Kac for SPDEs with Multiplicative Noise

One application of the celebrated Feynman-Kac formula in probability theory gives a stochastic representation of solutions to Cauchy problems. Following the basic idea of the proof, the formula is first derived for terminal value problems of type \[ -\frac{\partial}{\partial t}u\,=\, {\cal A} u + k\times u + g,\quad u(x,T)=f(x), \] where $f$ is a function on $\mathbb{R}^d$, $k$ and $g$ are functions on $\mathbb{R}^d\times[0,T]$, and ${\cal A}$ is a second-order partial differential operator. Under the right conditions, the representation would read \[ u(x,t)\,=\, {\bf E}[f(X_T)\exp\{\int_t^T k(X_s,s)\,ds\}+\int_t^T g(X_s,s)\exp\{\int_t^s k(X_r,r)\,dr\}\,ds\,|\,X_t=x], \] for $(x,t)\in\mathbb{R}^d\times[0,T]$, where $X_t$ stands for the Markov process generated by ${\cal A}$, and ${\bf E}$ denotes the corresponding expectation operator.

This note is NOT about the conditions on $f,k,g, {\cal A}$ guaranteeing such a representation. This note is about the corresponding initial value problem with $k$ being a random space-time field. To concentrate on this issue, in what follows, $g$ is set to be zero. After guessing a formula for the initial value problem from the above formula by simple time-reversion, it is discussed why this formula needs further treatment when $k$ is a rather IRREGULAR random space-time field.

If $u(x,t)$ satisfies the initial value problem \[\frac{\partial}{\partial t}u\,=\, {\cal A} u + k(x,t)\times u,\quad u(x,0)=f(x),\tag{1}\label{ivp}\] and if ${\cal A}$ only acts on the $x$-variable then, for an arbitrary but fixed $t>0$, $\tilde{u}(x,t')=u(x,t-t')$ satisfies the terminal value problem \[-\frac{\partial}{\partial t'}\tilde{u}\,=\, {\cal A}\tilde{u} + k(x,t-t')\times \tilde{u},\quad \tilde{u}(x,t)=f(x),\] and hence \[ \tilde{u}(x,t')\,=\, {\bf E}[f(X_t)\exp\{\int_{t'}^t k(X_s,t-s)\,ds\}\,|\,X_{t'}=x], \] for any $(x,t')\in\mathbb{R}^d\times[0,t]$, which gives the stochastic representation \[ u(x,t)\,=\, {\bf E}[f(X_t)\exp\{\int_{0}^t k(X_s,t-s)\,ds\}\,|\,X_{0}=x]\tag{2}\label{FK}\] for the solution to the initial value problem, when $t'$ is taken to be zero.

Now assume that $k(x,t)$ is a random space-time field, for example, $k(x,t)=h(x)\times\dot{B}_t$, where $h$ denotes a deterministic function on $\mathbb{R}^d$, and $B=(B_t)_{t\ge 0}$ is a one-dimensional standard Brownian motion. Clearly, this makes (\ref{ivp}) a rather FORMAL stochastic partial differential equation (SPDE), because the product $k(x,t)\times u$ needs explanation, because $\dot{B}_t=\partial B_t/\partial t$ is not a function anymore.

However, when constructing both $X_t$ and $B_t$ as two independent processes on the same probability space, formula (\ref{FK}) could be read \[ u(x,t)(\omega)\,=\, {\bf E}[f(X_t)\exp\{\int_{0}^t h(X_s)\dot{B}_{t-s}\,ds\}\,|\,X_{0}=x,B=\omega],\tag{3}\label{FK1}\] and this might give a solution to the SPDE, in some sense, if the integral $\int_{0}^t h(X_s)\dot{B}_{t-s}\,ds$ can be well-defined.

A first idea could be to treat the integral as a backward stochastic integral against Brownian motion. But this would cause an adaptedness-problem when verifying equation (\ref{ivp}) as the integral form of this equation would rather be obtained by integrating forward in time. So, one wants to write \[ \int_{0}^t h(X_{t-s})\dot{B}_s\,ds\quad\mbox{for}\quad \int_{0}^t h(X_s)\dot{B}_{t-s}\,ds \] which is a formal operation, only, since $\dot{B}$ is not a function. But this operation would have its proper meaning when $\dot{B}$ is approximated by a continuous function, $\dot{B}_\varepsilon$, obtained by convoluting $B$ against a mollifier. Then, \[ u_\varepsilon(x,t)(\omega)\,=\, {\bf E}[f(X_t)\exp\{\int_{0}^t h(X_{t-s})\dot{B}_\varepsilon(s)\,ds\}\,|\,X_{0}=x,B=\omega]\tag{4}\label{FKapprox} \] solves \[ \frac{\partial}{\partial t}u_\varepsilon\,=\, {\cal A} u_\varepsilon + h(x)\dot{B}_\varepsilon(t)\times u_\varepsilon,\quad u_\varepsilon(x,0)=f(x), \] in mild sense, that is \[ u_\varepsilon(x,t)\,=\,\int p_t(x,y)f(y)\,dy+\int_0^t\!\!\!\int p_{t-s}(x,y)\,u_\varepsilon(y,s)\,h(y)dy\,\dot{B}_\varepsilon(s)ds, \] where $p_t(x,y)$ stands for the family of transition probability densities of the Markov process $X_t$. Note that $X_t$ is a time-homogeneous Markov process because its generator ${\cal A}$ does not act on time.

Taking $\varepsilon$ to zero in the last equation gives \[ u(x,t)\,=\,\int p_t(x,y)f(y)\,dy+\int_0^t\!\!\!\int p_{t-s}(x,y)\,u(y,s)\,h(y)dy\,\circ dB_s, \] where \[ u(x,t)(\omega)\,=\, {\bf E}[f(X_t)\exp\{\int_{0}^t h(X_{t-s})\,dB_s\}\,|\,X_{0}=x,B=\omega]\tag{5}\label{solStrato}\] is the $\varepsilon$-limit of (\ref{FKapprox}). Here $\circ dB_s$ denotes Stratonovich integration as this is what $\dot{B}_\varepsilon(s)ds$ would "produce" in the limit. All in all, (\ref{solStrato}) yields a mild solution to (\ref{ivp}) if the product $k(x,t)\times u$ with $k(x,t)=h(x)\times\dot{B}_t$ is understood in the sense of Stratonovich. If the SPDE (\ref{ivp}) is understood in the sense of Itô, under the exponential in (\ref{solStrato}), the corresponding Itô-correction term $\frac{1}{2}\int_{0}^t h(X_{s})^2\,ds$ has to be substracted. Note that the integral under the exponential in (\ref{solStrato}) is a proper Itô integral.

Now, if the process $X_t$ is reversible, there is a trick to get rid of the backward time of the integrand under the exponential in (\ref{solStrato}). This trick is used by Bertini/Cancrini in The Stochastic Heat Equation: Feynman-Kac Formula and Intermittence, Journal of Statistical Physics, Vol. 78, Nos 5/6, 1995, when ${\cal A}$ is the Laplacian and $k(x,t)$ is space-time white noise. They don't say that this trick is new but they don't give a reference, either. Below, the right-hand side of (\ref{solStrato}) is going to be transformed following Bertini/Cancrini's ideas (I do not know any earlier reference---comments on earlier references would be most welcome).

Fixing a path $B_t$ and writing ${\bf E}^B$ for the corresponding conditional expectation, the right-hand side of (\ref{solStrato}) equals \begin{align*} &{\bf E}^B[f(X_t)\exp\{\int_{0}^t h(X_{t-s})\,dB_s\}\,|\,X_{0}=x]\\ =&\int f(y)\,{\bf E}^B[\exp\{\int_{0}^t h(X_{t-s})\,dB_s\}\,|\,X_{0}=x,X_t=y]\,p_t(x,y)\,dy. \end{align*} Obviously, for fixed $t$, the expectation is now taken with respect to an $X$-bridge satisfying $X_0=x$ and $X_t=y$. But the integration is with respect to the time-reversion of this bridge, and the time-reversed bridge has the same law as the corresponding forward-time bridge since the process $X_t$ is reversible in the sense of $p_t(x,y)=p_t(y,x)$. So, if ${\bf E}^B_{y,x;t}$ denotes the expectation with respect to an $X$-bridge, $b_s^X,\,0\le s\le t$, satisfying $X_0=y$ and $X_t=x$, then the right-hand side of (\ref{solStrato}) becomes \[ \int dy\,f(y)\,p_t(y,x)\,{\bf E}^B_{y,x;t}[\exp\{\int_{0}^t h(b_{s}^X)\,dB_s\}], \] which is the type of Feynman-Kac formula used by Bertini/Cancrini in the paper referenced above. They prove that their formula gives a solution to the corresponding SPDE. This proof is quite short, and I think that the validity of (3.21) on page 1394 of their paper can hardly be understood without mentioning the underlying principle of time-reversion I stressed above.

On Integrals with respect to Mollified Noise

Let $B_t$ be a Brownian motion, and define $B_t^\varepsilon=\int_0^\infty\varrho_\varepsilon(t-u)B_u\,du,\,t\ge 0$, where $\varrho_\varepsilon$ is a smooth approximation of the right-sided mollifier ${\bf 1}_{[0,\varepsilon]}/\varepsilon$. Obviously, $B_t^\varepsilon\rightarrow B_t,\,\varepsilon\downarrow 0$, almost surely, and in $L^p$, for any $p\ge 1$.

Now, consider an equation with multiplicative mollified noise, for example \[ dX_t^\varepsilon \,=\, X_t^\varepsilon\,d{B}_t^\varepsilon \,=\, X_t^\varepsilon\,\dot{B}_t^\varepsilon\,dt. \] It is well-known that, when $\varepsilon\downarrow 0$, the limiting equation would be \[ dX_t \,=\, X_t\,d_{strato}{B}_t \,=\, X_t\,\underbrace{d{B}_t}_{ito} +\,\frac{1}{2}d\langle X,B\rangle_t\,, \] thus $\int_0^t X_s^\varepsilon\,d{B}_s^\varepsilon$ converges to the Stratonovich integral of the limiting process.

One could therefore think that, for any continuous semimartingale integrand $Y_t$, the integral $\int_0^t Y_s\,d{B}_s^\varepsilon$ against mollified noise would converge to $\int_0^t Y_s\,d_{strato}B_s$. But, this is not true, it rather holds that \[ \int_0^t Y_s\,d{B}_s^\varepsilon \stackrel{\varepsilon\downarrow 0}{\longrightarrow} \int_0^t Y_s\,dB_s+\langle Y,B\rangle_t \,=\, \int_0^t Y_s\,d_{strato}B_s+\frac{1}{2}\langle Y,B\rangle_t\,. \] Indeed, \begin{align*} \int_0^t Y_s\,d(B^\varepsilon-B)_s-\langle Y,B\rangle_t =\;&Y_t(B_t^\varepsilon-B_t)-\int_0^t(B_s^\varepsilon-B_s)\,dY_s -\langle Y,B^\varepsilon-B\rangle_t-\langle Y,B\rangle_t\\ =\;&Y_t(B_t^\varepsilon-B_t)-\int_0^t(B_s^\varepsilon-B_s)\,dY_s\,, \end{align*} and the last two terms converge to zero in probability. So, why does the limit of $\int_0^t X_s^\varepsilon\,d{B}_s^\varepsilon$ pick up less covariation? It is precisely $\frac{1}{2}\langle X,B\rangle_t$ less. This must be due to the fact that the integrand also depends on the mollified noise.

To answer this question in the case of an easy example, using a continuous semimartingale integrand $Z_t$, consider a two-times iterated integral, \[ \int_0^t\int_0^s Z_u\,dB_u^\varepsilon dB_s^\varepsilon \,=\, \int_0^t Y_s^\varepsilon\,dB_s^\varepsilon, \] where $Y_s^\varepsilon=\int_0^s Z_u\,dB_u^\varepsilon$ converges to $\int_0^s Z_u\,dB_u+\langle Z,B\rangle_s$, when $\varepsilon\downarrow 0$. Then, looking into the limiting behaviour of \[ \int_0^t(\,Y_s^\varepsilon-\int_0^s Z_u\,dB_u-\langle Z,B\rangle_s\,)\,dB_s^\varepsilon \,=\, \int_0^t(\int_0^s Z_u\,d\overbrace{(B^\varepsilon-B)_u}^{D_u^\varepsilon} -\langle Z,B\rangle_s\,)\,dB_s^\varepsilon \] seems to be a good idea, as this term might reduce the covariation created in the limit of the outer integral. Since $\langle Z,D^\varepsilon\rangle_s=-\langle Z,B\rangle_s$, the above right-hand side equals \[ \int_0^t(\,Z_s D_s^\varepsilon-\!\int_0^s D_u^\varepsilon\,dZ_u\,)\,dB_s^\varepsilon \,=\, \underbrace{ \int_0^t(\,Z_s D_s^\varepsilon-\!\int_0^s D_u^\varepsilon\,dZ_u\,)\,dD_s^\varepsilon }_{\hspace{10pt}=T_1^\varepsilon} \,+ \underbrace{ \int_0^t(\,Z_s D_s^\varepsilon-\!\int_0^s D_u^\varepsilon\,dZ_u\,)\,dB_s }_{\hspace{10pt}=T_2^\varepsilon} \] where $T_2^\varepsilon$ obviously converges to zero in probability. Interestingly, it is a part of $T_1^\varepsilon$ which picks up negative covariation between $\int_0^\bullet Z_u\,dB_u$ and $B$, when $\varepsilon\downarrow 0$. First, observe that \[ \int_0^t\int_0^s D_u^\varepsilon\,dZ_u\,dD_s^\varepsilon \,=\, D_t^\varepsilon \int_0^t D_u^\varepsilon\,dZ_u - \int_0^t(D_s^\varepsilon)^2\,dZ_s - \langle \int_0^\bullet D_u^\varepsilon\,dZ_u\,,D^\varepsilon\rangle_t \] where the covariation-term on the right-hand side equals $-\int_0^t D_u^\varepsilon\,d\langle Z,B\rangle_u$, so that all terms on the right-hand side converge to zero in probability. Thus, the non-trivial part of $T_1^\varepsilon$ is \begin{align*} \int_0^t Z_s D_s^\varepsilon\,dD_s^\varepsilon \,=\,& \frac{1}{2}\int_0^t Z_s\,d(D_s^\varepsilon)^2 - \frac{1}{2}\int_0^t Z_s\,d\langle D^\varepsilon\rangle_s\\ \,=\,& \frac{1}{2}Z_t(D_t^\varepsilon)^2 - \frac{1}{2}\int_0^t(D_s^\varepsilon)^2\,dZ_s - \frac{1}{2}\langle Z,(D^\varepsilon)^2\rangle_t - \frac{1}{2}\int_0^t Z_s\,ds \end{align*} where $\frac{1}{2}\langle Z,(D^\varepsilon)^2\rangle_t = -\int_0^t D_s^\varepsilon\,d\langle Z,B\rangle_s$, and hence $T_1^\varepsilon\to -\frac{1}{2}\int_0^t Z_s\,ds,\,\varepsilon\downarrow 0$, in probability. On the whole, writing "lim" for "limit in probability", \[ \lim_{\varepsilon\downarrow 0}\int_0^t\int_0^s Z_u\,dB_u^\varepsilon dB_s^\varepsilon \,=\, -\frac{1}{2}\int_0^t Z_s\,ds +\lim_{\varepsilon\downarrow 0} \int_0^t(\int_0^s Z_u\,dB_u+\langle Z,B\rangle_s\,)\,dB_s^\varepsilon \] \begin{align*} \,=\,& -\frac{1}{2}\int_0^t Z_s\,ds +\int_0^t(\int_0^s Z_u\,dB_u+\langle Z,B\rangle_s\,)\,dB_s +\underbrace{ \langle\int_0^\bullet Z_u\,dB_u+\langle Z,B\rangle,B\rangle_t }_{\int_0^t Z_s\,ds}\\ \,=\,& \int_0^t(\int_0^s Z_u\,dB_u+\langle Z,B\rangle_s\,)\,d_{strato}B_s, \end{align*} which confirms that our easy example is another one where, a), the integrand depends on mollified noise, b), integrating this integrand against mollified noise picks up less covariation in the limit. Again, the reduced amount of covariation turns the limiting integral into a Stratonovich integral.

But what is going to happen to the above proof, when replacing the semimartingale integrand by an indexed family of integrands and consider $\lim_{\varepsilon\downarrow 0}\int_0^t\int_0^s\phi(s,u)\,dB_u^\varepsilon dB_s^\varepsilon$, where $(s,u)\mapsto\phi(s,u)$ is a smooth deterministic function. Of course, $\int_0^s\phi(s,u)\,dB_u^\varepsilon$ converges to $Y_s=\int_0^s\phi(s,u)\,dB_u$, when $\varepsilon\downarrow 0$, but $Y_s$ is NOT anymore a semimartingale. Still, one can work out the covariation between $\int_0^\bullet\phi(\cdot,u)\,dB_u$ and $B$, leading to \[ \langle Y,B\rangle_t \,=\, \langle\int_0^\bullet\phi(\cdot,u)\,dB_u\,,B\,\rangle_t \,=\, \int_0^t\phi(s,s)\,ds, \] and then the above proof goes through resulting in \begin{align*} \lim_{\varepsilon\downarrow 0}\int_0^t\int_0^s\phi(s,u)\,dB_u^\varepsilon dB_s^\varepsilon \,=\,& \int_0^t(\int_0^s\phi(s,u)\,dB_u\,)\,d_{strato}B_s\\ \,=\,& \int_0^t\int_0^s\phi(s,u)\,dB_u dB_s+\frac{1}{2}\langle\int_0^\bullet\phi(\cdot,u)\,dB_u\,,B\,\rangle_t. \end{align*}

Unfortunately, in interesting applications, $(s,u)\mapsto\phi(s,u)$ is only smooth outside of the diagonal and converges to $\phi(s,s)=+\infty$ on the diagonal. So, what is going to happen to the above proof in such a case?

Particle Filter for an easy Zakai Equation

Solutions of Zakai equations are un-normalised densities associated with specific posterior distributions in the context of so-called non-linear filtering, thus particle approximations of such solutions could also be called particle filters.

The goal is to create a particle filter approximating the solution of a simple Zakai equation \[ dq\,=\,{\cal L}'q\,dt + hq\,dB_t \quad\Leftrightarrow\quad \frac{\partial}{\partial t}q\,=\,{\cal L}'q + (h\dot{B}_t-\frac{h^2}{2})\times q \tag{1}\label{zakai}\] where ${\cal L}$ is a second-order partial differential operator generating a continuous Markov process, $h(x)$ is a function, $B_t$ is the observed signal, and $\times$ is the product leading to Stratonovich integration. The observed signal, assumed to be one-dimensional in what follows, can be considered a Brownian motion after an appropriate change of the probability measure used for modelling, and $dB_t$ stands for the differential leading to Itô integration.

The basic principle behind a particle filter for this purpose is very simple, it is just an iteration of the Feynman-Kac formula combined with time-reversion. Following Feynman-Kac for SPDEs with Multiplicative Noise, the solution of the above Zakai equation can be represented by taking expectations of functionals of a Markov process independent of the observed signal $B_t$. This Markov process's generator would be ${\cal L}'$, the adjoint of ${\cal L}$ on a Hilbert space, and the time-reversion step would be achieved by swapping to expectations with respect to the Markov process generated by ${\cal L}$.

Assume a solution to equation (\ref{zakai}) exists in the domain $\mathbb{R}^d\times[0,\infty)$, and let $t_\star \ge 0$ be a fixed reference time point. First, construct three independent stochastic processes, $X',X,B$, on the same probability space $(\Omega,{\cal F},{\bf P})$, where $X'$ and $X$ are Markov processes with generators ${\cal L}'$ and ${\cal L}$, respectively, and $B$ is a one-dimensional standard Brownian motion. Then, for $x\in\mathbb{R}^d,\,t\ge t_\star$, the Feynman-Kac formula yields the following representation of the solution, \[ q(x,t) \,=\, {\bf E}^B[\,q(X_t',t_{\star})\exp\{ \int_{t_{\star}}^t h(X_{t_{\star}+\,t-s}')\,dB_s - \frac{1}{2}\int_{t_{\star}}^t h(X_{t_{\star}+\,t-s}')^2\,ds \}\,|\,X_{t_{\star}}'\!=x\,], \] where ${\bf E}^B$ denotes the expectation with respect to ${\bf P}(\cdot\,|B)$. Note that, if $X'$ and $B$ were not independent, then the integral $\int_{t_{\star}}^t h(X_{t_{\star}+\,t-s}')\,dB_s$ would not necessarily be well-defined, but, as they are, it can even be approximated by $\sum_{n=1}^N h(X_{t_{\star}+\,t-s_n}')(B_{s_n}-B_{s_{n-1}})$, when the mesh of the partition $t_{\star} = s_0 < s_1 < \dots < s_{N-1} < s_N = t$ goes to zero. As a consequence, successively taking conditional expectations with respect to ${\cal F}^{X'}_{t_{\star}+\,t-s_1},\dots,{\cal F}^{X'}_{t_{\star}+\,t-s_N}$, and using that, on $L^2(\mathbb{R}^d)$, the semigroup generated by ${\cal L}'$ is the adjoint semigroup of the semigroup generated by ${\cal L}$, results in \begin{align*} &\int_{\mathbb{R}^d}q(x,t)f(x)\,dx\\ \,=\,& \int_{\mathbb{R}^d}dy\,q(y,t_{\star})\, {\bf E}^B[\,f(X_t)\exp\{ \int_{t_{\star}}^t h(X_s)\,dB_s - \frac{1}{2}\int_{t_{\star}}^t h(X_s)^2\,ds \}\,|\,X_{t_{\star}}\!=y\,]\\ \,=\,& \int_{\mathbb{R}^d} \int_{\mathbb{R}^d}dy\,q(y,t_{\star})\, {\bf E}^B[\,\delta_{X_t}(x)\exp\{ \int_{t_{\star}}^t h(X_s)\,dB_s - \frac{1}{2}\int_{t_{\star}}^t h(X_s)^2\,ds \}\,|\,X_{t_{\star}}\!=y\,] \,f(x)\,dx, \end{align*} for any test function $f(x)$, and hence \[ q(x,t) \,=\, \int_{\mathbb{R}^d}dy\,q(y,t_{\star})\, {\bf E}^B[\,\delta_{X_t}(x)\exp\{ \int_{t_{\star}}^t h(X_s)\,dB_s - \frac{1}{2}\int_{t_{\star}}^t h(X_s)^2\,ds \}\,|\,X_{t_{\star}}\!=y\,],\quad t\ge t_\star. \tag{2}\label{newFeynKac}\]

Note that the above formula is an easy consequence of the definition of $q(x,t)$ as a conditional density, but here it was derived from the PDE this density satisfies, thus the method would still work for solutions of PDEs which do not have a conditional density representation.

Now, for fixed K, denote by $q_K(x,t)$ the particle filter approximation to be constructed using (\ref{newFeynKac}). Assume $q(\cdot,0)=\delta_z(\cdot)$, for some $z\in\mathbb{R}^d$, so that the actual solution satisfies \[ q(\cdot,t) \,=\, {\bf E}^B[\,\delta_{X_t}(\cdot)\exp\{ \int_{0}^t h(X_s)\,dB_s - \frac{1}{2}\int_{0}^t h(X_s)^2\,ds \}\,|\,X_{0}\!=z\,]. \] Draw $K$ independent paths $X^{(k)}_s,\,0\le s\le t$, from the Markov process generated by ${\cal L}$, all of them satisfying $X^{(k)}_0=z,\,k=1,\dots,K$, set \[ a_k(t) \,=\, \exp\{ \int_{0}^t h(X^{(k)}_s)\,dB_s - \frac{1}{2}\int_{0}^t h(X^{(k)}_s)^2\,ds \}, \] and construct the particle filter via the Monte-Carlo mean, \[ q_K(\cdot,t) \,=\, \frac{1}{K}\sum_{k=1}^K a_k(t)\delta_{X^{(k)}_t}(\cdot)\,. \tag{3}\label{initial filter}\]

This Monte-Carlo mean is a valid approximation of the solution, but it might not be a good one. A major concern is that the weights, $a_k(t)$, of some particles, $X^{(k)}_t$, become very small, thus one would want to cull such particles, while, at the same time, multiply more "important" ones, hopefully improving the rate of convergence. This culling and multiplying can be achieved by "re-constructing" the particles at pre-defined times $0 = t_0 < t_1 < t_2 < \cdots\,$. Starting with the particle filter approximation given by (\ref{initial filter}) on $[0,t_1)$, which particles should be culled at time $t_1$?

Recalling (\ref{newFeynKac}), for $t\ge t_1$, the actual solution satisfies \begin{align*} q(\cdot,t)\, &= \int_{\mathbb{R}^d}dy\,q(y,t_1)\, {\bf E}^B[\,\delta_{X_t}(\cdot)\exp\{ \int_{t_1}^t h(X_s)\,dB_s - \frac{1}{2}\int_{t_1}^t h(X_s)^2\,ds \}\,|\,X_{t_1}\!=y\,]\\ = \int_{\mathbb{R}^d}q(y,t_1)\,dy &\times \int_{\mathbb{R}^d}dy\,\frac{q(y,t_1)}{\int_{\mathbb{R}^d}q(y,t_1)\,dy}\, {\bf E}^B[\,\delta_{X_t}(\cdot)\exp\{ \int_{t_1}^t h(X_s)\,dB_s - \frac{1}{2}\int_{t_1}^t h(X_s)^2\,ds \}\,|\,X_{t_1}\!=y\,]\\ &= \int_{\mathbb{R}^d}q(y,t_1)\,dy \times {\bf E}^B[\,\delta_{\{\,\!_1X_t\}}(\cdot)\exp\{ \int_{t_1}^t h(_1X_s)\,dB_s - \frac{1}{2}\int_{t_1}^t h(_1X_s)^2\,ds \}\,], \end{align*} where $_1X$ denotes the Markov process generated by ${\cal L}$, but starting at $t_1$ with an initial condition whose pdf is given by $q(\cdot,t_1)/\int_{\mathbb{R}^d}q(y,t_1)\,dy$.

A natural idea is therefore, first, to change the importance of the particles at time $t_1$ so that the distribution of their locations closely follows the initial distribution of $_1X$, second, to draw new paths from the Markov process generated by ${\cal L}$ starting at $t_1$ at the locations of the re-constructed particles. The formal choice of importance weights for approximating the distribution with pdf $q(\cdot,t_1)/\int_{\mathbb{R}^d}q(y,t_1)\,dy$ is also quite straight forward: the normalising integral should be approximated by $\int_{\mathbb{R}^d}q_K(y,t_1-)\,dy$ based on the motion of the particles before being re-constructed at $t_1$, that is \[ \int_{\mathbb{R}^d}q_K(y,t_1-)\,dy \,=\, \frac{1}{K}\sum_{k=1}^K a_k(t_1-) \quad\mbox{using}\quad a_k(t_1-)\,=\,\lim_{t\uparrow t_1}a_k(t), \] leading to \[ \frac{a_k(t_1-)}{\sum_{k'=1}^K a_{k'}(t_1-)} \,=\, \mbox{importance weight of $k$th particle's position at $t_1$},\;k=1,\dots,K. \] Thus, formally, one should re-construct the ensemble of $K$ particles in such a way that there are $K\times{a_k(t_1-)}/{\sum_{k'=1}^K a_{k'}(t_1-)}$ particles at location $X^{(k)}_{t_1}$, which describes a kind of natural branching procedure, where some particles would die and others would multiply.

This branching procedure was introduced and studied in the second half of the 1990's, mainly by D. Crisan, P. Del Moral, T. Lions. The problem was that the formal particle numbers are not integers, thus an integer-correction is needed, which is the non-trivial bit. The above three authors worked out several random integer-correction schemes of the following type: for each $k=1,\dots,K$, find a random integer with mean $K\times{a_k(t_1-)}/{\sum_{k'=1}^K a_{k'}(t_1-)}$ subject to a small variance. Some of these integers will be zero resulting in the corresponding particles being culled, and some of the remaining particles will multiply inheriting the parent-particle's position. If the integers are assumed to be independent, and that has been the assumption in the early and most studied schemes, then, after branching, the number of surviving particles, $K_1$, would be random with mean $K$, leading to re-constructed particles with positions $_1X^{(k)}_{t_1}$, $k=1,\dots,K_1$, at time $t_1$.

Then, starting from these positions, following the second step of the natural idea laid out above, one draws $K_1$ independent paths $_1X^{(k)}_s,\,t_1\le s\le t$, from the Markov process generated by ${\cal L}$, sets \[ a_k(t) \,=\, \exp\{ \int_{t_1}^t h(_1X^{(k)}_s)\,dB_s - \frac{1}{2}\int_{t_1}^t h(_1X^{(k)}_s)^2\,ds \}, \] and continues constructing the particle filter via \[ q_K(\cdot,t) \,=\, \frac{1}{K}\sum_{k=1}^K a_k(t_1-) \times \frac{1}{K_1}\sum_{k=1}^{K_1}a_k(t)\delta_{\{\,\!_1X^{(k)}_t\}}(\cdot)\,, \quad t\in[t_1,t_2).\]

Of course, this method can be applied at $t_2$, again, and so on, leading to the following general description of the particle filter: for $n=0,1,2,\dots\,$, \begin{align*} a_k(t)&\,=\, \exp\{ \int_{t_n}^t h(_nX^{(k)}_s)\,dB_s - \frac{1}{2}\int_{t_n}^t h(_nX^{(k)}_s)^2\,ds \},\quad t\in[t_n,t_{n+1}),\\ q_K(\cdot,t)&\,=\, \prod_{m=0}^n \left( \frac{1}{K_{m-1}}\sum_{k=1}^{K_{m-1}}a_k(t_m-) \right) \times \frac{1}{K_n}\sum_{k=1}^{K_n}a_k(t)\delta_{\{\,\!_nX^{(k)}_t\}}(\cdot)\,, \quad t\in[t_n,t_{n+1}), \tag{4}\label{filter} \end{align*} setting $a_k(t_0-)=1,\,K_{-1}=K_0=K$, and writing $_0X^{(k)}$ for $X^{(k)}$ used in (\ref{initial filter}).

If introducing the branching correction was all about increasing the speed of convergence, what is a good bound on the rate of convergence of the particle filter given by (\ref{filter})?

First of all, the observed signal, $B$, would most likely come in at discrete times $t_1,t_2,\dots\,$, forming a natural grid of times at which particles should be re-constructed. However, only knowing the signal at these discrete times complicates the calculation of the integral $\int_{t_n}^t h(_nX^{(k)}_s)\,dB_s$ needed for updating $a_k(t),\,t\in[t_n,t_{n+1})$. If $t_{n+1}-t_n$ is small, though, it would make sense to approximate this integral by $h(_nX^{(k)}_{t_n})\times(B_{t_{n+1}}-B_{t_n})$, and $\int_{t_n}^t h(_nX^{(k)}_s)^2\,ds$ by $h(_nX^{(k)}_{t_n})^2\times(t_{n+1}-t_n)$, for all $t\in[t_n,t_{n+1})$, which yields a discrete-time approximation, $q_{K,\Delta}(\cdot,t)$, of the particle filter given by (\ref{filter}), assuming a uniform mesh $\Delta=t_{n+1}-t_n,\,n=1,2,\dots\,$, for simplicity. Then, following the proof of a bound given in Branching and Interacting Particle Systems Approximations of Feynman–Kac Formulae with Applications to Non-Linear Filtering, Séminaire de Probabilités XXXIV, LNM 1729, 1--145, Springer, Berlin, 2000, by P. Del Moral and L. Miclo, one can derive \[ {\bf E}[\,|\hspace{-3pt} \int_{\mathbb{R}^d}q(x,t)f(x)\,dx - \int_{\mathbb{R}^d}q_{K,\Delta}(x,t)f(x)\,dx \,|^2\,] \,\le\, \frac{C(t)}{K\Delta}\|f\|_\infty^2 +C_{eu}(t)\Delta(1+\|f\|_\infty^2), \tag{5}\label{first rate}\] for bounded test functions $f$ also satisfying some smoothness condition. While the $C_{eu}(t)$-term comes from the Euler-type approximation of the stochastic exponential, the $C(t)$-term is the combined effect of Monte-Carlo approximation and re-constructing particles at $\Delta$ time steps, and that is the effect of interest. Note that $C(t)$ seems to be as bad as $O(te^{ct\|h\|_\infty^2})$, which means that one would need many particles for good approximations at large times.

However, the worrying bit is $\Delta$ in the denominator, which means that re-constructing particles too often is bad. Therefore, if the time grid of the observed signal is too tight, with a very small mesh $\Delta_{eu}$ for example, then one should re-construct particles using a sub-grid with a larger mesh $\Delta$. In such a case, the Euler approximations of the stochastic exponentials during the larger $\Delta$ time steps would be very close to the continuous time stochastic exponentials used in (\ref{filter}), and could hence be identified with those.

Having this in mind, presumably, people studied the convergence of $q(\cdot,t)-q_K(\cdot,t)$ without discretizing the stochastic exponential. They also made $\Delta=K^{-2\alpha}$ a function of $K$, for some $0<\alpha\le 1/2$, to better understand the impact of $\Delta$ on the rate of convergence. Following A central limit type theorem for a class of particle filters, Communications on Stochastic Analysis 1(1), 103-122 (2007), by D. Crisan and J. Xiong, as well as A branching particle system approximation for nonlinear stochastic filtering, Sci China Math 56(8), 1521--1541 (2013), by L. HuiLi and J. Xiong, the best bound so far seems to be \[ {\bf E}[\,|\hspace{-3pt} \int_{\mathbb{R}^d}q(x,t)f(x)\,dx - \int_{\mathbb{R}^d}q_{K}(x,t)f(x)\,dx \,|^2\,] \,\le\, C(t)(K^{-(1-\alpha)}\vee K^{-2\alpha})|\!|\!|f|\!|\!|^2, \] for test functions $f$ satisfying some smoothness condition associated with the norm $|\!|\!|\cdot|\!|\!|$. Both papers do not shed much light on the $t$-dependence of $C(t)$, but it still seems to be as bad as $O(te^{ct\|h\|_\infty^2})$. The optimal $\alpha=1/3$ translates into a rate of convergence proportional to $K^{-2/3}$, which is quite slower than the standard rate of $K^{-1}$.

On the other hand, when using $\Delta$ for $K^{-2\alpha}$, the rate of convergence would read $1/(K\sqrt{\Delta})$, which is better than the rate given by P. Del Moral and L. Miclo in (\ref{first rate}). However, this might be a side effect of making $\Delta$ a function of $K$. In Particle Approximations for a Class of Stochastic Partial Differential Equations, Appl. Math. Optim. 54, 293--314 (2006), Dan Crisan couples the random integers of the integer-correction scheme in such a way that the number of surviving particles is always $K$, which should improve the rate of convergence, but he calculates \[ {\bf E}[\,|\hspace{-3pt} \int_{\mathbb{R}^d}q(x,t)f(x)\,dx - \int_{\mathbb{R}^d}q_{K}(x,t)f(x)\,dx \,|^2\,] \,\le\, \frac{C(t)}{K\Delta}|\!|\!|f|\!|\!|^2, \] when keeping $\Delta$ independent of $K$. Again, $C(t)$ is as bad as in (\ref{first rate}), and the test function $f$ has to satisfy a smoothness condition associated with $|\!|\!|\cdot|\!|\!|$. This coupling, though, makes estimation harder: there does not seem to be a bound on the rate of convergence when $\Delta=K^{-2\alpha},\,0<\alpha\le 1/2$.

1 page comment