Understanding the relationship between filtration $\mathcal{F}_t$ and an observed trajectory $O_{t}$

filtrationsmeasure-theoryprobability theorystochastic-processes

Introduction: I understand the filtration $\mathcal{F}_t$ to model all knowledge of a stochastic process $\{X_t:t=0,1,2,\dots,T\}$ up to time $t$ which in this case is discrete-time (due to my basic level of understanding this makes it easier). It does this via a growing set of sigma algebras $\mathcal{F}_t \subset \mathcal{F}_{t+1}$. This means that as time passes, we are able to distinguish more events, i.e. the observer is becoming smarter. For example at $t=1$ in a heads (H) and tails (T) coin toss game that lasts 2 rounds, we can't distinguish between the events $\{HH,HT\}$ as well as $\{HH,HT\}$ such that $\mathcal{F}_1 = \{ \Omega,\varnothing,\{HH,HT\},\{TH,TT\}\}$ but at $t=2=T$ we can distinguish between all events $\mathcal{F}_2 = \{ \Omega,\varnothing,\{HH\},\{HT\},\{TH\},\{TT\}\} = 2^\Omega$. To me, the concepts of distingushing was important in comprehending filtrations.

Question: I have a problem with the two ideas/notation. The first conditions on the filtration
$$
P(X_t|\mathcal{F}_t) \mbox{ and } \mathbb{E}[X_t|\mathcal{F}_t]
$$

while the other conditions on an observed realisation of the stochastic process $O_t = \{X_\tau(\omega):\tau=1,2,\dots,t\}$ such that one would consider
$$
P(X_t|O_t) \mbox{ and } \mathbb{E}[X_t|O_t].
$$

In the coint-toss example, we might have $O_1 = \{H\}$ or $O_1 = \{T\}$. I am much more comfortable conditioning on observations. A course on Hidden Markov Models would make use of observations, etc. Is there a connection between the two?

My attempt: I would like to show effort through showing my attempt at understanding the relationship between the two. Firstly, $O_t$ should tell us where we are both in state and time/index. The index part is important as it means we know to use $\mathcal{F}_t$ to denote our available information. Such a statement is new to me; I would have always referred to $O_t$ as our available information or history but now know $\mathcal{F}_t$ to be the available information. Secondly, $O_t$ tells us what subset of $\mathcal{F}_t$ would be permissible. In other words, what part of $\mathcal{F}_t$ contains information pertaining to the realisation $O_t$. Let's denote that as $\mathcal{H}_{O_n}$ such that $\mathcal{H}_{O_t}\subseteq \mathcal{F}_t$. In the coin toss example, if we have $O_1= \{H\}$ then we have $\mathcal{H}_{O_1} = \{ \{HT,HH\} \} \subset \mathcal{F}_1$. I think we could also say that if $\mathcal{H}_{O_t}\not\subseteq \mathcal{F}_t $ then we are not observing the stochastic process $\{X_t:t=1,2,\dots,T\}$ anymore but perhaps another $\{Y_t:t=1,2,\dots,T\}$, i.e. of one observed a dice roll of six as the second outcome $O_2=\{H,6\}$. This leads to the following conclusion: $P(X_t|\mathcal{F}_t)$ is a generalisation of $P(X_t|O_t)$ which says put an observation here to condition on BUT the observation must be able to locate a subset of $\mathcal{F}_t$. Hence, the filtration tells us what informative observations would look like but also what the minimum requirements for $O_t$ would be. For example, $O_2=\{H,6\}$ does not suffice nor does $O_2 = \{H\}$. The former contains no relevant information at the second observation while the latter does not have enough. Something like $O_2=\{H,T,6\}$ is informative enough but the inclusion of 6 is redundant. Hopefully this make sense and is in the ballpark of correct intuition.

Best Answer

If $\mathcal F_t$ is the filtration generated by $(X_1,...,X_t)$, then we actually have that $\mathbb{E}[X_t|\mathcal F_t] = X_t$ because the information in $\mathcal F_t$ is sufficient to determine $X_t$.

In general, if $Z$ is a random variable, you can think of $\mathbb{E}[Z|\mathcal F_t]$ as a function mapping the sets in $\mathcal F_t$ to estimates of $Z$. $Z$ might be $X_T$ for some $T > t$, or it might be a different stochastic process, or any other random variable you are interested in. Since $\mathcal F_t$ is generated by $(X_1,...,X_t)$, there is actually a function $f$ such that $\mathbb{E}[Z|\mathcal F_t] = f(X_1,...,X_t).$ Note that since $X_1,...,X_t$ are all random variables, $\mathbb{E}[Z|\mathcal F_t]$ is also a random variable. The connection between conditioning on $\mathcal F_t$ and conditioning on an observed realization $O_t = (X_1(\omega),...,X_t(\omega))$ is that you are just plugging that observed realization into the function $f$, i.e. $\mathbb{E}[Z|O_t] = f(X_1(\omega),...,X_t(\omega)).$

It looks like you were also interested in the case where your observation $O_t$ contains some redundant information. As you said, the observation must be able to identify a subset of $\mathcal F_t$. Since each trajectory corresponds to a unique subset of $\mathcal F_t$, that means it must also be able to identify $(X_1(\omega),...,X_t(\omega))$, in which case you again just plug that into the function $f$ and ignore any extra information in $O_t$.

Related Question