Solved – Apply HMM on a stock dataset or any other real dataset

hidden markov model

Can someone explain how to apply an Hidden markov model on a stock dataset which has many rows and columns. I am new to HMM but have been going through it for a week.

Here is a snapshot of dataset:

quarter stock   date    open    high    low close   volume
1   AA  1/7/2011    $15.82	$16.72 $15.78	$16.42 239655616
1   AA  1/14/2011   $16.71	$16.71 $15.64	$15.97 242963398
1   AA  1/21/2011   $16.19	$16.38 $15.60	$15.79 138428495
1   AA  1/28/2011   $15.87	$16.63 $15.82	$16.13 151379173
1   AA  2/4/2011    $16.18	$17.39 $16.18	$17.14 154387761
1   AA  2/11/2011   $17.33	$17.48 $16.97	$17.37 114691279
1   AA  2/18/2011   $17.39	$17.68 $17.28	$17.28 80023895
1   AA  2/25/2011   $16.98	$17.15 $15.96	$16.68 132981863
1   AA  3/4/2011    $16.81	$16.94 $16.13	$16.58 109493077
1   AA  3/11/2011   $16.58	$16.75 $15.42	$16.03 114332562
1   AA  3/18/2011   $15.95	$16.33 $15.43	$16.11 130374108
1   AA  3/25/2011   $16.38	$17.24 $16.26	$17.09 95550392
1   AXP 1/7/2011    $43.30	$45.60 $43.11	$44.36 45102042
1   AXP 1/14/2011   $44.20	$46.25 $44.01	$46.25 25913713
1   AXP 1/21/2011   $46.03	$46.71 $44.71	$46.00 38824728
1   AXP 1/28/2011   $46.05	$46.27 $43.42	$43.86 51427274
1   AXP 2/4/2011    $44.13	$44.23 $43.15	$43.82 39501680
1   AXP 2/11/2011   $43.96	$46.79 $43.88	$46.75 43746998
1   AXP 2/18/2011   $46.42	$46.93 $45.53	$45.53 28564910
1   AXP 2/25/2011   $44.94	$45.12 $43.01	$43.53 39654146

I particularly mean what are the hidden states here? What are the observed states and what are the transition, individual and emission probabilities associated with the data. I have seen many papers saying about the stock data but nothing very clearly. In most of the HMM examples the dataset is not shown but they explain about the steps to be taken.

Can someone explain how to use this dataset or some other one (but please show the dataset also) for HMM. A step by step demo will be of great help. In particular I want to see how the dataset is used rather than explaining about the forward, Viterbi and Baum-Welch algorithms and no explanation on weather=(sunny, rainy, windy) scenario.

For any help advance thanx.

Best Answer

There is no one way to apply a HMM to stock market data. There are many options, and you might find that some ways are more useful than others.

In my opinion, the primary restriction you should concern yourself with is what the state process should represent. Your only constraint for this decision is that the state process must evolve on a discrete state space (e.g. $\{1,2,3,4\}$). In a economic/financial context, the values of this state space are often called "regimes."

One example: your state $x_t \in \{0,1\}$. If $x_t = 0$, then the mean return is positive. Otherwise, the mean return is negative. This suggests that your state variable could have the interpretation of whether or not you're in a "bull" or "bear" market. There's a paper from the 1990's by Hamilton that people cite quite often that implements this idea (I think). I am not sure, though, because I have never read it.

Another example: assume your state $x_t \in \{0,1\}$ again, and if $x_t = 0$, then some measure of dispersion for the return distribution is low, and otherwise it's high. Then your state would have the interpretation of whether you're in a low volatility or high volatility regime. This would be a competitor to other stochastic volatility models (most of the time their state processes evolve on a continuous state space, like $\mathbb{R}$, though).

Two more things. I have never seen the observations called "states." You want to reserve that word for the hidden/latent portions, probably. These observations will most likely be some of returns, although there is no reason that that is necessary. Second, these are not an exhaustive list of suggestions. Try fitting different models, and see if they are useful.

Related Question