How many turns will it take for this coin flipping game to end

probability

Here is a problem (coin flipping game) I recently thought about.

Suppose there are 2 Discrete Time Markov Chains:

  • Markov Chain A (3-state chain):

$$
X_t =
\begin{bmatrix}
1/3 & 1/3 & 1/3 \\
1/3 & 1/3 & 1/3 \\
0 & 0 & 1 \\
\end{bmatrix}
$$

  • Markov Chain B (4-state chain):
    $$
    Y_t =
    \begin{bmatrix}
    1/4 & 1/4 & 1/4 & 1/4 \\
    1/4 & 1/4 & 1/4 & 1/4 \\
    0 & 0 & 1 & 0 \\
    1/4 & 1/4 & 1/4 & 1/4 \\
    \end{bmatrix}
    $$

In this problem, we start in Chain A in state 1.

  • At each new time point, we flip a 2-sided fair coin : If heads, we stay in the same state we are currently in but move to Chain B. If tails, we stay in Chain A.
  • The first time we move from Chain A to Chain B, we permanently stay in Chain B
  • Once we reach State 3 (whether in Chain A or in Chain B), we stop forever.

I tried to represent all of this mathematically:

$$
B_t =
\begin{cases}
1 & \text{with probability } p \\
0 & \text{with probability } 1-p
\end{cases}
$$

$$
S_t = \sum_{i=0}^{t} B_i
$$

$$
Z_t =
\begin{cases}
X_t & \text{if } S_t = 0 \\
Y_t & \text{if } S_t \geq 1
\end{cases}
$$

$$
T = \min \{ t \geq 0 : Z_t = 3 \}
$$

Based on this, I am interested in answering the following types of questions:

  • On average, what percent of the time will $Z_t$ stop in $X_t$ vs $Y_t$?
  • Is it possible to find out the joint probability distribution of (number of steps to switch from $X_t$ to $Y_t$, number of steps for process to stop)?

I am not sure how to answer these questions, so I wrote R simulations to try and answer them numerically. For example:

library(tidyverse)
library(dplyr)
library(ggplot2)
library(RColorBrewer)


simulate_markov_chain <- function(simulation_num) {
    # Transition matrices
    transition_matrix_A <- matrix(c(1/3, 1/3, 1/3,  # probabilities from state 1
                                    1/3, 1/3, 1/3,  # probabilities from state 2
                                    0,   0,   1),   # probabilities from state 3
                                  nrow = 3, byrow = TRUE)
    
    transition_matrix_B <- matrix(c(1/4, 1/4, 1/4, 1/4,  # probabilities from state 1
                                    1/4, 1/4, 1/4, 1/4,  # probabilities from state 2
                                    0,   0,   1,   0,    # probabilities from state 3
                                    1/4, 1/4, 1/4, 1/4), # probabilities from state 4
                                  nrow = 4, byrow = TRUE)
    
    # Initial state and chain
    state <- 1
    chain <- "A"
    
   
    path_df <- data.frame(iteration = 1, chain = chain, state = state)
    
    # Simulate 
    iteration <- 1
    while (state != 3) {
        # Flip a coin
        coin_flip <- sample(c("heads", "tails"), size = 1, prob = c(0.5, 0.5))
        
        # Check if switch to chain B
        if (coin_flip == "heads" || chain == "B") {
            chain <- "B"
            state <- sample(1:4, size = 1, prob = transition_matrix_B[state, ])
        } else {
            state <- sample(1:3, size = 1, prob = transition_matrix_A[state, ])
        }
        
       
        iteration <- iteration + 1
        path_df <- rbind(path_df, data.frame(iteration = iteration, chain = chain, state = state))
    }
    
   
    path_df$simulation_num <- simulation_num
    
    return(path_df)
}



# simulation 1000 times 
results <- map_dfr(1:1000, simulate_markov_chain)




############################################################


results <- results %>%
    group_by(simulation_num) %>%
    mutate(transition_iteration = ifelse(chain == "B" & lag(chain) == "A", iteration, NA),
           max_iteration = max(iteration)) %>%
    ungroup()



transition_iterations <- results %>% 
    group_by(simulation_num) %>% 
    summarise(transition_iteration = min(transition_iteration, na.rm = TRUE),
              max_iteration = max(iteration))


# Filter out rows with Inf values
transition_iterations <- transition_iterations %>% filter(!is.infinite(transition_iteration), !is.infinite(max_iteration))

density_estimate <- MASS::kde2d(x = transition_iterations$transition_iteration, 
                                y = transition_iterations$max_iteration, 
                                n = 100)

filled.contour(x = density_estimate$x, 
               y = density_estimate$y, 
               z = density_estimate$z, 
               color.palette = colorRampPalette(brewer.pal(9, "Greens")),
               plot.title = title(main = "Contour Plot of Transition Iteration vs Max Iteration",
                                  xlab = "Transition Iteration", 
                                  ylab = "Max Iteration"),
               plot.axes = {axis(1); axis(2); })

enter image description here

As we can see, the majority of the games finish relatively early on (i.e. bottom left corner of the plot).

Note that sometimes $Z_t$ never transitions to $Y_t$ therefore the transition time is Infinite and I had to remove these rows for the simulation. I am not sure if a probability distribution can be defined with such infinite values.

But analytically (i.e. without simulations), is it possible to answer the following questions?

  • On average, what percent of the time will $Z_t$ stop in $X_t$ vs $Y_t$? (i.e. stopping in Chain A vs stopping in Chain B)
  • Is it possible to find out the joint probability distribution of (number of steps to switch from $X_t$ to $Y_t$, number of steps for process to stop)?
  • Within a given simulation, after $n$ turns of playing the game, what is the probability we are in Chain A?

Normally I would have used the Fundamental Matrix approach (e.g. https://en.wikipedia.org/wiki/Absorbing_Markov_chain), but I am not sure if it can be applied here.

Thanks!

Note: I made this visualization for the percent of games in Chain A vs Chain B as the number of turns in a simulation increase:

library(ggplot2)
chain_counts <- results %>%
    group_by(iteration, chain) %>%
    summarise(n = n(), .groups = "drop")

total_counts <- results %>%
    group_by(iteration) %>%
    summarise(total = n(), .groups = "drop")

percentages <- inner_join(chain_counts, total_counts, by = "iteration") %>%
    mutate(percentage = n / total * 100)


ggplot(percentages, aes(x = iteration, y = percentage, fill = chain)) +
    geom_bar(stat = "identity", position = "dodge") +
    labs(x = "Simulation Number", y = "Percentage", fill = "Chain") +
    theme_minimal() +
    scale_fill_brewer(palette = "Set1") +
    scale_x_continuous(breaks = seq(min(percentages$iteration), max(percentages$iteration), by = 1)) +ggtitle("Length of Game vs Percentage of Games in Chain A/ Chain B")

enter image description here

Best Answer

For your first question you ask you have a three position Markov chain. Position $1$ is the start. Position $2$ is Chain $A$, state $3$ and position $3$ is Chain $B$, all positions. Positions $2$ and $3$ are absorbing. From start, you go to $3$ with probability $\frac 12$ on the coin flip. You go to $2$ with probability $\frac 16$, with a factor $12$ from the coin flip and a factor $\frac 13$ from the die roll to pick state $3$. You return to start with probability $\frac 13$.

We can ignore the returns to start to see that the chance you end in position $2$ is $\frac 14$ and the chance you end in position $3$ is $\frac 34$.

For the second and third questions your Markov chain gains one state. You split the old position $3$ into position $3$ which is Chain $B$, not in state $3$ and position $4$, which is Chain $B$, position $4$. Positions $2$ and $4$ are absorbing. The probabilities from start are $\frac 13$ to return to start, $\frac 16$ to position $2, \frac 38$ to position $3$, and $\frac 18$ to position $4$. From position $3$ it is $\frac 34$ to return to position $3$ and $\frac 14$ to move to position $4$ and end.

The chance you move from Chain $A$ to Chain $B$ on move $n$ is $\left ( \frac 13 \right )^{n-1}\cdot \frac 12$ where the first factor comes from $n-1$ returns to start and the second comes from the final tails on the coin flip. You can sum this to get the chance you are on Chain $B$ after $n$ moves. Subtracting from $1$ gives the chance you are in Chain $A$ after $n$ moves. As $n \to \infty$ the chance you are in Chain $B$ approaches $\frac 34$ as it should from the first paragraph.

Related Question