EDIT: I've added the closed form expression of the probability
The probablity of unsuccessful walk after $n$ steps is given as follows
$$
P(n) = \sum_{k = 0}^{n} 2^{-n}p(k,n)$$
where
$$p(k,n)=\begin{cases}
\displaystyle\binom{n}{k} & k < x_c-x_0\\
\displaystyle\binom{n}{k} - \binom{n}{k+x_0-x_c}&x_c - x_0\leq k < \dfrac{n + x_c -x_0}{2}\\
0 &k \geq \dfrac{n + x_c - x_0}{2}
\end{cases}$$
or compactly
$$
P(n) = \sum_{k = 0}^{\lfloor\frac{1}{2}(n+x_c-x_0)\rfloor} 2^{-n}\binom{n}{k} - \sum_{k = x_c-x_0}^{\lfloor\frac{1}{2}(n+x_c-x_0)\rfloor} 2^{-n}\binom{n}{k+x_0-x_c}$$
The function $p(k,n)$ represents the number of paths that reach position $x = x_0-n+2k$ without reaching the termination target $x_c$. Hence $P(n)$ is the sum of $p(k,n)$ for all $x$, with the weight $2^{-n}$ for each path. In order to derive the formula of $p(n)$ we need to prove the following relation
$$\text{TNOW to reach } x\text{ from } x_0 \text{ without reaching }x_c\\= \\ \text{TNOW to reach } x\text{ from } x_0 - \text{TNOW to reach }x \text{ from } 2x_c - x_0$$
TNOW stands for "the number of ways" or the number of random walks. Let $X(i)$ be the position at step $i$ of a random walk that crosses or touches the boundary $x_c$. Let's define the reflection $R$ that maps the random walk $X$ to another random walk $Y_X$ by the following formula
$$Y_X(i) = \begin{cases}
2x_c - X(i) &i < L\\
X(i) &i \geq L
\end{cases}$$
where $L$ is the last step that at which the randow walk reached the position $x_c$. The figure below illustrates how the function is defined
This function maps random walk that starts from point $x_0$ and ends at $x$ to a random walk that starts at point $2x_c-x_0$ and also ends at $x$, given they reach the boundary at least once, and vice versa. Furthermore, this reflection is an involution i.e. $R(R(X)) = X$, hence it is bijection between these set of random walks. And therefore their numbers are equal. Since the number of RW that reach $x$ and don't reach $x_c$ = the number of RW that reach $x$ - the number of RW that reach x and reach $x_c$ at some step(s), this gives us the relation aforementioned.
This reflection is the technique that is used to prove the reflection principle
The last thing we need to do is calculate the number of RW that starts from $x_0$ and $2x_c - x_0$, which are given by binomial distribution. This gives the formula of $p(k,n)$ Q.E.D.
Here's the old answer with approximation
The amount of time for a random process to reach a certain point for the first time is known as the first hitting time. For a Brownian motion, the survival probability of particle remains at the posittion $x < x_c$ at time $t$ is given by
$$
\DeclareMathOperator{\erf}{erf}
S(t) =\erf\left(\frac{x_c-x_0}{2\sqrt{Dt}}\right)$$
where $D$ is the diffusion constant. The expected number of trials until the first hitting time is larger than $t$ is $\dfrac{1}{S(t)}$. A 1D random walk with time step $\delta$ and step size $\epsilon$ is approximately a 1D Brownian motion with $D=\dfrac{\epsilon^2}{2\delta}$, thus
$$\DeclareMathOperator{\erf}{erf}
S_{\mathrm{RW}}(t)\approx \erf\left(\frac{x_c-x_0}{2\sqrt{\frac{\epsilon^2t}{2\delta}}}\right)$$
Given $\delta = 1, \epsilon = 1, x_c = 100, x_0 = 0$ we have $S(100000) \approx \mathrm{erf}(0.3863) \approx 0.2481\%$ and the expected number of trials is approximately $4.03$
The data given in the program, however, correspond to $x_c = 10$ (targer <- 10
) and so $S(100000) \approx \mathrm{erf}(0.03863) \approx 2.52\%$ . $10$ is quite low and the lower $x_c$ leads to higher error in the approximation.
Edit: I've also used Python and in R to test it out (not very efficient though)
import random
t = []
target = 100
trials = 1000
for i in range(trials):
x = 0
for j in range(100000):
x += -1 + random.randint(0,1)*2
if x == target:
t.append(j)
break
print(f"Trial {i+1} finished after {j+1} steps")
print(1 - len(t)/trials)
R:
num_simulations <- 1000
max_steps <- 100000
target <- 100
steps <- numeric(num_simulations)
trajectories <- list()
terminated <- 0
for (i in 1:num_simulations) {
position <- 0
print(i)
for (step in 1:max_steps) {
position <- position + sample(c(-1, 1), 1)
if (position == target) {
steps[i] <- step
break
}
}
if (step == max_steps) {
terminated <- terminated + 1
}
}
sprintf("The percentage of unsuccessful attempts is %f %%", (terminated / num_simulations) * 100)
I got $22.6\%$ and $24.4\%$ for $x_c = 100$ and $2.5\%$ and $3.2\%$ for $x_c = 10$. which are quite close to the estimation
Best Answer
I think I have just got some sort of answer thanks to this question.
So, let's say $S_n=1 + \sum_{k=0}^n X_k$ where $(X_k)k$ are i.i.d. random variables with $\mathbb P (X_k = 1) = \frac 1 2 = 1 - \mathbb P (X_k = -1).$
Then, set $S_T = \inf\{ n\in\mathbb N :S_n \in \{0,K\} \}$ and calculate $$ \mathbb E[S_T] = 0 \ \mathbb P (S_T = 0) + K\ \mathbb P (S_T = K) = K (1 - \mathbb P (S_T = 0)) = K(1-\mathbb P(T_0 < T_K)), $$ where $$ T_0 = \inf\{n: S_n = 0\},\ T_K = \inf\{ n:S_n = K \}. $$ With the Optional Stopping Theorem, you get $\mathbb E[S_T]= \mathbb E[S_0] = 1$, and thus $$ K(1-\mathbb P(T_0 <T_K)) = 1, $$ which then leads to $\mathbb P(T_0 <T_K) = 1-\frac 1 K.$