I would change "visits every state infinitely often" to "visits every neighborhood of every point infinitely often". This is because the process you described does not actually hit every point; in fact it has probability zero of hitting any given point. One can say that it's "neighborhood recurrent" (for $d\le 2$) to emphasize this fact.

Since superharmonic functions are lower semicontinuous, you can conclude that $f\le M_\infty$ at every point. On the other hand, the supermartingale property implies that $E f(X_n)$ does not increase with $n$. Therefore, $M_\infty\le f(x)$ where $x$ is the starting point. Since the starting point was arbitrary, it follows that $f(x)\equiv M_\infty$.

In dimensions $d>2$ there are positive superharmonic functions such as $\min(1,|x|^{2-d})$. (Truncated by $1$ just to make it bounded and continuous; $|x|^{2-d}$ is superharmonic too). You can conclude that the process you described is not neighborhood recurrent when $d>2$.

The difference is the sharper constants for the sub-Gaussian norm/parameter of the bounded variables. You can still use Azuma's technique to get the sharper bound. Here are some notes I wrote a while ago:

**Sub-Gaussian tail bounds**

Recall that $\mathbb{E} e^{\lambda X} \le e^{\sigma^2 \lambda^2/2 }$ implies the following tail bound ($\mathbb{E} X = 0$)
\begin{align*}
\mathbb{P}( X \ge t) \le \exp\Big( {- \frac{t^2}{2\sigma^2}}\Big)
\end{align*}
Such a random variable is called sub-Gaussian with parameter $\sigma$.

A bounded variable $X \in [a,b]$ has squared sub-G parameter $\sigma^2 = (b-a)^2/4$. This sharp bound requires some work to show (left as an exercise.)

**Azuma-Hoeffding approach**

We want to provide concentration bounds for $Z = f(X) = f(X_1,\dots,X_n)$.

Let $\mathbb{E}_i[Z] := \mathbb{E}[Z | X_1,\dots,X_i]$ and $\Delta_i = \mathbb{E}_i[Z] - \mathbb{E}_{i-1}[Z]$. Then, $\{\Delta_i\}$ is a martingale difference sequence: $\mathbb{E}_{i-1}[\Delta_i] = 0$. Let $S_j := \sum_{i=1}^j \Delta_i$ which is only a function of $X_i, i \le j$.
\begin{align*}
S_n = \sum_{i=1}^n \Delta_i = Z - \mathbb{E} Z
\end{align*}
Let us assume that (*) $\mathbb{E}_{i-1} [e^{\lambda \Delta_i}] \le e^{\sigma_i^2 \lambda^2/2}$ for all $\lambda \ge 0$, and all $i$, almost surely. Then,
\begin{align*}
\mathbb{E}_{n-1} [ e^{\lambda S_n}] = e^{\lambda S_{n-1}}\mathbb{E}_{n-1} [ e^{\lambda \Delta_n}] \le e^{\lambda S_{n-1}} e^{\sigma_n^2 \lambda^2/2}
\end{align*}
Taking $\mathbb{E}_{n-2}$ of both sides:
\begin{align*}
\mathbb{E}_{n-2} [ e^{\lambda S_n}] \le e^{\sigma_n^2 \lambda^2/2} \mathbb{E}_{n-2}[ e^{\lambda S_{n-1}} ] \le
e^{\lambda S_{n-2}} e^{(\sigma_n^2 + \sigma_{n-1}^2)\lambda^2/2}
\end{align*}
Repeating the process, we get $\mathbb{E}_{0} [e^{\lambda S_n}] \le \exp ( (\sum_{i=1}^n \sigma_i^2)\lambda^2/2 ) $. Showing that $S_n$ is sub-G with squared-param. $\sigma^2 = \sum_{i=1}^n \sigma_i^2$.

**Bounded difference**

Conditional sub-G assumption (*) holds under bounded difference
\begin{align*}
|f(x_1,\dots,x_{i-1},x_i,x_{i+1},\dots,x_n) - f(x_1,\dots,x_{i-1},x_i',x_{i+1},\dots,x_n)| \le L_i
\end{align*}
Let $g_i(x_1,\dots,x_i) := \mathbb{E}[f(x_1,\dots,x_{i-1},x_i,X_{i+1},\dots,X_n) ]$ so that $\mathbb{E}_i[Z] = g_i(X_1,\dots,X_i)$. This uses independence, i.e. the distribution of the rest does not change by conditioning. It is easy to see that $g_i$ satisfies bounded difference condition with constants $(L_1,\dots,L_i)$, by an application of Jensen.

**Naive bound:** We have
\begin{align*}
\Delta_i = \mathbb{E}_i[Z] - \mathbb{E}_{i-1}[\mathbb{E}_i[Z]] = g_i(X_1,\dots,X_i) - \mathbb{E}_{i-1}[g_i(X_1,\dots,X_i)]
\end{align*}
Conditioned on $X_1,\dots,X_{i-1}$, we are effectively looking at ($X'_i$ is an independent copy of $X_i$)
\begin{align*}
g_i(x_1,\dots,x_{i-1},X_i)- \mathbb{E}[g_i(x_1,\dots,x_{i-1},X'_i)]
\end{align*}
due to independence of $\{X_i\}$. Thus, $|\Delta_i| \le L_i$ condition on $X_1,\dots,X_{i-1}$. That is, $\mathbb{E}_{i-1} [e^{\lambda \Delta_i}] \le e^{\sigma_i^2 \lambda^2/2}$ where $\sigma_i^2 = (2L_i)^2/4 = L_i^2$.

**Better bound:** We can show that $\Delta_i \in I_i$ where $|I_i| \le L_i$, improving the constant by $4$: Conditioned on $X_1,\dots, X_i$, we are effectively looking at
\begin{align*}
\Delta_i = g_i(x_1,\dots,x_{i-1},X_i) - \mu_i
\end{align*}
where $\mu_i$ is a constant (only a function of $x_1,\dots,x_{i-1}$). Then, $\Delta_i + \mu_i \in [a_i,b_i]$ where $a_i = \inf_x g_i(x_1,\dots,x_{i-1},x)$ and $b_i = \sup_x g_i(x_1,\dots,x_{i-1},x)$. We have
\begin{align*}
b_i - a_i = \sup_{x,y} \big[ g_i(x_1,\dots,x_{i-1},x) - g_i(x_1,\dots,x_{i-1},y) \big] \le L_i
\end{align*}
Thus, we have $\mathbb{E}_{i-1}[e^{\lambda \Delta_i}] \le e^{\sigma^2 \lambda^2/2}$ where $\sigma_i^2 = (b_i-a_i)^2/4 \le L_i^2/4$.

## Best Answer

$f(S_{n+1})=f(S_n+\xi_{n+1})$

Using Did's Hint: If $g(x)=E(f(x,Y))$ then $g(X)=E(f(X,Y)|X)$

$\Rightarrow E(f(S_n+\xi_{n+1})|\mathcal F_n)=g(S_n)$

but $\displaystyle g(x)=\int f(x+y)\nu(dy)$

and in our case $\nu(dy)=\mathbf 1_{B(0,1)}\frac{1}{|B(0,1)|}dy$ (because $\xi_i$ are uniform on $B(0,1)$)

$\displaystyle\Rightarrow g(S_n)=\int f(S_n+y)\mathbf 1_{B(0,1)}\frac{1}{|B(0,1)|}dy=\frac{1}{|B(0,1)|}\int_{B(S_n,1)}f(y)dy\le f(S_n)$