In general, the family of random variables $(S_T)_T$ fails to be uniformly integrable. Let's introduce the following definitions:
Definition Let $(X_t)_{t \geq 0}$ be a jointly measurable stochastic process. If the family $$\{X_{\tau}; \tau < \infty \, \, \text{stopping time}\}$$ is uniformly integrable, then we say that $(X_t)_{t \geq 0}$ is of class (D). If $$\{X_{\tau}; \tau \leq M \, \, \text{stopping time}\}$$ is uniformly integrable for any constant $M>0$, then $(X_t)_{t \geq 0}$ is of class (DL).
As you already pointed out, any uniformly integrable martingale with càdlàg sample paths is of class (D). Let me mention that these notions play an important role for the Doob-Meyer decomposition of (sub)martingales.
For submartingales there is the following statement (see Lemma 5 here):
Lemma A càdlàg submartingale is of class (DL) if and only if its negative part is of class (DL).
In particular, any non-negative càdlàg submartingale is of class (DL). Moreover, one can show the following statement (see Lemma 4):
Lemma: A non-negative càdlàg submartingale is of class (D) if, and only if, it is uniformly integrable.
The equivalence does, in general, not hold if we drop the assumption of non-negativity.
Example Let $(B_t)_{t \geq 0}$ be a three-dimensional Brownian motion started at $B_0 =( 1,0,0)$. If we set $u(x) := \frac{1}{|x|}$, then $M_t := u(B_t)$ is a non-negative supermartingale. Note that $(M_t)_{t \geq 0}$ has continuous sample paths with probability 1 since $$\mathbb{P}(\exists t>0: B_t=0)=0$$ (recall that $(B_t)_t$ is a three-dimensional Brownian motion; in dimension $d=1$ this statement is plainly wrong). It is possible to show that $(M_t)_{t \geq 0}$ is uniformly integrable but not of class (D), see the very end of the paper (1). Consequently, the process
$$N_t := -M_t$$
is a uniformly integrable submartingale which is not of class (D).
There is the following equivalent characterization (see Chapter 2 in (2)):
Theorem: Let $(X_t)_{t \geq 0}$ be a right-continuous submartingale. Then:
- $(X_t)_{t \geq 0}$ is of class (DL) if, and only if, there exists a right-continuous martingale $(M_t)_{t \geq 0}$ and a non-decreasing predictable process $(A_t)_{t \geq 0}$ such that $X=M+A$.
- $(X_t)_{t \geq 0}$ is of class (D) if, and only if, it admits a Doob-Meyer decomposition $X=M+A$ for a uniformly integrable right-continuous martingale $(M_t)_{t \geq 0}$ and a non-decreasing predictable uniformly integrable process $(A_t)_{t \geq 0}$.
Reference
(1) Johnson, G., Helms, L.L.: Class D Supermartingales. Bull. Am. Math. Soc. 69 (1963), 59-62. (PDF)
(2) Yeh, J.: Martingales and Stochastic Analysis. World Scientific, 1995.
It is part of the definition of the conditional expectation that
$$E[M_t 1_A] = E[E[M_t \mid \mathcal{F}_0] 1_A]$$
for any $A \in \mathcal{F}_0$. By taking $A = \Omega$ we see that the law of iterated expectations is an immediate consequence of the definition (without assuming that $M_t \in L^1$).
As a result, if $M_0 \in L^1$ and $(M_t)_{t \geq 0}$ satisfies condition $2$ then by your reasoning we have that $M_t$ is a martingale.
It's also fairly easy to see that there exist processes $M_t \not \in L^1$ such that the conditional expectations exist and $E[M_t \mid \mathcal{F}_s] = M_s$. For example, let $X$ be a non-integrable, non-negative random variable and for every $t \geq 0$, let $\mathcal{F}_t = \sigma(X)$ and $M_t = X$. Then we obviously have $$E[M_t \mid \mathcal{F}_s] = E[X \mid \sigma(X)] = X.$$
Best Answer
To show U.I.,
\begin{align} &\mathbb{E}[\exp(M_s);\exp(M_s)\geq K] && \\ &\leq \mathbb{E}[\mathbb{E}[\exp(M_\infty)|\mathcal{F}_s]; \exp(M_s)\geq K] && \text{(submartingale property})\\ & \leq\mathbb{E}[\exp(M_\infty); \exp(M_s)\geq K] \end{align}
Hence you need only to show that $\sup_{0\leq s\leq \infty}\mathbb{P}[\exp(M_s)\geq K]\to 0$ as $K\to \infty$, which you can do using Doob's inequality.
And just to explain the notation: $$\mathbb{E}[X;A]:=\int_A X(\omega)\mathrm{d}\mathbb{P}(\omega)$$