I'm sure that Durrett's proof is nice. A straight forward solution to the question asked is as follows.
For $n \geq 1$
$$
\begin{array}{rcl}
P(N_t = n) & = & \int_0^t P(S_{n+1} > t \mid S_n = s) P(S_n \in ds) \\
& = & \int_0^t P(T_{n+1} > t-s) P(S_n \in ds) \\
& = & \int_0^t e^{-\lambda(t-s)} \frac{\lambda^n s^{n-1} e^{-\lambda s}}{(n-1)!} \mathrm{d} s \\
& = & e^{-\lambda t} \frac{\lambda^n }{(n-1)!} \int_0^t s^{n-1} \mathrm{d} s \\
& = & e^{-\lambda t} \frac{(\lambda t)^n}{n!}
\end{array}
$$
For $n = 0$ we have $P(N_t = 0) = P(T_1 > t) = e^{-\lambda t}$.
This does not prove that $(N_t)_{t \geq 0}$ is a Poisson process, which is harder, but it does show that the marginal distribution of $N_t$ is Poisson with mean $\lambda t$.
The discussion in your book is not phrased correctly in some aspects, but first let me address your question about conditioning on an event of probability $0$; something that is explicitly forbidden in the definition of conditional probability in the earlier chapter of your book.
For jointly continuous random variables $X$ and $Y$ with joint pdf $f_{X,Y}(u,v)$, the conditional pdf of $Y$ given that $X = x$ is defined to be
$$f_{Y\mid X}(v\mid X = u) = \begin{cases}
\displaystyle \frac{f_{X,Y}(u,v)}{f_{X}(u)}, & \text{if }~f_{X}(u)>0,\\0, &\text{otherwise.}\end{cases}$$ where $f_X(u)$ is the (marginal) pdf
of $X$. The conditional complementary CDF is
$$1-F_{Y\mid X}(t\mid X = u) = P\{Y > t\mid X = u\} =
\int_t^\infty f_{Y\mid X}(v\mid X = u) \,\mathrm dv$$
Now, in your application, $P\{X_2 > t\mid X_1 = s\}$ can be
calculated directly since we are told that the first arrival occurred
at $s$ and are being asked for the conditional probability that no arrivals have occurred in $(s,s+t]$. But, what happens in $(s,s+t]$ is independent of what happened in $(0,s]$ since the time intervals are
disjoint. That is, $P\{\text{no arrivals in} ~ (s,s+t]\mid X_1=s\}$ is the same regardless of whether we assume that there was an arrival at $s$ or the first arrival occurred before time $s$, and so
$$P\{X_2 > t\mid X_1 = s\} = P\{\text{no arrivals in} ~ (s,s+t]\}
= e^{-\lambda t}.$$ and thus we get that the conditional pdf
$f_{X_2\mid X_1 = s}(v\mid X_1 = s)$ is the same as the unconditional pdf $f_{X_2}(v) = \lambda e^{-\lambda v}, v > 0$. Conditionally or unconditionally, the distribution of $X_2$ is exponential with parameter $\lambda$. Furthermore,
\begin{align}
f_{X_2}(v) = f_{X_2\mid X_1 = s}(v\mid X_1 = s)
= \displaystyle \frac{f_{X_1,X_2}(s,v)}{f_{X_1}(s)}
\implies f_{X_1,X_2}(s,v) = f_{X_1}(s)f_{X_2}(v)
\end{align}
showing that $X_1$ and $X_2$ are independent (exponential random variables
with parameter $\lambda$).
The answers to our specific questions are hidden somewhere in the above.
Best Answer
An interesting property of Poisson processes is that each event can be considered as "placed" independently and uniformly at a given time $t$ in $[0,T]$ (just like rain drops falling uniformly over the length of a board of length $T$). In other words, if there are $n$ events $\{\tau_i\}_{i=1}^n$, we have $\tau_i \sim \mathrm{Unif}(0,T)$ for all $i$. What you are looking for is the distribution of $t_1=\min \{\tau_i: i=1,\dots,n\}$. Please see this answer for distribution of the minimum of uniform IID variables. In your case: $$ p(t_1|n) = \frac{n}{T^n}(T-t_1)^{n-1} \quad\text{for $t_1\in[0,T]$}. $$