Before you start thinking about the standard errors of your estimates, it would be good to know, or to check, if your Markov chain has a stationary distribution.
There is no way to say anything from the small sample you provided as it seems to be just randomly shuffled pairs of states, rather than the actual history of the process (the pair on the next line doesn't start in the state where the previous line ended). Going forward I will presume that the chain has a stationary distribution.
Regarding the bootstrap for the state pairs, it seems you are trying to obtain a confidence interval for your maximum likelihood estimates of the transition probabilities. The bootstrap can be used for this, but I would consider bootstrapping each row of the transition matrix at a time - basically simulating a multinomial random variable. (see also this paper)
A row-wise approach also guarantees that you will not obtain invalid matrices from your bootstrap.
I think you could also be able to calculate the 95% confidence intervals for your estimates directly (for example using the MultinomialCI package):
> library(MultinomialCI)
> t1 <- c("A","A","B","C","B","C","E","E","A","C","B","A","C","C")
> t2 <- c("B","C","C","D","C","D","A","E","D","B","A","C","D","B")
>
> transition_probs <- data.frame(t1,t2)
>
> # illustration for departure state A
> est_trans_prob_A <- table(transition_probs$t2[transition_probs$t1=="A"])/
+ length(transition_probs$t2[transition_probs$t1=="A"])
> ci <- multinomialCI(table(transition_probs$t2[transition_probs$t1=="A"]),
+ alpha=0.05)
>
> result <- data.frame(est_p = est_trans_prob_A, lo_ci=ci[,1], up_ci=ci[,2])
> result
est_p.Var1 est_p.Freq lo_ci up_ci
1 A 0.00 0.00 0.6083073
2 B 0.25 0.00 0.8583073
3 C 0.50 0.25 1.0000000
4 D 0.25 0.00 0.8583073
5 E 0.00 0.00 0.6083073
On the other hand, in case you have some previous knowledge about the transition probabilities, you might prefer the CoinMinD package which allows to specify a Dirichlet prior and run a Bayesian analysis using MCMC.
Best Answer
Short Answer
Use the Kolmogorow-Smirnov test to identify a time window such that the eigenvalues of the Transition matrix is not sensitive to the precise choice of window.
Longer Answer
In Markov theory, there are two ways of formulating the dynamics.
One can show that these two formulations are related through a matrix exponential.
\begin{equation} T = e^{Q\delta t} \end{equation}
Continuous Time Being Well Defined
So regarding your first point regarding self transition being well-defined:
pnext.msm
quantity gives the probability of which state will come next. This is achieved by the equation in your question which just calculates the relative probabilities of transitions.Discrete Time Choice of Time Window
Now, let's say we want to look at discrete time and talk about "Which state will we be in 1 second from now?". Then we can easily find our $T$ from $Q$ if we know what our $\delta t$ is.
How do we choose the time window? Well we want a window such that the process is memoryless (Markovian). Often real world processes are non-Markovian on short time scales.
To check for Markovianity, we compute the matrix at a range of windows and examine that the dominant eigenvalue $\lambda$ is insensitive to the precise choice of window.
\begin{equation} \frac{log(\lambda(\delta t))}{\delta t}\approx \frac{log(\lambda(2\delta t))}{2\delta t} \end{equation}