If the observations of a stochastic process are irregularly spaced the most natural way to model the observations is as discrete time observations from a continuous time process.
What is generally needed of a model specification is the joint distribution of the observations $X_{1}, \ldots, X_n$ observed at times $t_1 < t_2 < \ldots < t_n$, and this can, for instance, be broken down into conditional distributions of $X_{i}$ given $X_{i-1}, \ldots, X_1$. If the process is a Markov process this conditional distribution depends on $X_{i-1}$ $-$ not on $X_{i-2}, \ldots, X_1$ $-$ and it depends on $t_i$ and $t_{i-1}$. If the process is time-homogeneous the dependence on the time points is only through their difference $t_i - t_{i-1}$.
We see from this that if we have equidistant observations (with $t_i - t_{i-1} = 1$, say) from a time-homogeneous Markov process we only need to specify a single conditional probability distribution, $P^1$, to specify a model. Otherwise we need to specify a whole collection $P^{t_{i}-t_{i-1}}$ of conditional probability distributions indexed by the time differences of the observations to specify a model. The latter is, in fact, most easily done by specifying a family $P^t$ of continuous time conditional probability distributions.
A common way to obtain a continuous time model specification is through a stochastic differential equation (SDE)
$$dX_t = a(X_t) dt + b(X_t) dB_t.$$
A good place to get started with doing statistics for SDE models is Simulation and Inference for Stochastic Differential Equations by Stefano Iacus. It might be that many methods and results are described for equidistant observations, but this is typically just convenient for the presentation and not essential for the application. One main obstacle is that the SDE-specification rarely allows for an explicit likelihood when you have discrete observations, but there are well developed estimation equation alternatives.
If you want to get beyond Markov processes the stochastic volatility models are like (G)ARCH models attempts to model a heterogeneous variance (volatility). One can also consider delay equations like
$$dX_t = \int_0^t a(s)(X_t-X_s) ds + \sigma dB_t$$
that are continuous time analogs of AR$(p)$-processes.
I think it is fair to say that the common practice when dealing with observations at irregular time points is to build a continuous time stochastic model.
Your approach makes sense. A commercial piece of software I was associated with for a couple of years did exactly this.
Your outline applies to Single Exponential Smoothing (SES), but of course you could apply the same treatment to trend or seasonal components. For the seasonal ones, you would need to go back a full seasonal cycle, just as for updating.
Another alternative would of course be to simply interpolate missing values. This is an option in newer versions of ets(..., na.action="na.interp")
.
From what little I know of state space models, it should not be overly hard to simply treat the missing data as unobserved. I am not sure why this is not implemented in the forecast
package. A quick search through Rob Hyndman's blog didn't really yield anything useful.
Best Answer
Models like ARIMA are defined in terms of lagged variables, so you need the subsequent points. Prophet (Taylor and Letham, 2017) is defined in terms of regression-like model
$$ y(t) =g(t) +s(t) +h(t) + \varepsilon_t $$
where
The trend function $g(t)$ is defined in terms of piecewise regression, seasonality $s(t)$ uses Fourier terms, and holiday effects $s(t)$ are just dummies. None of the features needs you to have all the points, since if lacking information, it wouldn't use it to estimate anything, but will just interpolate between the known points. Saying it differently, if you have points $a < b < c$, but $b$ is unknown, then you can still fit the line (or curve) to $a$ and $c$ and interpolate for $b$. What Prophet does, it just fits many different lines (trend), curves (seasonalities) and constants (dummies) and combines them together.