I can answer a couple of your questions, so here goes.
(1) Is there a set of generally accepted best practices for pre-processing the predictors (scoring, normalizing, rescaling, etc.) to provide more sensible inputs to the support vector regression step?
A general rule of thumb for SVM/SVR is to scale all inputs to the same interval. Common choices are $[-1,1]$ and $[0,1]$. The actual interval doesn't matter much, as long as all inputs are scaled to the same one. This prevents some input dimensions to completely dominate others when evaluating the kernel function. I have no reference for regression, but for classification it is listed in the LIBSVM guide (it holds for regression too).
When using kernel functions, like a Gaussian RBF, to transform the inputs, is it fair to say that the only extra degrees of freedom introduced are whatever hyperparameters govern the kernel's functional form? That feels a bit wrong to me, because you're effectively allowing yourself to explore a whole space of unvetted transformations of the data, which isn't really captured in just the functional form of the kernel function. How can you fairly penalize a model like this for having much more freedom to overfit the data with non-linear transformations?
When using SVM(/SVR), the degrees of freedom are in fact the number of training instances. Each training instance can become a support vector and as such contribute to the separating hyperplane/regressor. Although this may seem bad, this is exactly why SVM works in infinite dimensional feature spaces, for example using an RBF kernel: the actual number of degrees of freedom is always finite.
I suspect there is no general term that will cover all cases. Consider, for example, a white noise generator. In that case, we would just call it white noise. Now if the white noise comes from a natural source, e.g., AM radio band white noise, then it has effects including superimposed diurnal, seasonal, and sun-spot (11 year) solar variability, and man made primary and beat interference from radio broadcasts.
For example, the graph in the link mentioned by the OP looks like amplitude modulated white noise, almost like an earthquake. I personally would examine such a curve in the frequency and or phase domain, and describe it as an evolution of such in time because it would reveal a lot more about the signal structure by direct observation of how the amplitudes over a set of ranges of frequencies evolve in time with respect to detection limits as opposed to thinking about stationarity, mainly by reason of conceptual compactness. I understand the appeal of statistical testing. However, it would take umpteen tests and oodles of different criteria, as in the link, to incompletely describe an evolving frequency domain concept making the attempt at developing the concept of stationarity as a fundamental property seem rather confining. How does one go from that to Bode plotting, and phase plotting?
Having said that much, signal processing becomes more complicated when a "primary" violation of stationarity occurs; patient dies, signal stops, random walk continues, and so forth. Such processes are easier to describe as a non-stationarity than variously as an infinite sum of odd harmonics, or a decreasing to zero frequency. The OP complaint about not having much literature to document secondary stationarity is entirely reasonable; there does not seem to be complete agreement as to what even constitutes ordinary stationarity. For example, NIST claims that "A stationary process has the property that the mean, variance and autocorrelation structure do not change over time." Others on this site claim that "Autocorrelation doesn't cause non-stationarity," or using mixture distributions of RV's that "This process is clearly not stationary, but the autocorrelation is zero for all lags since the variables are independent." This is problematic because auto-non-correlation is typically "tacked-on" as an additional criterion of non-stationarity without much consideration given to how necessary and sufficient that is for defining a process. My advice on this would be first observe a process, and then to describe it, and to use phrases crouched in modifiers such as, "stationary/non-stationarity with respect to" as the alternative is to confuse many readers as to what is meant.
Best Answer
I would go with the notation in Hamilton:
Uppercase letters used for random variables; lowercase letters used for realized values of random variables; {} used for infinite sequences; () used for finite sequences.