I'd like to know whether the solution proposed below is valid/acceptable and any justification available.
We have two biological conditions, and for each condition we measured 3 time series, so at each time point we have up to 3 data points per condition. For a priori reasons we believe the time series follow a Gaussian mixture model. We'd like to test for a significant difference in means between conditions at a small number of time points (e.g. t = 14, 28, and 38). The plot below is an example, but we have thousands of similar comparisons, and most are not so clear. What's the best way to do this?
A first approach was to compute a t-test at the time points of interest (after transforming the data into something approximately normal). However this gave us at most 6 data points for each comparison, threw out a ton of data, and it didn't use our assumption about the Gaussian mixture model.
Here is the proposed solution. Fit a Gaussian mixture model to both conditions, calculate confidence intervals around the fitted curve, and use these confidence intervals to assess significance at given times. That is, we would evaluate whether the fitted models differ at the time points of interest. (See below.) I could see a problem with this approach if the Gaussian mixture models don't describe the raw data, but in general they do very well (mean R^2 is around 0.9). So, given that Gaussian mixture models seem to be a good description of our data, is this a valid/acceptable solution? If so, how would we carry it out?
Edit: I asked a similar question here and linked to a different, also similar question where Rob Hyndman suggested using a parametric bootstrap. Unfortunately I'm really not familiar with parametric bootstrapping, so I'm not sure whether it would be appropriate here. This explanation of parametric boostrap says it involves "positing a model on the statistic you want to estimate" (in our case the means?), and estimating those parameters "by repeated sampling of the ecdf". I believe the ecdf is just our data.
However I'm not clear how to go from there (bootstrapped distributions of parameters for the Gaussian mixture models) to answering whether there's a significant difference at a specific time point. My understanding is that parametric bootstrapping directly compares the model parameters, i.e. Gaussian parameters in our case. But we're not interested in comparing model parameters. Instead we want to compare the model value, e.g. y_condition1(t=38) vs y_condition2(t=38). So I'm not sure parametric bootstrapping is the way to go.
Best Answer
The approach you suggest is to fit a model to sampled data from each condition, then use some measure of the models' variability to test whether they're different at a particular set of time points. This seems similar to some established regression techniques, which involve assuming you know the basic functional form, then using that form to do bootstrapping. These methods are the parametric bootstrap and bootstrapping residuals (more about this in a second). So, I think your general idea makes sense; the question is how to implement it.
After fitting models to the time series, the question is how to estimate variability at the time points of interest. A bootstrapping approach could work. But, it won't be possible to use the simple bootstrap (i.e. to resample the data points) because the data are correlated in time.
That's why Rob Hyndman suggested the parametric bootstrap. In that approach, you'd fit a model to each time series, repeatedly simulate new data from the model, then run statistics on the simulated data. The model in this case wouldn't be a simple curve, but a generative model (i.e. it would have to give a probability distribution from which you could sample new points).
Here's a paper using that approach. They use Gaussian process regression to model the time series and do parametric bootstrapping. Their method might work well for your data. You'd use the same model fitting and bootstrap procedure, but the thing you'd test would be the equality of the mean at particular time points.
Another possibility along similar lines is to resample the residuals. The procedure would look like this:
Let's say the sampled time series is $\{x_1, ..., x_n\}$ for the first condition and $\{y_1, ..., y_n\}$ for the second condition. Say we're interested in the differences at time points $t_1, t_2, t_3$.
Resampling the residuals relies on the assumption that the residuals are identically distributed, so it would be good to check that this is true. This condition could be violated, for example, if the variance changes over time.
Possibly of interest:
This chapter describes bootstrapping residuals.
These notes briefly compare the simple bootstrap, parametric bootstrap, and bootstrapping the residuals.