Here might be a better approach (from link)
For a missing measurement, just use the last state estimate as a
measurement but set the covariance matrix of the measurement to
essentially infinity. (If the system uses inverse covariance just set
the values to zero.) This would cause a Kalman filter to essentially
ignore the new measurement since the ratio of the variance of the
prediction to the measurement is zero. The result will be a new
prediction that maintains velocity/acceleration but whose variance
will grow according to the process noise.
Following the symbols on Wikipedia, the Kalman gain is $K_k = P_{k|k-1}H^T_k S_{k-1}^{-1}$
$S_{k-1}^{-1}$ is equivalent to what you have called $\Sigma_t^{-1}$ in your question, so it suffices to show the equivalence of $P_{k|k-1}H_k^T$ and $\Sigma^{x,t}_z$.
Cross covariance between the predicted state $\hat{x}_{k|k-1}$ and the predicted measurement $H_k \hat{x}_{k|k-1} + v_k$ is (which you have called $\Sigma_{z}^{x,t}$) is defined $E[(\hat{x}_{k|k-1} - E[\hat{x}_{k|k-1}])(H_k\hat{x}_{k|k-1} + v_k - E[H_k\hat{x}_{k|k-1}+v_k])^T]$.
$$E[(\hat{x}_{k|k-1} - E[\hat{x}_{k|k-1}])(H_k\hat{x}_{k|k-1} + v_k - E[H_k\hat{x}_{k|k-1}+v_k])^T]$$
Now use linearity of expectation to factor out $H_k$.
$$= E[(\hat{x}_{k|k-1} - E[\hat{x}_{k|k-1}])(H_k(\hat{x}_{k|k-1}- E[\hat{x}_{k|k-1}])+v_k - E[v_k])^T]. $$
Define $\Delta \hat{x}_{k|k-1} \equiv \hat{x}_{k|k-1} - E[\hat{x}_{k|k-1}]$ and $\Delta v_k \equiv v_k - E[v_k]$.
$$= E[\Delta \hat{x}_{k|k-1}(H_k\Delta \hat{x}_{k|k-1}+\Delta v_k)^T]. $$
Distribute the transpose.
$$= E[\Delta \hat{x}_{k|k-1}(\Delta \hat{x}_{k|k-1}^T H_k^T+\Delta v_k^T)]. $$
Distribute.
$$= E[\Delta \hat{x}_{k|k-1}\Delta \hat{x}_{k|k-1}^T] H^T + E[\Delta \hat{x}_{k|k-1} \Delta v_k^T]. $$
Now use independence of $v_k$ to distribute the expectation across the product.
$$= E[\Delta \hat{x}_{k|k-1}\Delta \hat{x}_{k|k-1}^T] H^T + E[\Delta \hat{x}_{k|k-1}]E[ \Delta v_k^T]. $$
$$= E[\Delta \hat{x}_{k|k-1}\Delta \hat{x}_{k|k-1}^T] H^T + 0 $$
$$= P_{k|k-1} H_k^T $$
So in the end, the equivalence follows from the standard Kalman filter assumptions. The simplification is not possible for the UKF case because we do not have the matrix $H_k$.
You can, however, get a completely mechanical intuition about the information content of the cross covariance. It answers the following question: "If I wiggle state variable $i$ inside $\Delta \hat{x}_{k|k-1}$ upwards, how likely is it that observation variable $j$ of $(H_k \hat{x}_{k|k-1} + v_k)$ will also go up?" This what the entry $(i,j)$ of the crosscovariance means. And it makes sense that this information would be relevant for the operation of the UKF.
Best Answer
If you have several observations for the same signal you just need to update your model several times (once for each sensor). Your workflow would look like:
If some of the measurements are not available at the current call you just jump over and update only for available ones.
Doing so you don't need to change your model depending on the system configuration. The filter should not know there you take your measurements from and how many sensor you have. It just predicts for the time between the measurements and updates as soon one or several measurements are available.
Here is a short matlab example for your problem:
Here are the results:
Velocity estimation:
Acceleration estimation: