[Math] Testing for a change in mean in a time series

st.statistics

Suppose I have m observations from a time series before some specified date, and n additional observations after that date. The mean of the time series is expected to change on that date by an unknown amount, without further changes to the time series. Given the m + n observations and the date of the expected change, how does one test for a change in mean?

As an example, think about new traffic rules, that are expected to decrease the number of accidents in some country. We have the daily number of accidents in the country for, say, the 60 days before the new rules change, and for 40 days after the rules change, and want to know if the change did any good, i.e., lowered the number of accidents. Importantly, the observations may not be assumed to be independent, so a simple two-sample t test or Wilcoxon test aren't appropriate; any reasonable correlation structure – say, AR(1) – may be assumed. In my specific application, both m and n are around 10.

Any help will be greatly appreciated.

Best Answer

With m and n so small (about 10), either the change is large enough that it's going to jump at you by looking at the data or it's small enough that you won't be able to say anything very conclusive with a statistical test.

If you insist on a formal approach nonetheless, MDL provides a framework.

Write the shortest program $P$ that outputs an infinite time series that starts like the $m+n$ values and let $x = |P|$ the length of $P$ in bits. Then write two short program $P_1$ and $P_2$ that output infinite time series that start respectively like the first $m$ value and the subsequent $n$ values such that $y = |P_1| + |P_2| - |P_1 \cap P_2|$ is minimal, where $|P_1 \cap P_2|$ is the longest prefix of code shared by $P_1$ and $P_2$.

If $x < y$ you can't really justify treating the two time series as different. Otherwise, you can look at the mean implied by $P_1$ and $P_2$ and see how they differ.

Related Question