Solved – Statistical test for a series of data over time

statistical significancetime series

I have a set of data that looks at the number of "hits" a specific program makes over the course of time. The data goes back to September 2010, and includes data up to March 2011, so the data points are monthly. What I want to see if the most recent data (March 2011) shows a statistically significant decrease in the number of "hits" this program makes.

I have a feeling there might not be a test that would fit this perfectly, as the data is a bit limited. I can also pull data weekly for the same time frame, which would build 31 points (at which point I would still want to look at the most recent unit for comparison). There hasn't been a population mean built for this data as of yet, as the data can only be pulled as far back as Jan 2010 (but the data from then is not reliable).

For reference, just 9 weeks data (as i pulled that first) reveals
mean= 1013.67
n=9
st.dev= 53.57
Most recent week= 991

Just eyeballing it does not appear statistically significant as a drop in "hits", however I'll need to perform this analysis every few weeks, and wondering if there's something reliable I can use. Thanks ahead of time for the input!

Best Answer

As GaBorgulya pointed out one needs to have a model to detect the potential anomaly. This model needs to generate a "white noise" error series or be sufficient to separate signal and noise. With this model in hand based upon older data one could then compare the new value with the prediction interval. This is the classical , albeit limited approach called an "out off model test". A more comprehensive approach is to to include a "pulse variable" i.e. zeros and a 1 for the new data point and to estimate coefficients for the augmented model using all of the data. The probability of observing what you observed before you observed it ( i.e. the new value" ) is then available from the "t value" of the "pulse variable" in this augmented model. In general this approach is referred to as Intervention Detection which scans ( data mines ) the time periods to detect the points where Pulses , Level Shifts , Seasonal Pulses and Local Time Trends have been significantly evidented. In your case you are not searching for the null hypothesis but rather simply is there a potential change point at the last observation i.e. the last "1" period. Your question also suggests solutions that we have seen which detect a significant change in the mean of the last K periods alerting the analyst to the innovation.