Solved – Analysis of multiple time series

time series

i'm not sure how to google for this as i am not very familiar with time series analysis.

i have 500 websites, and i am measuring the number of visitors to each website each day. at some point, i turn on SEO (search engine optimization) for each of the websites. this happens on different days for different websites. the distribution of visitors by account for any given day has a long tail to the right. SEO may not have an immediate effect; it might take a few days/weeks to really start to see some results.

i want to measure something like the "average" lift in the number of visitors, but an average is probably not going to do the trick because of the mix of websites. (a daily/weekly/monthly trend curve would be really cool, but the average problem will exist there, too)

i can probably average the number of visitors per day for any given account, but i can't do it across accounts.

do i simply need to segment the websites into "number of visitor" groups? what other kinds of approaches should i read about?

Best Answer

Any study should start out with some conception of a goal. Are you interested in measuring the impact of your SEO? Or are you trying to model the visitor behavior?

It isn't clear to me that you're making use of the "time series" aspect of this data. Are you also interested in the time of day or day of week of visits, for instance? Or visits around specific events? You could just as easily divide your # of visits/day into two groups -- with/without SEO -- and eliminate time. This would then be a categorical variable in your data.

A basic next step could be to run a logistic regression to see the impact of SEO on your site traffic. In R, this could look something like this (where "seo" is a factor):

site.traffic.lg <- glm(num.visits ~ seo, family=binomial, data=your.data)
summary(site.traffic.lg)

If you want to use the fact that this is across a number of different kinds of sites, you could include this by adding it into the formula as another variable.

Related Question