Solved – Repeated loess smoothing for time series data

data visualizationloessr

I have 3 time series and I would like to determine if there are any common patterns in the data. Zuur mentioned in one of his books that a reasonably simple approach to test this is based on repeated loess smoothing. Zuur states that to visualize the general patterns of the time series, we fit a loess smoother through each series and then plot them all on the same graph. This generally captures the long term variations, to capture the short term variations (which I am interested in) we have to apply loess smoothing again, but now on the residuals.

Below, is an example data set where we have hourly values of ozone concentrations for 3 countries, I would like to perform what Zuur has explained, but to this example data set. Could anyone offer some suggestions? Please note that my knowledge of R and statistics in general is low, I apologize if this is a simplistic question.

require(plyr)
require(lattice)

TopFolder <- list("http://www.nilu.no/projects/ccc/onlinedata/ozone/CZ03_2009.dat"
                  ,"http://www.nilu.no/projects/ccc/onlinedata/ozone/CY02_2009.dat"
                  ,"http://www.nilu.no/projects/ccc/onlinedata/ozone/BE35_2009.dat"
)
## create variable for the data 
data = ldply(TopFolder, header = TRUE, read.table, sep = "", skip = 3)
## define Ozone levels
Ozone <- data$Value
    Ozone[Ozone==-999] <- NA
    Ozone <- data.frame(Ozone)
    ## define Datetime -  need to concatenate arrays
    DateTime <- paste(data$Date,data$Hour, sep = " ")
    Date <- as.POSIXct(DateTime, format = "%d.%m.%Y %H:%M")
    ## define countries 
    Countries <- c("Country1","Country2","Country3")
    Country <- data.frame(Country = rep(Countries, each = 8760))
    ## bind together
    Dat <- cbind(Ozone, Country = Country)
    Dat <- transform(Dat, Date=Date, Doy = as.numeric(format(Date,format = "%j")),
                     Tod = as.numeric(format(Date,format = "%H")),
                     DecTime = rep(seq(1,365, length = 8760),by = 3))
    ## Select data for a month during the summer 
    NewDat <- subset(Dat, as.Date(Dat$Date) >= '2009-07-01 00:00:00' & as.Date(Dat$Date) <= '2009-08-01 00:00:00') 

## plot the data
xyplot(Ozone~DecTime | Country, data = NewDat, type  = "l", col = 1,
       strip = function(bg = 'white',...)strip.default(bg = 'white',...))

So, here I have imported the data and taken a subset (for one month), and then plotted the values. I would now like to apply loess smoothing to represent the short term variation i.e. hourly.

Best Answer

Try this:

PlotData <- lapply(Countries,function(country) {
  fit1 <- loess(Ozone~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude)

  fit2 <- loess(residuals(fit1)~DecTime ,data = NewDat[NewDat$Country==country,],na.action=na.exclude)

  data.frame(DecTime=NewDat[NewDat$Country==country,"DecTime"],SmoothResiduals=predict(fit2),Country=country)
  })

PlotData <- do.call('rbind',PlotData)

xyplot(SmoothResiduals~DecTime | Country, data = PlotData, type  = "l", col = 1,
       strip = function(bg = 'white',...)strip.default(bg = 'white',...))

Smoothed residuals

Explanation: lapply is used to iterate over the countries, applies a function and combines the function values in a list. The anonymous function does a loess fit (note my use of na.action=na.exclude to deal with NA values), does another loess fit on the residuals and returns a data.frame with the smoothed residuals. Subsequently, I combine all data.frames and create the plot.

Obviously you might want to adjust the parameters of the loess fits.

Related Question