Solved – How to effectively summarize and visualize time series of employee activities

data visualizationrtime series

I am managing many people entering data into a database. I have a log of user, date, time, table, and action that each person makes:

records <- data.frame(user = c('bob', 'bob', 'jane', 'jane', 'bob', 'bob', 'bob', 'jane', 'jane', 'bob'),
                      date = c("2010-06-24", "2010-06-28", "2010-06-29", "2010-06-30", "2010-07-01", "2010-07-02", "2010-07-05", "2010-07-06", "2010-07-07", "2010-07-09"), 
                      time = c("01:40:08", "01:40:18", "01:40:28", "01:40:37", "01:40:44", "01:40:52", "01:40:59", "01:56:26", "02:16:37", "03:55:06"),
                      table = c(rep('table1',5), rep('table2',5)),
                      action = c('create', 'create', 'create', 'update', 'create', 'update', 'update', 'create', 'create', 'create')) 

For a non-trivial example, the actual records dataframe with 10,000 entries can be downloaded as an .Rdata file here, and then:

load('records.Rdata')
library(ggplot2)
qplot(date, table, data = records, color = user, geom='jitter')

How can I visualize, overall and for each table:

  1. the amount of time each person works per week
  2. the type and number or frequency of actions that they made.

?

Best Answer

Below the code to plot the numbers of actions per week/per user:

load("records.Rdata")
library(ggplot2)
records$posdate <- as.POSIXlt(records$date,format="%Y-%m-%d")
records$week <- as.numeric(format(records$posdate,"%W")) #changed from previous hack!
numberActions <- by(records$action,records[,c("user","week")],function(x) length(x[x!="Login"]))
numberActions <- melt(t(numberActions[1:7,]),"week")
colnames(numberActions)[2] <- "user"
ggplot(numberActions,aes(week,value,group=user,col=user))+geom_line()+geom_point(size=3)

I hope this helps.

EDIT:

For the second part you can use plyr:

numberActions <- ddply(records,c("user","week"),function(x) table(x$action))
numberActions <- melt(numberActions,c("user","week"))
numberActions <- numberActions[numberActions$value!=0 & numberActions$variable!="Login",]

and then the usual ggplot...