Solved – What’s a good way to represent a complex timeline

data visualization

My wife is a novelist, now working on her fourth book. The books aren't a series in the sense of featuring the same main character, but they all take place in the same town, and some main characters from one book appear as bit parts in other books, and there are references to town history and landmarks. So she's getting to the point where she has to keep track of all of these people and things, so she doesn't contradict something she wrote in an earlier book.

To help with this, I'd like to make an infographic tracking each character through time, but I'm not sure how to begin.

I'm envisioning something like a graph with time on the X axis, with a colored line for each character stacked up the Y axis. The length of the bar runs from the person's birth to death (or the present), and maybe gets thicker where there's more activity.

Is there a better way to do this? Are there tools designed just for this kind of thing?

Best Answer

Sheesh, this is a good one.
I think that you're on the right path using a "infographic" approach however I would suggest looking at data visualization infographics would be better.
Something like http://www.dipity.com/ could allow you to track time and also provide content with the subject being reffrenced.
Take a look at http://many-eyes.com/#/visualizations and maybe somthing here could allow you to better evaluate your data. I only suggest these sites to help you find a better way of approaching such a fun and exciting project.

Related Solutions

Solved – Plotting Events on a Timeline in R

The following proposition is surely perfectible:

zucchini <- function(st, en, mingap=1)
{
  i <- order(st, en-st);
  st <- st[i];
  en <- en[i];
  last <- r <- 1
  while( sum( ok <- (st > (en[last] + mingap)) ) > 0 )
  {
    last <- which(ok)[1];
    r <- append(r, last);
  }
  if( length(r) == length(st) )
    return( list(c = list(st[r], en[r]), n = 1 ));

  ne <- zucchini( st[-r], en[-r]);
  return(list( c = c(list(st[r], en[r]), ne$c), n = ne$n+1));
}

coliflore <- function(st, en, mingap = 1)
{
  zu <- zucchini(st, en, mingap);
  plot.new();
  plot.window( xlim=c(min(st), max(en)), ylim = c(0, zu$n+1));
  box(); axis(1);
  for(i in seq(1, 2*zu$n, 2))
  {
    x1 <- zu$c[[i]];
    x2 <- zu$c[[i+1]];
    for(j in 1:length(x1))
      rect( x1[j], (i+1)/2,  x2[j], (i+1)/2+0.5, col="gray", border=NA );
  }
}

Application:

> st <- runif(20,0,50)
> en <- st + runif(20, 5,20)
> st
 [1] 25.571385 17.074676  4.564936 27.247745 23.832638 11.045469  2.845222
 [8]  2.824046 23.319625 19.684993 42.610242 48.185618 47.748637 39.813871
[15]  9.235512 40.299425 13.797027 21.079956 31.638772 24.152991
> en
 [1] 35.43667 32.20029 19.37133 44.30378 35.73845 16.63794 11.52551 16.06469
 [9] 32.22477 26.05563 49.51284 67.77664 67.27914 49.35472 28.27657 50.49421
[17] 27.29273 37.87611 48.76251 39.89335

> coliflore(st, en)

example

Happy new year!

Solved – What’s a good visualization for Poisson regressions

After you've fit the model, why not use predicted defects as a variable to compare to the others using whatever standard techniques are meaningful to them? It has the advantage of being a continuous variable so you can see even small differences. For example, people will understand the difference between an expected number of defects of 1.4 and of 0.6 even though they both round to one.

For an example of how the predicted value depends on two variables you could do a contour plot of time v. complexity as the two axes and colour and contours to show the predicted defects; and superimpose the actual data points on top.

The plot below needs some polishing and a legend but might be a starting point.

enter image description here

An alternative is the added variable plot or partial regression plot, more familiar from a traditional Gaussian response regression. These are implemented in the car library. Effectively the show the relationship between what is left of the response and what is left of one of the explanatory variables, after the rest of the explanatory variables have had their contribution to both the response and explanatory variables removed. In my experience most non-statistical audiences find these a bit difficult to appreciate (could by my poor explanations, of course).

enter image description here

#--------------------------------------------------------------------
# Simulate some data
n<-200
time <- rexp(n,.01)
complexity <- sample(1:5, n, prob=c(.1,.25,.35,.2,.1), replace=TRUE)
trueMod <- exp(-1 + time*.005 + complexity*.1 + complexity^2*.05)
defects <- rpois(n, trueMod)
cbind(trueMod, defects)


#----------------------------------------------------------------------
# Fit model
model <- glm(defects~time + poly(complexity,2), family=poisson)
# all sorts of diagnostic checks should be done here - not shown


#---------------------------------------------------------------------
# Two variables at once in a contour plot

# create grid
gridded <- data.frame(
    time=seq(from=0, to=max(time)*1.1, length.out=100),
    complexity=seq(from=0, to=max(complexity)*1.1, length.out=100))

# create predicted values (on the original scale)
yhat <- predict(model, newdata=expand.grid(gridded), type="response")

# draw plot
image(gridded$time, gridded$complexity, matrix(yhat,nrow=100, byrow=FALSE),
    xlab="Time", ylab="Complexity", main="Predicted average number of defects shown as colour and contours\n(actual data shown as circles)")
contour(gridded$time, gridded$complexity, matrix(yhat,nrow=100, byrow=FALSE), add=TRUE, levels=c(1,2,4,8,15,20,30,40,50,60,70,80,100))

# Add the original data
symbols(time, complexity, circles=sqrt(defects), add=T, inches=.5)

#--------------------------------------------------------------------
# added variable plots

library(car)
avPlots(model, layout=c(1,3))

Best Answer

Related Solutions

Solved – Plotting Events on a Timeline in R

Solved – What’s a good visualization for Poisson regressions

Related Question