Solved – What’s the best (Google chart) visualisation for displaying sparse timeline data across thousands of “columns”

data visualizationsparsetime series

I am trying to visualise a sparse dataset but am finding it hard to fit it into the standard categories of charts. I'm a developer building with Google Charts and I really want to stick with that product.

The data set is the total count of "alerts" per day coming from a set of 2,000 devices. On most days some of the devices have problems and raise a few alerts. I want to display a chart so that an administrator can easily see which devices are having problems recently and be able to look back in time over previous days/weeks/months.

  • "Most" days = almost every day.
  • "Some" devices = perhaps 100-1000 (per day)
  • "A few" alerts = perhaps 1 to 100 (per device per day) occasionally much higer.

I tried putting this on a column or line chart, but it seems to end up needing 2,000 columns mostly with empty data – and results in unworkably large dataset (I have to fill the entire table with missing data points) or a terrible UI, or a crashed browser.

The raw row data is really simple, just three columns:

date, device id, number of alerts 

but the sparsity over time, and the fact that different devices pop up on different days, means I can't figure out how to map it to a chart.

I was thinking a stacked column chart would be ideal, but again technically the graphing system needs a consistent set of columns on every day.

It feels like I need a kind of daily stacked histogram, to collate by #of alerts? Or a threshold…

(BTW I've tagged it google-spreadsheet because it's the same charts, but I'm not using the spreasheets, I've made a data-provider)

(BTW2 I'm new to this exchange. I may post a more technical question on SO about the programming behind Google Charts – if that's all that's holding me up, but I was hoping for some visualisation inspiration here!)

Best Answer

Answering my own question, after coming up with a set of visualisations that seems to do the job! Lesson learned: I was just trying to show too much in one chart.

In summary, the problem was solved by splitting the data into multiple charts - six or seven in all - with interactivity to enable the viewer to drill down into the data from high-level aggregate summaries.

  1. A column chart of total alerts per month
  2. A column chart of total alerts per day, within a month
  3. A histogram of total alerts per device, all-time
  4. A histogram of total alerts per device, within a month
  5. A histogram of total alerts per device, within a day
  6. A column chart showing the incidents happening on a specific unit, over a timeline e.g. a month.
  7. A line-chart showing the data associated with each alert (specific to the alert, e.g. voltage, bpm, i/o whatever).

So the user starts at the top, and clicks a day (probably today). The month and day histogram reloads and the highest bar is the device with the most problems, so the user clicks that. Below, the device-specific graphs load allowing the user to see that devices behaviour.

It works well, in two/three clicks the user can get overview and detail for the most important aspects.

In this example one device had a problem on a specific day, contributing to most of the errors that day. It's very easy to find this now.

enter image description here

Related Question