When taking in data through a file (such as table[x index=0, y index=1, y error index=2]{plots/mydata.table};
) is it possible to automatically calculate the error bars (instead of putting them in manually)? Error calculation is a rather straightforward manner, and I would think that a package as complete as pgfplots would include it as a feature.
I'm open to a number of options. Obviously, a pure LaTeX (i.e. pgfplots) solution is best. If there is a way to automate the run of a script upon compilation that alters the data files before their use, that would be okay too (and this is probably the easiest, although I don't know how to effect the automation).
MWE
\documentclass{article}
\usepackage{pgfplots}
\usepackage{tikz}
\pgfplotsset{compat=1.7}
\begin{document}
\begin{tikzpicture}
\begin{axis}[grid=major]
\addplot+[smooth,
error bars/.cd,
y dir=both,
y explicit]
table[x index=0, y index=1, y error index=2]
{plots/mydata.table}; % simple space-delimited data file
\end{axis}
\end{tikzpicture}
\end{document}
Sample mydata.table
# Test input file
Sample Measure1 Measure2 Measure3 Measure4 ...
5 180 190 200 210
15 420 410 400 390
25 650 640 630 640
35 1100 1200 1150 1020
I have a script that will produce
5 194.4 4.36898157469
10 195.6 1.4310835056
15 207.4 2.23785611691
20 250.4 1.4587666023
given the input file.
Best Answer
PGFPlots comes with the PGFPlotstable package, which can process tabulated data. It doesn't include functions to calculate summary statistics like the mean, standard deviation or standard error for data columns, but these can be added quite easily.
After the necessary code has been included in the document, you can tell PGFPlots to make the standard error of the data in columns 2 to 5 available in a column called
stderror
by putting the following lines somewhere before your graph:By default, the code assumes that the data columns start at index 1 (so the second column in the table) and end at column 4, but this can be changed using the keys
summary statistics/start index
andsummary statistics/end index
.Then you can plot the mean values of each row with the error bars representing the standard error of columns 2 to 5 using
Here's an example using the data you provided (with the values slightly altered for a more dramatic effect):