[Tex/LaTex] How to already binned data be plotted as a histogram

diagramspgfplotstikz-pgf

I have the following data:

[0, 1000) : 14524
[1000, 2000) : 38214
[2000, 3000) : 36169
[3000, 4000) : 25875
[4000, 5000) : 16942
[5000, 6000) : 10603
[6000, 7000) : 6778
[7000, 8000) : 4288
[8000, 9000) : 2980
[9000, 10000) : 1986
[10000, 11000) : 1392
[11000, 12000) : 1040
[12000, 13000) : 801
[13000, 14000) : 632
[14000, 15000) : 467
[15000, \infty): 3819

How can I plot such data in form of a histogram?

My Try

% Source: http://tex.stackexchange.com/a/65518/5645
\documentclass{standalone}
\usepackage{pgfplots}

\begin{document}
\begin{tikzpicture}
    \begin{axis}[/tikz/ybar,
        ybar legend,
        xtick align=outside,
        ymin=0,
        bar width=0.9cm,
        axis x line*=left,
        enlarge x limits=false,
        grid=major,
        height=7cm,
        title={All Results},
        xlabel={recording time $t$ in ms},
        ylabel={Number of Recordings},
        symbolic x coords={$0$, $1000$, $2000$, $3000$, $4000$, $5000$,
                           $6000$, $7000$, $8000$, $9000$, $10000$,
                           $11000$, $12000$, $13000$, $14000$, $15000$},
        minor x tick num=5,
        extra x tick style={xticklabel style={yshift=-15pt}},
        width=\textwidth,
        xtick=data,
        xticklabel style={
            inner sep=0pt,
            anchor=north east,
            rotate=45
        },
        nodes near coords,
        every node near coord/.append style={
            anchor=mid west,
            rotate=45}]
    \addplot[blue, fill=blue!40!white] coordinates {($0$,  14524) ($1000$,  38214) ($2000$,  36169) ($3000$,  25875) ($4000$,  16942) ($5000$,  10603) ($6000$,  6778) ($7000$,  4288) ($8000$,  2980) ($9000$,  1986) ($10000$,  1392) ($11000$,  1040) ($12000$,  801) ($13000$,  632) ($14000$,  467) ($15000$,  3819)};
    \legend{Time}
    \end{axis}
\end{tikzpicture}
\end{document}

which produces

enter image description here

My main problem is that the bins are centered around the symbolic tikz. They should be in the range noted above.

edit: Obviously I did not express myself clearly. My problem is that the labels of the first bar of the histogram seem to go from -500 to +500. But they should go from 0 to 1000

The other problems (overlapping bars, overlapping of the x-label and the x-tick labels) are not part of the question (although it would be nice if I could get pointers how to fix that).

Best Answer

My suggestion is

enter image description here

Code:

\documentclass[margin=10pt]{standalone}
\usepackage{amsmath}
\usepackage{pgfplots}
\pgfplotsset{compat=1.10}

\newcommand\clipright[1][white]{
  \fill[#1](current axis.south east)rectangle(current axis.north-|current axis.outer east);
  \pgfresetboundingbox
  \useasboundingbox(current axis.outer south west)rectangle([xshift=.5ex]current axis.outer north-|current axis.east);
}

\begin{document}
\begin{tikzpicture}
    \begin{axis}[/tikz/ybar interval,
        ybar legend,
        xtick align=outside,
        ymin=0,
        axis x line*=left,
        enlarge x limits=false,
        grid=major,
        height=7cm,
        title={All Results},
        xlabel={recording time $t$ in ms},
        ylabel={Number of Recordings},
        xtick={0,...,16},
        xticklabels={$0$, $1000$, $2000$, $3000$, $4000$, $5000$,
                           $6000$, $7000$, $8000$, $9000$, $10000$,
                           $11000$, $12000$, $13000$, $14000$, $15000$,$\infty$},
        width=\textwidth,
        xtick=data,
        xticklabel style={
            inner sep=0pt,
            anchor=north east,
            rotate=45
        },
        nodes near coords,
        every node near coord/.append style={
            anchor=mid west,
            rotate=45},
            ]
    \addplot[blue, fill=blue!40!white] coordinates {(0,  14524) (1,  38214) (2,  36169) (3,  25875) (4,  16942) (5,  10603) (6,  6778) (7,  4288) (8,  2980) (9,  1986) (10,  1392) (11,  1040) (12,  801) (13,  632) (14,  467) (15,  3819) (16,  3819)};
    \legend{Time}
    \end{axis}
    \clipright
\end{tikzpicture}
\end{document}

Explanation:

Your main problem can be solved by using /tikz/ybar interval instead /tikz/ybar. In addition it is necessary to set the compat option at least to 1.3 to avoid the overlapping of xlabel and xtick labels automatically. The current pgfplots version is 1.10.

enter image description here

Code:

\documentclass[margin=10pt]{standalone}
\usepackage{pgfplots}
\pgfplotsset{compat=1.3}

\begin{document}
\begin{tikzpicture}
    \begin{axis}[
        /tikz/ybar interval,
        ybar legend,
        xtick align=outside,
        ymin=0,
        %bar width=0.9cm,
        axis x line*=left,
        enlarge x limits=false,
        grid=major,
        height=7cm,
        title={All Results},
        xlabel={recording time $t$ in ms},
        ylabel={Number of Recordings},
        symbolic x coords={$0$, $1000$, $2000$, $3000$, $4000$, $5000$,
                           $6000$, $7000$, $8000$, $9000$, $10000$,
                           $11000$, $12000$, $13000$, $14000$, $15000$},
        %minor x tick num=5,
        %extra x tick style={xticklabel style={yshift=-15pt}},
        width=\textwidth,
        xtick=data,
        xticklabel style={
            inner sep=0pt,
            anchor=north east,
            rotate=45
        },
        nodes near coords,
        every node near coord/.append style={
            anchor=mid west,
            rotate=45}]
    \addplot[blue, fill=blue!40!white] coordinates {($0$,  14524) ($1000$,  38214) ($2000$,  36169) ($3000$,  25875) ($4000$,  16942) ($5000$,  10603) ($6000$,  6778) ($7000$,  4288) ($8000$,  2980) ($9000$,  1986) ($10000$,  1392) ($11000$,  1040) ($12000$,  801) ($13000$,  632) ($14000$,  467) ($15000$,  3819)};
    \legend{Time}
    \end{axis}
\end{tikzpicture}
\end{document}

But there are some other problems: The last item is missing. This behavoir is explained in the manual. And in addition there is a known issue: nodes near coords generates the last item outside the axis. To solve this problems you can add an additional coordinate (infinite, 3819) and use the workaround from this question:

enter image description here

\documentclass[margin=10pt]{standalone}
\usepackage{pgfplots}
\pgfplotsset{compat=1.10}

\newcommand\clipright[1][white]{
  \fill[#1](current axis.south east)rectangle(current axis.north-|current axis.outer east);
  \pgfresetboundingbox
  \useasboundingbox(current axis.outer south west)rectangle([xshift=.5ex]current axis.outer north-|current axis.east);
}

\begin{document}
\begin{tikzpicture}
    \begin{axis}[
        /tikz/ybar interval,
        ybar legend,
        xtick align=outside,
        ymin=0,
        axis x line*=left,
        enlarge x limits=false,
        grid=major,
        height=7cm,
        title={All Results},
        xlabel={recording time $t$ in ms},
        ylabel={Number of Recordings},
        symbolic x coords={$0$, $1000$, $2000$, $3000$, $4000$, $5000$,
                           $6000$, $7000$, $8000$, $9000$, $10000$,
                           $11000$, $12000$, $13000$, $14000$, $15000$, infinite},
        width=\textwidth,
        xtick=data,
        xticklabel style={
            inner sep=0pt,
            anchor=north east,
            rotate=45
        },
        nodes near coords,
        every node near coord/.append style={
            anchor=mid west,
            rotate=45}]
    \addplot[blue, fill=blue!40!white] coordinates {($0$,  14524) ($1000$,  38214) ($2000$,  36169) ($3000$,  25875) ($4000$,  16942) ($5000$,  10603) ($6000$,  6778) ($7000$,  4288) ($8000$,  2980) ($9000$,  1986) ($10000$,  1392) ($11000$,  1040) ($12000$,  801) ($13000$,  632) ($14000$,  467) ($15000$,  3819) (infinite,  3819)};
    \legend{Time}
    \end{axis}
    \clipright
\end{tikzpicture}
\end{document}

Unfortunately I wasn't able to use \infty as a symbolic coordinate without getting error messages.

But it works if I use xtick={1,...,16} and xticklabels instead of symbolic x coords. Then I must change the coordinates to

\addplot[blue, fill=blue!40!white] coordinates {(0,  14524) (1,  38214) (2,  36169) (3,  25875) (4,  16942) (5,  10603) (6,  6778) (7,  4288) (8,  2980) (9,  1986) (10,  1392) (11,  1040) (12,  801) (13,  632) (14,  467) (15,  3819) (16,  3819)};

or use a table

\addplot[blue, fill=blue!40!white] table[x expr=\coordindex,y expr=\thisrowno{0},row sep=\\,header=false]{14524\\ 38214\\ 36169\\ 25875\\ 16942\\ 10603\\ 6778\\ 4288\\ 2980\\ 1986\\ 1392\\ 1040\\ 801\\ 632\\ 467\\ 3819\\ 3819\\};

The result is shown at the top of my answer.