It sounds to me like your primary difficulty is in binning the data properly. Please pardon me if this is not your question.
You want the total area of the rectangles comprising the histogram to be 1. If you take rectangle height to be the number of data points in the corresponding bin, then the total area will not be 1. Instead, you will have
$$
\text{area}=(\text{bin width})\times(\text{total number of data points)}.
$$
The general principle for handling distributions of binned data properly - which even allows you to deal with variable-width bins - is that rectangle area, rather than rectangle height, should be proportional to the number of data points in the bin. For the fixed-width case, the height of each rectangle should be computed as
$$
\text{height}=
\frac{(\text{number of data points in bin})}{(\text{total number of data points})\times(\text{width of bin})}.
$$
If you follow this procedure, then, within reason, the resulting distribution should be relatively insensitive to the number of bins you decide to use. Of course, if you use too many bins, most bins will be empty and you'll get a very spiky distribution, which won't be very illuminating. If you use too few bins, the distribution will be too coarse-grained. The Wikipedia page on histograms describes some commonly used rules, such as Sturges' formula, for deciding how many bins to use.
For testing goodness of fit, you should follow the suggestions in Michael Chernick's answer.
Perhaps it's more evocative if we draw it slightly differently. Think of the single line and horizontal lines as an abstract version of this (Kazimir Malevich would be so proud).
In any case, the further you go in mathematics the lazier mathematicians get in naming things. Shout outs to "admissible", "pseudo / quasi / weak", and the horrors of "normal" and "regular".
Best Answer
Normal probability plots: The main purpose of a normal probability plot (normal Q-Q plot) is to assess normality. Here are plots, each of $n = 500$ observations, from uniform, normal, and Laplace (double-exponential) families, respectively. Only the normal sample shows points along a reasonably straight line in its normal probability plot. Of the three kinds of graphs a normal probability plot is most directly relevant to assessing normality.
Boxplots: Major purposes of boxplots are to show quartiles--and also outliers, if any are present. The boxplots below are for the same three datasets as above. All three distributions are symmetrical, and their respective boxplots are almost symmetrical. First and third quartiles (ends of boxes) become closer together as we scan from left to right.
In a boxplot, outliers are plotted individually as dots. A uniform distribution has no 'tails', and outliers are rare. A normal distribution has long thin tails, and and a boxplot of a moderately large sample will typically show a few outliers (in each tail). A Laplace distribution has heavy tails, and it is rare for a boxplot not to show many outliers.
If a boxplot shows many far outliers or if the whiskers are greatly different in length, then the population from which the sample came is unlikely to be normal. However, boxplots may be the weakest of the three kinds of plots in assessing normality. (They are better at showing a sample is not normal, than confirming that it is.)
Histograms: Below we show histograms of the three samples along with the respective density functions of their populations. Especially for small samples, important information can be lost when data are sorted into histogram bins. Even with our moderately large samples, the shape of the histogram is not necessarily a close match with the shape of the population distribution. Nevertheless, of the three kinds of graphical descriptions, histograms may be second-best (to normal probability plots) for assessing normality.