To understand this properly, I suggest to look at distributions and how operations with distributions are defined in the first place.
A distribution is an object that acts on the space of infinitely differentiable and compactly supported functions in a linear and continuous way (check a textbook or Wikipedia for the precise definition). I.e. a distribution $T$ on a set $\Omega\subset\newcommand{\RR}{\mathbb{R}}\RR^n$assigns to any infinitely differentiable and compactly supported function $\phi$ defined on $\Omega$ a complex number $T(\phi)$. Then one notes that locally integrable function $f$ defined on $\Omega$ induced a distribution $T_f$ via the operation
$$T_f(\phi) = \int_\Omega f(x)\phi(x)dx.$$
Now one can try to define operations which one can do to a locally integrable function also for a distribution by analogy. Take, for example, translation (in the case $\Omega=\RR^n$): define the operation $t_y(f)$ defined by $t_y(f)(x) = f(x-y)$. Observe that
$$T_{t_y(f)}(\phi) = \int t_y(f)(x)\phi(x)dx = \int f(x-y)\phi(x)dx = \int f(x)\phi(x+y)dx = T_f(t_{-y}(\phi)).$$
I.e. "translating the function $f$ is the same as translation the test function $\phi$ in the opposite direction". Hence, one defines the translation of the distribution $T$ as
$$t_y(T)(\phi) := T(t_{-y}\phi).$$
(Work out, that the translation of the Dirac $\delta$ is what you think it should be.) Now you should be able to do the same thing with scaling $s_a(f)(x) = f(ax)$.
$\newcommand{\+}{^{\dagger}}
\newcommand{\angles}[1]{\left\langle\, #1 \,\right\rangle}
\newcommand{\braces}[1]{\left\lbrace\, #1 \,\right\rbrace}
\newcommand{\bracks}[1]{\left\lbrack\, #1 \,\right\rbrack}
\newcommand{\ceil}[1]{\,\left\lceil\, #1 \,\right\rceil\,}
\newcommand{\dd}{{\rm d}}
\newcommand{\down}{\downarrow}
\newcommand{\ds}[1]{\displaystyle{#1}}
\newcommand{\expo}[1]{\,{\rm e}^{#1}\,}
\newcommand{\fermi}{\,{\rm f}}
\newcommand{\floor}[1]{\,\left\lfloor #1 \right\rfloor\,}
\newcommand{\half}{{1 \over 2}}
\newcommand{\ic}{{\rm i}}
\newcommand{\iff}{\Longleftrightarrow}
\newcommand{\imp}{\Longrightarrow}
\newcommand{\isdiv}{\,\left.\right\vert\,}
\newcommand{\ket}[1]{\left\vert #1\right\rangle}
\newcommand{\ol}[1]{\overline{#1}}
\newcommand{\pars}[1]{\left(\, #1 \,\right)}
\newcommand{\partiald}[3][]{\frac{\partial^{#1} #2}{\partial #3^{#1}}}
\newcommand{\pp}{{\cal P}}
\newcommand{\root}[2][]{\,\sqrt[#1]{\vphantom{\large A}\,#2\,}\,}
\newcommand{\sech}{\,{\rm sech}}
\newcommand{\sgn}{\,{\rm sgn}}
\newcommand{\totald}[3][]{\frac{{\rm d}^{#1} #2}{{\rm d} #3^{#1}}}
\newcommand{\ul}[1]{\underline{#1}}
\newcommand{\verts}[1]{\left\vert\, #1 \,\right\vert}
\newcommand{\wt}[1]{\widetilde{#1}}$
$$
\mbox{In spherical coordinates,}\quad
\delta\pars{\vec{r}}={\delta\pars{r}\delta\pars{\cos\pars{\theta}}\delta\pars{\phi} \over r^{2}}\quad
\mbox{such that}
$$
\begin{align}
\color{#66f}{\large\int_{{\mathbb R}^{3}}\delta\pars{\vec{r}}\,\dd^{3}\vec{r}}
&=\int_{0^{-}}^{\infty}\dd r\,r^{2}\int_{0}^{\pi}\dd\theta\,\sin\pars{\theta}
\int_{0}^{2\pi}\dd\phi\,{\delta\pars{r}\delta\pars{\cos\pars{\theta}}\delta\pars{\phi} \over r^{2}}
\\[3mm]&=\underbrace{\bracks{\int_{0^{-}}^{\infty}\delta\pars{r}\,\dd r}}
_{\ds{=\ 1}}\
\underbrace{\bracks{%
\int_{0}^{\pi}\delta\pars{\cos\pars{\theta}}\sin\pars{\theta}\,\dd\theta}}
_{\ds{=\ 1}}\
\underbrace{\bracks{\int_{0}^{2\pi}\delta\pars{\phi}\,\dd\phi}}_{\ds{=\ 1}}\
\\[3mm]&=\ \color{#66f}{\Large 1}
\end{align}
$$\mbox{Note that}\quad
\int_{{\mathbb R}^{3}}\delta\pars{\vec{r} - \vec{r}_{0}}\,\dd^{3}\vec{r}
=\int_{{\mathbb R}^{3}}\delta\pars{\vec{r}}\,\dd^{3}\vec{r}
$$
Best Answer
The $\delta$-function is not actually a function - it's a distribution. The idea of finding an integral by using lots of thin rectangles doesn't work here. This is a more abstract form of integration. In fact, lots of integrals from Quantum Mechanics do not converge in the classical sense.
The $\delta$-function has the property that $\delta(x) = 0$ for all $x \neq 0$.
So, for example, $\delta(2) = 0$ and $\delta(-3.4)=0$.
The value of $\delta(0)$ is not well-defined, but we do know that $$\int_{-\infty}^{\infty}\delta(x)~\mathrm dx = 1$$
Since $\delta(x) = 0$ for all $x \neq 0$, the values of $x$ away from zero contribute nothing to the integral:
$$\int_{-\varepsilon}^{\varepsilon} \delta(x)~\mathrm dx = 1$$
for any $\varepsilon > 0$, as small as you like!
If $S$ is some open subset of the real numbers then $$\int_S \delta(x)~\mathrm dx \ \ = \ \ \left\{ \begin{array}{ccc} 1 & : & 0 \in S \\ 0 & : & 0 \notin S \end{array}\right.$$
In fact, you can make even stronger statements, e.g. $S$ doesn't need to be open, but you need to be careful how you word it.
In your example
$$H(x) := \int_{-\infty}^{x} \delta(\tau)~\mathrm d\tau$$ the set $S$ is the interval $(-\infty,x)$. If $x < 0$ then $0 \notin S$ and so $H(x) = 0$ for all $x < 0$. If $x>0$ then $0 \in S$ and so $H(x) = 1$ for all $x > 0$. What happens when $x=0$ depends on whom you speak to.