By definition, $$g(x)=x-\frac{f(x)}{f'(x)}.$$ So
$$
g'(x)=1-\frac{f'(x)f'(x)-f(x)f''(x)}{f'(x)^2}=\frac{f(x)f''(x)}{f'(x)^2}.
$$
Now $r$ is chosen so that $f(r)=0$, so the numerator above is zero: thus, $g'(r)=0$.
Your second point is totally valid, and indeed $\xi$ will depend on $n$. But if you know that $g''$ is continuous, as all the $\xi$ lie inside a closed interval, $g''$ will be bounded in that interval. So, while it is not necessarily constant, the factor $g''(\xi)$ in the error is bounded by a constant.
For your third question, if $r$ is a "root of multiplicity $\delta$", it means that $$ 0=f(r)=f'(r)=\cdots=f^{(\delta-1)}(r).$$ So all those terms in the Taylor expansion will be gone.
A best approximation of a real number $r$ is a rational fraction $a/b$ with $b>0$ such that for every rational fraction $c/d$ with $d \le b$ and $c/d \ne a/b$,
$$\left| r - \frac{a}{b}\right| < \left|r - \frac{c}{d}\right|$$
Theorem: Every best approximation of a number $r$ is either a convergent or an intermediate fraction of the continued fraction representing $r$ (if you include a "$-1$'th order convergent $1/0$).
For example: if $r = \pi$, the continued fraction representation starts
$3 + 1/(7 + 1/(15 + 1/(1 + ...)))$. The first few convergents are
$$\frac{1}{0}, \frac{3}{1}, \frac{22}{7}, \frac{333}{106}, \frac{355}{113}$$
The intermediate fractions between $1/0$ and $22/7$ are $$\frac{4}{1}, \frac{7}{2}, \frac{10}{3}, \frac{13}{4}, \frac{16}{5}, \frac{19}{6}$$ The intermediate
fractions between $3/1$ and $333/106$ are
$$ \frac{25}{8}, \frac{47}{15}, \frac{69}{22}, \frac{91}{29}, \frac{113}{36}, \ldots,
\frac{311}{99}$$
The first few best approximations of $\pi$ are
$$\frac{3}{1}, \frac{13}{4}, \frac{16}{5}, \frac{19}{6}, \frac{22}{7}, \frac{179}{57},
\frac{201}{64}, \frac{223}{71}, \frac{245}{78}, \frac{267}{85}, \frac{289}{92}, \frac{311}{99}, \frac{333}{106}, \frac{355}{113}$$
Best Answer
In general, there is no relation between the absolute value of the error $\epsilon_n$ and the number $\delta_n$!
For example consider a function $f : \mathbb{R} \rightarrow \mathbb{R}$ which far away from the desired root is positive and decays as $x \rightarrow \exp(-\lambda x)$, where $\lambda >0$. If the user has picked an initial guess far from the root and inside the bad region, then Newton's method will essentially reduce to $x_{n+1} = x_n + \frac{1}{\lambda}$. If $\lambda$ is sufficiently large, then your routine will signal convergence, despite the fact that you are moving away from the root! A good routine should also monitor the residual, i.e the absolute value of $f(x_n)$, but this will not save you in this case, as $f(x_n) \rightarrow 0$.
If you want a truly robust code, then you should strive to maintain a bracket, i.e. an interval $[a,b]$ for which you are certain that $f(a)$ and $f(b)$ have different signs. You should look into hybrid methods such as merger between the secant method and the bisection method.
If you can decisively say that you are close to a root, then you can approximate the error as $x - x_{n} \approx \frac{f'(x_n)}{f(x_n)}$. This estimate is however worthless unless you are close to a root, so look for a bracket.