Given a system of $N$ point-particles with positions
${\bf r}_1, \ldots , {\bf r}_N$; with corresponding virtual displacements $\delta{\bf r}_1$, $\ldots $, $\delta{\bf r}_N$; with momenta ${\bf p}_1, \ldots , {\bf p}_N$; and with applied forces ${\bf F}_1^{(a)}, \ldots , {\bf F}_N^{(a)}$. Then D'Alembert's principle states that
$$\tag{1} \sum_{j=1}^N ( {\bf F}_j^{(a)} - \dot{\bf p}_j ) \cdot \delta {\bf r}_j~=~0. $$
The total force
$${\bf F}_j ~=~ {\bf F}_j^{(a)} +{\bf F}^{(ec)}_j+{\bf F}^{(ic)}_j
+ {\bf F}^{(i)}_j + {\bf F}_j^{(o)}$$
on the $j$'th particle can be divided into five types:
applied forces ${\bf F}_j^{(a)}$ (that we keep track of and that are not constraint forces).
an external constraint force ${\bf F}^{(ec)}_j$ from the environment.
an internal constraint force ${\bf F}^{(ic)}_j$ from the $N-1$ other particles.
an internal force ${\bf F}^{(i)}_j$ (that is not an applied or a constraint force of type 1 or 3, respectively) from the $N-1$ other particles.
Other forces ${\bf F}_j^{(o)}$ not already included in type 1, 2, 3 and 4.
Because of Newton's 2nd law ${\bf F}_j= \dot{\bf p}_j$, D'Alembert's principle (1) is equivalent to$^1$
$$\tag{2} \sum_{j=1}^N ( {\bf F}^{(ec)}_j+{\bf F}^{(ic)}_j+{\bf F}^{(i)}_j+{\bf F}_j^{(o)}) \cdot \delta {\bf r}_j~=~0. $$
So OP's question can essentially be rephrased as
Are there examples in classical mechanics where eq. (2) fails?
Eq. (2) could trivially fail, if we have forces ${\bf F}_j^{(o)}$ of type 5, e.g. sliding friction, that we (for some reason) don't count as applied forces of type 1.
However, OP asks specifically about internal forces.
For a rigid body, to exclude pairwise contributions of type 3, one needs the strong Newton's 3rd law, cf. this Phys.SE answer. So if these forces fail to be collinear, this could lead to violation of eq. (2).
For internal forces of type 4, there is in general no reason that they should respect eq. (2).
Example: Consider a system of two point-masses connected by an ideal spring. This system has no constraints, so there are no restrictions to the class of virtual displacements. It is easy to violate eq. (2) if we count the spring force as a type 4 force.
Reference:
H. Goldstein, Classical Mechanics, Chapter 1.
--
$^1$It is tempting to call eq. (2) the Principle of virtual work, but strictly speaking, the principle of virtual work is just D'Alembert's principle (1) for a static system.
The main difference between "real" displacements and "virtual" displacements arise when the constraint itself is time-dependent.
For example, you have an incline that is accelerating (I am saying accelerating because a moving side can be cancelled with galilei transform, accelerating one cannot; but you should imagine simply a moving incline). Then because the slide will move, any instantenous (!!!!) real displacement $d\mathbf{r}$ will not be tangential to the incline. It will consist of a $\delta\mathbf{r}$ displacement that is tangential along the incline, and a displacement $\Delta\mathbf{r}$ caused by the motion of the incline, so $d\mathbf{r}=\delta\mathbf{r}+\Delta\mathbf{r}$.
The $\delta\mathbf{r}$ here is what is called a virtual displacement. Because it doesn't take into account the time-evolution of the constraint (the motion of the incline), it can be calculated by "freezing time" and in the frozen landspace, you evaluate the possible displacements of the system that are allowed by the contraints (eg. in this case, tangent to the incline), hence the Calkin definition.
Best Answer
Well, I think the book is not sufficiently clear. In Classical Mechanics without constraints, everything reduces to solve a system of differential equations of the form: $$\frac{d^2 \vec{x}}{dt^2}= \vec{G}\left(t, \vec{x}(t), \frac{d \vec{x}}{dt}(t)\right)\tag{1}$$ with given initial conditions $$\vec{x}(t_0) = \vec{x}_0\:,\quad \frac{d \vec{x}}{dt}(t) = \vec{v}_0\:.\tag{2}$$ where $\vec{x}=\vec{x}(t) \in \mathbb R^{3N}$ encompasses all the positions of the $N$ points of the system at time $t$, $$\vec{x} = (\vec{x}_1, \ldots, \vec{x}_N)\:.$$ Above, $\vec{x}_i$ is the position vector of the $i$-th point in the rest 3-space of a reference frame. The masses $m_1,\ldots, m_N$ of the points have been embodied in the function $\vec{G}= \vec{G}(t, \vec{x}, \vec{v})$.
Since (1) is in normal form, if the function $\vec{G}$ is known and is sufficiently regular, say $C^2$, (it would be enough jointly continuous in all variables and locally Lipschitz in $(\vec{x}, \vec{v})$), then the Cauchy problem (1)+(2) admits a unique (maximal) solution in a (maximal) interval including $t_0$.
Physically speaking the functional form of $\vec{G}$ is known when it is a superposition of (classically) fundamental forces, like the gravitational one, Lorentz force, etc...
However life is not so easy because there are practical and usual situations where we do not know the functional form of some of the forces superposed to produce $\vec{G}$. This is the case of geometrically constrained points, where, instead of the functional form of the force necessary to make the constraint satisfied, only the analytic equation of the constraint is provided. In this case (1) has to be replaced by
$$\frac{d^2 \vec{x}_i}{dt^2}= \frac{1}{m_i}\vec{F}_i\left(t, \vec{x}(t), \frac{d \vec{x}}{dt}(t)\right)+ \vec{\phi}_i(t)\tag{1'}\quad i=1,\ldots, N$$ where $\vec{\phi}_i(t)$ is the force on the $i$-th point due to the geometric constraint at time $t$, whose functional form is completely unknown and therefore it is a further unknown of the problem exactly as the functions $\vec{x}_i$.
A closer microscopic scrutiny would reveal that $\phi_i$ is of electrical nature, but it is irrelevant here. They also can be mathematically emulated by means of a suitable functional forms, e.g., of conservative forces, taking the form of the constraints into account. However, barring particular cases, these approaches do not make easier the solution of the problem of the constrained motion.
(1)' has to be accompanied by a set of equations describing the constraints $$f_j(t, \vec{x}_1, \ldots, \vec{x}_N)=0, \quad j=1,\ldots, c\tag{1''}$$ with $3N-c >0$ and the functions $f_j$ have to be functionally independent (I will not insist here on this point).
The set of requirements (1')+(1'')+(2) (where (2) must be compatible with the constraints) are generally not yet enough to determine a unique solutions $\vec{x}_i= \vec{x}_i(t)$ together with the values of the functions $\vec{\phi}_i= \vec{\phi}_i(t)$ along every motion of the system.
The remaining necessary information is given by a constitutive equation about the constraints. This is a non-geometrical relation involving the unknown forces $\vec{\phi}_i$. You should know several cases. The most important are two. The standard friction relation about components of the $\vec{\phi}_i$ involving friction coefficients $\mu$ and the D'Alembert's characterization of the ideal constraints. The latter has as the simplest case a frictionless constraint, but also includes the rigidity constraint and the integrable rolling constraint.
The problem consisting of (1')+(1'')+(2)+D'Alembert's characterization of the ideal constraints always admits a unique solution
$$\vec{x}_i= \vec{x}_i(t)\:, \quad \vec{\phi}_i=\vec{\phi}_i(t)\:, i=1,\ldots, N$$
provided all known involved functions are sufficiently regular. This is the main result of Lagrangian formulation of classical mechanics. As a byproduct, the equation of motion are written using just $3N-c$ coordinates (the $c$ constraints as well as the unknown forces $\vec{\phi}_i$ disappear from the Lagrange equations) which can be chosen with a very great arbitrariness as you probably already know.
Addendum 1.
Regarding the charcterization of D'Alembert constarints it is like this. $$\sum_{i=1}^N \vec{\phi}_i \cdot \delta \vec{x}_i =0 \tag{D}$$
where everything is evaluated at every fixed time $t$, on every permitted configuration $\vec{x}$ and each vector $$\delta \vec{x} = (\delta\vec{x}_1, \ldots, \delta\vec{x}_N)$$ is every tangent vector to the submanifold of $\mathbb R^{3N}$ including all permitted configurations at time $t$. If we introduce local free coordinates $q^1,\ldots, q^n$ ($n= 3N-c$) on this manifold, we have $$ \delta \vec{x}_i := \sum_{k=1}^n\frac{\partial \vec{x}_i}{\partial q^k} \delta q^k \tag{E}$$ for every choice of the numbers $\delta q^k \in \mathbb R$. $\delta \vec{x}$ is sometimes called virtual displacement.
(The notation is horrible) $\delta q^k$ are not small, they are arbitrary. I know that some books write "infinitesimal", but it does not mean anything. The right-hand side of (E) (with $i$ fixed) is a generic tangent vector to the manifold of the constraints, the $\delta q^k$ are the generic scalar components of that vector with respect to the basis made of the $n$ vectors $\frac{\partial \vec{x}_i}{\partial q^1}\:, \ldots,\frac{\partial \vec{x}_i}{\partial q^n}$. These vectors are linearly independent because the constraints are assumed to be functionally independent. (D) Says that the set of reactive forces $\phi_i$ is always orthogonal to the manifold of constraints in a generalized sense. This is the geometric meaning of the ideal contraints.
It is possible to prove that (D) includes the case of a frictionless geometric constraints (also when the geometric curves and surfaces change their shape in time), the rigidity constraint, any mixing of these two cases. The case of rolling constraint is also included provided it is integrable, i.e., it can be re-stated into a pure geometric fashion as it happens for a disk rolling on a fixed path. But it is not the case for a sphere rolling on a plane.
Addendum 2.
Regarding the existence and uniqueness of the solution of the arising equations of motion. First of all, (D) and (E) permit us to restate everything in terms of free local coordinates $q^1,\ldots, q^n$. The obtained equations of motion are the known general Euler-Lagrange equations $$\frac{d}{dt} \frac{\partial T}{\partial \dot{q}} - \frac{\partial T}{\partial q} = Q_k\tag{EL1}$$ $$\frac{dq^k}{dt} = \dot{q}^k\tag{EL2}$$ where, for $k=1,\ldots, n$ $$Q_k(t,q, \dot{q}) := \sum_{i=1}^N \vec{F}_i \cdot \frac{\partial \vec{x}_i}{\partial q^k} $$ and $$T(t,q,\dot{q}) = \sum_{i=1}^N \frac{1}{2} m_i \left(\frac{d\vec{x}_i}{dt}\right)^2\:.$$ It is possible to prove that, under our hypotheses of ideal geometric constraints $$T(t,q,\dot{q}) = \sum_{k,h=1}^m a_{hk}(t,q) \dot{q}^h \dot{q}^k + \sum_{k=1}^m b_{k}(t,q) \dot{q}^k + c(t,q)$$ where the matrix $[a_{hk}]$ is non-singular and positive. As a matter of fact
$$\begin{align} a_{hk}(t,q) &= \sum_{i=1}^N \frac{1}{2} m_i \frac{\partial \vec{x}_i}{\partial q^h} \cdot \frac{\partial \vec{x}_i}{\partial q^k} \\ b_{k}(t,q) &= \sum_{i=1}^N m_i \frac{\partial \vec{x}_i}{\partial q^k} \cdot \frac{\partial \vec{x}_i}{\partial t} \\ c(t,q) &= \sum_{i=1}^N \frac{1}{2} m_i \frac{\partial \vec{x}_i}{\partial t} \cdot \frac{\partial \vec{x}_i}{\partial t} \end{align}$$
Using this fact, a complicated computation proves that (EL1)+(EL2) reduces to a system of the from $$\frac{d^2 q^k}{dt^2} = G^k\left(t, q, \frac{d q}{dt} \right) \quad k=1,\ldots$$
This is a system of differential equations of the second order written into its normal form (all highest order derivatives appear separated from the other variables). If the right-hand side is sufficiently regular (as is the case if the constraints and the functional form of the known forces are smooth), the general theorem of existence and uniqueness of the solutions for initial conditions $q^k(t_0)$, $\frac{dq^k}{dt}|_{t_0}$ applies.