It's definitely doable. Let's consider a simpler example first: let $X=[0,1]$, and let $A=\{0\}$.

You can retract $X\times I$ (a square) to $(X\times\{0\})\cup(A\times I)$ (the union of the "bottom" and "left" sides of the square) by projecting each point along the ray from $(2,2)$:

To move this intuition to your example of $X=$ a disk and $A=$ a smaller disk inside $X$, just "swing this around" (as one would to form a solid of revolution) and leave the interior of $A$ alone.

For fun:

PlotACylinder[RadiusOfA_, Height_, theta_, u_] :=
{RadiusOfA*Cos[theta], RadiusOfA*Sin[theta], Height*u}
PlotATop[RadiusOfA_, Height_, theta_, u_] :=
{RadiusOfA*u*Cos[theta], RadiusOfA*u*Sin[theta], Height}
PlotX[RadiusOfX_, theta_, u_] :=
{RadiusOfX*u*Cos[theta], RadiusOfX*u*Sin[theta], 0}
PlotTopSurface[RadiusOfA_, RadiusOfX_, Height_, t_, theta_, u_] :=
Module[{x, y},
x = RadiusOfA + (RadiusOfX - RadiusOfA) u;
y = 2 Height*(1 - (2 RadiusOfX - 2 RadiusOfA)/(2 RadiusOfX - RadiusOfA - x))
+ Height*(2 RadiusOfX - 2 RadiusOfA)/(2 RadiusOfX - RadiusOfA - x);
{(x (1 - t) + RadiusOfA*t)*Cos[theta], (x (1 - t) + RadiusOfA*t)*Sin[theta],
Height (1 - t) + y*t}]
PlotSideSurface[RadiusOfA_, RadiusOfX_, Height_, t_, theta_, u_] :=
Module[{x, y},
y = Height*u;
x = (2 RadiusOfX - RadiusOfA)*(1 - (2 Height/(2 Height - y)))
+ RadiusOfX (2 Height/(2 Height - y));
{(RadiusOfX (1 - t) + x*t)*Cos[theta], (RadiusOfX (1 - t) + x*t)*Sin[theta],
y (1 - t)}]
PlotRetract[RadiusOfA_, RadiusOfX_, Height_, t_] := ParametricPlot3D[
{PlotACylinder[RadiusOfA, Height, theta, u],
PlotATop[RadiusOfA, Height, theta, u],
PlotX[RadiusOfX, theta, u],
PlotTopSurface[RadiusOfA, RadiusOfX, Height, t, theta, u],
PlotSideSurface[RadiusOfA, RadiusOfX, Height, t, theta, u]},
{theta, 0, 2 Pi}, {u, 0, 1}, Mesh -> None, Axes -> None,
Boxed -> False, PlotPoints -> 30,
Lighting -> {{"Directional", White, {{1, 1, 1}, {0, 0, 0}}}},
PlotStyle -> {Gray, Gray, Gray, Directive[Blue, Opacity[0.5]],
Directive[Blue, Opacity[0.5]]}]
Export["animation.gif", Table[PlotRetract[1, 3, 4, Max[0, t]],
{t, -0.1, 0.98, 0.02}], "DisplayDurations" -> {0.125}]

## Best Answer

Let's look at Hatcher's definition:

Thus we Hatcher says that for a map $f_0:X \to Y$ with a subspace $A \subset X$ we are given are homotopy $f_t:A \to Y$ of $f_0|A$ this simply means that there is some other map $f_1:A \to Y$ that is homotopic to $f_0|A$. The homotopy extension property then tells us when this homtopy extends to the whole of $X$ (not just the subspace $A \subset X$)

There are plenty of examples. For example Proposition 0.16 gives a useful one - a CW-pair $(X,A)$ has the homotopy extension property.

This will be especially useful when you learn about cofibrations.