It doesn't differ - this is just the subspace topology on $X$.
Given a topological space $A$ and a subset $B\subseteq A$, we can give $B$ the subspace topology.
Let us suppose $B$ is a closed subset of $A$.
Because we have made $B$ into a topological space via the subspace topology, we speak of subsets of $B$ being open or closed.
By the definition of the subspace topology, $C\subseteq B$ is open in $B$ if and only if there exists some $E\subseteq A$, that is open in $A$, such that $E\cap B = C$.
By the definition of "closed", the complement of $E$ in $A$, namely $A\setminus E$, is a closed subset of $A$.
Because $A\setminus E$ and $B$ are both closed subsets of $A$, their intersection $(A\setminus E)\cap B$ is also a closed subset of $A$.
Their intersection is equal to $B\setminus (B\cap E)=B\setminus C$.
Thus, $C\subseteq B$ is open in $B$ if and only if $B\setminus C$ is a closed subset of $A$.
In this case, $A=\mathbb{A}^n$ is affine space with the Zariski topology, $B=X$, and $C=U$.
In the quote from Shafarevich, at the end there appears "$X\setminus U$ is closed", which can be ambiguous when it is said without reference to which topological space it is closed in. However, it is implicitly being taken to mean in the big space, i.e. $A$.
To appreciate the Zariski topology it helps to have a fairly broad view about what a topological space is. Topological spaces in full generality are, confusingly, not very topological in the naive sense! As discussed in this math.SE question, I think it is better to think of point-set topology as being about semidecidable properties (which are the open sets). The familiar kind of topology induced by a metric is about the specific property of being close in a metric sense, but other kinds of topologies are about different kinds of properties.
The Zariski topology is about the property of non-vanishing of polynomials. The semidecidable properties here are the properties "this set of polynomials does not vanish here." Intuitively speaking the reason this is semidecidable is that you can compute the value of a polynomial at a point to finite precision and once you show that it is sufficiently different from zero it cannot be zero.
The fact that the Zariski topology isn't Hausdorff isn't a weird property of the Zariski topology; it tells you something important about how vanishing of polynomials behaves, namely that the behavior of a polynomial on a few points can tell you a lot about its behavior at seemingly far-away points. This is intrinsic to the nature of algebraic geometry and pretending that the Zariski topology doesn't exist won't make it go away.
Okay, so what can you actually do with it? Here are a couple of things:
- If two polynomials agree on a Zariski-dense subset, then they agree identically. This is a surprisingly useful way to prove polynomial identities; for example, it can famously be used to prove the Cayley-Hamilton theorem.
- Moving to the Zariski topology on schemes allows the use of generic points. I am not familiar with examples of this technique in use though.
- Serre famously made use of the Zariski topology to introduce sheaf cohomology to algebraic geometry, which was (as I understand it) a crucial innovation.
To really appreciate the Zariski topology it helps to generalize it to arbitrary commutative rings. An important motivational example: if $X$ is a compact Hausdorff space and $C(X)$ is the ring of continuous functions $X \to \mathbb{R}$, then the maximal spectrum of $C(X)$ not only can be identified with $X$, but has the same topology! (This is an exercise in Atiyah-MacDonald.)
The rings one gets in this way are precisely the real subalgebras of complex commutative C*-algebras by the commutative Gelfand-Naimark theorem, and in fact you get a (contravariant) equivalence of categories. Moreover, by the Serre-Swan theorem, the category of real vector bundles on $X$ is naturally equivalent to the category of finitely-generated projective modules over $C(X)$.
It helps to think about this example like a physicist. Think of $X$ as the set of possible states of some physical system and the elements of $C(X)$ as observations one can make about the system; the value of a function at a point is the result of the observation in a fixed state. The Zariski topology here captures all semidecidable properties that you can decide using the observations in $C(X)$. For example, if one of the functions in $C(X)$ is called "temperature," there is a corresponding semidecidable property "the temperature of the system is between $0$ and $100$ degrees inclusive," which you can decide by computing the temperature to finite precision.
(What if $X$ is not compact? Then if you work with the ring $C_b(X)$ of bounded continuous functions on $X$, there are consistent sets of possible values of the observables which do not arise from an actual state of your system; they are points in the Stone-Čech compactification $\beta X$ instead.)
Here's another example that I like: let $B$ be a Boolean ring, which is a ring satisfying $b^2 = b$ for all $b \in B$. Then every element of $B$ can be identified with a subset of its maximal spectrum. This idea can be used to
For a discussion, see my blog post Boolean rings, ultrafilters, and Stone's representation theorem.
Best Answer
Let $Z\subset\mathbb{A}^n=X$ a closed subset; by definition $$ \exists f_1,\dots,f_r\in\mathbb{K}[x_1,\dots,x_n]=R\mid Z=V(f_1,\dots,f_r)=\bigcap_{i=1}^rV(f_r), $$ withous loss of the generality we can assume $r=1$ and we put $f_1=f$ so that $Z=V(f)$; because $R$ is an U.F.D., then \begin{gather*} \exists g_1,\dots,g_s\in R\mid f=g_1\cdot\dots\cdot g_s,\,g_j\,\text{is a prime polynomial}\\ Z=V(g_1\cdot\dots\cdot g_s)=\bigcup_{j=1}^sV(g_j), \end{gather*} without loss of generality we can assume $s=1$, in other words $f$ is a prime polynomial and $Z$ is an irreducible closed subset of $X$.
Let $\{U_i\}_{i\in I}$ an open covering of $Z$, for exact: $$ Z\subseteq\bigcup_{i\in I}U_i, $$ by previous reasoning, we can assume (without loss of the generality) that $U_i$'s are irreducible; by definition: \begin{gather*} \forall i\in I,\exists f_i\in R\mid U_i=D(f_i)=X\setminus V(f_i),\\ \bigcup_{i\in I}U_i=\dots=X\setminus\bigcap_{i\in I}V(f_i)=X\setminus W; \end{gather*} we know that $W$ is a closed subset of $X$, let $I(W)$ be the associated ideal of $W$, by Hilbert's Base theorem, it is finitely generated; by this statement \begin{gather*} \exists I_F\subseteq I\,\text{finite,}\,\{f_i\in R\}_{i\in I_F}\mid I(W)=(f_i\mid i\in I_F)\Rightarrow\\ \Rightarrow Z\subseteq\bigcup_{i\in I}U_i=X\setminus V(I(W))=X\setminus\bigcap_{i\in I_F}V(f_i)=\bigcup_{i\in I_F}U_i \end{gather*} and the claims follows. (Q.E.D.) $\Box$
Remark: I had use only the hypothesys that $X$ is an affine space over a field, independently from its characteristic and other algebraic properties!