Let $V$ be a finite-dimensional vector space with basis $e_1, ... e_n$. Then we may write any vector $v$ in the form
$$v = \sum c_i e_i$$
for some coefficients $c_i$. Sending a vector $v$ to the coefficient $c_i$ for fixed $i$ defines a linear functional $e_i^{\ast} : V \to k$. These linear functionals together constitute the dual basis to $V$, and what confused me for a long time is that linear functionals do not transform in the same way as vectors under change of coordinates; we say that vectors transform covariantly but linear functionals transform contravariantly. Before I understood this I was constantly getting confused about the difference between transforming a vector and transforming its components.
For an infinite-dimensional example, consider the vector space $k[x]$ of polynomials in one variable over a field. It has a distinguished set of dual vectors given by the functions $[x^n]$ which return the coefficient of $x^n$ in a polynomial. To be suggestive you can write these functions as $\frac{1}{n!} \frac{d^n}{dx^n}_{x = 0}$. It turns out that the dual space $k[x]^{\ast}$ is precisely the product of the spaces containing each of these dual vectors; for example, the dual space contains vectors that ought to be called
$$(e^{t \frac{d}{dx} })_{x=0} = \sum_{n \ge 0} \frac{t^n}{n!} \frac{d^n}{dx^n}_{x=0}$$
that given a polynomial $f(x)$ return the numerical value of $f(t)$.
Thinking of $\frac{d^0}{dx^0}_{x=0}$ as a toy model for the Dirac delta function, you can think of this construction as a toy model for (Schwartz) distributions.
In differential geometry, the dual of a tangent space $T_p(M)$ at a point $p$ on a manifold $M$ is the cotangent space $T_p^{\ast}(M)$ at $p$. Just as the tangent space captures the infinitesimal behavior of smooth functions $\mathbb{R} \to M$ near $p$ (curves), the cotangent space captures the infinitesimal behavior of smooth functions $M \to \mathbb{R}$ near $p$ (coordinates). Just as a nice family of tangent vectors gives a vector field, a nice family of cotangent vectors gives a 1-form. In classical mechanics, the cotangent bundle is the phase space of a classical particle traveling on $M$; cotangent vectors give momenta.
For me duality really shines when you combine it with tensor products and start using the language of tensors. Then you can describe any kind of linear-ish thing using a combination of tensor products and duals, at least for finite-dimensional vector spaces:
- What's a linear function $V \to W$? It's an element of $V^{\ast} \otimes W$.
- What's a bilinear form $V \times V \to k$? It's an element of $V^{\ast} \otimes V^{\ast}$.
- What's a multiplication $V \times V \to V$? It's an element of $V^{\ast} \otimes V^{\ast} \otimes V$.
When you have a bunch of linear-ish things around, writing them as all tensors helps you keep track of exactly how you can combine them (using tensor contraction). For example, an endomorphism $V \to V$ is an element of $V^{\ast} \otimes V$, but I have a distinguished dual pairing
$$V^{\ast} \otimes V \to k.$$
What does this do to endomorphisms? It's just the trace!
Any category comes naturally equipped with two distinct functors into $\text{Set}$: there is the covariant functor $h^r\colon a\mapsto\text{Hom}(r,a)$ and the contravariant functor $h_r\colon a\mapsto\text{Hom}(a,r).$ There is a duality between them, in the sense that $h_r$ for the category $D$ is $h^r$ on $D^\text{op}.$ Turning the arrows around changes $D$ into $D^\text{op}$ and $h_r$ into $h^r$.
So to each object we can associate a presheaf or a co-presheaf. The covariant and contravariant Yoneda lemmas tell us that both associations are embeddings:
if $K$ is a covariant functor $D\to \text{Set}$, then we have the natural isomorphism $\text{Hom}(D(r,-),K)\cong K(r)$, with the isomorphism given by $\alpha\mapsto \alpha_r(1_r).$
if $K$ is a contravariant functor $D\to \text{Set}$, then we have the natural isomorphism $\text{Hom}(D(-,r),K)\cong K(r),$ with the isomorphism given by $\alpha\mapsto \alpha_r(1_r).$
To be even more explicit, let us note that the covariant Yoneda lemma applies to any category. In particular, if you turn the arrows around of $D$, you get the dual category $D^\text{op}$, which has $D^\text{op}(r,s)=D(s,r)$. If $f\colon s\to r$ is an arrow in $D$, then $f^\text{op}\colon r\to s$ is an arrow in $D^\text{op}.$
Now let's apply the Yoneda lemma to this category $D^\text{op}$. If $K$ is a covariant functor $D^\text{op}\to\text{Set},$ then we have $\text{Hom}(D^\text{op}(r,-),K)=\text{Hom}(D(-,r),K)\cong K(r).$ But a covariant functor $D^\text{op}\to\text{Set}$ is nothing but a contravariant functor $D\to\text{Set}.$ Thus we see that the contravariant Yoneda lemma is nothing but the covariant Yoneda lemma applied to the category $D$ with its arrows reversed.
So the covariant and contravariant Yoneda lemmas are dual in the sense that they are identical theorems, states for two categories which are dual to one another. But another way to get dual statements in category theory is to dualize everything in sight. Every line of the theorem. Any statement in the language of category theorem has an equivalent dual statement. For example, the statement that $A\times(B+C)\cong A\times B+A\times C$ is a natural isomorphism (which holds in any distributive category) has a dual statement $A + (B\times C)\cong (A+B)\times(A + C).$ We turn all limits into colimits.
To achieve this for the Yoneda lemma, let's not just replace $D$ by its opposite, but also $\text{Set}$ by its opposite, and turn all products into coproducts, etc.
This is called the co-Yoneda lemma, which states that $$\int^{r\in D}D(r,s)\times K(r)\cong K(s)$$ (and also comes in a contravariant version), and is the formal dual of the Yoneda lemma, in the sense that the coend is the dual limit of the end, and tensor product is the dual of hom.
To go into a bit more detail on this, note that the set of functions between two sets $\text{Hom}(S,T)$ is isomorphic to a product $T^S=\displaystyle\prod_{s\in S}T.$ Products admit categorical descriptions in terms of universal properties, and as such they have duals, calls disjoint unions, or sums, formed by reversing the arrows in the universal diagram.
The dual of the weighted product $\displaystyle\prod_{s\in D}K(r)^{D(s,r)}$ is the weighted sum $\displaystyle\sum_{s\in D}D(s,r)\times K(r).$
The covariant Yoneda lemma says that two functors $D\times \text{Set}^D\to\text{Set}$ are isomorphic, the two functors being "take hom, i.e. weighted product" $(r,K)\mapsto \text{Hom}(h^r,K)$ and "evaluate" $(r,K)\mapsto K(r)$. The contravariant Yoneda lemma says the same thing about the two functors $D^\text{op}\times \text{Set}^{D^\text{op}}\to\text{Set}.$
Dualizing literally everything in sight, we have isomorphism of two functors $D^\text{op}\times (\text{Set}^\text{op})^{D^\text{op}}\to\text{Set}^\text{op}$ and of two functors $D\times (\text{Set}^\text{op})^{D}\to\text{Set}^\text{op}.$ Weighted products in $\text{Set}^\text{op}$ are weighted sums in $\text{Set},$ therefore these isomorphisms are the co-Yoneda lemma. Instead of taking the dual of our starting category, we take the dual of the functor category in which the isomorphisms live.
Best Answer
The idea of duality is that, whenever you have a purely categorical statement (or concept), you can look at what that statement means for a category $ C $ when applied to $ C^{\mathrm{op}} $: this gives you the dual statement or concept. Then, if you have a statement that is true in any category, then its dual statement is also necessarily true, since it is true for $ C^\mathrm{op} $. Finding out the dualized statement is then just a matter of properly writing down the statement in the dual category and interpreting it in the original one (which can often be tedious).
As an example, you can see how the definition of an initial object in $ C^{\mathrm{op}} $ coincides with that of a terminal object in $ C $! Thus, the dualized statement of "all initial items are uniquely isomorphic" is "all terminal objects are uniquely isomorphic". Same goes for limits/colimits, continuous/co-continuous functors, left/right-adjoints (note here that there are two categories involved, both of which you end up dualizing), left/right Kan extensions, kernels/cokernels, images/coimages in an abelian category, etc. In the end, you can apply duality to whatever statement you want, it is just a matter of practicing that conceptually simple but powerful tool.
I do think though that the lack of details is just typical of users of categories in general, and not just of duality, but that is another topic altogether.