What is the difference between linearly and affinely independent vectors? Why does affine independence not imply linear independence necessarily? Can someone explain using an example?
[Math] the difference between linearly and affinely independent vectors
affine-geometryconvex-geometrylinear algebravector-spacesvectors
Related Solutions
No, there is no mistake there.
Consider the set of points:
$$ F = \left\{x = \lambda_1 x_1 + \dots + \lambda_n x_n \in \mathbb{R}^d \ \vert \ \lambda_1 + \dots +\lambda_n = 0 \right\} \ . $$
This set of points is a linear subspace of $ \mathbb{R}^d$, as you can easily check. If you solve for $\lambda_1$ the equation $\lambda_1 + \dots +\lambda_n = 0 $, you find that the vectors of $F$ can be written as
$$ x = -(\lambda_2 + \dots + \lambda_n)x_1 + \lambda_2x_2 + \dots + \lambda_n x_n = \lambda_2(x_2 - x_1) + \dots + \lambda_n (x_n - x_1) \ . $$
That is,
$$ F = \mathrm{span}\left\{ \overrightarrow{x_1x_2}, \dots , \overrightarrow{x_1x_n}\right\} \ . $$
(You could have done the same with any $x_i$ instead of $x_1$ too.)
Now, the following two statements are equivalent:
- Points $x_1, \dots , x_n$ are affinely independent.
- Vectors $\overrightarrow{x_1x_2}, \dots , \overrightarrow{x_1x_n} $ are linearly independent.
$\mathbf{(1) \Longrightarrow (2)}$. Let
$$ \mu_2 \overrightarrow{x_1x_2} + \dots + \mu_n \overrightarrow{x_1x_n} = 0 $$
We have to show that this implies $\mu_2 = \dots = \mu_n = 0$. Indeed,
$$ 0 =\mu_2 \overrightarrow{x_1x_2} + \dots + \mu_n \overrightarrow{x_1x_n} = -(\mu_2 + \dots + \mu_n)x_1 + \mu_2 x_2 + \dots \mu_n x_n \ . $$
In this expression, the sum of all coefficients is $0$. Since we are assuming $(1)$, this implies $\mu_2 = \dots = \mu_n = 0$.
$\mathbf{(2) \Longrightarrow (1)}$. Let
$$ \lambda_1 x_1 + \dots + \lambda_n x_n = 0 \qquad \text{and} \qquad \lambda_1 + \dots + \lambda_n = 0 \ . $$
We have to show that this implies $\lambda_1 = \dots = \lambda_n = 0$. Indeed, solve the second equation for $\lambda_1$ again and you have
$$ 0 = \lambda_1 x_1 + \dots + \lambda_n x_n = - (\lambda_2 + \dots + \lambda_n) x_1 + \lambda_2 x_2 + \dots + \lambda_n x_n = \lambda_2 \overrightarrow{x_1x_2} + \dots + \lambda_n \overrightarrow{x_1x_n} \ . $$
Since we are assuming $(2)$, this implies $\lambda_2 = \dots = \lambda_n = 0$ and, since $\lambda_1 + \dots + \lambda_n = 0$, we have $\lambda_1 = 0$ too.
So far so good. Now, let's finish with another trivial remark about a geometrical interpretation of this linear subspace $F$ and that condition $\lambda_1 + \dots + \lambda_n = 0$. Consider the set of points
$$ V = \left\{x = \lambda_1 x_1 + \dots + \lambda_n x_n \in \mathbb{R}^d \ \vert \ \lambda_1 + \dots +\lambda_n = 1 \right\} \ . $$
This set is an affine subspace. Indeed,
$$ V = x_1 + F \ . $$
(You should check this equality and understand that you could put any $x_i$ in the place of $x_1$.)
You can say that $V$ is parallel to the subspace $F$: indeed, $V$ "is" just $F$ translated by $x_1$.
So what? What's so special about $V$? Well, on one hand, $V$ contains all the points $x_1 , \dots , x_n$ (exercise: check it!). On the other hand, it is the smallest affine subspace which contains them; in the sense that, if $W \subset \mathbb{R}^d$ is another affine subspace containing all $x_i$, then $V \subset W$.
Indeed, in general, if you have an affine subspace $W = p + G$ and two points in it $x, y \in W$, then $\overrightarrow{xy} \in G$. So, if $x_1, \dots , x_n \in W$, then $G$ must contain all $\overrightarrow{x_1x_i}$. Hence, $F \subset G$. So $V = x_1 + F \subset x_1 + G = W$.
Summing up: the condition that annoys you, $\lambda_1 + \dots + \lambda_n = 0$, makes the set $V$ to be the smallest affine subspace which contains all the points $x_1, \dots , x_n$.
EDIT. I forgot. Perhaps it would be a good exercise to redo everything we have seen here with some specific examples. For instance, take:
- $x_1 = (1,0), x_2 = (0,1)$ in $\mathbb{R}^2$.
- $x_1 = (1,0,0), x_2 = (0,1,0), x_3 = (0,0,1)$ in $\mathbb{R}^3$.
- $x_1 = (1,0), x_2 = (0,1), x_3 = (1/2, 1/2)$ in $\mathbb{R}^2$.
The definition of a lattice is that it is a discrete additive subgroup of $\mathbb{R}^n$. The requirement that it be discrete gives us the answer to your question!
The "discrete" bit means that there is some $\epsilon > 0$ such that for any two distinct lattice points $x, y \in \Lambda$, $||x-y|| > \epsilon$.
In English: there has to be some minimum distance by at least which all lattice points are separated.
What would happen if we tried to define a lattice with the following basis:
$\begin{pmatrix} 1 & 0 & \sqrt{2}\\ 0 & 1 & \sqrt{2} \end{pmatrix}$
There's no way we're ever going to get $C_3$ from integer multiples of $C_1$ and $C_2$, so why isn't this a valid lattice basis?
Note that $\begin{pmatrix}1 \\ 1\end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix} + \begin{pmatrix} 0 \\ 1\end{pmatrix}$ will be in $\Lambda$, as will $\begin{pmatrix} 0.4142135...\\ 0.4142135...\end{pmatrix} = \begin{pmatrix} \sqrt{2} \\ \sqrt{2} \end{pmatrix} - \begin{pmatrix} 1 \\ 1\end{pmatrix}$
as will $ \begin{pmatrix}0.5857864... \\ 0.5857864... \end{pmatrix}= \begin{pmatrix} 1 \\ 1\end{pmatrix} - \begin{pmatrix} 0.4142135...\\ 0.4142135...\end{pmatrix}$ and so on and so forth.
So if I came to you claiming that our set of 3 vectors forms a basis for a 2-D lattice, I would have to offer up some minimum distance between lattice points, $\epsilon$, per the definition of a lattice. However, you would always be able to find two lattice points $x, y \in \Lambda$ such that $|| x - y || < \epsilon$ for any value of $\epsilon$ that I proposed, proving that I was a liar and our set of vectors doesn't form a lattice basis after all.
Related Question
- [Math] Prove that $\{\vec x_i\}\subset\mathbb R^d$ is affinely independent iff $\{(1,\vec x_i)\}$ is linearly independent
- Linear Algebra – Are Affinely Independent Vectors Also Linearly Independent
- Affine Geometry – What Does Affinely Independent Mean and Its Importance
- [Math] Prove that $\mathbb{R}^n$ contains at most $n+1$ affinely independent points
- Difference between a linearly independent set and l.i. vectors
Best Answer
To augment Lord Shark's answer, I just wanted to talk a little about the intuition behind it.
Intuitively, a set of vectors is linearly dependent if there are more vectors than necessary to generate their span, i.e. the smallest subspace containing them.
On the other hand, a set of vectors is affinely dependent if there are more vectors than necessary to generate their affine hull, i.e. the smallest flat (translate of a linear space) containing them.
A single vector $v$ in a vector space generates an affine hull of $\lbrace v \rbrace$, which is just the trivial subspace $\lbrace 0 \rbrace$ translated by $v$. But, if $v \neq 0$, the span is the entire line between $0$ and $v$, as $0$ must be part of any subspace. To generate that line as an affine hull, you could look at the list $v, 0$.
So, $v, 0$ are linearly dependent (e.g. $0 = 0 \cdot v + 5 \cdot 0$) as $0$ is not necessary to generate the span (just $v$ would have done fine), but both are necessary to generate the line as the affine hull, so they are affinely independent. To prove this, suppose $\lambda_1 + \lambda_2 = 0$ and,
$$\lambda_1 \cdot v + \lambda_2 \cdot 0 = 0.$$
Then $\lambda_1 \cdot v = 0$, which implies $\lambda_1 = 0$, since $v \neq 0$. Since $\lambda_1 + \lambda_2 = 0$, we therefore also have $\lambda_2 = 0$. This proves affine independence.