Statement A: $v_0,v_1,...,v_k$ are affinely independent.
Statement B: $v_1-v_0,v_2-v_0,...,v_k-v_0$ are linearly independent.
First let us prove that $A \implies B$.
Consider some $\lambda_1,...,\lambda_k$, such that:
$$\sum_{i=1}^{k} \lambda_i (v_i - v_0) = 0 \tag{1}$$
We have to show that, if affine independence holds, all the cofficients ($\lambda$'s) must be zero.
Now, consider some $\lambda_0 \in \mathbb{R}$, such that: $$\sum_{i=0}^{k} \lambda_i = 0\tag{2}$$
Also, we have: $$\sum_{i=0}^{k} \lambda_i v_i = \sum_{i=1}^{k} \lambda_i (v_i - v_0) + (\sum_{i=0}^{k} \lambda_i)v_0 \tag{3}$$
Using equations 1 and 2, we observe that both terms on the RHS of (3) are zero. This means that: $$\sum_{i=0}^{k} \lambda_i v_i = 0 \tag{4}$$
From equations 2 and 4, we can say that $\lambda_i = 0$, $\forall i$ due to affine independence. Therefore, since all coefficients must be zero, this implies that B is true.
Now, let us prove the converse, that $B \implies A$.
Consider some $\lambda_0,\lambda_1,...,\lambda_k$, such that: $\sum_{i=0}^{k} \lambda_i v_i = 0$ and $\sum_{i=0}^{k} \lambda_k = 0$.
We have to show that all these coefficients must be zero under the condition of linear independence.
Using equation 3, and the above two conditions, we can conclude that $\sum_{i=1}^{k} \lambda_i (v_i - v_0) = 0$.
Therefore, due to linear independence of $(v_i - v_0)$, we conclude that: $$\lambda_1 = \lambda_2 = .... = \lambda_k = 0$$
Also, $\sum_{i=0}^{k} \lambda_k = 0 \implies \lambda_0 = 0$.
This proves that they are affinely independent (B is true).
We have shown $A \implies B$ and $B \implies A$.
$\therefore A \iff B$.
No, there is no mistake there.
Consider the set of points:
$$
F = \left\{x = \lambda_1 x_1 + \dots + \lambda_n x_n \in \mathbb{R}^d \ \vert \ \lambda_1 + \dots +\lambda_n = 0 \right\} \ .
$$
This set of points is a linear subspace of $ \mathbb{R}^d$, as you can easily check. If you solve for $\lambda_1$ the equation $\lambda_1 + \dots +\lambda_n = 0 $, you find that the vectors of $F$ can be written as
$$
x = -(\lambda_2 + \dots + \lambda_n)x_1 + \lambda_2x_2 + \dots + \lambda_n x_n = \lambda_2(x_2 - x_1) + \dots + \lambda_n (x_n - x_1) \ .
$$
That is,
$$
F = \mathrm{span}\left\{ \overrightarrow{x_1x_2}, \dots , \overrightarrow{x_1x_n}\right\} \ .
$$
(You could have done the same with any $x_i$ instead of $x_1$ too.)
Now, the following two statements are equivalent:
- Points $x_1, \dots , x_n$ are affinely independent.
- Vectors $\overrightarrow{x_1x_2}, \dots , \overrightarrow{x_1x_n} $ are linearly independent.
$\mathbf{(1) \Longrightarrow (2)}$. Let
$$
\mu_2 \overrightarrow{x_1x_2} + \dots + \mu_n \overrightarrow{x_1x_n} = 0
$$
We have to show that this implies $\mu_2 = \dots = \mu_n = 0$. Indeed,
$$
0 =\mu_2 \overrightarrow{x_1x_2} + \dots + \mu_n \overrightarrow{x_1x_n} = -(\mu_2 + \dots + \mu_n)x_1 + \mu_2 x_2 + \dots \mu_n x_n \ .
$$
In this expression, the sum of all coefficients is $0$. Since we are assuming $(1)$, this implies $\mu_2 = \dots = \mu_n = 0$.
$\mathbf{(2) \Longrightarrow (1)}$. Let
$$
\lambda_1 x_1 + \dots + \lambda_n x_n = 0 \qquad \text{and} \qquad \lambda_1 + \dots + \lambda_n = 0 \ .
$$
We have to show that this implies $\lambda_1 = \dots = \lambda_n = 0$. Indeed, solve the second equation for $\lambda_1$ again and you have
$$
0 = \lambda_1 x_1 + \dots + \lambda_n x_n = - (\lambda_2 + \dots + \lambda_n) x_1 + \lambda_2 x_2 + \dots + \lambda_n x_n = \lambda_2 \overrightarrow{x_1x_2} + \dots + \lambda_n \overrightarrow{x_1x_n} \ .
$$
Since we are assuming $(2)$, this implies $\lambda_2 = \dots = \lambda_n = 0$ and, since $\lambda_1 + \dots + \lambda_n = 0$, we have $\lambda_1 = 0$ too.
So far so good. Now, let's finish with another trivial remark about a geometrical interpretation of this linear subspace $F$ and that condition $\lambda_1 + \dots + \lambda_n = 0$. Consider the set of points
$$
V = \left\{x = \lambda_1 x_1 + \dots + \lambda_n x_n \in \mathbb{R}^d \ \vert \ \lambda_1 + \dots +\lambda_n = 1 \right\} \ .
$$
This set is an affine subspace. Indeed,
$$
V = x_1 + F \ .
$$
(You should check this equality and understand that you could put any $x_i$ in the place of $x_1$.)
You can say that $V$ is parallel to the subspace $F$: indeed, $V$ "is" just $F$ translated by $x_1$.
So what? What's so special about $V$? Well, on one hand, $V$ contains all the points $x_1 , \dots , x_n$ (exercise: check it!). On the other hand, it is the smallest affine subspace which contains them; in the sense that, if $W \subset \mathbb{R}^d$ is another affine subspace containing all $x_i$, then $V \subset W$.
Indeed, in general, if you have an affine subspace $W = p + G$ and two points in it $x, y \in W$, then $\overrightarrow{xy} \in G$. So, if $x_1, \dots , x_n \in W$, then $G$ must contain all $\overrightarrow{x_1x_i}$. Hence, $F \subset G$. So $V = x_1 + F \subset x_1 + G = W$.
Summing up: the condition that annoys you, $\lambda_1 + \dots + \lambda_n = 0$, makes the set $V$ to be the smallest affine subspace which contains all the points $x_1, \dots , x_n$.
EDIT. I forgot. Perhaps it would be a good exercise to redo everything we have seen here with some specific examples. For instance, take:
- $x_1 = (1,0), x_2 = (0,1)$ in $\mathbb{R}^2$.
- $x_1 = (1,0,0), x_2 = (0,1,0), x_3 = (0,0,1)$ in $\mathbb{R}^3$.
- $x_1 = (1,0), x_2 = (0,1), x_3 = (1/2, 1/2)$ in $\mathbb{R}^2$.
Best Answer
Yes, so $n+1$ points can be affinely independent, since $n$ vectors can be linearly independent. For example, if $\{e_1,..,e_n\}$ denotes the basis vectors of $\mathbb R^n$, then the vectors $\{e_1,2e_1,e_2+e_1,...,e_n+e_1\}$ form an affinely independent subset, since if we take $x_0 = e_1$, then the vectors $e_1,e_2,...,e_n$ are generated after subtracting the first vector from every other vector, which are linearly independent. Hence, the given vectors are affinely independent.
But if we take more than $n+1$ points, then the number of vectors generated by the differences will be more than $n$, so cannot be linearly independent. Hence, atmost $n+1$ vectors can constitute an affinely independent set.
Note : The choice of $x_0$ does not affect affine independence i.e. if I had chosen some other vector instead of $e_1$ as $x_0$ and taken differences, it would not have affected linear independence. I leave you to see why, it is not so difficult.