Counterexamples exist.
They are essentially based on the fact that the output of the algorithm outlined in the last paragraph of the question may be wasteful in covering all the vectors of $V$. Making "smart" choices may result in a smaller set $\mathcal{A}$ than what we would get by greedily picking the first vector not violating the update rule.
IMHO a most striking way of seeing this is to appeal to the properties of so called perfect codes. It is known (highly non-trivial) that there are darn few perfect codes with covering radius $>1$, but we have the binary Golay code. It is a $12$-dimensional subspace $C$ of $W=GF(2)^{23}$. Perfectness here manifests itself as follows. Given a vector $x\in W$ there is a single vector not differing from $x$ at all ($x$ itself), $\binom {23}1=23$ vectors differing from $x$ at a single component, $\binom {23}2=23\cdot 11$ vectors differing from $x$ at exactly two components, and $\binom {23}3=23\cdot 7\cdot 11$ vectors differing from $x$ at exactly three components. Observe that
$$1+23+23\cdot11+23\cdot7\cdot11=1+23\cdot89=2^{11}.$$
A truly remarkable fact is that these balls of radius three (w.r.t. the Hamming metric) centered at the vectors $x\in C$ cover the entire space $W$ seamlessly without any overlap whatsoever! In other words, all the $2^{23}=2^{12}\cdot 2^{11}$ vectors of $W$ are within Hamming distance three from a unique vector in $C$.
Let $H$ be a parity check matrix of the binary perfect Golay code. Here is one such (unless I miscopied it)
$$
H=\left(
\begin{array}{ccccccccccccccccccccccc}
1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0&0&0&0&0&0\\
0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0&0&0&0&0\\
0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0&0&0&0\\
0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0&0&0\\
0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0&0\\
0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0&0\\
0&0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0&0\\
0&0&0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0&0\\
0&0&0&0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0&0\\
0&0&0&0&0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1&0\\
0&0&0&0&0&0&0&0&0&0&1&1&1&1&1&0&0&1&0&0&1&0&1\\
\end{array}\right).
$$
The subspace $C$ consists of those vectors in $GF(2)^{23}$ that are orthogonal to
all the rows of $H$, i.e. share an even number of $1$s with all the rows of $H$.
On with the counterexample. Let $V=GF(2)^{11}$.
Let $\mathcal{A}$ be the set of $23$ columns of $H$.
The perfectness of $C$ implies that any vector in $V$ is a sum of at most three of those columns. The minimum distance of $C$ is seven, implying that every set of seven columns of $H$ is linearly independent. This is standard coding theory, ask if you want these steps detailed. In particular, every set of four columns of $H$ is linearly independent. Therefore $\mathcal{A}$ is a basis of $V$ with parameter $t=4$.
The above calculation with binomial coefficients implies that:
- Because a set of $K$ vectors of $V$ has $(1+\binom{K}1+\binom{K}2+\binom{K}3)$ linear combinations of at most $t-1=3$ vectors, any basis with parameter $t=4$ must have at least $23$ vectors to express all the $2^{11}$ vectors of $V$ as in condition (2).
- And, if a basis $\mathcal{A}'$ with parameter $t=4$ has only $23$ vectors, then the presentation of vectors as a sum of at most three elements of $\mathcal{A}'$ must be unique.
But, consider the following subset $S$ of $V$
$$
S=\{
10000000000,
01000000000,
00100000000,
00010000000,
11110000000\}.
$$
That is, the first four vectors from the natural basis and their sum. Clearly all subsets of at most $t=4$ elements of $S$ are linearly independent, so the set $S$ has the property (1) with parameter $t=4$. Therefore we can use it as an input to the algorithm of the last paragraph, and keep adding new vectors from $V$ to it without violating property (1) until the property (2) is also satisfied, and we have a basis with parameter $t=4$. The output will be another basis $\mathcal{A}'$.
But, the vector $v=01110000000$ can be written in two obvious ways as a sum of at
most three vectors from $S$. Therefore it can be written in at least two different ways as a sum of at most $t-1=3$ vectors from $\mathcal{A}'$. As per the second bullet we can conclude that $|\mathcal{A}'|>23$.
I'm fairly sure that we can find smaller counterexamples where the idea of seeding the algorithm with a smart and with a less smart initial set $S$ leads to bases with different cardinalities. I used the Golay code to produce a counterexample simpy because I find it kinda neat.
We want to show that $G$ has a fixed point on $V$. In order to do so, we let $0 \neq v \in V$ and consider the $\mathbb{F}_p$-span of the orbit of $v$ under $G$, let's call this space $V'$. This is a finite dimensional vector space over $\mathbb{F}_p$, thus $|V'| = p^n$ for some $n$. The group $G$ acts on $V'$ and we have a partition $$V' = \bigsqcup O_i $$
into $G$-orbits. Note that either $|O_i| = 1$, which means that the single element in $O_i$ is a fixed point or $p$ divides $|O_i|$ (orbit-stabilizer-theorem). The identity $$ \sum_{i} |O_i| \equiv 0 \text{ (mod }p)$$
yields that $\{i \:|\: |O_i| = 1\}$ is either empty or contains at least $p$ elements. However, since $0 \in V'$ is a fixed point, it follows that this set is not empty and thus there exists at least one non-zero fixed point in $V'$. Take $W$ to be the subspace generated by this element and you are done.
Best Answer
A $d$-dimensional vector space over a finite field with $q$ elements has size $q^d$. Since finite fields only occur with prime powers: $q = p^n$ for some $n \in \mathbb{N}$, and so the finite vector space has $(p^n)^d = p^{nd}$ elements.
There are only a couple ways to write $8$ as a power of a prime power: \begin{align*} 8 = (2^1)^3 \qquad &\Longrightarrow \qquad V \cong (\mathbb{F}_2)^3 \text{ is $3$-dim over field with $2$ elements} \\ 8 = (2^3)^1 \qquad &\Longrightarrow \qquad V \cong (\mathbb{F}_{2^3})^1 \text{ is $1$-dim over field with $8 = 2^3$ elements} \end{align*}
But, seeing as how the field $\mathbb{F}_{2^3}$ is itself a $3$-dimensional vector space over $\mathbb{F}_2$, the second construction is essentially the same as the first (you have to "forget" the multiplication inside the bigger field, as vectors don't multiply in a vector space).