The first derivation is correct, but only if you mean to take the difference between the two vectors, $\mathbf{F_1} - \mathbf{F_2}$; the figure would then show $\mathbf{F_D}$ running from the tip of one vector to the tip of the other, across the parallelogram. This is the Law of Cosines, which refers to the angle enclosed by the two sides of the triangle: $$F_D^2=\Vert \mathbf{F_2} - \mathbf{F_1}\Vert^2=\Vert\mathbf{F_1}\Vert^2+\Vert\mathbf{F_2}\Vert^2-2\mathbf{F_1}\mathbf{F_2}=F_1^2+F_2^2-2F_1F_2\cos(\alpha )$$
The second derivation obtains the correct result, but is flawed. Here we follow the figure in order to obtain the longer diagonal of the parallelogram by moving $\mathbf{F_1}$ to the tip of $\mathbf{F_2}$, the enclosed angle becomes the supplement, $(\pi - \alpha )$; and the vector $\mathbf{F_1}$ should be pointing away from the new vertex.
Now when we evaluate the law of cosines we get
$$F_R^2=\Vert-\mathbf{F_2} + \mathbf{F_1}\Vert^2=\Vert\mathbf{F_1}\Vert^2+\Vert\mathbf{F_2}\Vert^2+2\mathbf{F_1}(-\mathbf{F_2})=F_1^2+F_2^2-2F_1F_2\cos(\pi - \alpha )$$
But $cos(\pi - \alpha ) = -cos(\alpha )$, which recovers the second formula, from the first: $$F_R^2=\Vert\mathbf{F_2}-\mathbf{F_1}\Vert^2=\Vert\mathbf{F_1}\Vert^2+\Vert\mathbf{F_2}\Vert^2-2\mathbf{F_1}\mathbf{F_2}=F_1^2+F_2^2+2F_1F_2\cos\alpha$$
You can also obtain the second formula from Euclid's Law of Parallelograms, which states that the the sum of the squares of the four sides is equal to the sum of the squares of the two diagonals. Subtract the first formula, the Law of Cosines, which gives the square of the length of the short diagonal, from the sum of the squares of the four sides, $$ F_R^2 + F_D^2=(2\Vert\mathbf{F_1}\Vert^2+2\Vert\mathbf{F_2}\Vert^2)$$ and you are left with the second formula, which gives the length squared of the long diagonal of the parallelogram.
Note: I have edited my original answer to take into account some comments and to add more generality, in particular stressing the superposition principle as a more general principle than the pair-wise additivity, the latter being a particular case of the former.
I wouldn't call linearity the property you are referring to. The proper name superposition principle. In the simplest case, the superposition principle coincides with the pair-wise additivity of the interactions, i.e., we assume that if the force on body $i$ due to body $j$ alone is ${\bf F}_{ij}$, and that due to body $k$ alone is ${\bf F}_{ik}$, the total force in $i$, due to the simultaneous presence of $j$ and $k$, is
$$
{\bf F}_{i} = {\bf F}_{ij}+{\bf F}_{ik}.
$$
Notice that in the separate case and in the combined case, each pair-wise force
(${\bf F}_{ij}$ and ${\bf F}_{ik}$) is a function only of quantities of the corresponding pair.
More in general, for an n-body system,
$$
{\bf F}_{i} = \sum_{j=1;j\neq i}^n{\bf F}_{ij}.
$$
Superposition (or pair-wise additivity) is definitely a mathematical property different from the bi-linearity of the vector sum. The former has to do with the functional dependence of the contributions to the force on the body parameters, the latter with the operations defined on these functions.
Superposition, or the more specific pair-wise additivity of the forces, is often used in Newtonian mechanics and it was taken for granted by Netwon, but it is not a necessary condition. Indeed it is quite easy to provide examples of more complicated forms of force law. Even more important, although rarely stressed in the textbooks, the most accurate models of the effective forces among atoms or molecules in condensed matter are certainly not pair-wise additive (see the comment at the end).
Probably the most simple example of a force that is not pair-wise additive is the force between two neutral but polarizable particles, say $1$ and $2$. If only these two particles are present the mutual forces are zero: ${\bf F}_{12}=0$ and ${\bf F}_{21}=0$. However, if we introduce a third, charged particle say number $3$, both the original particles get an induced electric dipole and, in addition to the dipole-charge interactions with the charged body, ${\bf F}_{21}\neq 0$ and ${\bf F}_{12} \neq 0$, due to the dipole-dipole interaction.
A couple of final comments are in order:
- the formal structure of Newtonian mechanics is able to accommodate non-pairwise forces without problems. It is only the expression of the total force on each particle that is more complex. It should be clear that pair-wise non-additivity does not break the second-law relation between total force and acceleration. Simply put, there is nothing like the additivity of the accelerations due to the presence of different external bodies. This has nothing to do with the vector character of accelerations and forces, of course.
- if the example of the polarizable particles in the presence or absence of a charge seems too artificial, one should remember that the effective interactions among atoms in condensed matter are always originating from an operation of partial trace over electronic degrees of freedom. An example is the well-known Born-Oppenheimer approximation where the interatomic interaction energy contains a many-body term (i.e. non-pairwise interaction) corresponding to the ground state energy of the electrons in the presence of fixed nuclei.
Best Answer
Everything in mathematics is abstract. The number 1 does not exist in the same way the earth does. However, some mathematical ideas can be good models for some parts of physical reality. The counting numbers--0, 1, 2, etc.--are useful if I want to know how many cars are on the road. Negative numbers, while just as valid mathematically, are not useful for this purpose. Negative numbers are useful for talking about altitude: positive numbers for above ground, negative numbers for underground.
As for vectors and forces[1], we can come at this from multiple directions. First, we can say that vectors are a good fit for describing forces through experiment. For example, if you push a block across a floor with 2 N of force at a 45$^\circ$ angle downwards, it will move the same as when you place a 1.4-N weight on top of the block and push it horizontally with a 1.4 N force ($\sqrt{1.4^2 + 1.4^2} \approx 2$). This is an experimental fact that supports describing forces as vectors that can be decomposed into components. Indeed, engineers of all kinds rely on forces acting like vectors in their designs, so the fact that the machines they build work is more experimental confirmation of the use of vectors.
Another way to think about vectors is to start with displacement. You can convince yourself that walking from one point to another along a vector will have the same result as walking along two vectors (that is, head-to-tail addition of vectors) from decomposing the first. If vectors describe displacement, then they can also describe velocity, since velocity is displacement divided by time. Similarly, since vectors work for velocities, they will work for acceleration since that is the difference in velocity vectors divided by time. Finally, by Newton's third law, vectors must work for forces, since a force is equal to acceleration times the mass of the accelerating body.
[1] A side note, vectors were not the first mathematical entity to describe forces and motion. Quaternions came before vectors by about half a century. Vectors turned out to be simpler to work with and were favored by the beginning of the 20th century. Quaternions are still used today since they are better at describing rotations and are used in computer graphics and motion control (like the joints of robotic arms).