Is My Description of Systems of Equations Correct

linear algebralogicsolution-verificationsystems of equations

I am trying to summarize the concept of systems of equations for a written piece in a formal setting for students. For this, I attempt to give a formal and complete description of the case where we wish to determine the complete solution (that is, the one-solution case) for the system and the proof that to solve such a system we need as many equations as variables.

My question is, is the description I provide valid and complete? Also, as part of that requirement, is the proof that I give also valid and complete?

What I have so far follows, which starts with an example for motivation.

Any single valid equation in one variable addresses exactly one concern. So if we were
asked the following question, we would have no way of answering it.

If a car can drive four times the amount of fuel it's carrying in
gallons minus three times the number of minutes it has spent standing
running idly at traffic lights and we know that a particular car drove
for one hundred miles after a refuel, how much fuel did the car start
with and how many minutes did it spend running idly since it refueled?

The reason being that we are trying to address two concerns with one
equation.

The above scenario is precisely the reason that a
formal method was developed to discern the various quantities which
may be modeled analytically. As we just discussed, in order to be able
to successfully and accurately represent the information being
queried, the number of equations must somehow be adjusted to account
for an increase in the number of variables being used.

How do we know exactly how many equations are required to correctly
derive the needed information? And, how do we use the various
equations to this end? Finally, do the equations need to satisfy
certain properties individually and collectively?

When we attempt to determine what quantities are represented by the
given equations and variables, we must keep in mind the following.

  1. All of the variables being used must be related with each of the other
    variables.
  2. The number of equations given must at least be "distinct,"
    consistent, and match the number of variables.

Why are these two rules necessary?

For the first, if the variables we are solving for aren't related,
then the situation isn't much different than solving various different
unrelated equations simultaneously.

As for the second, while the example with the car we covered above
should provide a fairly intuitive understanding of why that is so, a
more formal treatment may be summarized as follows.

Consider that a single equation in a single variable describes the
relationship between the constants and that variable present in the
equation. If we added another variable but left the number of
equations unchanged, the new equation is true conditionally for the
specific simultaneous choices for both variables. If we added another
equation (also relating the two variables across the system) which is
distinct (not a multiple of the original) and consistent (doesn't
contradict the first and isn't contradictory in itself), we will now
have two valid equations in a valid system which fully describes the
relation between both variables. In other words, for two related real
variables $x$ and $y,$ we have that

$$\begin{align} a_1x + a_2y &= c_1 \tag{1}\\ a_3x + a_4y &= c_2 \end{align}$$

where $c_1$ and $c_2$ are real numbers, $a_1, a_2, a_3, a_4$ are
all non-zero real numbers such that the system $(1)$ is a valid
system and furthermore, for some non-zero real constant $m,\ a_1 = ma_3, a_2 = ma_4, \text{ and } c_1 = mc_2$ doesn't happen
simultaneously.

Without loss of generality, we can assume that at least $a_1$ and
$a_4$ are non-zero. Thus this system may be solved as such.

$$\begin{align} x &= (c_1 – a_2y)/a_1 \tag{2}\\ y &= (c_2 – a_3x)/a_4\end{align}$$

And in this way, we obtained a complete solution to the system since
either both $a_2$ and $a_3$ are zero, or, one of the solutions to
the variables can be substituted back into the equation other than the
one we got the value to substitute from with, allowing us to solve for
the other variable.

Let us now assume, by the principle of induction, that this property
holds for $n-1$ equations representing $n-1$ variables, $n$
being an integer greater than one. This tells us that that at least
those many coefficients are non-zero that are required for the system
to be valid are present. That is, at least those many variables are
present in the system in the same manner as in $(2)$ to allow us to
solve for the $n-1$ variables in the $n-1$ equation system.

If we now add one more variable across the system and corresponding
valid equation to this system and make the justified assumption that
at least the coefficient multiplying the new variable is non-zero in
at least the new equation, we may solve for this new variable in terms
of all the other variables in the new equation. And since according to
the prior assumption that we already have the complete solution to the
$n-1$ equation system, we know the value of every other variable
already and can use that information to compute the value of the new
variable from the rearranged form of the new equation. Thus, a valid
completely solvable $n-1$ system assures the complete solution to
the valid $n$ equation system in $n$ variables.

I am really truly very grateful for any help on this. Thank you so much for reading through what I have written and providing feedback.

Sincerely,

ThisIsNotAnId

Best Answer

I think examples are good for motivation, but I personally might leave the formal justification and “proof writeup” for after we've learned concepts such as linear independence, rank, and elementary row operations (and you can mention this at the start, as a foreshadowing/preview of things to come). After all, that's kind of what those concepts/definitions are for and why they are the standard: they capture the ideas we want to capture.

For example, to justify that the $n \times n$ system $Ax = b$ has a unique solution if and only if the rows of $A$ are linearly independent, we can do it as follows:

Suppose the rows are linearly independent. Note that elementary row operations don’t change the set of solutions, so we may set out to reduce the matrix in an attempt to find the solutions of our system. Furthermore, they don’t change the row space, so the RREF of $A$ must be the identity matrix (otherwise we get a zero row, which implies fewer than $n$ linearly independent rows in the row space, contradicting the assumption of linear independence). In other words, we get a unique solution.

Conversely, if the rows are linearly dependent, then after row-reduction some row must be zero (otherwise we get $n$ linearly independent row vectors, contradicting the assumption of linear dependence). In particular, this means there is at least one free variable, so either our system has no solution or infinitely many solutions. In particular, it cannot have a unique solution.

However, there might be a nice way to explain all this in simple layman terms, but I don't know how to do it beyond saying something like "the equations have to be independent in a sense; no one can be written as a linear combination of the others. Otherwise, we will have some degree of freedom (an underdetermined system), meaning there cannot be a unique solution."

Related Question