Welcome to MSE!
First, your idea regarding lagrange interpolation is a great one! The key insight is to look at particular coefficients in order to make your life easier.
So, let's let $a_1, \ldots a_n$ be some complex numbers. Then let's find a polynomial interpolating the points
$(a_1, 1), (a_2, 1), \ldots, (a_n, 1)$.
On the one hand, we know that this is the constant $1$ polynomial. On the other, we know it's some horrible linear combination:
$$
1 = \sum_{k=1}^n \prod_{j \neq k} \frac{x - a_j}{a_k - a_j}
$$
But now let's compare the $x^{n-1}$ coefficient of both sides! On the left hand side, we get $0$. On the right hand side, we get $\sum_{k=1}^n \prod_{j \neq k} \frac{1}{a_k - a_j}$.
Similarly, let's compare the $x^0$ coefficients! On the left hand side we get $1$, and on the right hand side we get $\sum_{k=1}^n \prod_{j \neq k} \frac{-a_j}{a_k - a_j} = \sum_{k=1}^n \prod_{j \neq k} \frac{a_j}{a_j - a_k}$.
To see why these are both true, let's separate the numerators and denominators of our polynomial:
$$
1 =
\sum_{k=1}^n
\left (
\prod_{j \neq k} \left ( \frac{1}{a_k - a_j} \right )
\prod_{j \neq k} \left ( x - a_j \right )
\right )
$$
Now, if we want the $x^{n-1}$ term, we need to get the $x^{n-1}$ term of each product. But imagine foiling out $\prod_{j \neq k} (x - a_j)$. The coefficient of $x^{n-1}$ is $1$. If we do this for each summand, we find
$$
0 x^{n-1} = \sum_{k=1}^n \prod_{j \neq k} \left ( \frac{1}{a_k - a_j} \right ) x^{n-1}
$$
which gives the claim.
Similarly, if we want the constant term, we need to get the constant term of each product. Again, if we imagine foiling out the $\prod_{j \neq k} (x - a_j)$, we find the constant term is exactly $\prod_{j \neq k} (-a_j)$. Doing this to each term of the sum shows
$$
1 =
\sum_{k=1}^n
\left (
\prod_{j \neq k} \left ( \frac{1}{a_k - a_j} \right )
\prod_{j \neq k} (- a_j)
\right )
$$
now if we recombine these products (and juggle some minus signs), we get the desired
$$
1 = \sum_{k=1}^n \prod_{j \neq k} \frac{a_j}{a_j - a_k}.
$$
Now, let's address some of your closing questions.
- Where might one begin with these problems?
Exactly where you began! Working out lots of concrete examples, conjecturing something to be true, and then trying to prove it! Your idea to look at lagrange interpolation was an excellent one, and it was only the idea to look at the polynomials one coefficient at a time that you were lacking. I'm sure if you spent another day or two with this problem you would have gotten it.
As a slightly more productive answer, you need to know what's the right level of generality to use when working out "concrete" examples. For instance, it's probably best in this problem to keep the $a_i$ as symbolic constants, rather than considering the special case of, say $a_1 = a_2 = a_3 = 2$. That said, it is fruitful to work with the special cases $n = 1, 2, 3, \ldots, 5$, say. These can all be worked out without too much hassle (especially if you use a computer algebra system, as you mention you did), and can give you some useful information.
In particular, the notion of a normal form is extremely useful, and whenever you're working with two objects that you know are equal, you should try and put them into some kind of normal form and compare the pieces separately. For this problem, that means expanding the lagrange polynomial into its normal form
$c_0 + c_1 x + c_2 x^2 + \ldots + c_{n-1} x^{n-1}$. If you had done this for some small examples, you would have doubtless seen the pattern in the coefficients that later became the proof.
In an informal way, problems that involve both addition and multiplication simultaneously are "hard". This "prototheorem" can be made precise in a number of ways (see skolem and presburger arithmetic compared with robinson arithmetic, for instance), but informally it permeates lots of questions. There are lots of open problems whose difficulty lies, in some sense, in trying to see how addition and multiplication interact. This means that problems that blend addition and multiplication are necessarily trickier than problems which only feature one of the two. I'm sure that this is also accurate to your lived experience.
Of course, I'm sure there are some tricks for dealing with these things, but I'm far from an expert, so I don't know any off the top of my head. I tend to tackle these on a fairly ad-hoc basis, and would also love to hear about a more systematic approach!
I hope this helps ^_^
Best Answer
Your conjecture on $p\equiv 1 \pmod{4}$ is correct except missing the class number, making the smallest counterexample at $p=229$. It follows from the class number formula.
Consider the case $p\equiv 1 \pmod{4}$, $K=\mathbb{Q}(\sqrt{p})$, $\varepsilon > 1$ be the fundamental unit, $K$ is associated with the quadratc character $\chi = (\cdot | p)$. Factorization of Dedekind zeta function $\zeta_K(s) = \zeta(s) L(s,\chi)$ implies
$$ \frac{h \log \varepsilon}{\sqrt{p}} = L(1,\chi) = -\frac{2G(\chi)}{p} \sum_{k=1}^{(p-1)/2} \chi(k)\log(\sin \frac{k \pi}{p})$$ where $h$ is class number of $K$, $G(\chi)$ is the Gauss sum $\sum \chi(k) e^{2\pi i k /p}$. It is a non-trivial result that $G(\chi) = \sqrt{p}$ (when $p\equiv 1 \pmod{4}$). Some rearrangement gives $$\varepsilon^h = \frac{\sqrt p}{2^{(p-1)/2}}\prod_{k=1}^{(p-1)/2}\csc\frac{\pi k^2}p$$