For (2), re-instate the expression for the $k_i$'s and see what this gives you (there was a typo in what you wrote, now corrected).
For (3) a strategy here would be to show that
$$-2\ln \mathcal{L}(\beta; y) = -2 \ln f(y ; \beta) +2\ln f(y ; \hat{\beta})$$
is the difference of two chi-squares, one with one degree of freedom less than the other (chi-squares are closed under addition and subtraction).
We have that
$$-2\ln \mathcal{L}(\beta; y) = \sum_{i=1}^n\left(\frac {y_i -\beta x_i}{\sigma}\right)^2 - \sum_{i=1}^n\left(\frac {y_i -\hat \beta x_i}{\sigma}\right)^2$$
Given the assumptions the first sum is a sum of $n$ independent standard normals, and so a chi-square with $n$ degrees of freedom. So what you have to show is that the second sum is a chi-square with $n-1$ degrees of freedom. Intuitively, we should expect so, because what you have is again a sum of standardized normal random variables but by their estimated mean rather than the true one (so "you lose one degree of freedom"). This can be viewed as an application of the general form of Cochran's Theorem. Can you take it from here?
In maximum likelihood estimation, we calculate
$$\hat \beta_{ML}: \sum \frac {\partial \ln f(\epsilon_i)}{\partial \beta} = \mathbf 0 \implies \sum \frac {f'(\epsilon_i)}{f(\epsilon_i)}\mathbf x_i = \mathbf 0$$
the last relation taking into account the linearity structure of the regression equation.
In comparison , the OLS estimator satisfies
$$\sum \epsilon_i\mathbf x_i = \mathbf 0$$
In order to obtain identical algebraic expressions for the slope coefficients we need to have a density for the error term such that
$$\frac {f'(\epsilon_i)}{f(\epsilon_i)} = \pm \;c\epsilon_i \implies f'(\epsilon_i)= \pm \;c\epsilon_if(\epsilon_i)$$
These are differential equations of the form $y' = \pm\; xy$ that have solutions
$$\int \frac 1 {y}dy = \pm \int x dx\implies \ln y = \pm\;\frac 12 x^2$$
$$ \implies y = f(\epsilon) = \exp\left \{\pm\;\frac 12 c\epsilon^2\right\}$$
Any function that has this kernel and integrates to unity over an appropriate domain, will make the MLE and OLS for the slope coefficients identical. Namely we are looking for
$$g(x)= A\exp\left \{\pm\;\frac 12 cx^2\right\} : \int_a^b g(x)dx =1$$
Is there such a $g$ that is not the normal density (or the half-normal or the derivative of the error function)?
Certainly. But one more thing one has to consider is the following: if one uses the plus sign in the exponent, and a symmetric support around zero for example, one will get a density that has a unique minimum in the middle, and two local maxima at the boundaries of the support.
Best Answer
A maximum likelihood estimator has the nice property that it is invariant under transformations. This means that if $\theta_{MLE}$ is the MLE for $\theta$, then for a function $g(\theta)$, $g(\theta_{MLE})$ is the MLE for $g(\theta)$.
This can be directly applied to your problem. Hint: what is the MLE for $(\beta, \sigma)$?