[Math] Given a data set, how do you do a sinusoidal regression on paper? What are the equations, algorithms

regressionstatistics

Most regressions are easy. Trivial once you know how to do it. Most of them involve substitutions which transform the data into a linear regression. But I have yet to figure out how to do a sinusoidal regression. I'm looking for the concept beyond the results. I don't need Excel, TI, or CAS answers. I would like to see equations, methods, so on. How would you do it on paper if you were actually willing to do it on paper.


For clarification, I'm referring to the most general sinusoidal regression $$ y = A\sin(Bx+C) + D$$

Not some special case assuming the values of one of these constants.

Best Answer

Gauss-Newton algorithm directly deals with this type of problems. Given m data points $(x_i,y_i)$ for regression with a function of n parameters $\vec \beta =(\beta_1,...,\beta_n)$ $$min_{\vec \beta }\ S(\vec \beta)\ where\ S(\vec \beta)=\sum_{i=1}^m r_i(\vec \beta)^2=(y_i-f(\vec \beta,x_i))^2$$ I skip the derivation of algorithm which you can find in every textbook (First use Taylor approximation and then use Newton's method). $$\Delta \vec \beta=\big(J^T\ J\big)^{-1}\ J^T\ \vec r$$ $$\vec \beta=\vec \beta + \alpha\ \Delta \beta$$ where $\alpha$ is damping coefficient and $$J=\begin{pmatrix}\bigg(\frac{\partial f}{\partial \beta_1}\bigg)_{x=x_1}&...&\bigg(\frac{\partial f}{\partial \beta_n}\bigg)_{x=x_1}\\\ ...&...&...\\\ \bigg(\frac{\partial f}{\partial \beta_1}\bigg)_{x=x_m}&...&\bigg(\frac{\partial f}{\partial \beta_n}\bigg)_{x=x_m}\end{pmatrix}\quad \vec r=\begin{pmatrix}y_1-f(\vec \beta,x_1)\\\ ... \\\ y_m-f(\vec \beta,x_m) \end{pmatrix}$$ For your specific case $$\frac{\partial f}{\partial A}=sin(Bx_i+C)$$ $$\frac{\partial f}{\partial B}=Ax_icos(Bx_i+C)$$ $$\frac{\partial f}{\partial C}=Acos(Bx_i+C)$$ $$\frac{\partial f}{\partial D}=1$$

In Matlab I generated 60 uniformly distributed random sample points. I used these points to calculate a known sin-curve such as $$y=0.5sin(1.2x+0.3)+0.6$$ I added error terms N(0,0.2) to each point. My initial guess was $A=0.1\ B=0.5\ C=0.9\ D=0.1$ and I set the damping coeff to 0.01. The algorithm determines below approximation equation after 407 iterations $$\hat y=0.497sin(1.178x+0.352)+0.580$$ Below you can see the graph (black - original curve $y(x)$, red - curve with error terms $y(x)+\epsilon (x)$, blue - approximation curve $\hat y(x)$)

Results