Fitting points to curve $g(t) = \frac{100}{1+\alpha e^{-\beta t}}$ by thinking about projections and inner products

linear algebramultivariable-calculusnumerical linear algebranumerical methods

This is a reinterpretation of my old question Fit data to function $g(t) = \frac{100}{1+\alpha e^{-\beta t}}$ by using least squares method (projection/orthogonal families of polynomials). I need to understand things in terms of orthogonal projections and inner products and the answers were for common regression techniques.

t — 0 1 2 3 4 5 6

F(t) 10 15 23 33 45 58 69

Adjust $F$ by a function of the type $$g(t) = \frac{100}{1+\alpha
e^{-\beta t}}$$
by the discrete least squares method

First of all, we cannot work with the function $g(t)$ as it is. The way I'm trying to see the problem is via projections.

So let's try to transform the problem like this:

$$\frac{100}{g(t)}-1 = \alpha e^{-\beta t}\implies \ln \left(\frac{100}{g(t)}-1\right) = \ln \alpha -\beta t$$

Since we want to fit the function to the points, we want to minimize the distance of the function from the set of points, that is:

$$\min_{\alpha,\beta} \left(\ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t\right)$$

Without using derivative and equating things to $0$, there's a way to see this problem as an orthogonal projection problem.

I know I need to end up with something like this:

$$\langle \ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t, 1\rangle = 0\\ \langle \ln\left(\frac{100}{g(t)}-1\right)-\ln\alpha + \beta t, t\rangle=0$$

And I know this comes from the knowledge that our minimum is related to some projection and this projection lives in a space where the inner product with $span\{1, t\}$ (because of $\ln\alpha,\beta t$), gives $0$.

In order to end up with

$$\begin{bmatrix}
\langle 1,1\rangle & \langle t,1\rangle \\
\langle 1,t\rangle & \langle t,t\rangle \\
\end{bmatrix} \begin{bmatrix}
\ln \alpha \\
-\beta \\
\end{bmatrix}= \begin{bmatrix}
\langle \ln\left(\frac{100}{g(t)}-1\right) , 1\rangle \\
\langle \ln\left(\frac{100}{g(t)}-1\right) , t\rangle \\
\end{bmatrix}$$

Where the inner product is

$$\langle f,g\rangle = \sum f_i g_i $$

*why?

Can someone tell me what reasoning gets me to the inner products above, if I did everything rigth and how to finish the exercise?

Best Answer

$\color{brown}{\textbf{Via linear model}}$

Let $$h(t) = \ln\left(\dfrac{100}{g(t)}-1\right),\tag1$$ then the data table is \begin{vmatrix} i & 1 & 2 & 3 & 4 & 5 & 6 & 7\\ t_i & 0 & 1 & 2 & 3 & 4 & 5 & 6\\ g_i & 10 & 15 & 23 & 33 & 45 & 58 & 69\\ h_i & 2.197225 & 1.734631 & 1.208311 & 0.708185 & 0.200671 & -0.322773 & -0.800119\\ h(t_i) & 2.215988 & 1.711902 & 1.207816 & 0.703730 & 0.199644 & -0.304442 & -0.808528\\ g(t_i) & 9.83239 & 15.29172 & 23.00877 & 33.09858 & 45.02541 & 57.55280 & 69.17958\\ r(t_i) & 0.16761 & -0.29172 & -0.00877 & -0.09858 & -0.02541 & 0.44720 & -0.17958\\ g_1(t_i) & 9.83245 & 15.29853 & 23.02728 & 33.13320 & 45.07696 & 57.61634 & 69.2460\\ \tag2 \end{vmatrix}

The task is to estimate parameters of the function $h(t)$ in the form of $$h(t) = \ln\alpha + \beta_* t.\tag 3$$

The least squares method provides minimization of the discrepancy function $$d_h(\alpha,\beta_*) = \sum\limits_{i=1}^7 (\ln\alpha - \beta t_i - h_i)^2\tag 4$$ as the function of the parameters $\alpha$ and $\beta.$

The minimum of the quadratic function achieves in the single stationary point, which can be defined fro the system $(d_h)'_{ln\alpha} = (d_h)'_{\beta*}= 0,$ or \begin{cases} 2\sum\limits_{i=1}^7 (\ln\alpha + \beta* t_i - h_i) = 0\\ 2\sum\limits_{i=1}^7 (\ln\alpha \beta* t_i - h_i)T_I = 0.\tag5. \end{cases}

The system $(5)$ can be presented in the form of \begin{cases} 7\ln\alpha + a_1 \beta* = b_0\\ a_1\ln\alpha + a_2 \beta* = b_1, \end{cases} where $$a_1 = \sum\limits_{i=1}^7 t_1 = 21,\quad a_2 = \sum\limits_{i=1}^7 t_1^2 = 91,$$ $$b_1 = \sum\limits_{i=1}^7 h_1 = 4.926100,\quad b_2 = \sum\limits_{i=1}^7 h_1 = 0.663879.$$ The discriminants are $$\Delta = \begin{vmatrix}7 & 21 \\ 21 & 91\end{vmatrix} = 196,$$ $$\Delta_1 = \begin{vmatrix}4.9261 & 21 \\ 0.663879 & 91\end{vmatrix} \approx 434.33364,$$ $$\Delta_2 = \begin{vmatrix} 7 & 4.926 \\ 21 &0.663879 \end{vmatrix} \approx -98.80095.$$

Then $$\alpha = e^{\large \frac{\Delta_1}\Delta} \approx 9.170465,\quad \beta = -\dfrac{\Delta_2}\Delta \approx 0.504086,$$ $$d_h(\alpha, \beta) \approx 0.001295,\quad d_g(\alpha, \beta)\approx 0.355863.$$

Results of the calculations, which are shown in the table $(2),$ confirm obtained parameters values.

$\color{brown}{\textbf{Orthogonal projections approach}}$

The method of orthogonal projections is used to solve problems of large dimension. The essence of the method for the source data is that the parameters of the linear model are calculated one by one.

The already selected dependences should be subtracted.

In the given case, the data after first stage has not essential correlations. Linear approximation of the difference $r_i = g_i - g(t_i)$ in the form of $$r_i = -0.043425+0.014987 t$$ gives $d_r = 0.349557$.

$\color{brown}{\textbf{Via the gradient descent.}}$

Obtained solution via linear model is not optimal for the discrepancy in the form of $$d_g(\alpha,\beta)=\sum\limits_{i=1}^7\left(\dfrac{100}{1+\alpha e^{-\beta t_i}} - g_i\right)^2.$$

To verify the orthogonal projections approach, can be used the gradient descent method.

Really, the gradient is $$\binom uv = \left(\begin{matrix} \dfrac {\partial d_*}{\partial \alpha}\\[4pt] \dfrac{\partial d_*}{\partial \beta}\end{matrix}\right) = 200\left(\begin{matrix} -\sum\limits_{i=1}^7 \dfrac{e^{-\beta t_i}}{\left(1+\alpha e^{-\beta t_i}\right)^2} \left(\dfrac{100}{1+\alpha e^{-\beta t_i}} - g_i\right)\\[4pt] \sum\limits_{i=1}^7 \dfrac{t_ie^{-\beta t_i}}{\left(1+\alpha e^{-\beta t_i}\right)^2} \left(\dfrac{100}{1+\alpha e^{-\beta t_i}} - g_i\right) \end{matrix}\right),$$ $$\binom uv =\frac1{50}\left(\begin{matrix} \sum\limits_{i=1}^7 e^{-\beta t_i}g^2(t_i)r_i \\[4pt] -\sum\limits_{i=1}^7 t_i e^{-\beta t_i}g^2(t_i)r_i \end{matrix}\right) =\binom{0,26390}{-2.32907}\not=\binom00.$$

Optimization get for the difference $\Delta d_r = -0.000223$ gives $$\binom{\alpha_1}{\beta_1} = \binom{\alpha}{\beta} +\binom{\Delta\alpha}{\Delta\beta} = \binom\alpha\beta + \Delta d_r\binom uv\approx\binom{9,170406} {0,504605}.$$ Then $$d_g(\alpha_1,\beta_1) \approx 0,349343,\quad \operatorname{grad} d_g(\alpha_1,\beta_1) = \dbinom{-0,036480}{-0,081239}.$$

The data in the table $(2)$ confirm the same estimation accuracy.

Related Question