Gradient descent might be overkill.
For convenience, use a temperature scale translated so that $T_0=0$ and the model is
$$T(t)=T_s(1-e^{-\alpha t}).$$
You want to minimize
$$E=\sum_i(T_i-T_s(1-e^{-\alpha t_i}))^2.$$
Setting an arbitrary value for $\alpha$, the least-squares estimate of $T_s$ is given by
$$\hat T_s(\alpha)=\frac{\sum_iT_i(1-e^{-\alpha t_i})}{\sum_i(1-e^{-\alpha t_i})^2},$$
from which you deduce
$$\hat E(\alpha)=\sum_i(T_i-\hat T_s(1-e^{-\alpha t_i}))^2.$$
The optimal $\alpha$ is found by unidimensional optimization.
Based on $n$ data points $(x_i,y_i)$, you want to fit the model $$y=a+bx^{\alpha}$$ which is nonlinear which means that you need estimates to start the nonlinear regression and the question is then : how to get these estimates ?
If you think about it, the model is nonlinear just because of $\alpha$. So, suppose you give a value to $\alpha$, define $t_i=x_i^{\alpha}$ and you just face a linear model $y=a+b t$. So, for this arbitrary value of $\alpha$, you van get $a(\alpha$), $b(\alpha$) and $SSQ(\alpha)$.
So, an idea is to try different values of $\alpha$ and, plotting $SSQ(\alpha)$, to look where it goes more or less through a minimum. Let us call $\alpha_*$ this point. To it correspond $a_*$ and $b_*$. These are your estimates for the nonlinear regression.
If you do not have such a tool, then contine the process zooming more and more around the minimum.
It this clear for you ? If not, I could elaborate using an example you could add to the post.
What you can also do, hoping that the errors are not too large is to take three points (say for example points $1$, $n$ and one somewhere in the middle (say point $m$) and write the three equations
$$y_1=a+bx_1^{\alpha}\qquad
y_m=a+bx_m^{\alpha}\qquad
y_n=a+bx_n^{\alpha}$$ So
$$y_m-y_1=b (x_m^{\alpha}-x_1^{\alpha})\qquad
y_n-y_m=b (x_n^{\alpha}-x_m^{\alpha})$$
$$\frac {y_m-y_1 } {y_n-y_m }=\frac {x_m^{\alpha}-x_1^{\alpha} } { x_n^{\alpha}-x_m^{\alpha}}$$ which is a single equation in $\alpha$ that you can solve using a plot. This gives you $\alpha_*$ and, going backward in he three equations, then $a_*$ and $b_*$.
Now, you are ready !
Best Answer
In Mathematica you have to import your data as table for $x$ and $y$ values. The data can have the form $$\text{data}=\{\{x_1,y_1\},\{x_2,y_2\},...,\{x_n,y_n\}\}$$
You can create your model as
Then you can use FindFit function as