It would be fairly unusual to find power law data for which the variance is actually constant across $x$ values. However, let's take that constant variance as given and proceed with the analysis as a least squares problem.
The easy way to fit this particular model is actually to use GLMs, not NLS! No need for starting values.
I don't have Matlab on this machine, but in R:
x <- seq(4,6.5,0.5)
y <- 1.0e-04 * c(0.011929047861902, 0.026053683604530, 0.057223759612162,
0.117413370572612, 0.242357468772138, 0.462327165034928)
powfit <- glm(y~log(x),family=gaussian(link=log))
summary(powfit)
[snip]
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -25.05254 0.16275 -153.93 1.07e-08 ***
log(x) 8.05095 0.08824 91.24 8.65e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 6.861052e-14)
Null deviance: 1.5012e-09 on 5 degrees of freedom
Residual deviance: 2.2273e-13 on 4 degrees of freedom
AIC: -162.52
Number of Fisher Scoring iterations: 1
> plot(x,y,log="xy")
> lines(x,fitted(powfit))
I highly recommend using Gaussian GLMs with the relevant link function whenever your NLS model can be cast in that form (it can for exponential and power functions, for example).
As to why you're having trouble with it in Matlab, I'm not sure - could it be to do with your convergence criteria? There's no issue with the data at all, but I suggest you start it off at the linear regression coefficients:
> lm(log(y)~log(x))
Call:
lm(formula = log(y) ~ log(x))
Coefficients:
(Intercept) log(x)
-24.203 7.568
So start your parameters at $\exp(-24.2)$ and $7.57$ and see how that works. Or even better, do it the easy way and you can forget about starting values altogether.
Here's my results in R using nls
, first, starting with $b$ at the value from the linear log-log fit:
nls(y~a*x^b,start=list(a=exp(-24.2),b=7.57))
Nonlinear regression model
model: y ~ a * x^b
data: parent.frame()
a b
1.290e-11 8.063e+00
residual sum-of-squares: 2.215e-13
Number of iterations to convergence: 10
Achieved convergence tolerance: 1.237e-07
And then shifting to starting at b=10
:
nls(y~a*x^b,start=list(a=exp(-24.2),b=10))
Nonlinear regression model
model: y ~ a * x^b
data: parent.frame()
a b
1.290e-11 8.063e+00
residual sum-of-squares: 2.215e-13
Number of iterations to convergence: 19
Achieved convergence tolerance: 9.267e-08
As you see, they're very close.
The GLM didn't quite match the estimates of the parameters - though a stricter convergence criterion should improve that (glm
is only taking one step before concluding it's converged above). It doesn't make much difference to the fit, though. With a bit of fiddling, it's possible to find a closer fit with the GLM (better than the nls), but you start getting down to very tiny deviance (around $10^{-14}$), and I worry if we're really just shifting around numerical error down that far.
According to wikipedia, it can be calculated exactly in O(n log(n)).
Wikipedia points to no less than six papers detailing different deterministic or randomized algorithms with $O(n\log n)$ performance, right in the section where they mention the existence of such algorithms (as well as mentioning an even faster one under particular circumstances).
Deterministic:
Cole, Richard; Salowe, Jeffrey S.; Steiger, W. L.; Szemerédi, Endre (1989), An optimal-time algorithm for slope selection, SIAM Journal on Computing 18 (4): 792–810, doi:10.1137/0218055, MR 1004799.
Katz, Matthew J.; Sharir, Micha (1993), Optimal slope selection via expanders, Information Processing Letters 47 (3): 115–122, doi:10.1016/0020-0190(93)90234-Z, MR 1237287.
Brönnimann, Hervé; Chazelle, Bernard (1998), Optimal slope selection via cuttings, Computational Geometry Theory and Applications 10 (1): 23–29, doi:10.1016/S0925-7721(97)00025-4, MR 1614381.
$\ $
Randomized:
Dillencourt, Michael B.; Mount, David M.; Netanyahu, Nathan S. (1992), A randomized algorithm for slope selection, International Journal of Computational Geometry & Applications 2 (1): 1–27, doi:10.1142/S0218195992000020, MR 1159839.
Matoušek, Jiří (1991), Randomized optimal algorithm for slope selection, Information Processing Letters 39 (4): 183–187, doi:10.1016/0020-0190(91)90177-J, MR 1130747.
Blunck, Henrik; Vahrenhold, Jan (2006), "In-place randomized slope selection", International Symposium on Algorithms and Complexity, Lecture Notes in Computer Science 3998, Berlin: Springer-Verlag, pp. 30–41, doi:10.1007/11758471_6, MR 2263136.
Which did you want?
Best Answer
James Phillips' suggestion on how to expand the Theil-Sen algorithm to a second degree polynomial worked surprisingly well. There were 762 (x,y)-points in the dataset I tested. Selecting three different points from the 762 can be made in 73 million ways, so instead I put the points into groups of 11 and calculated the median x och y values of each group. $\binom{762/11}{3} = \binom{69}{3} \approx 52000$ which is a more reasonable number of combinations to use.
For each combination of three points, I calculated the $a$ coefficient for $y = ax^2 + bx + c$ using:
$a = \frac{x_3(y_2 - y_1) + x_2(y_1 - y_3) + x_1(y_3 - y_2)}{(x_1 - x_2)(x_1 - x_3)(x_2 - x_3)}$
Using the median of all $a$'s, for each combination of two points I calculated the slope of the line $y - ax^2 = bx + c$ using:
$b = \frac{(y_2 - ax_2^2) - (y_1 - ax_1^2)}{x_2 - x_1}$
Finally, using the the medians of all $a$'s and $b$'s, for each point I calculated the intercept of $y - ax^2 - bx = c$ using:
$c = y_n - ax_n^2 - bx_n$
I used the Remedian median value approximation and the algorithm implemented in C# took ~15 ms to run for 69 points. If the initial median filter is reduced to 5 points, the execution time increases to ~110 ms which still is ok. Running the algorithm on all 762 points results in an execution time of 13 seconds.
Even with the fast 11 point filter, the results looks very good: