[Math] Newton’s Method, and approximating parameters for Bézier curves.

approximationbezier-curveparametric

I've been wanting, for quite a while now, to polish up some source code I wrote for approximating arbitrary Bézier curves to given series of points. I managed to accomplish quite a bit, but I hit a road bump today while trying to write documentation for it.

I'm using Philip J. Schneider's "An Algorithm for Automatically Fitting Digitized Curves" (a copy can be found here) as source material, along with the C code that accompanies it (which can be downloaded here — it exists as FitCurves.c for the inclined inquirer) and an Objective-C port of the C code (which can be found here — again, for the inclined). The thing that's tripping me up is documenting how the algorithm for refining parameters for a given Bézier curve works.

For reference, the gist of the algorithm is this (in pseudocode):

Function RefineParameters (Bézier curve, Point array, Parameter array)
  For every parameter in the list of parameters
    Use Newton's Method to refine the parameter, using the given points

There is enough input to be able to evaluate the Bézier curve at any given parameter (using De Casteljau's algorithm); the array of points is the array we want to fit a Bézier curve to — these are all 'correct' points, in the sense that we're trying to refine the parameters to match these points within some bounds of error; and the parameter array is the list of parameters (normalized from 0 to 1, of course), that we're trying to refine.

What trips me up is the explanation of Newton's Method, which is defined as follows (in terms of the Bézier function, $Q(t)$): $$t' = t – \frac{Q(t)}{Q'(t)}$$

Or, so I would assume. It would make sense that to refine the parameter we have, $t$, we'd use this formula. Using the given information we have, we can easily retrieve or calculate $t$, $Q(t)$, and $Q(t')$.

However, both code libraries (which provide completely correct code), do something different. The original C code provides no comments, and defines the formula to be: $$t' = t – \frac{(Q(t) – Q(t')) \cdot Q'(t)}{(Q(t) – Q(t')) \cdot Q''(t) + Q'(t)^2}$$

The Objective-C port is slightly more helpful by providing some comments, but not by much:

"Use Newton's Method to refine our parameter. In general, that formula is:" $$x' = x – \frac{f(x)}{f'(x)}$$

"In our case:" $$f(x) = (Q(t) – Q(t')) \cdot Q'(t) = 0$$
"Where $Q'(t)$ is tangent to the curve at $Q(t)$ and orthogonal to [$Q(t) – Q(t')$]
Taking the derivative gives us:" $$f'(x) = (Q(t) – Q(t')) \cdot Q''(t) + Q'(t)^2$$

This leads us to the same result, and helps explain it only a little bit more. I half-understand the transition between the original form of Newton's Method and the second line: we've just solved for $f(x)$, and substituted $x$ with $Q(t)$. What I don't understand, though, is how or why $f'(x)$ is substituted with $Q'(t)$ (and if it makes sense to substitute them, why would we need to differentiate in order to find $f'(x)$ again later?).

Knowing this algorithm works, I tried to work backwards and perhaps meet in the middle. We're looking to refine some parameter $t$ so that we can't refine it any more (meaning $f(x)$ becomes $0$). Solving for $f(x)$, we get $f(x) = (Q(t) – Q(t')) \cdot f'(x)$, which is zero if $f'(x)$ is orthogonal to $(Q(t) – Q(t'))$. However, it stops making sense for me right there — why is it legal to just replace $f'(x)$ with $Q'(t)$? I understand that makes them orthogonal, but if that were valid, why the need to solve for $f'(x)$ later?

I could leave the code as-is, but I want to document it. Being able to explain a concept to others helps you understand it so much better yourself. Any help with this?

Best Answer

Like rubber duck problem solving, I've managed to figure this out in my head (with the help of J.M.), where it just wasn't making much sense. I was going about the problem the wrong way.

The more accurate the parameters we get, the closer we get to $$(Q(t) - P) \cdot Q'(t) = 0$$ or in other words, the vector formed by the difference between our Bézier curve evaluated at a parameter $t$ that we want to refine and the real point it corresponds to becomes increasingly perpendicular to the slope of the Bézier curve at that parameter. When we reach the right $t$, they're perfectly perpendicular.

That's why $f(t) = (Q(t) - P) \cdot Q'(t) = 0$ (of which we take the derivative to get our solution). We're trying to better approximate the values of parameters for which this is true. Every iteration of the method will get us closer to the real parameters.

Didn't make much sense to me before (likely because I was bashing my head in for hours trying to figure it out), but now it's pretty clear.