[Math] Check for collinearity between four points that supposedly build a quadrilateral

geometryquadrilateral

Let's say we get four 2D points (let's name these $p1$, $p2$, $p3$ and $p4$) from somewhere and we want to build a quadrilateral (doesn't matter if convex or not). What is the optimal check for collinearity to make sure that it is actually possible to define a quadrilateral using these four points?

So far I have been checking using the cross product (as seen here) for the following triplets: $collinear(p1, p2, p3)$, $collinear(p2, p3, p4)$ and $collinear(p3, p4, p1)$. If any of these three $collinear(.,.,.)$ checks returns $true$ the points cannot be used to define a quadrilateral. Is this enough or do I have to add something else? Note that I also check for equality between all four points.

I use the folling code (in Python 3) for collinearity check between 3 points:

Note: I also use $z$ which is equal to 0 in my case.

@staticmethod
def collinear(p1, p2, p3):
    '''
    Checks if three given points are collinear using cross product
    '''
    # http://www.ambrsoft.com/TrigoCalc/Line3D/LineColinear.htm
    c1 = (p2.y - p1.y)*(p3.z - p1.z) - (p3.y - p1.y)*(p2.z - p1.z)
    c2 = (p3.x - p1.x)*(p2.z - p1.z) - (p2.x - p1.x)*(p3.z - p1.z)
    c3 = (p2.x - p1.x)*(p3.y - p1.y) - (p3.x - p1.x)*(p2.y - p1.y)

    from math import isclose
    return isclose(c1, 0, abs_tol=0.00000001) and isclose(c2, 0, abs_tol=0.00000001) and isclose(c3, 0, abs_tol=0.00000001)

I'm also porting my code to C++ and found a nice article on collinearity on Wolfram Mathworld:

bool Triangle::collinear(QVector2D v1, QVector2D v2, QVector2D v3)
{
    return closeEqual((v1.x()*(v2.y() - v3.y()) + v2.x()*(v3.y() - v1.y()) + v3.x()*(v1.y() - v2.y())), 0.0, FUZZY_FACTOR);
}

Best Answer

What is the optimal check for collinearity to make sure that it is actually possible to define a quadrilateral using these four points?

I don't know about optimal, but the 2D analog of cross product (the test OP is using) is probably the simplest and most robust check to implement for numerical computation. In detail, if $\epsilon$ is the largest positive value you consider zero (machine epsilon), then the three points $(x_a, y_a)$, $(x_b, y_b)$, and $(x_c, y_c)$ are collinear if and only if $$\lvert (x_b - x_a)(y_c - y_a) - (x_c - x_a)(y_b - y_a) \rvert \le \epsilon$$ Computationally, this "costs" two multiplications, five subtractions, and either two comparisons or one comparison and one absolute value function; not much at all.

If the Euclidean distances between the points are already known, then triangle inequality could be used. However, the square root operation is usually much more costly ("slower") computationally than multiplication, and the inequality does not hold for squared lengths, so it is not very attractive for this particular case, where we only know the point coordinates.


[Are my checks] enough or do I have to add something else?

There are four ways to choose three out of four unique items (see binomial coefficient at Wikipedia): $$4 \text{ choose } 3 = \binom{4}{3} = 4$$ This means that if you have four points, you need four tests to verify that no three of them are collinear.

In other words, checking collinear(p1,p2,p3), collinear(p2,p3,p4), and collinear(p1,p3,p4) is not sufficient -- it misses the case where points 1,2, and 4 are collinear -- and you need to also check for collinear(p1,p2,p4).

The test mentioned earlier will also catch the case when two or more points are at the same coordinates, so there is no need to test for equality.


Here is an example of the test in pseudocode. It returns True if at least three of the four points are collinear:

Function Collinear3of4(p1, p2, p3, p4, eps=0.0000005):
    # (p1, p2, p3) are collinear if and only if
    #     abs( (p2.x-p1.x)*(p3.y-p1.y) - 
    #          (p3.x-p1.x)*(p2.y-p1.y) ) <= eps 
    (x12, y12) = (p2.x - p1.x, p2.y - p1.y)
    (x13, y13) = (p3.x - p1.x, p3.y - p1.y)
    (x14, y14) = (p4.x - p1.x, p4.y - p1.y)
    (x23, y23) = (p3.x - p2.x, p3.y - p2.y)
    (x24, y24) = (p4.x - p2.x, p4.y - p2.y)
    # Test each unique triplet.
    # 4 choose 3 = 4 triplets: 123, 124, 134, 234
    Return ( (Abs(x12*y13-x13*y12) < eps) Or
             (Abs(x12*y14-x14*y12) < eps) Or
             (Abs(x13*y14-x14*y13) < eps) Or
             (Abs(x23*y24-x24*y23) < eps) )

In C, the temporary variables could/should be considered premature optimization, as the compiler should be able to do the corresponding optimizations automatically. In Python, member access is slower than local variable access and the documentation recommends using temporary variables over repeated member accesses, so the temporary variables are justified -- not to reduce the number of subtractions, but to use local variables instead of member accesses. In C++, the temporary variables are justified if the members are functions (like in Qt point types); otherwise, the optimization is premature. (C++ compilers should be able to optimize even the member function calls, if they are pure getters, actually.)

In fact, I personally would only accept a version of the function if it has comments making the math it implements clear, and, if it uses any, the temporary variables are well named.

If the temporary variables were named e.g. xA, yA, and so on, it would be near impossible to tell if the math part is correct or not. With these names, one can easily verify if all four sets (123, 124, 134, 234) are covered.

This all loops back to optimal, you see. I value robustness and maintainability: I don't want to, or ever try to remember any details of the code I wrote a year, a month, or even a week ago. I want the code to tell me what it does, and the comments to tell me why it does it, and what the purpose of the code is.

(No, my comments are not nearly as good as I'd like them to be, and it is one of the things I need to work on myself.)

Related Question