[GIS] Clustering trajectories (GPS data of (x,y) points) and mining the data

algorithmclusteringgpspython

Just another day at the office for the human cannonball.

I've got 2 questions on analyzing a GPS dataset.

1) Extracting trajectories I have a huge database of recorded GPS coordinates of the form (latitude, longitude, date-time). According to date-time values of consecutive records, I'm trying to extract all trajectories/paths followed by the person. For instance; say from time M, the (x,y) pairs are continuously changing up until time N. After N, the change in (x,y) pairs decrease, at which point I conclude that the path taken from time M to N can be called a trajectory. Is that a decent approach to follow when extracting trajectories? Are there any well-known approaches/methods/algorithms you can suggest? Are there any data structures or formats you would like to suggest me to maintain those points in an efficient manner? Perhaps, for each trajectory, figuring out the velocity and acceleration would be useful?

2) Mining the trajectories Once I have all the trajectories followed/paths taken, how can I compare/cluster them? I would like to know if the start or end points are similar, then how do the intermediate paths compare?

How do I compare the 2 paths/routes and conclude if they are similar or not. Furthermore; how do I cluster similar paths together?

I would highly appreciate it if you can point me to a research or something similar on this matter.

The development will be in Python, but all kinds of library suggestions are welcome.

I'm opening the exact same question https://stackoverflow.com/questions/4910510/comparing-clustering-trajectories-gps-data-of-x-y-points-and-mining-the-data in StackOverflow. Thought I'd get more answers here…

Best Answer

Two articles that you would likely be interested in, as they have similar motivations to yours:

Limits of Predictability in Human Mobility by: Chaoming Song, Zehui Qu, Nicholas Blumm, Albert-László Barabási. Science, Vol. 327, No. 5968. (19 February 2010), pp. 1018-1021.

Understanding individual human mobility patterns by: Marta C. Gonzalez, Cesar A. Hidalgo, Albert-Laszlo Barabasi. Nature, Vol. 453, No. 7196. (05 June 2008), pp. 779-782.

Note the two studies use the same data, which is similar to yours but not at the level of precision in space or time. I don't think what I would describe what you want to find as a trajectory, but I'm not sure what I would call it either. Why exactly do you want to cluster the beginning/end nodes of your "trajectories".

Related Question