Solved – Calculating means and medians for 2d clusters

data visualizationmeanmedian

I have a set of data with x and y values (scatter plot) and I'm trying to get their mean (centroid) and median point. The way I did this was by simply calculating the means and the medians on x and y and those were (i thought) the coordinates of the mean and median points of the group. However I did some reading on 2d medians (a.k.a. 1-median) and as it turns out this is some super complex math problem that doesn't even have a specific answer, but can only be approximated. I'm not sure if that's what I should be using, or if calculations I did are good for me.

Best Answer

As far as the mean/centroid is concerned, getting the average of $x$ and the average of $y$ is right. Regarding the median, if what you are interested is the geometric median (i.e the point which minimizes the sum of absolute distances to all other points), then calculating the median of $x$ and the median of $y$ is not correct. The Wikipedia page on the Geometric Median offers more information whereas depending on the software you'll use, there are packages/libraries that calculate it (for R there's the Gmedian package, for example).