Solved – Correlation between two data sets with same x-axis values (year) and different y values

correlationpythonr

I have two data sets, both ranging from 1996-2016. However, the y-axis values are on completely different scales. The first is for mean NDVI values where 0 is centered on the mean (.1865) and the ranges are the differences between the mean and the values for that year that range from -0.03 to 0.03. The second dataset is the Palmer Drought Severity Index with the same date range, but range from -7 and 6.

I want to know the best way to find correlation between these two datasets, and would prefer to be able to do it in python. Here is an image of the two plots, both have the same x-axis which is the year range from 1996-2016
Here are the two data set plots, bot have the same x-axis which is years ranging from 1996-2016

Best Answer

The different ranges of your data is no problem, since e.g. scaling (both or just one of them) to mean zero and unit variance does not change the correlation between them. However, if you want to correlate the two data vectors you have, the need to have the same length, i.e. if your PMDI vector has more data points than your other vector, then you need to find a way (e.g. taking the mean over some period) to summarise your PMDI vector in less data points. Calculating correlation in Python: See e.g. https://stackoverflow.com/questions/19428029/how-to-get-correlation-of-two-vectors-in-python