Solved – the best way to use Latitude and Longitude features in building a Machine Learning model

classificationfeature-engineeringmachine learning

I am working with a city's crime data and am trying to classify the type of crime in a city based on various features. Two of the features are latitude and longitude and I have been thinking about what is the best way to use these features in a model? Using them as regular numerical features does not seem right to me intuitively because the numerical variances between different latitude and latitude values are small and are not ordinal (45.0002 vs 45.0003, etc…) so what would be the best approach here? Thank you!! (The one thing I want to add in case though I don't think this should be relevant is the other features I am using in this model I have created dummy variables for due to their categorical nature).

Best Answer

Time to read up on cluster analysis and crime.

https://www.ncjrs.gov/html/nij/mapping/ch4_9.html

https://www.icpsr.umich.edu/CrimeStat/files/CrimeStatChapter.6.pdf

http://www.ecostat.unical.it/RePEc/WorkingPapers/WP12_2011.pdf

For lots more search cluster analysis geography and crime type. and similar text.

This is what you asked for: one with cluster analysis that talks about predictive crime models and neural nets.