[GIS] Python with ArcGIS: import scikit-learn fails (bad numpy.dtype)

arcpyinstallationnumpyscikit learn

I am trying to install scikit-learn 0.17.1 (current) into my Python 2.7.3 that accompanies my ArcGIS 10.2. The installation through easy_install goes through smoothly, but I get the following error on import:

>>> import sklearn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "Z:\lib\site-packages\scikit_learn-0.17.1-py2.7-win32.egg\sklearn\__init__.py", line 57, in <module>
    from .base import clone
  File "Z:\lib\site-packages\scikit_learn-0.17.1-py2.7-win32.egg\sklearn\base.py", line 11, in <module>
    from .utils.fixes import signature
  File "Z:\lib\site-packages\scikit_learn-0.17.1-py2.7-win32.egg\sklearn\utils\__init__.py", line 10, in <module>
    from .murmurhash import murmurhash3_32
  File "numpy.pxd", line 155, in init sklearn.utils.murmurhash (sklearn\utils\murmurhash.c:5029)
ValueError: numpy.dtype has the wrong size, try recompiling

After some googling it looks like a dependency mismatch. I have numpy 1.6.1 and scipy 0.10.0 installed, both working. No other dependencies are stated on the installation guide for scikit-learn. Is there something I missed along the way? How can I make the scikit-learn installation work?

Best Answer

Scikit-learn 0.20 was the last version to support Python 2.7 but e.g. ArcGIS 10.6 (and older versions of ArcGIS) does not support it. I found out that scikit-learn 0.18.2 is the latest version supported by ArcGIS 10.6. I did some cycles of pip install scikit-learn==<version> and pip uninstall scikit-learn until I found the right version. Use pip install scikit-learn== to get the available versions and start trying them out until you find a supported Version. This approach should work, to find the appropriate version of scikit-learn for ArcGIS 10.2.