[GIS] In_memory workspace in geoprocessing services

arcgis-serverarcpygeoprocessing-servicein-memory

I have a arcpy python script that runs a process that I publish as a geoprocessing service on ArcGIS Server 10.3.1. The python script uses various input feature classes. These feature classes are within geodatabases that are registered on the ArcGIS Server.

My python script also uses the in_memory workspace for storage of some temporary data. The in_memory feature classes are not the input or final output, there are only intermediate data.

When trying to publish the geoprocessing service, the publishing tools want to copy the in_memory feature classes to the server, since the in_memory workspace is not registered with ArcGIS Server. The specific warning I get is Data source used by Script MyToolThatDoesStuff is not registered with the server and will be copied to the server: in_memory\myTempData

I let the publishing tools copy the in_memory data to the server and the my geoprocessing service works as expected. However, I suspect that every time the tool is run, the in_memory data is copied to the arcgisserver\directories\arcgissystem\arcgisinput\ MyToolThatDoesStuff .GPServer\extracted\v101\data1.gdb geodatabase and never removed. Over time this gdb bloats, slows down the geoprocessing service and ultimately fills the disk, creating mass problems.

My questions:

  1. Is there a way to prevent the geoprocessing tools from publishing
    the in_memory workspace to the server?
  2. Is there a way to register the in_memory workspace with ArcGIS Server?
  3. Is using the in_memory workspace in scripts that are published as geoprocessing services not “best practice”. If that's the case, how should temporary, intermediate data be processed in arcpy scripts?

Best Answer

In_memory layers are not written to the disk; that is the whole point with the in_memory geodatabase workspace.

  1. The in_memory workspace is not published to the server.
  2. Registering in_memory on the server wouldn't make sense.
  3. Using in_memory is best practices for handling the intermediate data both for desktop and server geoprocessing workflows.

For the publishing purposes, I usually suggest do a dry run of the tool, when no code is executed and no data is created. Publish the tool results and then either uncomment the code in the published file (or maybe you have a parameter value that is used for defining the dry run).

Even better, keep the copied script with ArcGIS bits published (parameters etc) separated from the business code. Please refer to this post for best practices I found very useful when working with GP services.

Related Question