[GIS] Python Script to iterate through folder and copy all feature classes in MXDs to new file geodatabases

arcgis-desktoparcpy

I have a folder and would like to iterate through each MXD and take each SDE feature class that is in it and place in a new folder and file geodatabase, with a new MXD using the file geodatabase paths.

  • folder with original MXD name
  • new MXD in new folder with original MXD name
  • new file geodatabase with all feature classes from MXD

Each new MXD should have the above associated with it.

I have a solid understanding of each function that I would like to write aside from iterating and copying MXDs and their features, how should this be done with arcpy?

This script iterates through a folder and all MXDs and prints the MXD and feature classes used in the MXD. I would like to further this by accessing each feature class from arcpy.mapping and sending those to a file geodatabase. And making a new MXD with the new feature classes.

import arcpy, os, datetime

folderPath = 'C:\MXD_test'

#Loop through each MXD file
for root, dirs, files in os.walk(folderPath):
    for file in files: # files is a list of files in the current directory
        if file.lower().endswith(".mxd"):
            fullpath = os.path.join(root, file) # root is the current directory
            #Reference MXD
            mxd = arcpy.mapping.MapDocument(fullpath)

            DFList = arcpy.mapping.ListDataFrames(mxd)
            for df in DFList:
                # Format output values
                if df.description == "":
                    descValue = "None"
                else:
                    descValue = df.description
                # ==== Note the new descName variable
                descName = df.name

                lyrList = arcpy.mapping.ListLayers(mxd, "", df)
                for lyr in lyrList:
                    lyrName = lyr.name
                    if lyr.supports("dataSource"):
                        lyrDatasource = lyr.dataSource
                    else:
                        lyrDatasource = "N/A"


                    print lyrDatasource
                    print fullpath

Best Answer

This update to your script creates new folder and new file geodatabase in each folder. As @Vince mentions in a comment above, if different MXDs are pointing to the same large SDE datasets you'll be copying the same data multiple times. An alternative would be to create a single File Geodatabase to store all the data in.

This is the basic process the script follows:

  1. Loop through folders/files
  2. Create new folder in output folder using name of MXD (This assumes output folder is in another location to the input folder.)
  3. Create File Geodatabase in new folder
  4. Access the MXD
  5. Find and loop through Data Frames
  6. Get list of layers in data frame and loop through layers
  7. Get datasource of layer, check that it hasn't already been loaded, and copy to new FGDB
  8. Repoint layer to new FGDB
  9. Record datasource for later loop to avoid re-copying existing data
  10. Save MXD to new folder
import arcpy, os, datetime

folderPath = r'C:\MXD_test'
newBasePath = r'C:\MXD_Output'

#Loop through each MXD file
for root, dirs, files in os.walk(folderPath):
    for file in files: # files is a list of files in the current directory
        if file.lower().endswith(".mxd"):
            dataSourceSet = set() # Keep track of already copied datasources

            fullpath = os.path.join(root, file) # root is the current directory
            fileName = os.path.splitext(file)[0] # Get filename for new folder name
            print "FileName = {}".format(fileName)

            newPath = arcpy.CreateFolder_management(newBasePath, fileName) # Create new folder for MXD and FGDB
            newFGDB = arcpy.CreateFileGDB_management(newPath, "FGDB.gdb") # Create new FGDB

            #Reference MXD
            mxd = arcpy.mapping.MapDocument(fullpath)
            DFList = arcpy.mapping.ListDataFrames(mxd)
            for df in DFList:
                lyrList = arcpy.mapping.ListLayers(mxd, "", df)
                print "Fullpath = {}".format(fullpath)

                for lyr in lyrList:
                    lyrName = lyr.name
                    if lyr.supports("dataSource"):
                        lyrDatasource = lyr.dataSource
                        lyrDataSetName = lyr.datasetName

                        print "lyrName = {}".format(lyrName)
                        print "lyrDatasource = {}".format(lyrDatasource)

                        if lyrDatasource not in dataSourceSet: # If datasource not already copied, then copy it
                            newLyrPath = arcpy.Copy_management(lyrDatasource, r"{}\{}".format(newFGDB, lyrDataSetName))

                        lyr.replaceDataSource(newFGDB, "FILEGDB_WORKSPACE") # Repoint layer datasource to new FGDB

                        dataSourceSet.add(lyrDatasource) # Keep track of datasource name to avoid copying same datasource twice

            mxd.saveACopy(r"{}\{}".format(newPath, file)) # Save new MXD in new folder

Things to watch out for:

  • Saving to same folder - os.walk will find the new MXDs if the output location is within a subfolder of your folderPath. My script above has assumed this is not the case, but could be worked around if necessary
  • Definition queries - changing from SDE to FGDB will mean field names in definition queries will be referenced incorrectly. FieldName becomes [FieldName] etc. This can probably be worked around if required.
Related Question