[GIS] FME How to read multiple CSV files from many folders and write to geodatabase tables by folder

csvesri-geodatabasefile-geodatabase-apifme

I have 13000+ CSV files, all containing around 13000 rows. These CSV files are arranged in 146 different folders. All CSV files have the same schema with ; as separator. These CSV files contain about 175 million rows of data combined. The folders are named as seen in the picture below (for example …\CSV_folders\5785xxx)


Folder names


CSV files are named like these:

CSV names


I already tried to copy all CSV files to same directory and run them to one GDB with FME. This went well… but resulted too heavy ESRI GDB that takes forever to search through. (175 million rows, around 20GB)

enter image description here


Now I would need to make things a bit differently. I would like to work with FME to create one ESRI Geodatabase. It should contain tables named like these folders containing CSV files. A table in the GDB should have all the CSV files loaded in the table, that are inside the original folder. This would split my 175 million rows to 146 tables, making it a bit faster to search through with Arcmap.

So my problem is, how can I make FME read CSV files from all of the folders and write a GDB with tables named like the folder, each table containing the same information the CSV files have in the original folder combined? I'm not familiar with PostGIS etc. (and it would not work with ArcGIS), so this needs to be done with some kind of ESRI GDB workaround..

Best Answer

This is a pretty straightforward exercise in FME. You'll need to do the following:

  1. Load your CSVs using a dynamic reader (Single Merged Feature Type) and point to the whole folder where your CSVs are stored. Use the advanced browser to select the whole folder: enter image description here

    Ensure you select search subfolders: enter image description here

  2. In the reader, expose the fme_dataset attribute by right clicking on the reader and clicking on Properties. Then go to the Format Attributes tab: enter image description here
  3. In your workbench, add a FilenamePartExtractor transformer and point it to the fme_dataset attribute. The field you need is _dirname.

    enter image description here

    1. Finally, in your writer, you'll want to set your Table name to an expression which includes your _dirname from the FilenamePartExtractor. The reason I did this was because your folder names start with a number and feature classes in a file geodatabase cannot begin with a number. Also note that if you're using the ArcObjects File Geodatabase writer, you will set your geometry to geodb_table. enter image description here If you're using the API writer then you will set your geometry to geodb_no_geom: enter image description here

You can see that I had folders within the CSV folder called 1, 2 and 3. They wrote to the GeoDatabase as tables called Table_1, Table_2, and Table_3:

enter image description here