ArcSDE Data Merge – Merging Edited Data Back into ArcSDE Master Geodatabase

arcgis-10.0arcpyenterprise-geodatabasefmesql server

I need some suggestions/inspiration for a problem we have.

The problem:

We have an ArcSDE master GIS database. We will edit this data almost exclusively outside of ArcGIS and outside of ArcSDE. In other words, we will make an export of some small area and edit this. Then we will push this data back into an other SDE table (diff database).
We will also maintain an extra attribute that contains: add, delete, edit.

What we want to do now is write this diff table to the master. My question is what is the best tool for doing this?
Using Arcpy with cursor, using FME, or are there other Arcgis tools to help?
I was thinking using ArcPy and doing something like this:

  • simply add all objects with status add.
  • search for every id with status delete and delete it
  • search for all ids with status edit and copy the edited ones

What do you think would be the best tool to do this?

  • Arcpy?
  • FME
  • direct SQL (we are using MS SQL)
  • others…

The master database will be in the range of about 500.000 objects, so performance is maybe an issue. At the moment we are thinking about a sync operation where an operator starts the sync and can wait for it to finish so he can do some checks after it is completed.

Bonus questions:
We will need to sync dimensionings and labels too. The problem is that, in the diff database, we probably cannot store them as arcgis dimensioning feature classes but they will be line and point symbols with the necessary data attachted to it. Can we use the same technique for transforming and migrating this data so it fits in the dimensioning classes?

Best Answer

An approach that I have used successfully and seems to perform pretty well in FME is to create three columns in the "master" feature class on SDE:

  1. A field containing the feature's original OID (or other unique ID).
  2. (Optional) A field containing the feature's original feature class name (used if you have multiple source feature classes going into the master feature class)
  3. A column containing a CRC (Cyclic Redundancy Check) value for each feature. This value is generated using the CRCCalculator transformer. See also this FMEpedia article detailing how this can be implemented.

I then read in both the source feature class and the master feature class and "join" (using a FeatureMerger) on a concatenation of the original ID and source feature class name.

Now depending on which port the feature leaves the FeatureMerger you know if it's an add, delete or if the feature exists in both places.

You then just need to calculate the CRC again on the incoming feature and compare it against the stored CRC. If the CRC is different, it's an update. If it's the same the feature hasn't changed.

You then know whether to update, delete, add or skip each feature. Just be careful to route the correct streams (input vs. master) and don't mix up the geodb_oid values as you'll want to use the master's OIDs for deletes and updates. For adds it doesn't matter. Note the AttributeRenamer step in the FMEpedia example; they used this to avoid the FeatureMerger overwriting existing attributes.

The FMEpedia example sends each database transacation type (update/delete/add) to separate writers, but you can use a neat trick to use only one writer by setting the fme_db_operation attribute to UPDATE, INSERT, or DELETE and setting the writer mode to UPDATE. This technique is detailed in this FMEpedia article.

You can also of course add timestamp columns to record when a feature was first loaded and last modified with the TimeStamper transformer.

Note that this approach would likely allow you to skip the "diff' table step entirely, and I believe that FME also supports dimension and annotation feature classes just fine, so hopefully the approach works the same or only needs minor modifications for those (never used them myself).

Related Question