[GIS] ASCII encoding error when updating field content in SQL Server table

arcgis-10.2arcpyasciisql serverunicode

I know there are plenty of posts and blog explanations about unicode errors, but I still can't figure out to handle it in my particular case. So here is my problem: I am writing a Python script to update records in an SQL Server table, with field content from a shapefile that has been edited in ArcPad.

I use an arcpy.da.UpdateCursor to update existing records.

I've put # -*- coding: cp1252 -*- on top of my script as I work with French characters.
Just doing so, I get the ASCII encoding error:

'ascii' codec can't encode characters in position 0-1: ordinal not in
range(128)

I've tried putting u before the field content:

...
with arcpy.da.UpdateCursor(DBtable, DBFields, where_clause) as DBCur:
    for DBrow in DBCur:
        ...
        DBrow[8] = u"{} - {}: {}".format(DBrow[8], date , AProw[8]) # AProw comes from a SearchCursor reading the shapefile.

I can then print a message with the text content, and it doesn't return an error message, but all characters are replaced (I get something like ???4???5????????>???5??? in my database field).

Any accentuated or punctuation character can be present in the fields (there's a free text comment field) so I don't want to check for every possible non-ASCII character and replace it.

I work with ArcGIS 10.2.2, the shapefile is edited in ArcPad 10.2 and the database is SQL Server 2008 R2.

What am I missing?

EDIT: This only occurs if the destination database is SQL Server. No problem with a file gdb. I have to add that the SQL Server table already contains non-ASCII characters.

Best Answer

Finally I've found a solution by bringing all of the following changes to my code:

  • use arcpy.ArcSDESQLExecute() instead of arcpy.da Insert/Update cursors to update/populate the table (I had to do both operations and both types of cursors didn't really work in all situations).

  • use # -*- coding:utf8 -*- in the beginning of the script, which is supposed to be a good practice anyway (# -*- coding: cp1252 -*- allows me to have accents in the messages but doesn't work for editing the database table). Now accents in my messages are replaced, but editing my database works.

  • there was also an issue with single quotes being recognized as string delimiters within my SQL request. To deal with this I had to replace them with double single quotes.

So now my code looks like this:

new_comment  = "{} - {}: {}".format(DBrow[8].replace("'", "''").encode('utf-8'), date , AProw[8]replace("'", "''").encode('utf-8')) 
# DBrow[8] is the comment already present in the table and AProw[8] is the new comment from my shapefile edited in ArcPad
...
sde_conn = arcpy.ArcSDESQLExecute(DB)  
sql = '''
update {} set {} = '{}'
'''.format(DBtable, DBcomment_field, new_comment)

Hope one day it might help anyone meeting this kind of unicode/SQL Server headache.

Related Question