[GIS] ogr2ogr + MapInfo TAB + SQLite (SpatiaLite) = unicode problems

mapinfoogr2ogrsqlitetilemillunicode

I'm adding MapInfo TAB files to SQLite (SpatiaLite) database using those arguments:

ogr2ogr -f "SQLite" dataset.sqlite somelayer.tab -dsco spatialite=yes

Source file somelayer.tab have WindowsCyrillic encoding defined:

!table
!version 450
!charset WindowsCyrillic

Definition Table
  Type NATIVE Charset "WindowsCyrillic"
  Fields 1
    Name Char (100) ;

As a result, dataset.sqlite have somelayer table with VARCHAR Name column.

But I need store Name values in unicode.

How can I tell ogr2ogr use unicode NVARCHAR type for Name column and encode text from WindowsCyrillic (Windows-1251) to UTF-8 encoding for storing.

Best Answer

SQLite does not care about types and VARCHAR and NVARCHAR mean just the same for it https://stackoverflow.com/questions/3930501/difference-between-varchar-nvarchar-in-sqlite.

The real problem is in the GDAL MapInfo driver that does not handle character encodings. There are some workarounds:

  1. Convert MapInfo data into MID/MIF format with ogr2ogr and convert Windows-1251 into UTF-8 with iconv. Conversion with ogr2ogr from MID/MIF (UTF-8) into SpatiaLite should go right now.
  2. Alternatively, convert MapInfo data first into GML with ogr2ogr and convert the GML file info UTF-8 with iconv. Finally convert GML into SpatiaLite.