Shapefile Encoding – Convert Shapefile from Shift_JIS to UTF-8

encodingogr2ogrshapefile

Long story short, I'm trying to import this ESRI shapefile of Japan into CartoDB. (Sorry, no direct link: to download, click on the orange ファイルのダウンロード button, check 同意する to agree to the T&C, then click on the green 全国市区町村界データのダウンロード button.)

Problem is, the DBF in the file is encoded as Shift_JIS, and CartoDB only likes UTF-8. I've tried the following unsuccessfully:

1) ogr2ogr

ogr2ogr --config SHAPE_ENCODING Shift_JIS japan_ver72_utf8.shp 

No-op: SJIS in, SJIS out.

ogr2ogr --config SHAPE_ENCODING UTF-8 japan_ver72_utf8 japan_ver72.shp

Makes ogr2ogr think the input is UTF-8, meaning I get garbage out.

2) QGIS

Load the shapefile into QGIS as ShiftJIS. But while the shapes load fine, QGIS dumps a whole bunch of this on load:

ERROR 1: fread(48623) failed on DBF file.

And inspecting the attribute table just shows a bunch of nulls, so there's no point trying to save as UTF-8.

3) OpenOffice Calc

Load the DBF into OpenOffice, re-export as SJIS. But OO throws an error when parsing the DBF and refuses to import the file at all.

4) iconv

Run iconv directly on the DBF:

iconv -f Shift_JIS -t UTF-8 japan_ver72_sjis.dbf >japan_ver72.dbf

This "works", in the sense that the Japanese within is correctly recoded as UTF-8, but it destroys the DBF in the process.

Ideas?

Best Answer

Recently, i also encounter problems with Chinese character read in dbf file !

Here is the convert tool for shapefile to geojson via web browser without server-side code and supporting non-english encoding, just need to upload the zip file and set the encoding (Shift_JIS) for the correctly display Japanese text.

http://gipong.github.io/shp2geojson.js/

enter image description here

It will create a geojson file , so you can use with leaflet.js, openlayer or cartodb.js. https://github.com/gipong/shp2geojson.js

Related Question