OpenStreetMap Data – Why Certain Features from Geofabrik’s OSM Extract Seem to Be Missing?

geofabrikopenstreetmapoverpass-apiqgisquickosm

I would like to have all objects from OpenStreetMap with the Tag:man_made=works (ie factories), in Estonia. I can do this using QGIS, but another method using GeoFabrik's OSM extract fails.


QGIS way

I download Estonia's borders from osm-boundaries.com. I add it as a QGIS layer. Then I set up a query using the QuickOSM plugin (Overpass API endpoint set as: https://lz4.overpass-api.de/api/, Nominatim API endpoint set as: https://nominatim.qgis.org/search?):

enter image description here

After Run query, I get several factories.

enter image description here

I can export these to GeoJSONs.


OSM extract from GeoFabrik

I am trying to achieve the same thing via another way. I download estonia-latest.osm.bz2 from http://download.geofabrik.de/europe/estonia.html:

curl https://download.geofabrik.de/europe/estonia-latest.osm.bz2 -o estonia-latest.osm.bz2

Uncompress file:

bzip2 -dk estonia-latest.osm.bz2 # produces estonia-latest.osm

As suggested by this answer as well, estonia-latest.osm has:

points
lines
multilinestrings
multipolygons
other_relations

(I know this by running: ogrinfo estonia-latest.osm | tail -n +3 | awk '{print $2}')

Let's attempt to extract all man_made=works! I do:

for each in $(ogrinfo estonia-latest.osm | tail -n +3 | awk '{print $2}'); do
    rm -f factories_${each}.geojson
    echo ogr2ogr -f GeoJSON factories_${each}.geojson estonia-latest.osm $each -where "other_tags LIKE '%\"man_made\"=>\"works\"%'"
    # this line above^ is to check what ogr2ogr command is running
    ogr2ogr -f GeoJSON factories_${each}.geojson estonia-latest.osm $each -where "other_tags LIKE '%\"man_made\"=>\"works\"%'"
done

Output:

ogr2ogr -f GeoJSON factories_points.geojson estonia-latest.osm points -where other_tags LIKE '%"man_made"=>"works"%'
0...10...20...30...40...50...60...70...80...90...100 - done.
ogr2ogr -f GeoJSON factories_lines.geojson estonia-latest.osm lines -where other_tags LIKE '%"man_made"=>"works"%'
0...10...20...30...40...50...60...70...80...90...100 - done.
ogr2ogr -f GeoJSON factories_multilinestrings.geojson estonia-latest.osm multilinestrings -where other_tags LIKE '%"man_made"=>"works"%'
0...10...20...30...40...50...60...70...80...90...100 - done.
ogr2ogr -f GeoJSON factories_multipolygons.geojson estonia-latest.osm multipolygons -where other_tags LIKE '%"man_made"=>"works"%'
0...10...20...30...40...50...60...70...80...90...100 - done.
ogr2ogr -f GeoJSON factories_other_relations.geojson estonia-latest.osm other_relations -where other_tags LIKE '%"man_made"=>"works"%'
0...10...20...30...40...50...60...70...80...90...100 - done.

Problem: all the files created (factories_points.geojson, factories_lines.geojson, factories_multilinestrings.geojson, factories_multipolygons.geojson, factories_other_relations.geojson) are empty.

If I replace the where clause by "other_tags LIKE '%\"power\"=>\"substation\"%'", I do indeed get many electrical substations.


The query for factories might be wrong. GeoFabrik's OSM extract might not contain the factories I am looking for. Or something else.

What am I doing wrong?

Best Answer

man_made is an important tag that has its own field, so you can query it directly:

ogr2ogr -f GeoJSON factories_${each}.geojson estonia-latest.osm $each -where "man_made='works'"

The fields that will get into their own column is an ogr2ogr configuration (osmconf.ini located in GDAL data folder), whose defaults can be seen here.

At time of writing (02/2023) they are:

points: name, barrier, highway, ref, address, is_in, place, man_made
lines: name, highway, waterway, aerialway, barrier, man_made, railway
multi lines: name, type
multi polygons: name, type, aeroway, amenity, admin_level, barrier, boundary, building, craft, geological, historic, land_area, landuse, leisure, man_made, military, natural, office, place, shop, sport, tourism
other: name, type

Related Question