[GIS] How to build PostGIS database schema for flexibility and normalization

postgisschema-architecture

I'm creating a database that will be used by researchers in population genetics with different goals and sampling strategies. The database needs to accommodate locations as points, lines, or polygons and I'm puzzling over whether those should be kept in a single "locations" table, or in separate tables for "point_locations", "line_locations", and "polygon_locations". The single-table design is appealing because it would make it simple to enforce the required one-to-one relationship between samples and their locations with a foreign key constraint. I can imagine that there may be reasons to split locations out into separate tables for points, lines, and polygons, however. If I can keep all locations' spatial data in a single column, then which geography datatype should I use (I know geometry would be more flexible, but that's a topic for another post)? I'm thinking geography(polygon,4326).

Best Answer

The answer depends a bit on what you're going to be using the geometries for. The database is perfectly happy to manage/index/search a generic geometry table:

CREATE TABLE g (
  id Serial Primary Key,
  geog Geography(geometry, 4326)
);

Yes, it looks odd to be declaring a "geography" as a "geometry", but what's a boy to do?

The trouble with generic tables is for client software that expects to only deal in homogeneous types, not mixed. The design question becomes: is it easier to store things separately in the database and have every piece of database logic "type aware", or is it easier to have mapping queries that require homogeneity to deal with their issues?

You can perhaps get some distance by putting views onto your generic table.

CREATE VIEW g_point AS
SELECT * FROM g WHERE ST_GeometryType(g::geometry) = 'ST_Point';

Best of luck!

Related Question