Search code examples
postgisapache-superset

Converting geometry to json in PostGIS creates extra rows


I want to plot municipal borders in Apache Superset. Superset provides maps and provinces using the ISO-3166-2 standard, but I would like to plot municipalities as wel. There is a shape file with these borders and this is converted to a PostGIS table following instructions on this site. I used shp2pgsql to add the shape file to the PostGIS database. That resulted in a table containing some identifying informatie and a binary gemometry column.

The first row of the table is 231.375 characters long and looks like this (abbreviated):

"gid"   "gm_code"   "gm_naam"   "shape_leng"    "shape_area"    "geom"
   1    "GM0034"    "Almere"    122665.358635   109562253.490   "0106000020E61000000300000001030000000100000017000000806CE7FB2F560241408B6CE73E441D41801C5A643A520...

This information is not readable by Superset so I converted it to geojson by this query:

CREATE TABLE gis.gemeente_2021_json AS
  SELECT gm_code, gm_naam, shape_area, 
      json_build_object(
        'type', 'Polygon',
        'geometry', ST_AsGeoJSON(ST_Transform((ST_DUMP(geom)).
            geom::geometry(Polygon, 4326), 4326))::json)::text as geojson

  FROM gis.gemeente_2021_v1;

What happens is that I started with 435 municipalities but my new table has 1140 entries. Some sample output is shown below. The lines are very long so I cut the lines and replaced this by ellipses (...). As you can see municipality "GM0034" has suddenly 3 rows with different entries in the gejson column while I expected just one row with a very long Geojson string.

    gm_code gm_naam shape_area  geojson
    GM0034  Almere  109562253.49    {"type" : "Polygon", "geometry" : {"type":"Polygon","coordinates":[[[150213.998,479503.726],[150087.298999999,479382.379000001],[150000.420000002,479461.258000001],[150000.354600001,479461.317400001],[150000.366300002,479461.327300001],[150001.45630,...]]]}}
    GM0034  Almere  109562253.49    {"type" : "Polygon", "geometry" : {"type":"Polygon","coordinates":[[[141872.969,483192.398800001],[141872.978100002,483192.398499999],[141872.984099999,483192.398499999],[141904.523899999,483191.793400001],[141912.174600001,483191.646699998],[141912.340999998, ...
    GM0034  Almere  109562253.49    {"type" : "Polygon", "geometry" : {"type":"Polygon","coordinates":[[[144312.481800001,492971.460499998],[144312.557,492971.443999998],[144312.633299999,492971.445799999],[144316.953000002,492971.539999999],[144321.4701,492972.263300002],[144321.730999999,492972.305], ...

The data may be Multipolygon though the documentation mentions Polygon. I tried that but that basically yielded the same results.

Checked the Postgres documentation on maximum linelength but that is about 1GB of text per occurrence and although the lines are long, 250K can be easily handled by postgres.

Any suggestion to where I should look further is welcome.

Edit 1

As @JGH rightfully pointed out I'd made quite a mistake in the SRID's. The shape file's SRID is 28992. I deleted the table, created it anew with the correct SRID. It displayed correctly on openstreetmap. It is converted as follows:

CREATE TABLE gis.gemeente_2021_json AS
  SELECT gm_code, gm_naam, shape_area, 
      json_build_object(
        'type', 'Polygon',
        'geometry', ST_AsGeoJSON(ST_Transform((ST_DUMP(geom)).
            geom::geometry(Polygon, 28992), 4326))::json)::text as geojson

  FROM gis.gemeente_2021_v1;

The same error applies, alas. Still 3 rows for GM0034.

Edit 2

I adjusted the query according to the suggestions of @JGH:

CREATE TABLE gis.gemeente_2021_json AS
  SELECT gm_code, gm_naam, shape_area, 
      json_build_object(
        'type', 'MultiPolygon',
        'geometry', ST_AsGeoJSON(ST_Transform(
            geom::geometry(MultiPolygon, 28992), 4326))::json)::text as geojson

  FROM gis.gemeente_2021_v1;

and that worked.


Solution

  • The data show coordinates like [150213.998,479503.726], yet the code contains the line geom::geometry(Polygon, 4326) followed by a useless transform to 4326. The shown coordinates are not in 4326, they have probably been loaded without a defined CRS (so 0 is assigned and the cast works).

    The data is therefore declared to be in lat-long 4326 but contains values for another CRS, you end up with artistic coordinates which can't be properly handled (maybe it goes several times around the earth, maybe it goes to the pole and back etc... anything can happen). Garbage in, garbage out.

    So the first step is to set the proper CRS. Then you can likely apply a -usefull- transform from this other CRS to 4326. If you want single parts, keep using the dump and handle the attributes repetions. If you want multi-parts, remove the dump and set the geojson type to MultiPolygon.