Search code examples
node.jsjsoncsvcsvtojson

What's the best way to create a JSON array with sub-properties from UTF-8 CSV (English & Chinese characters)?


I've tried using the csvtojson module to create a GeoJSON-formatted file, but the nesting isn't working correctly at all. Can anyone point me in the right direction or do I need to write my own code?

> npx csvtojson input.tsv > output.json

input.tsv

properties.labelTc  properties.labelEn  properties.nameTc   properties.nameEn   properties.zoomifyX properties.zoomifyY geometry.coordinates.1  geometry.coordinates.0  properties.urlEn    properties.urlZh    type
皇城  The Imperial Palace City    明故宫 Ming Palace 105513  -1863   32.038  118.815 https://en.wikipedia.org/wiki/Ming_Palace       Feature
天地壇 Altar of Heaven and Earth       Guanghuamen?    105049  -1000   32.058  118.832     https://baike.baidu.com/item/%E5%A4%A9%E5%9D%9B/19964669    Feature

What I want

{
  "properties": {
    "labelTc": "皇城",
    "labelEn": "The Imperial Palace City",
    ...
  },
  "geometry": {
    "coordinates": [118.815, 32.038]
  },
  "type": "Feature"
}

What I got:

[
  {
    "properties": {
      "labelTc\tproperties": {
        "labelEn\tproperties": {
          "nameTc\tproperties": {
            "nameEn\tproperties": {
              "zoomifyX\tproperties": {
                "zoomifyY\tgeometry": {
                  "coordinates": {
                    "1\tgeometry": {
                      "coordinates": {
                        "0\tproperties": {
                          "urlEn\tproperties": {
                            "urlZh\ttype": "??\tThe Imperial Palace City\t???\tMing Palace\t105513\t-1863\t32.038\t118.815\thttps://en.wikipedia.org/wiki/Ming_Palace\t\tFeature"
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  },

Solution

  • Since I was using Windows PowerShell, which uses an extension of Latin-1 encoding, by default, I needed both a flag for the csvtojson library and a second for PowerShell.

    npx csvtojson --delimiter=\t input.tsv > output.json -encoding utf8