June 20, 2019

Mapping Swiss coordinates (LV95) with d3: Part 1

This is the first of a two-part series about creating a map in SVG based on Swiss geographical data. Each part contains a certain step of the process:

  1. Converting shapefiles to GeoJSON/TopoJSON with node
  2. Mapping LV95 coordinates with d3-geo

The result of that process you can see and interact with on srf.ch

🚧 I will try to keep this article free of geographical jargon as I experienced that as quite of a hurdle when I first worked with geodata.

The Swiss Coordinate System

The goal of this Tutorial is that we can work with geodata form the Federal Statistics Office. The keyword you google for is “Generalisierte Gemeindegrenzen”. You will land on this download page where you can choose for what year you want to download the data.

Why is the year important?

…you might ask yourself. Let me tell you: it’s very, very important. Borders are constantly changing – mainly because small municipalities choose to go together with their neighbors – they merge. This happens several times a year!

👉🏼 If you have data that you want to display, you’ll need to find out with the exact status of municipalities it contains. This is usually specified in the metadata of a dataset. If it’s not, you’ll need to compare the number of municipalities in the dataset and in the different shapefiles.

In the folder you downloaded you find usually find two folders, e.g. ggg_2017 and ggg_2017vz. VZ stands for “Volkszählung” (census) and means, the data is from the 31st of December of that year. The other folder contains data for the 1st of January. Within that folder, you navigate to /shp/LV95/. In there you’ll find groups of files (.cpg, .dbf, .prj, .shp and .shx). If you move these files around always take all 5 of them. If you rename them, rename all – they must always have the same name as they are interlinked.

The files starting with “g1” are more detailed (more fine-grained borders) that those with “g2”. The third letter indicates what kind of borders the file contains:

  • g = Gemeinde (municipality)
  • b = Bezirke (districts)
  • k = Kanton (canton)
  • l = Land (country)
  • s = Seen (lakes)

We want to show Gemeinden (municipalities) at the state of the 31st of December and we don’t need it in the most detailed form. g2g17.shp is this what we need. Feel free to drag & drop it into QGis (a handy, open source tool for manually working with geodata) to have a look at it:

qgis

If you move your mouse around, you’ll see that the coordinates in the data look something like this:

2622800, 1204100

These are Swiss Coordinates (sometimes also called CH1903+, LV95 or EPSG:2056). What’s cool: 1 unit is equal to 1 meter. You may have only worked with what you might call “GPS Data”, where you have lat and a long that look something like 46.8066834, 6.7238481.

In Switzerland, the numbers are called north and east or sometimes just Y and X (yes, actually in that order, as you can read on Wikipedia).

Ok, we have the map that we want. But now we need to bring this shapefile into a format that browser understands better:

Introducing GeoJSON

JSON stands for Javascript Object Notation and is a file format that we can read in with Javascript. There are numerous ways to convert shapefiles to GeoJSON, you might have run into this great tutorial series by d3-creator Mike Bostock:

It’s an impressive step-by-step explanation of how to work with Shapefiles, GeoJSON and TopoJSON. I’ve you’re used to working in the command line, I’m sure it’s great. But if you’re more of a coder you might find this a bit intimidating (like me) with all these weird backslashes and pipe characters that do magic stuff. I prefer to write Javascript. So that’s what we’ll do.

We create a new project folder, paste the geodata and set up node with npm init and add the package shapefile. In the root folder we create a file shp2topo.js with the following content:

var fs = require('fs')
var shapefile = require('shapefile')

// specify the output file here
const inputFile = 'g2g17vz.shp'
const outputFile = 'geo.json'

shapefile
  .open(inputFile)
  .then(source => {
    // start with empty geojson skeleton
    let geojson = {
      type: 'FeatureCollection',
      features: [] // into this array we push our features below
    }
    return source.read().then(function log (result) {
      // when done: pass geojson to next function in promise pipeline
      if (result.done) return geojson

      // if not done: add to geojson feature by feature
      const feature = {
        ...result.value,
        // keep from properties only the bfs id
        properties: {
          bfs_id: result.value.properties.GMDNR
        }
      }
      geojson.features.push(feature)
      // continue with next feature/iteration
      return source.read().then(log)
    })
  })
  // write to file
  .then(fileContent =>
    fs.writeFile(outputFile, JSON.stringify(fileContent), () => { 
      console.log('The file has been saved!')
    })
  )
  .catch(console.error)

Let’s quickly go through what’s happening here: After pulling in the packages with require and defining the input and output destination of the shapefile and the GeoJSON, we have a Promise chain. (If you’re not familiar with Promises, read more about them here.)

What’s a bit unusual: shapefile.open(inputFile) yields us one area from the shapefile after the next (read more about the package in the docs).

This step-by-step reading is very performant for large files, but a bit unhandy for us. In the end, we actually just want to have one GeoJSON file.

But no problem, we’ll just construct it ourselves. In the GeoJSON documentation, we find out how our starting point should look like:

{ type: 'FeatureCollection', features: [] }

This is an empty list of areas. So in the following lines, we push each municipality (in GeoJSON called these are called “features”) into the empty array features.

Each shape in the shapefile has a geometry, defining its borders and properties where we can store data about the area. As there are plenty of data points in the shapefile that we don’t need, we extract only the bfs_id by destructuring it from all properties. (If this syntax is unfamiliar to you, read more about at MDN).

At the end that we just write our selfmade GeoJSON into a file that now looks like this:

{"type":"FeatureCollection","features":[{"type":"Feature","properties":{"bfs_id":1},"geometry":{"type":"Polygon","coordinates":[[[2680806.098200001,1237763.4615000002],[2681154.9166,1237443.4145000018],

But wait. The result is 2.2MB big? Wow, that’s even bigger than our shapefile was. 😟 But don’t give up just yet – to reduce the file size we have different options:

1. Simplifying shapes

One way to reduce the file size of a GeoJSON is to reduce the details of the shapes. This is called simplifying. Let’s quickly drag and drop our shapefile or GeoJSON into the tool mapshaper.

In there we can perform simplification and play around with the level of simplification. Let’s simplify pretty drastically (0.96%) to see how this changes our shapes and the file size:

mapshaper simplified

When we click export, we get a GeoJSON file that is 889kb.

But almost 1MB? That’s still a lot – and for that, we sacrificed so many details? People will barely recognize the municipalities they’re living in 🙁 Not cool. We’re not gonna go down that road.

If simplification is something that you still want, you could, of course, add it to the node script. You would need to add the package @turf/simplify (The turf package offers a lot of neat helpers to handle geodata btw) and add the following line before writing the content to a file:

var simplify = require('simplify-geojson')// just before write to file add the following line:
.then(geojson => simplify(geojson, { tolerance: 0.05 }))

But let’s continue and check other options, what work better:

2. Rounding coordinates

In the first few lines of our GeoJSON, we see: our coordinates are super detailed. Didn’t we say 1 unit equals 1 meter? Why do we need decimal numbers? That’s way too detailed! So we install geojson-precision and add it to our Promise chain:

var geojsonPrecision = require('geojson-precision')// just before write to file add the following line:
.then(geojson => geojsonPrecision.parse(geojson, 0))

What it does: it passes our GeoJSON to the precision function. With the number at the end, we define how many decimal numbers we need. We need none, so 0. (For Lat/Longs you definitely need more than 0 but for Swiss coordinates, you don’t).

This reduces our file size by almost 50% to now 1.2MB. That’s a good start, but can we do more?

3. TopoJSON

Besides GeoJSON which is really the industry standard for working geographical data in browsers, there is also a second format: TopoJSON.

What’s cool about it: While in GeoJSON, every feature contains all of its coordinates, TopoJSON makes a list out all coordinates. This means that all areas that share a border can then reference them, so you only need to save coordinates once instead of twice.

Let’s install topojson-project and add it below the precision call. If we would also like to add e.g. the lakes of Switzerland, we could add multiple GeoJSONs here. For simplicity we won’t do this, so we just pass it an object where we call our areas municipalities.

var topojson = require('topojson-server')// just before write to file add the following line:
.then(geojson => topojson.topology({ municipalities: geojson }))

The resulting file size is now 814kb. Not too bad, but it will get better with the last step:

3. Quantization

As Mike Bostock, the creator of TopoJSON explains in an answer on StackOverflow: While simplification removes the number of coordinates of a shape, quantization is different:

«Quantization removes information by reducing the precision of each coordinate, effectively snapping each point to a regular grid. This reduces the size of the generated TopoJSON file because each coordinate is represented as an integer (such as between 0 and 9,999) with fewer digits.»

While simplifying also visually distorts our shapes, quantization is practically invisible to our eye. But very effective. We add 0 as a second argument to our topology call:

topojson.topology({ municipalities: geojson }, 1000)

Which snaps all our coordinates onto a grid of 1000 by 1000 dots. With this added, we reduced our file size to 462kb. Neat, right? We can now even remove geojson-precision again – quantization rounds all our numbers.

The beginning of our file now looks like this:

{
  "type": "Topology",
  "objects": {
    "municipalities": {
      "type": "GeometryCollection",
      "geometries": [{
        "type": "Polygon",
        "arcs": [ [ 0, 1, 2, 3, 4, 5 ] ],
        "properties": { "bfs_id": 1 }
      },

Instead of coordinates we now have arcs that reference coordinates by id. If we inspect the result by viewing it in mapshaper we see: Our file is really small but we’re not sacrificing any details:

result

In our frontend, we’ll later convert this data format back to GeoJSON to map it onto the screen with d3-geo. Part two will follow soon.


Angelo Zehr

Written by Angelo Zehr, data journalist at SRF Data and teacher.


Further reading