Search code examples
postgresqlnominatimosm2pgsql

osm2pgsql - importing of an openstreetmaps planet file takes very long


I have installed Nominatim to a server dedicated just for OSM data, with the following configurations: CentOS 7 operating system, 2x Intel XEON CPU L5420 @ 2.50GHz (Total 8 CPU cores), 16 GB of ram, and 2x2TB SATA hard drive.

I've configured the postgresql based on the recomendations on the Nominatim install wiki (http://wiki.openstreetmap.org/wiki/Nominatim/Installation#PostgreSQL_Tuning), taking into account, that my machine has only got 16 GB instead of the 32 GB recommended for those configs. I've used the following things:

shared_buffers = 1GB             # recommended for a 32GB machine was 2 GB
maintenance_work_mem = 4GB       # recommended for a 32GB macinhe was 8 GB
work_mem = 20MB                  # recommended for a 32GB machine was 50 MB
effective_cache_size = 10GB      # recommended for a 32GB machine was 24 GB
synchronous_commit = off
checkpoint_segments = 100
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
fsync = off
full_page_writes = off`

First, I've tried importing a small country extract(Luxembourg), setting a cache size of 6000, using the setup.php file from utils, it was imported succesfully under 1 hour.

Secondly, I've deleted the data of Luxembourg, and imported for another test purpose the country extract of Great Brittain, using a cache size of 8000, it imported succesfully as well, in around 2-3 hours.

Today, I've decided, to try to import the whole planet.pbf file, so I've deleted the postgresql database, downloaded a pbf of the planet from one of the official mirror sites, and ran the setup with a cache size of 10000. Beforehand, I've read up some benchmarks to get a vague idea of how much time and space will this operation take.

When the import started, I was very surprised. The importing of the nodes went with a whopping high speed of 1095.6k/s, in the benchmark which I've analyized (a 32GB ram machine), it was only 311.7k/s.

But when the import of the nodes finished, and the import of the ways started, the speed significantly dropped. It was importing the ways with the speed of 0.16k/s (altough it was slowly rising, it started from 0.05k/s, and in 4 hours it rised to the above mentioned value).

I've stopped the import, and tried to tweak the settings. I've allocated a higher cache size first (12000), but with no success, the nodes imported with a very high speed, but the ways remained at 0.10-0.13k/s. I then tried allocating a new swap file(the original was 8GB, I've allocated another 32GB as a swap file), but that didn't change anything neither. Lastly, I've edited the setup.php, changed the --number-processes from 1, to 6, and included the --slim keyword when osm2psql is started from there, but nothing changed.

Right now I am out of ideas. Is this speed decrease normal? Should I upgrade my machine to the recommended memory? I tought that a 16GB ram would be enough for planet pbf, I was aware that it could take more time with this machine, then with a 32 GB, but this seems very much. If the whole planet import would take not more then 12-15 days, I would be ok with that, but as things look now, with these settings the import would take around 2 months, and this is just too much, considering, an error could occur anywhere, and I have to start the whole import process again.

Any ideas what could cause this problem, or what other tweaks could I try, to fasten the import process?

Thanks


Solution

  • I had a similar performance problem using SATA drives, when I replaced the SATA drives for SSD drives the ways import speeded up from 0.02k/s to 8.29k/s. Now I have a very slow relations import which is at 0.01/s rate, so I believe memory is also an important factor for a full planet import but I have not tested it again.