osm2pgsql Features
Portability
Osm2pgsql is a command line program written in modern, portable C++. It works
on Linux, Windows, macOS, and several other operating systems. It is available
on all major Linux distributions, Homebrew for macOS users, and we provide
bindary downloads for Windows. Docker images are provided from a third party.
Scalability
Osm2pgsql scales from a small geographical area to the whole world. You can import the data for a city in seconds. Or you can have a database with all OpenStreetMap data for the entire planet on a single machine in a few hours.
Backwards Compatibility
Keeping osm2pgsql stable is important, because it is used in production in many places. That’s why osm2pgsql is still able to run a ten year old configuration. That being said, osm2pgsql is also rapidly developing new features and we encourage you to take advantage of them.
Some features have been removed in the recently released 2.0 version. The old pgsql backend is now deprecated and will be removed in version 3.
Stay up-to-date with OSM
An osm2pgsql database can be updated from OSM change files. If you want to, you can keep your database current with only a few minutes delay from the main OSM database.
Osm2pgsql comes with its own little helper program called
osm2pgsql-replication
which makes updating your database a snap.
Table layout
Define any number of tables in the database with any number of columns of any type as you need them. Use builtin data conversion or define your own conversion functions to get the data in the format best suited for your applications.
Use PostgreSQL JSON and hstore data types to store the complete set of tags of
an OSM object in a single database column for maximum flexiblity or use
specific columns for specific attributes.
- text
- int, int2, int8
- real
- bool
- json(b)
- hstore
- (any other PostgreSQL datatype)
Configuration using a programming language
Osm2pgsql is endlessly configurable, because it leverages the power of the Lua programming language for its configuration. The database table schema, indexes, expire and so on is all configured with Lua code as well as the data cleanup and transformations.
The Lua config has access to environment variables allowing even more flexible
configuration. And with the use of Lua libraries functionality can be extended
even further.
Valid geometries
Osm2pgsl creates point, line, polygon, multipolygon, and other types of
geometries from OSM data and makes sure they are always valid. This makes
map rendering and further processing in the database much less error-prone.
Geometry transformations
Often used geometry transformation can be done by osm2pgsql while the data
is being imported. Data is not unnecessarily copied into the database in
intermediate steps.
- Centroid, Labelling point
- Simplification
- Splitting of multi-geometries
- …
Projections
While importing the geometries osm2pgsql can transform the coordinates into any projection you want. It has builtin support for lightning fast conversion of the OSM lon/lat coordinates to Web Mercator, the most popular format for map tiles. Or it leverages the Proj library to convert coordinates into basically any projection.
It is possible to use different projections for each table/column if that’s
needed.
- WGS84 (4326)
- Web Mercator (3857)
- Any Projection supported by Proj library
OSM file formats
All popular OpenStreetMap file formats are supported when reading the data.
Multiple files can be read at the same time and their contents will be
merged.
- XML
- PBF
- OPL
- O5M
Organize your database
- schemas
- tablespaces
- custom types
Index creation
Osm2pgsql will always automatically create the indexes it needs for updating
the data (if you want to update the data). By default it will also create
a geometry index for the first (or only) geometry in all tables. But you
can change that and tell osm2pgsql exactly what indexes you’ll need and
it will create them. Index creation is run in parallel to speed up the
import.
- btree indexes
- geometry indexes
- unique indexes
Tile expiry
Most online maps use a tile-based approach, where the map is split into
rectangular (raster or vector) tiles that can be created, delivered and updated
independently. Osm2pgsql can create list of tiles that need updating based on
the changed OSM data.
- expire files
- expire database tables
- use any zoom level or zoom level range
Sorting by geometry
By default osm2pgsql will order the tables by geometry on import. This can speed up further processing considerably, because data that’s geographically near to other data will probably be used together.
If not needed, this function can be disabled for a faster import.
Raw OSM data in database
Osm2pgsql can create and update tables that contain all OSM data including
all tags, all attributes (version, creation timestamp, changeset, use id, user
name) as well as the relationships between ways and their member nodes and
relations and their members.
- nodes
- ways
- relations
Handling untagged objects
For performance you usually are only interested in tagged OSM objects.
But access to untagged objects is available for the special cases where
you need it.
- process_untagged_node()
- process_untagged_way()
- process_untagged_relation()
Two-stage processing for relations
Osm2pgsql has advanced support for working with OSM relations. Using optional
two-stage processing tags and other information from relations can be attached
to their member objects. This is useful for route relations, for instance.
Postprocessing (beta)
Osm2pgsql can run several types of postprocessing steps after an initial import of the OSM data or after updates of changed OSM data.
This is currently done with the separate osm2pgsql-gen
command.
- Create any indexes
- Run any SQL
- Run any SQL per tile
Generalization (beta)
Geographic data usually needs to be generalized for small zoom levels/small
map scales. This is difficult to do automatically and often quite slow.
Osm2pgsql implements several algorithms for generalization that can be used
at scale.
- Line simplification
- Polygon simplification
- Tile-based generalization
- Discrete Isolation
Themepark (beta)
Themepark is a framework for mixing and matching several configurations into one, allowing you to re-use configurations others have written and merge them with your own configurations to quickly assemble a setup that works for you.
This can be used, for instance, to create different rendering styles from
the same database, or even to use one database for rendering and geocoding.