The ohsome-planet tool transforms OSM (history) PBF files into GeoParquet format. It creates the actual OSM elements geometries for nodes, ways and relations. The tool can join information from OSM changesets such as hashtags, OSM editor or usernames. You can join country codes to every OSM element by passing a boundary dataset as additional input.

You can use the ohsome-planet data to perform a wide range of geospatial analyses, e.g. using DuckDB, GeoPandas or QGIS. Display the data directly on a map and start playing around!


  • java 21


First, clone the repository and its submodules. Then, build it with Maven.

git clone --recurse-submodules
cd ohsome-planet
./mvnw clean package -DskipTests


You can download the full latest or history planet or download PBF files for smaller regions from Geofabrik.

To process a given PBF file, provide it in the --pbf parameter in the following example.

java -jar ohsome-planet-cli/target/ohsome-planet.jar contributions \
    --pbf data/karlsruhe.osh.pbf \
    --country-file data/world.csv \
    --output out-karlsruhe \

The parameters --country-file, --output and --overwrite are optional. To see all available parameters, call the tool with --help parameter.

Country Data

By passing the parameter --country-file you can perform a spatial join to enrich OSM contributions with country codes. The country file should be provided in .csv format. Geometries should we represented as WKT (well-known text) string. The current version only supports POLYGON or MULTIPOLYGON geometries.

Basically, the file should look like this:

DEU;POLYGON ((7.954102 49.781264, 11.118164 49.781264, 11.118164 51.563412, 7.954102 51.563412, 7.954102 49.781264))
FRA;POLYGON ((1.186523 45.058001, 4.833984 45.058001, 4.833984 48.545705, 1.186523 48.545705, 1.186523 45.058001))
ITA;POLYGON ((10.766602 41.211722, 14.985352 41.211722, 14.985352 44.024422, 10.766602 44.024422, 10.766602 41.211722))


We will add the functionality to join OSM changeset metadata soon. Stay tuned!

Output Structure

When using a history PBF file, the output files are split into history and latest contributions. All contributions which are a) not deleted and b) visible in OSM at the timestamp of the extract are considered as latest. The remaining contributions, e.g. deleted or old versions, are considered as history.

├── contributions
│   ├── history
│   │   ├── node-0-163811-history.parquet
│   │   ├── ...
│   │   ├── way-0-2496473-history.parquet
│   │   ├── ...
│   │   ├── relation-0-12345-history.parquet
│   │   └── ...
│   └── latest
│       ├── node-0-163811-latest.parquet
│       ├── ...
│       ├── way-0-2496473-latest.parquet
│       ├── ...
│       ├── relation-0-12345-latest.parquet
│       └── ...
├── minorNodes (rocksdb)
└── minorWays (rocksdb)

Inspect Results

You can inspect your results easily using DuckDB.

-- list all columns
DESCRIBE FROM read_parquet('out-karlruhe/*.parquet');

-- result
│    column_name    │                                                                column_type                                                                 │  null   │   key   │ default │  extra  │
│      varcharvarcharvarcharvarcharvarcharvarchar │
│ status            │ VARCHAR                                                                                                                                    │ YES     │         │         │         │
│ valid_from        │ TIMESTAMP WITH TIME ZONE                                                                                                                   │ YES     │         │         │         │
│ valid_to          │ TIMESTAMP WITH TIME ZONE                                                                                                                   │ YES     │         │         │         │
│ osm_type          │ VARCHAR                                                                                                                                    │ YES     │         │         │         │
│ osm_id            │ BIGINT                                                                                                                                     │ YES     │         │         │         │
│ osm_version       │ INTEGER                                                                                                                                    │ YES     │         │         │         │
│ osm_minor_version │ INTEGER                                                                                                                                    │ YES     │         │         │         │
│ osm_edits         │ INTEGER                                                                                                                                    │ YES     │         │         │         │
│ osm_last_edit     │ TIMESTAMP WITH TIME ZONE                                                                                                                   │ YES     │         │         │         │
│ user              │ STRUCT(id INTEGER, "name" VARCHAR)                                                                                                         │ YES     │         │         │         │
│ tags              │ MAP(VARCHAR, VARCHAR)                                                                                                                      │ YES     │         │         │         │
│ tags_before       │ MAP(VARCHAR, VARCHAR)                                                                                                                      │ YES     │         │         │         │
│ changeset         │ STRUCT(id BIGINT, created_at TIMESTAMP WITH TIME ZONE, closed_at TIMESTAMP WITH TIME ZONE, tags MAP(VARCHAR, VARCHAR), hashtags VARCHAR[]) │ YES     │         │         │         │
│ bbox              │ STRUCT(xmin DOUBLE, ymin DOUBLE, xmax DOUBLE, ymax DOUBLE)                                                                                 │ YES     │         │         │         │
│ centroid          │ STRUCT(x DOUBLE, y DOUBLE)                                                                                                                 │ YES     │         │         │         │
│ geometry_type     │ VARCHAR                                                                                                                                    │ YES     │         │         │         │
│ geometry          │ BLOB                                                                                                                                       │ YES     │         │         │         │
│ area              │ DOUBLE                                                                                                                                     │ YES     │         │         │         │
│ area_delta        │ DOUBLE                                                                                                                                     │ YES     │         │         │         │
│ length            │ DOUBLE                                                                                                                                     │ YES     │         │         │         │
│ length_delta      │ DOUBLE                                                                                                                                     │ YES     │         │         │         │
│ contrib_type      │ VARCHAR                                                                                                                                    │ YES     │         │         │         │
│ refs              │ BIGINT[]                                                                                                                                   │ YES     │         │         │         │
│ members           │ STRUCT("type" VARCHAR, id BIGINT, "role" VARCHAR, geometry_type VARCHAR, geometry BLOB)[]                                                  │ YES     │         │         │         │
│ country_iso       │ VARCHAR[]                                                                                                                                  │ YES     │         │         │         │
│ build_time        │ BIGINT                                                                                                                                     │ YES     │         │         │         │
│ 26 rows                                                                                                                                                                                      6 columns │

Getting Started as Developer

This is a list of resources that you might want to take a look at to get a better understanding of the core concepts used for this projects. In general, you should gain some understanding of the raw OSM (history) data format and know how to build geometries from nodes, ways and relations. Furthermore, knowledge about (Geo)Parquet files is useful as well.

What is the OSM PBF File Format?

What is parquet?

What is RocksDB?

  • RocksDB is a storage engine with key/value interface, where keys and values are arbitrary byte streams. It is a C++ library. It was developed at Facebook based on LevelDB and provides backwards-compatible support for LevelDB APIs.

How to build OSM geometries (for multipolygons)?


