Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redefine collection-level metadata and v0.3.0 #39

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -3,7 +3,7 @@ on:
- push
- pull_request
jobs:
deploy:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-python@v5
@@ -15,7 +15,30 @@ jobs:
pip install pipenv
pipenv install
- run: pipenv run test-docs
schema:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-python@v5
with:
python-version: '>=3.9'
- uses: actions/checkout@v4
- name: Install pipenv
run: |
pip install pipenv
pipenv install
- run: pipenv run test-schema
examples:
runs-on: ubuntu-latest
needs: schema
steps:
- uses: actions/setup-python@v5
with:
python-version: '>=3.9'
- uses: actions/checkout@v4
- name: Install pipenv
run: |
pip install pipenv
pipenv install
- run: pipenv run test-geojson-features
- run: pipenv run test-geojson-collection
- run: pipenv run test-geoparquet
- run: pipenv run test-geoparquet
27 changes: 17 additions & 10 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -11,27 +11,34 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

- Property `category`
- Property `determination_details`
- Information about the encoding of datatypes at the collection-level

### Changed

- ...

### Deprecated

- ...
- Switched from v0.1.0 to v0.2.0 of the schema language
- Renamed `fiboa_extensions` to `schemas`
- Schemas must be valid HTTP(S) URLs
- GeoParquet: Renamed Parquet metadata key from `fiboa` to `collection`
- GeoJSON: Switched `contentEncoding` for data type `binary` from `binary` to `base64`
- GeoJSON data types: `null` is not allowed any longer, instead omit the property
- GeoJSON FeatureCollection: Collection-level data is provided at the top-level, not in a `fiboa` property

### Removed

- Value `administrative` was removed from `determination_method` in favour of the new property `category`
- Value `administrative` was removed from `determination_method` in favor of the new property `category`
- `fiboa_version` in favor of adding the schema URL of the specification to `schemas`.
- GeoJSON Feature: `links` property

### Fixed

- Various minor clarifications and editorial enhancements
- GeoParquet encoding: Properties that are optional can be omitted if all values are null values
- GeoJSON encoding: Clarify the encoding of the top-level properties (including `links` and `fiboa`)
- GeoJSON encoding: Clarify the use of RFC 7946
- GeoParquet encoding for bounding boxes and objects
- Added descriptions to the allowed values for `determination_method`
- Clarified handling of missing values
- GeoJSON: Clarify the encoding of the top-level properties (including `links` and `fiboa`)
- GeoJSON: Clarify the use of RFC 7946
- GeoParquet: Properties that are optional can be omitted if all values are null values
- GeoParquet: Added encoding for bounding boxes and objects
- GeoParquet: Clarified the use of Map and Struct data types

## [v0.2.0] - 2024-04-10

2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
@@ -8,7 +8,7 @@ preferred-citation:
type: standard
title: "Field Boundaries for Agriculture (fiboa) specification"
abstract: "Making field boundaries openly available in a unified way."
version: 0.2.0
version: 0.3.0
year: 2024
date-released: 2024-04-10
license: Apache-2.0
3 changes: 3 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -40,6 +40,9 @@ We use pipenv to execute the tests.
Start with the following command in the folder where this README is located:
`pip install pipenv --user`

Install the dependencies for the test:
`pipenv install`

Finally, you can run the tests as follows:

- To check the markdown run: `pipenv run test-docs`
25 changes: 20 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ This repository contains the core specification for fiboa, including the data sc
For more context, information on the ecosystem, and points of contact see the
[fiboa github organization](https://github.com/fiboa/).

- Version: **0.2.0**
- Version: **0.3.0**

> [!IMPORTANT]
> The fiboa specification is a work in progress.
@@ -29,15 +29,30 @@ The specification in this repository consists of three parts:
- [GeoJSON Encoding](geojson/README.md)
- [GeoParquet Encoding](geoparquet/README.md)

To completent the specification, there are also best practices and extensions available:
To complement the specification, there are also best practices and extensions available:

- [Best Practices](best-practices/README.md)
- [Extensions](https://github.com/fiboa/extensions/)

The repository also contains additional information about the project:
## Relation to other standards and working groups

- [Changelog](CHANGELOG.md)
- [Citation Details (as CFF file)](CITATION.cff)
fiboa doesn't aim to reinvent the wheel.
Our aim is to align with existing efforts as much as possible.
Some parts of the specification are already based on the work of other initiatives,
e.g. the determination-related fields in the core specification.

Related standards and working groups are:

- [Adapt standard](https://adaptstandard.org), including their [WG17](https://github.com/ADAPT/Standard/issues/97)
- [Varda FieldID](https://www.varda.ag/global-field-id)
- [Deere Boundaries](https://developer.deere.com/dev-docs/boundaries)
- [AgGateway](https://aggateway.org/), including their
[Locking in on Field Boundaries](https://aggateway.org/Portals/1010/WebSite/About%20Us/FIELD%20BOUNDARY%20FLYER%20122123.pdf?ver=2024-01-03-212959-590) initiative

If you think we are missing relevant work here, we'd love to hear from you.
Please get in touch by [opening an issue](https://github.com/fiboa/specification/issues/new)!

## Contributing

The fiboa community strives to provide a welcoming and transparent environment for all of the project’s participants.
You can find additional information about our community best practices and collaborative development processes below:
7 changes: 7 additions & 0 deletions best-practices/README.md
Original file line number Diff line number Diff line change
@@ -7,3 +7,10 @@

All properties should be using snake case.
For example a field for a land-use class should be named `landuse_class` instead of `landuseClass`.

## Extension prefixes

All properties in an extensions should have a common prefix.
Extensions commonly use the colon (`:`) as separator between prefix and property name, e.g. `crop:name`.
A single underscore (`_`) should be avoided to avoid conflicts with other property names (see [Casing](#casing)).
Nevertheless, the separator can be chosen freely by extension authors.
136 changes: 93 additions & 43 deletions core/README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,88 @@
# Core Specification
# Core Specification <!-- omit in toc -->

This specification describes the core data and metadata properties for both at the
Collection and Feature level.
This specification describes the core data and metadata properties that describe a fiboa Feature.
The specification doesn't distinguish between collection-level and feature-level properties,
common definitions are shared across these levels.

- A Collection refers to a group of one or more features.
- A Feature is a single field geometry with additional properties.

> [!NOTE]
> The Core Specification is still work in progress. Feedback is welcome!
- **Schema:** <https://fiboa.github.io/specification/v0.3.0/schema.yaml>

- **Schema:** <https://fiboa.github.io/specification/v0.2.0/schema.yaml>
## Table of Contents <!-- omit in toc -->

## Schema
- [General Properties](#general-properties)
- [schemas](#schemas)
- [id](#id)
- [collection](#collection)
- [category](#category)
- [Spatial Properties](#spatial-properties)
- [area / perimeter](#area--perimeter)
- [Determination Properties](#determination-properties)
- [determination\_datetime](#determination_datetime)
- [determination\_method](#determination_method)
- [Schema Language](#schema-language)

The data types in the following document are defined in
[fiboa Schema](https://github.com/fiboa/schema), v0.1.0.
## General Properties

fiboa Schema defines a (limited) set of data types and a vocabulary to express
additional constraints for these data types.
This allows to define a clear mapping between the core specification and its encodings.
| Property Name | Data Type | Description |
| ------------- | ------------------------------- | ----------- |
| schemas | object\<string, array\<string>> | **REQUIRED.** A list of schemas the collection implements. |
| id | string | **REQUIRED.** An identifier for the field. |
| collection | string | **REQUIRED.** The identifier of the collection. |
| category | array\<string> | A set of categories the field boundary belongs to. |

### schemas

The schemas the collection implements.
Each schema must be a valid HTTP(S) URLs to an existing YAML files compliant to fiboa Schema.
The schema for this specification (see above) is required to be provided.

Each `collection` must have a single set of applicable schemas.
The key of the dictionary must be equal to the value provided for the `collection` property.

- [Data types](https://github.com/fiboa/schema/blob/v0.1.0/datatypes.md)
- [Vocabulary](https://github.com/fiboa/schema/blob/v0.1.0/README.md#vocabulary)
The schema URI for fiboa that is listed above is required to be present.

## Collection
**Example for `schemas`:**

Collection-level metadata must be provided in an object that contains the properties below.
The invidiual encodings may decide to embed the collection or make it available separately.
This describes two collections `abc` and `xyz`.

### Properties
```json
{
"abc": [
"https://fiboa.github.io/specification/v0.3.0/schema.yaml"
],
"xyz": [
"https://fiboa.github.io/specification/v0.3.0/schema.yaml",
"https://fiboa.github.io/crop-extension/v0.1.0/schema.yaml",
]
}
```

| Property Name | Data Type | Description |
| ---------------- | -------------- | ----------- |
| fiboa_version | string | **REQUIRED.** Version number of the fiboa specification this entity implements. |
| fiboa_extensions | array\<string> | A list of URIs to extensions this entity implements. |
### id

Generally, the version and the extensions must be uniform per Collection.
It must be unique per collection, i.e. `collection` and `id` form a unique identifier.

Other properties are also allowed to be provided, but are not described by this specification.
### collection

## Features
A collection is a group of one or more features with a unique identifier, stored in the `collection` property.

### General Properties
Encodings may support to store properties that consists of the same value across all features at the collection-level.
This de-duplicates data for more efficient resource usage, but only applies if more than two features are available for the collection.
The specific location and behaviour of collection-level data is specified in the encoding-specific specifications.

| Property Name | Data Type | Description |
| ------------- | -------------- | ----------- |
| id | string | **REQUIRED.** A unique identifier for the field. It must be unique within the [Collection](#collection). |
| collection | string | The identifier of the parent collection. |
| category | array\<string> | A set of categories the field boundary belongs to. |
**Example:**

**collection:** The collection identifier is usually only needed for merged datasets.
You have two different field boundary datasets named `abc` (CC-0 licensed) and `xyz` (CC-BY-4.0 licensed).
If you store the datasets separately, you can store the license in the collection-level data
as the value for the property is the same for all features.
Once you merged the two datasets, you must ensure that a unique identifier for the collection is provieded
(here: `abc` and `xyz`) so that IDs are unique.
Additionally, you have to add the license property on the feature-level as the licenses are now twofold.

**category:** Choose any (unique) combination of the following values:
### category

Choose any (unique) combination of the following values:

- `conceptual`: This boundary represents how the grower thinks of a field, and what they would share with service
providers to allocate information at the highest level of the field concept within their operation.
@@ -65,7 +96,7 @@ Other properties are also allowed to be provided, but are not described by this

The categories are based on the [definitions of the AgGateway initiative](https://aggateway.org/Portals/1010/WebSite/About%20Us/FIELD%20BOUNDARY%20FLYER%20122123.pdf?ver=2024-01-03-212959-590).

### Spatial Properties
## Spatial Properties

| Property Name | Data Type | Description |
| ------------- | ------------ | ----------- |
@@ -74,29 +105,36 @@ The categories are based on the [definitions of the AgGateway initiative](https:
| area | float | Area of the field, in hectares. Must be > 0 and <= 100,000. |
| perimeter | float | Perimeter of the field, in meters. Must be > 0 and <= 125,000. |

**area/perimeter:** These are derived attributes from the geometry itself,
### area / perimeter

These are derived attributes from the geometry itself,
and must match the geometry's area/perimeter. If they do not match then the
geometry should be considered canonical.
Validators may flag the value as invalid if it exceeds a certain threshold.

### Determination Properties
## Determination Properties

| Property Name | Data Type | Description |
| ---------------------- | --------- | ------------------------------------------------------------ |
| determination_method | string | The boundary creation method, one of the values below. |
| determination_datetime | datetime | The last timestamp at which the field did exist and was observed, in UTC. |
| Property Name | Data Type | Description |
| ---------------------- | --------- | ----------- |
| determination_method | string | The boundary creation method, one of the values below. |
| determination_datetime | datetime | The last timestamp at which the field did exist and was observed. |
| determination_details | string | Further details about the determination, especially the methodology. |

**determination_datetime**: In case the source of the information is an
interval or a set of timestamps, use the end.
### determination_datetime

The last timestamp at which the field did exist and was observed, provided in the UTC timezone.

In case the source of the information is an interval or a set of timestamps, use the end.
For example, for ML you'd use the timestamp of the last image and not the
timestamp of the actual execution.

> [!NOTE]
> We define more temporal properties in the
> [timestamps extension](https://github.com/fiboa/timestamps).

**determination_method**: Must be one of the following values:
### determination_method

The determination method must be one of the following values:

- `manual`: Hand created from imagery, e.g. using a tool to point and click on a map.
- `surveyed`: Determined through a professional land survey measuring the actual distances and angles on the ground.
@@ -107,3 +145,15 @@ timestamp of the actual execution.

The determination methods are based on the definitions of the [AgGateway initiative - WG17](https://aggateway.org/).
The specific values have [not been published yet](https://github.com/fiboa/specification/issues/31).

## Schema Language

The schema language used for fiboa is [fiboa Schema](https://github.com/fiboa/schema), version 0.2.0.

The data types in the tables above are defined in the document
[Data Types](https://github.com/fiboa/schema/blob/v0.2.0/datatypes.md).

fiboa Schema defines a (limited) set of data types and a
[vocabulary](https://github.com/fiboa/schema/blob/v0.2.0/README.md#vocabulary)
to express additional constraints for these data types.
This allows to define a clear mapping between the core specification and its encodings.
19 changes: 18 additions & 1 deletion core/schema/schema.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,25 @@
$schema: https://fiboa.github.io/schema/v0.1.0/schema.json
$schema: https://fiboa.github.io/schema/v0.3.0/schema.json
required:
- schemas
- id
- collection
- geometry
collection:
schemas: true
id: false
geometry: false
bbox: false
properties:
schemas:
type: array
items:
type: string
format: uri
pattern: ^https?://
contains:
type: string
enum:
- https://fiboa.github.io/specification/v0.3.0/schema.yaml
id:
type: string
minLength: 1
83 changes: 47 additions & 36 deletions geojson/README.md
Original file line number Diff line number Diff line change
@@ -1,62 +1,73 @@
# GeoJSON Encoding Specification

The GeoJSON encoding defines how field boundaries compliant to fiboa must be published.
The generic GeoJSON format is defined in
[IETF RFC 7946](https://datatracker.ietf.org/doc/html/rfc7946).
The GeoJSON encoding defines to encode field boundaries compliant to fiboa as
GeoJSON as defined in [IETF RFC7946](https://datatracker.ietf.org/doc/html/rfc7946).

> [!NOTE]
> The GeoJSON encoding is still work in progress. Feedback is welcome!
A single fiboa Feature must be encoded as a GeoJSON [`Feature`](#feature).
Multiple fiboa Featurs should be provided as a GeoJSON [`FeatureCollection`](#featurecollection).
Other GeoJSON types are not allowed.

- **[Examples](examples/):**
1. [as a FeatureCollection](examples/featurecollection/features.json)
2. [as individual Features with a dedicated Collection](examples/individual-features/)
- **[Datatype mapping](datatypes.md)**
Related documents:

## Collection
- [Examples](examples/)
- [Datatype mapping](datatypes.md)

A [fiboa Collection](../core/README.md#collection) must be provided as a JSON object, either
1. embedded into the GeoJSON in a top-level property named `fiboa` (see example 1), or
2. separately as a JSON file that is linked to from the GeoJSON (see example 2).
## Feature

A fiboa Collection may be a [GeoJSON FeatureCollection](https://datatracker.ietf.org/doc/html/rfc7946#section-3.3).
All features in a FeatureCollection must be fiboa-compliant.
- Example: [individual features](examples/individual-features/)

## Features

Each [fiboa Feature](../core/README.md#features) must be a valid
Each [fiboa Feature](../core/README.md) must be a valid
[GeoJSON Feature](https://datatracker.ietf.org/doc/html/rfc7946#section-3.2).

The following properties are defined for a GeoJSON Feature (at the top-level of the object):

| Property Name | Data Type | Description |
| ------------- | ------------------- | ------------------------------------------------------------ |
| id | string | **REQUIRED. ** See [id](../core/README.md#general-properties) in the core specification, must not be a `number` |
| type | string | **REQUIRED. ** The GeoJSON type, must be: `Feature` |
| Property Name | Data Type | Description |
| ------------- | ------------------- | ----------- |
| id | string | **REQUIRED.** See [id](../core/README.md#id) in the core specification, must not be a `number` |
| type | string | **REQUIRED.** The GeoJSON type, must be: `Feature` |
| geometry | object | **REQUIRED.** A [GeoJSON Geometry Object](https://datatracker.ietf.org/doc/html/rfc7946#section-3.1), must not be `null` |
| bbox | array\<number> | A [GeoJSON Bounding Box](https://datatracker.ietf.org/doc/html/rfc7946#section-5) |
| properties | object | An object with additional properties (see [`properties`](#properties)) |
| links | array\<Link Object> | A list of links (see [`links`](#links)) |
| fiboa | object | An object with the [fiboa Collection](../core/README.md#collection) properties if not provided as a link (see [Collection](#collection)). |
| properties | object | An object with all additional properties (see [`properties`](#properties)) |

The mapping between the Parquet data types and the fiboa data types, can be found in the
[data type mapping](datatypes.md).

> [!IMPORTANT]
> RFC 7946 doesn't support a property named `crs`, which was only available in an earlier version of GeoJSON (2008).
> The CRS of the GeoJSON geometry and bbox must be WGS 84 / OGC CRS 84,
> see the [RFC 7946, chapter 4](https://datatracker.ietf.org/doc/html/rfc7946#section-4) for details.
> see the [RFC 7946, chapter 4](https://datatracker.ietf.org/doc/html/rfc7946#section-4) for details.
### `properties`
[Collection-level](../core/README.md#collection) data is not supported.
All properties are provides in the JSON object with the key [`properties`](#properties).

Must include any property that is required by the fiboa core specification (currently none).
### properties

Must include any property that is required by the fiboa core specification.
May include any additional property.
All properties defined by the core specification (except for `id`, `geometry` and `bbox`) or extensions should be provided here.

### `links`
## FeatureCollection

- Example: [a feature collection](examples/featurecollection/features.json)

All features in a GeoJSON FeatureCollection must be fiboa-compliant.

Properties can also be stored at the [collection-level](../core/README.md#collection)
if all values for a specific property have the same value in all features.
This de-duplicates data for more efficient resource usage.
All properties are stored on the top-level of the FeatureCollection object as
[foreign members](https://datatracker.ietf.org/doc/html/rfc7946#section-6.1).
The individual features shall not contain any properties that are stored at the collection-level.
Validation must ensure that the collection-level properties are taken into account.

The following properties in Features can't be collection-level properties:

An array of links where each link conforms to the
[Hyperlink Schema](http://schemas.opengis.net/ogcapi/common/part1/1.0/openapi/schemas/link.yaml)
defined in
[OGC API - Common - Part 1](https://docs.ogc.org/is/19-072/19-072.html#_11b9b4f7-42fc-413a-b63a-e7fb060b5e4b).
- `id`
- `geometry`
- `bbox`

The following relation types are commonly used:
Properties with the following names can#t be moved to the collection-level due to conflicts with the
FeatureCollection properties defined by GeoJSON:

- `self`: Absolute link to the GeoJSON file itself.
- `collection`: Link to the [Collection](#collection)
- `features`
- `type`
49 changes: 27 additions & 22 deletions geojson/datatypes.md
Original file line number Diff line number Diff line change
@@ -3,28 +3,33 @@
The following table shows the data types that are used by fiboa in the Property definitions.
It also shows the mapping to the GeoJSON data types.

| fiboa data type | (Geo)JSON |
| --------------------------------------------------- | ------------------------------------------------------------ |
| boolean | boolean |
| int8 | integer<br />minimum: -128<br />maximum: 127 |
| uint8 | integer<br />minimum: 0<br />maximum: 255 |
| int16 | integer<br />minimum: -32768<br />maximum: 32767 |
| uint16 | integer<br />minimum: 0<br />maximum: 65535 |
| int32 | integer<br />minimum: -2147483648<br />maximum: 2147483647 |
| uint32 | integer<br />minimum: 0<br />maximum: 4294967295 |
| int64 | integer<br />minimum: -9223372036854775808<br />maximum: 9223372036854775807 |
| uint64 | integer<br />minimum: 0<br />maximum: 18446744073709551615 |
| float<br />IEEE 32-bit | number<br />minimum: ?<br />maximum: ? |
| double<br />IEEE 64-bit | number<br />minimum: ?<br />maximum: ? |
| binary | string<br />contentEncoding: binary |
| string<br />charset: UTF-8 | string |
| array | array |
| object<br />keys: string<br />values: any | object<br />additionalProperties: false |
| date | string<br />format: date |
| date-time<br />with milliseconds<br />timezone: UTC | string<br />format: date-time<br />pattern: Z$ |
| geometry | [object with schema](https://geojson.org/schema/Geometry.json) |
| bounding-box<br />x and y only, no z | array<br />minItems: 4<br />maxItems: 4<br />items: number |
| *required* (not a datatype) | null |
| fiboa data type | (Geo)JSON | Collection-level |
| --------------------------------------------------- | ------------------------------------------------------------ | ---------------- |
| boolean | boolean | yes |
| int8 | integer<br />minimum: -128<br />maximum: 127 | yes |
| uint8 | integer<br />minimum: 0<br />maximum: 255 | yes |
| int16 | integer<br />minimum: -32768<br />maximum: 32767 | yes |
| uint16 | integer<br />minimum: 0<br />maximum: 65535 | yes |
| int32 | integer<br />minimum: -2147483648<br />maximum: 2147483647 | yes |
| uint32 | integer<br />minimum: 0<br />maximum: 4294967295 | yes |
| int64 | integer<br />minimum: -9223372036854775808<br />maximum: 9223372036854775807 | yes |
| uint64 | integer<br />minimum: 0<br />maximum: 18446744073709551615 | yes |
| float<br />IEEE 32-bit | number<br />minimum: ?<br />maximum: ? | yes |
| double<br />IEEE 64-bit | number<br />minimum: ?<br />maximum: ? | yes |
| binary | string<br />contentEncoding: base64 | yes |
| string<br />charset: UTF-8 | string | yes |
| array | array | yes |
| object<br />keys: string<br />values: any | object<br />additionalProperties: false | yes |
| date | string<br />format: date | yes |
| date-time<br />with milliseconds<br />timezone: UTC | string<br />format: date-time<br />pattern: Z$ | yes |
| geometry | [object with schema](https://geojson.org/schema/Geometry.json) | no |
| bounding-box<br />x and y only, no z | array<br />minItems: 4<br />maxItems: 4<br />items: number | no |

## Missing values

For optional properties, values might be missing.
This is expressed by omitting the JSON property.
The value `null` is not allowed.

## Potential issues in conversion

178 changes: 47 additions & 131 deletions geojson/examples/featurecollection/features.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
{
"schemas": {
"de_nrw": [
"https://fiboa.github.io/specification/v0.3.0/schema.yaml",
"https://fiboa.github.io/inspire-extension/v0.2.0/schema.yaml",
"https://fiboa.github.io/flik-extension/v0.1.0/schema.yaml",
"https://fiboa.github.io/crop-extension/v0.1.0/schema.yaml"
]
},
"collection": "de_nrw",
"license": "dl-de/by-2-0",
"attribution": "Land Nordrhein-Westfalen / Open.NRW - https://www.opengeodata.nrw.de/produkte/umwelt_klima/bodennutzung/landwirtschaft/",
"type": "FeatureCollection",
"features": [
{
@@ -8,158 +19,63 @@
"inspire:id": "https://geodaten.nrw.de/id/inspire-lc-fb/landcoverunit/12324",
"flik": "DENWLI0542130247",
"determination_datetime": "2005-02-28T00:00:00Z",
"nutz_code": "A",
"nutz_txt": "Ackerland",
"area": 1.631100058555603
"crop:code": "A",
"crop:name": "Ackerland",
"area": 1.6311
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
7.875243329949302,
51.7469574917968
],
[
7.8754156210171224,
51.74865579902567
],
[
7.87559517961007,
51.748657516128716
],
[
7.875727139469757,
51.74864762337336
],
[
7.875865723118926,
51.74861179149097
],
[
7.876160946694515,
51.74853656922356
],
[
7.876274940061089,
51.748526513043004
],
[
7.876646213349393,
51.74852263605798
],
[
7.876669177898854,
51.74759587524452
],
[
7.876683221091441,
51.7470291214554
],
[
7.875243329949302,
51.7469574917968
]
[7.8752433, 51.7469574],
[7.8754156, 51.7486557],
[7.8755951, 51.7486575],
[7.8757271, 51.7486476],
[7.8758657, 51.7486117],
[7.8761609, 51.7485365],
[7.8762749, 51.7485265],
[7.8766462, 51.7485226],
[7.8766691, 51.7475958],
[7.8766832, 51.7470291],
[7.8752433, 51.7469574]
]
]
},
"bbox": [
7.875243329949302,
51.7469574917968,
7.876683221091441,
51.748657516128716
]
"bbox": [7.8752433, 51.7469574, 7.8766832, 51.7486575]
},
{
"id": "2713",
"type": "Feature",
"properties": {
"inspire:id": "https://geodaten.nrw.de/id/inspire-lc-fb/landcoverunit/2713",
"flik": "DENWLI0540210084",
"determination_datetime": "2005-02-28T00:00:00Z",
"nutz_code": "A",
"nutz_txt": "Ackerland",
"area": 1.8975000381469727
"determination_datetime": "2005-02-22T00:00:00Z",
"crop:code": "W",
"crop:name": "Weide",
"area": 1.8975
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
9.279072225112648,
51.925508828714925
],
[
9.279848170539884,
51.92582918268683
],
[
9.280173032315249,
51.925963048968214
],
[
9.280599939130775,
51.92614034991495
],
[
9.280660193987938,
51.926028714865886
],
[
9.280886077078973,
51.9256102896548
],
[
9.281335286046785,
51.924778127406576
],
[
9.281305739341624,
51.92472580957354
],
[
9.280917027691007,
51.92458295033388
],
[
9.279903540966059,
51.92421337118715
],
[
9.279817610187122,
51.92423316888092
],
[
9.279398358118248,
51.92501015234708
],
[
9.279241344298002,
51.925301083950984
],
[
9.279072225112648,
51.925508828714925
]
[9.2790722, 51.9255088],
[9.2798481, 51.9258291],
[9.280173, 51.925963],
[9.2805999, 51.9261403],
[9.2806601, 51.9260287],
[9.280886, 51.9256102],
[9.2813352, 51.9247781],
[9.2813057, 51.9247258],
[9.280917, 51.9245829],
[9.2799035, 51.9242133],
[9.2798176, 51.9242331],
[9.2793983, 51.9250101],
[9.2792413, 51.925301],
[9.2790722, 51.9255088]
]
]
},
"bbox": [
9.279072225112648,
51.92421337118715,
9.281335286046785,
51.92614034991495
]
"bbox": [9.2790722, 51.9242133, 9.2813352, 51.9261403]
}
],
"fiboa": {
"fiboa_version": "0.2.0",
"fiboa_extensions": [
"https://fiboa.github.io/inspire-extension/v0.2.0/schema.yaml"
],
"id": "de_nrw",
"title": "Field boundaries for North Rhine-Westphalia (NRW), Germany",
"license": "dl-de/by-2-0",
"attribution": "Land Nordrhein-Westfalen / Open.NRW - https://www.opengeodata.nrw.de/produkte/umwelt_klima/bodennutzung/landwirtschaft/"
}
}
]
}
75 changes: 21 additions & 54 deletions geojson/examples/individual-features/12324.json
Original file line number Diff line number Diff line change
@@ -2,68 +2,35 @@
"id": "12324",
"type": "Feature",
"properties": {
"schemas": {
"example": [
"https://fiboa.github.io/specification/v0.3.0/schema.yaml",
"https://fiboa.github.io/flik-extension/v0.1.0/schema.yaml"
]
},
"collection": "example",
"flik": "DENWLI0542130247",
"determination_datetime": "2005-02-28T00:00:00Z",
"nutz_code": "A",
"nutz_txt": "Ackerland",
"area": 1.631100058555603
"area": 1.6311
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
7.875243329949302,
51.7469574917968
],
[
7.8754156210171224,
51.74865579902567
],
[
7.87559517961007,
51.748657516128716
],
[
7.875727139469757,
51.74864762337336
],
[
7.875865723118926,
51.74861179149097
],
[
7.876160946694515,
51.74853656922356
],
[
7.876274940061089,
51.748526513043004
],
[
7.876646213349393,
51.74852263605798
],
[
7.876669177898854,
51.74759587524452
],
[
7.876683221091441,
51.7470291214554
],
[
7.875243329949302,
51.7469574917968
]
[7.8752433, 51.7469574],
[7.8754156, 51.7486557],
[7.8755951, 51.7486575],
[7.8757271, 51.7486476],
[7.8758657, 51.7486117],
[7.8761609, 51.7485365],
[7.8762749, 51.7485265],
[7.8766462, 51.7485226],
[7.8766691, 51.7475958],
[7.8766832, 51.7470291],
[7.8752433, 51.7469574]
]
]
},
"links": [
{
"href": "collection.json",
"rel": "collection",
"type": "application/json"
}
]
}
}
}
89 changes: 25 additions & 64 deletions geojson/examples/individual-features/2713.json
Original file line number Diff line number Diff line change
@@ -2,80 +2,41 @@
"id": "2713",
"type": "Feature",
"properties": {
"schemas": {
"example": [
"https://fiboa.github.io/specification/v0.3.0/schema.yaml",
"https://fiboa.github.io/flik-extension/v0.1.0/schema.yaml"
]
},
"collection": "example",
"flik": "DENWLI0540210084",
"determination_datetime": "2005-02-28T00:00:00Z",
"nutz_code": "A",
"nutz_txt": "Ackerland",
"area": 1.8975000381469727
"area": 1.8975000
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
9.279072225112648,
51.925508828714925
],
[
9.279848170539884,
51.92582918268683
],
[
9.280173032315249,
51.925963048968214
],
[
9.280599939130775,
51.92614034991495
],
[
9.280660193987938,
51.926028714865886
],
[
9.280886077078973,
51.9256102896548
],
[
9.281335286046785,
51.924778127406576
],
[
9.281305739341624,
51.92472580957354
],
[
9.280917027691007,
51.92458295033388
],
[
9.279903540966059,
51.92421337118715
],
[
9.279817610187122,
51.92423316888092
],
[
9.279398358118248,
51.92501015234708
],
[
9.279241344298002,
51.925301083950984
],
[
9.279072225112648,
51.925508828714925
]
[9.2790722, 51.9255088],
[9.2798481, 51.9258291],
[9.2801730, 51.9259630],
[9.2805999, 51.9261403],
[9.2806601, 51.9260287],
[9.2808860, 51.9256102],
[9.2813352, 51.9247781],
[9.2813057, 51.9247258],
[9.2809170, 51.9245829],
[9.2799035, 51.9242133],
[9.2798176, 51.9242331],
[9.2793983, 51.9250101],
[9.2792413, 51.9253010],
[9.2790722, 51.9255088]
]
]
},
"links": [
{
"href": "collection.json",
"rel": "collection",
"type": "application/json"
}
"bbox": [
9.2790722, 51.9242133, 9.2813352, 51.9261403
]
}
}
62 changes: 0 additions & 62 deletions geojson/examples/individual-features/collection.json

This file was deleted.

2 changes: 1 addition & 1 deletion geojson/schema/datatypes.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://fiboa.github.io/specification/v0.2.0/geojson/datatypes.json",
"$id": "https://fiboa.github.io/specification/v0.3.0/geojson/datatypes.json",
"$defs": {
"boolean": {
"type": "boolean"
36 changes: 18 additions & 18 deletions geoparquet/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,33 @@
# GeoParquet Encoding Specification

The Geoparquet encoding defines how field boundaries compliant to fiboa must be published.
The generic GeoParquet format is defined in the
[OGC GeoParquet specification v1.0.0](https://geoparquet.org/releases/v1.0.0/).
The generic GeoParquet format is defined in the OGC GeoParquet specification,
either version [v1.0.0](https://geoparquet.org/releases/v1.0.0/)
or [v1.1.0](https://geoparquet.org/releases/v1.1.0/).
We aim to support any future version of GeoParquet, too.

> [!NOTE]
> The GeoParquet encoding is still work in progress. Feedback is welcome!
- **[Examples](examples/)**
- **[Data type mapping](datatypes.md)**

## Collection

The GeoParquet file must embed the [fiboa Collection](../core/README.md#collection)
in the Parquet metadata in a property named `fiboa`.

It is recommended to additionally provide the fiboa Collection as a separate JSON file, too.

## Features

Each [fiboa Feature](../core/README.md#features) corresponds to a row in a GeoParquet file.
Each [fiboa Feature](../core/README.md) corresponds to a row in a GeoParquet file.

The properties defined for fiboa Features are made available as individual columns in the GeoParquet file.

Properties that are optional can be omitted if all values are
[null values](https://parquet.apache.org/docs/file-format/nulls/),
i.e. the column can be missing from the GeoParquet file.

Properties can also be stored at the [collection-level](../core/README.md#collection) if all values in a column have the same value.
This de-duplicates data for more efficient resource usage and simplifies the sturcture of the Parquet file.
The GeoParquet file must embed the properties in the Parquet metadata in a property named `collection`.
The metadata must be JSON-encoded.

The mapping between the Parquet data types and the fiboa data types, can be found in the
[data type mapping](datatypes.md).

Related documents:

- [Examples](examples/)
- [Data type mapping](datatypes.md)

## Best practices

For data with a lot of repetition, brotli compression is recommended.
This applies particularly for merged datasets that don't deduplicate properties to the collection-level.
59 changes: 36 additions & 23 deletions geoparquet/datatypes.md
Original file line number Diff line number Diff line change
@@ -3,32 +3,45 @@
The following table shows the data types that are used by fiboa in the Property definitions.
It also shows the mapping to the GeoParquet data types.

| fiboa Schema data type | (Geo)Parquet |
| --------------------------------------------------- | ------------------------------------------------------------ |
| boolean | BOOLEAN |
| int8 | IntType<br />bitWidth: 8<br />isSigned: true<br />(deprecated: INT_8) |
| uint8 | IntType<br />bitWidth: 8<br />isSigned: false<br />(deprecated: UINT_8) |
| int16 | IntType<br />bitWidth: 16<br />isSigned: true<br />(deprecated: INT_16) |
| uint16 | IntType<br />bitWidth: 16<br />isSigned: false<br />(deprecated: UINT_16) |
| int32 | IntType<br />bitWidth: 32<br />isSigned: true<br />(deprecated: INT_32) |
| uint32 | IntType<br />bitWidth: 64<br />isSigned: false<br />(deprecated: UINT_32) |
| int64 | IntType<br />bitWidth: 64<br />isSigned: true<br />(deprecated: INT_64) |
| uint64 | IntType<br />bitWidth: 64<br />isSigned: false<br />(deprecated: UINT_64) |
| float<br />IEEE 32-bit | FLOAT |
| double<br />IEEE 64-bit | DOUBLE |
| binary | BYTE_ARRAY |
| string<br />charset: UTF-8 | STRING (BYTE_ARRAY) |
| array | LIST |
| object<br />keys: string<br />values: any | STRUCT / MAP |
| date | DATE (INT32) |
| date-time<br />with milliseconds<br />timezone: UTC | TimestampType (INT64)<br />isAdjustedToUTC: true<br />unit: MILLIS<br />(deprecated: TIMESTAMP_MILLIS) |
| geometry | BYTE_ARRAY<br />encoded as WKB |
| bounding-box<br />x and y only, no z | STRUCT(xmin FLOAT, ymin FLOAT, xmax FLOAT, ymax FLOAT) |
| *if a field is not required* | [Nullity](https://parquet.apache.org/docs/file-format/nulls/) |
| fiboa Schema data type | (Geo)Parquet | Collection-level |
| --------------------------------------------------- | ------------------------------------------------------------ | ------------------------------- |
| boolean | BOOLEAN | yes |
| int8 | IntType<br />bitWidth: 8<br />isSigned: true<br />(deprecated: INT_8) | yes |
| uint8 | IntType<br />bitWidth: 8<br />isSigned: false<br />(deprecated: UINT_8) | yes |
| int16 | IntType<br />bitWidth: 16<br />isSigned: true<br />(deprecated: INT_16) | yes |
| uint16 | IntType<br />bitWidth: 16<br />isSigned: false<br />(deprecated: UINT_16) | yes |
| int32 | IntType<br />bitWidth: 32<br />isSigned: true<br />(deprecated: INT_32) | yes |
| uint32 | IntType<br />bitWidth: 64<br />isSigned: false<br />(deprecated: UINT_32) | yes |
| int64 | IntType<br />bitWidth: 64<br />isSigned: true<br />(deprecated: INT_64) | yes |
| uint64 | IntType<br />bitWidth: 64<br />isSigned: false<br />(deprecated: UINT_64) | yes |
| float<br />IEEE 32-bit | FLOAT | yes |
| double<br />IEEE 64-bit | DOUBLE | yes |
| binary | BYTE_ARRAY | as string, base64-encoded |
| string<br />charset: UTF-8 | STRING (BYTE_ARRAY) | yes |
| array | LIST | yes |
| object<br />keys: string<br />values: any | STRUCT or MAP (see below) | yes |
| date | DATE (INT32) | as string, compliant to ISO8601 |
| date-time<br />with milliseconds<br />timezone: UTC | TimestampType (INT64)<br />isAdjustedToUTC: true<br />unit: MILLIS<br />(deprecated: TIMESTAMP_MILLIS) | as string, compliant to ISO8601 |
| geometry | BYTE_ARRAY<br />encoded as WKB | no |
| bounding-box<br />x and y only, no z | STRUCT(xmin FLOAT, ymin FLOAT, xmax FLOAT, ymax FLOAT) | no |

The integer data types and the data type string can also be mapped to the ENUM data type in Parquet
if a pre-defined set of values is available.

## Missing values

For optional properties, values might be missing.
This is expressed by providing the values `null`
(see data type [Nullity](https://parquet.apache.org/docs/file-format/nulls/)).

## Struct vs Map

Parquet has both Map and Struct types. The struct type is similar to a named dictionary while the map type is similar to a list of ordered (key, value) pairs. The main difference is that you need to know up-front the keys for the struct type, while you don't for the map type.

Due to this difference, the **Struct** type can only be used if `additionalProperties` is `false` (the default value) and only `properties` is provided to clearly specify the exact names of the properties.

Any variability in the keys through the use of `additionalProperties` (except for the default `false`) or `patternProperties` requires the use of the **Map** data type. Please note that the order of the Map type is guaranteed to be preserved.

## Unsupported Data Types

The following data types occur in Parquet, but are not currently supported in fiboa:
@@ -45,4 +58,4 @@ The following data types occur in Parquet, but are not currently supported in fi

## Potential issues in conversion

- The micro/nanosecond precision of Datetime / Times may got lost
- The micro/nanosecond precision of Datetime / Times may get lost