Data model
This page is for users of the Dataset Register: anyone querying the SPARQL endpoint, fetching dataset descriptions in RDF, or building applications on top of the register. It describes the consumer-facing, published data model: the RDF as it appears in the register after fetching, validating, mapping, and storing the providers’ input.
If you are a publisher (a data platform submitting dataset descriptions) you are looking for the input format and validation rules instead — see the Requirements for Datasets.
The register stores descriptions in DCAT, aligned with DCAT-AP-NL 3.0. Schema.org submissions are converted to DCAT at ingest, so consumers see the DCAT form regardless of how the data was originally submitted. The Schema.org ↔ DCAT alignment mostly follows the W3C DCAT 3 Alignment with Schema.org appendix.
Cardinalities reflect the data as stored, including auto-derived and auto-default values.
Property-column tags signal the source vocabulary:
- untagged — profiled by DCAT-AP-NL 3.0 (the default).
- DCAT — defined in DCAT 3.0 but not profiled by DCAT-AP-NL.
- DC — plain Dublin Core (
dct:) passthrough; not profiled by DCAT-AP-NL, DCAT-AP, or DCAT 3.0. - DCAT-AP-NL: Distribution only — DCAT-AP-NL profiles the property only on Distribution; the dataset-level usage is a register convenience.
NAL stands for Named Authority List, the EU Publications Office’s term for the controlled vocabularies it maintains (Languages, Frequency, Access Rights, File Type, etc.).
Dataset
The dcat:Dataset, dcat:Distribution, and
foaf:Agent shapes describe the public dataset description as stored.
dcat:Dataset
When a dataset’s RDF description is fetched and validated, it is stored as a dcat:Dataset in its
own graph. The URL of the graph corresponds to the dataset’s IRI.
| DCAT term | Data type / notes | Cardinality |
|---|---|---|
dct:title | rdf:langString | 1..n (one per language) |
dct:identifier | Auto-derived from the dataset IRI | 1..1 |
dct:description | rdf:langString | 1..n (one per language) |
dct:license DCAT-AP-NL: Distribution only | IRI or literal in v1; v2.0: IRI required. Inherited by distributions that don’t specify their own. If the dataset has no IRI license, the register denormalises one IRI license from its distributions onto the dataset for query convenience. A license must exist on the dataset or on every distribution — see DistributionLicenseRequiredShape. | 0..1 |
dct:accessRights | EU Access Rights NAL IRI; defaults to PUBLIC | 1..1 |
dcat:theme | IRI from a controlled vocabulary; the EU Data Theme NAL value data-theme/EDUC is auto-assigned | 1..n |
dcat:contactPoint | vcard:Kind with vcard:fn and vcard:hasEmail (mailto: IRI) | 0..1 v2.0: 1..1 |
dct:language | EU Language Authority IRI | 0..n |
dcat:keyword | rdf:langString | 0..n |
dcat:landingPage | IRI | 0..n |
dct:source | IRI | 0..n |
dct:created DC | xsd:date or xsd:dateTime; the lexical form may not be ISO 8601 in v1.v2.0: ISO 8601 value required | 0..1 |
dct:issued | xsd:date or xsd:dateTime; the lexical form may not be ISO 8601 in v1.v2.0: ISO 8601 value required | 0..1 |
dct:modified | xsd:date or xsd:dateTime; the lexical form may not be ISO 8601 in v1.v2.0: ISO 8601 value required | 0..1 |
dcat:version | xsd:string | 0..1 |
dct:creator | foaf:Organization or foaf:Person | 0..n v2.0: 1..n |
dct:publisher | foaf:Organization or foaf:Person | 0..1 v2.0: 1..1 |
dct:spatial | IRI (e.g. GeoNames). DCAT-AP-NL also allows dct:Location with dcat:bbox / dcat:centroid / dcat:geometry, but the register stores IRI references only. | 0..n |
dct:temporal | dct:PeriodOfTime blank node with dcat:startDate and/or dcat:endDate | 0..n |
dct:isPartOf DC | IRI or literal in v1 v2.0: HTTPS IRI required | 0..n |
dct:hasPart DCAT | IRI | 0..n |
dct:isReferencedBy | IRI | 0..n |
dct:accrualPeriodicity | EU Frequency NAL IRI | 0..1 |
dcat:distribution | dcat:Distribution (see below) | 0..n |
dcat:Distribution
The objects of dcat:distribution dataset properties have type dcat:Distribution.
| DCAT term | Data type / notes | Cardinality |
|---|---|---|
dcat:accessURL | IRI v2.0: HTTPS IRI | 1..1 |
dcat:mediaType | IANA media type IRI. Required for download distributions; APIs use dct:conformsTo instead. Any compression suffix is split off into dcat:compressFormat. | 0..n v2.0: 0..1 |
dcat:compressFormat | IANA media type IRI; added when e.g. +gzip is stripped from dcat:mediaType | 0..1 |
dct:conformsTo | Protocol IRI (e.g. <https://www.w3.org/TR/sparql11-protocol/> for SPARQL endpoints) | 0..1 |
dct:issued | xsd:date or xsd:dateTime | 0..1 |
dct:modified | xsd:date or xsd:dateTime | 0..1 |
dct:title | rdf:langString | 0..n |
dct:description | rdf:langString | 0..n |
dct:language | EU Language Authority IRI | 0..1 |
dct:license | IRI or literal in v1; v2.0: IRI required. Inherited from the dataset if not specified. The register requires a license to exist on the distribution or the dataset via DistributionLicenseRequiredShape. | 0..1 |
dcat:byteSize | xsd:integer (bytes) | 0..1 |
foaf:page | IRI to a documentation page (SPARQL UI, download landing page, etc.) v2.0: HTTPS required | 0..n |
odrl:hasPolicy | ODRL policy associated with the distribution | 0..n |
foaf:Agent
The objects of both the dct:creator and dct:publisher dataset properties are foaf:Agent
instances — concretely either foaf:Organization or foaf:Person. The publisher carries
additional properties beyond those available on the creator.
| Property | Data type / notes | Cardinality |
|---|---|---|
foaf:name | rdf:langString — the organization or person name | 1..n (one per language) |
foaf:nick | Alternate name (publisher only) | 0..n |
dct:identifier | Identifier (publisher only) | 0..1 |
foaf:mbox | Email address as a literal (publisher only) | 0..1 |
owl:sameAs | Equivalent entity IRI (publisher only) | 0..n |
Registration
The shapes below describe how the register tracks registrations themselves – not the public dataset description that consumers query.
schema:EntryPoint
Any URL registered by clients is added as a schema:EntryPoint to the
Registrations graph.
Datasets are fetched from this URL on registration and when the crawler runs.
| Property | Description |
|---|---|
schema:additionalType | Computed registration status:
|
schema:datePosted | UTC datetime when the URL was registered. |
schema:dateRead | UTC datetime when the URL was last read by the application. The crawler updates this value when fetching descriptions. |
schema:status | The HTTP status code last encountered when fetching the URL. |
schema:validUntil | If the URL has become invalid, the UTC datetime at which it did so. |
schema:about | The schema:Datasets found at this URL. A registration URL may describe a single dataset (one entry) or a catalog of multiple datasets (multiple entries). The crawler updates this value when fetching descriptions. |
schema:Dataset
Each dataset that is found at the schema:EntryPoint registration URL gets added as a
schema:Dataset to the
Registrations graph.
| Property | Description |
|---|---|
schema:dateRead | UTC datetime when the dataset was last read by the application. |
schema:subjectOf | From which registration URL the dataset was read. |
schema:Rating
A separate named graph keeps a schema:Rating instance for each dataset description, indicating
how complete the description is. Reach it from a dataset via the schema:contentRating property.
| Property | Description |
|---|---|
schema:bestRating | The highest possible rating. |
schema:worstRating | The lowest possible rating. |
schema:ratingValue | Rating for the dataset description. |
schema:ratingExplanation | Explanation for the rating: which properties are missing? |
Allow list
A registration URL must be on a domain that is allowed before it can be added to the Register.
The allow list lives in the
https://data.netwerkdigitaalerfgoed.nl/registry/allowed_domain_names RDF graph.
Each entry is a blank node with a single property:
| Property | Description |
|---|---|
https://data.netwerkdigitaalerfgoed.nl/allowed_domain_names/def/domain_name | Literal: either a registrable domain (example.com) or a specific subdomain (sub.example.com). A registrable domain implicitly covers all its subdomains. |
To modify the allow list, use the REST API (POST /allowed-domains); the SPARQL
endpoint is read-only.