Dataset Register
The Dataset Register is the list of datasets in the Dutch heritage network. For each dataset, it provides a machine-readable description that includes:
- the dataset’s name, creator and publisher
- information on how to access the dataset’s content, for example data dumps and SPARQL endpoints
- licensing information.
Where to go next
If you are publishing a dataset, register your dataset description via the REST API. The Register fetches it, validates it against the Requirements for Datasets, and stores the result. See also Register your dataset for the publisher-side walkthrough.
If you are consuming the Register, query the SPARQL endpoint. The data model describes the DCAT-AP-NL shapes you will encounter in the results.
If you are browsing, the Dataset Register website provides a human-readable search interface.
Keep reading here for an overview of the Dataset Register’s architeture, components and flows.
Components
The Dataset Register service is made up of several cooperating components:
- a REST API that accepts dataset registrations and validates incoming descriptions;
- a crawler that periodically re-fetches every registered URL, re-validates the response, and writes the result to the store;
- an RDF store that holds all valid dataset descriptions and exposes them via a SPARQL endpoint
- a website that lets people browse and search the contents of the store.
The diagram below shows how the Dataset Register sits between the publishing side (Data Platforms and publishers’ websites) and the consuming side (the Dataset Register website, the Knowledge Graph, and third-party applications):
Registration flow
To make a dataset description visible on the Dataset Register website, Data Platforms and the Dataset Register cooperate in the following steps.
- A Collection Manager produces a dataset description and publishes it on the web (e.g. on a website or in a SPARQL endpoint).
- The URL to the dataset description is registered with the Dataset Register.
- The Dataset Register validates the dataset description and stores it for later retrieval.
- Periodically, the Dataset Register fetches, validates and stores all dataset descriptions again.
- The NDE Dataset Knowledge Graph periodically fetches valid descriptions from the Dataset Register, analyses linked datasets, and stores their summaries.
- When users consult the Dataset Register website, information from the Dataset Register and the Knowledge Graph is combined.
This flow is explained in more detail in the two diagrams below.
Flow chart
Sequence diagram
Source code
The Dataset Register is open source.