Stack
This chapter proposes engineering choices, naming conventions, and operational patterns for the NDE network which are not yet endorsed by NDE. Feedback welcome via the issue tracker.
Introduction
The NDE Stack is the working name given here to the ecosystem of NDE-compatible components and the operational patterns that compose them, grounded in the network’s shared standards. The Stack’s goal is to help software developers in the NDE network solve shared functionality once rather than reinventing it in each project. Most of what appears here is new: proposed components and patterns yet to be built. The rest is existing software given a role in the Stack.
The Stack operationalises the architecture sketched in Van data naar dienst: visie op de ontwikkeling van verbonden erfgoeddiensten (NDE, November 2025), covering the Data and Presentation Layers of a Service Platform. The Stack also includes the Publication Layer and connects to source-side data management. As more of the network’s functionality is named, the Stack grows. Where the report leaves gaps, the Stack fills them.
These chapters are also meant to create a shared language and understanding across the network: a common vocabulary for the components and patterns that builders, operators, and decision-makers can use when discussing what the Stack does and how it fits together. Naming the parts is the prerequisite for talking about them coherently across teams and organisations.
Scope
This is: a bridge from the report’s architectural vocabulary to running code. Names concrete components (existing and proposed), default operational patterns, and the foundational technologies the Stack depends on.
This isn’t: a restatement of the report. The report describes what should happen at each step of a Service Platform; this guidance proposes how, at the engineering level, with named patterns and packages.
Goes beyond the report where useful: some practical engineering concerns are not in the report, such as semantic search, snapshot-CDC (change data capture) deletion handling, per-source outage resilience, declarative standards-backed pipeline configuration. The Stack picks these up as natural extensions of the report’s framework, flagged in context where they appear so a reader can tell report-grounded content from Stack-direction extensions.
Primary audience: builders of Service Platforms and network services within the NDE network.
Status of contents: many components are proposals (@lde/* or @ndes/* packages that do not exist yet). The function-mapping table marks them as such. Patterns labelled “Proposed” have been discussed but not endorsed.
Taxonomy
The Stack uses a small vocabulary consistently.
| Term | Definition | Examples |
|---|---|---|
| Component | Software the Stack provides | @lde/* packages, @ndes/* packages |
| Pattern | Operational mechanic | Blue/green Rebuild, SCHEMA-AP-NDE-first, Ports & Adapters |
| Service | A running instance of a Component, deployed with a specific configuration | A Service Platform running the search projection with its own SHACL and search configuration |
| Network service | A Service operated by NDE, network-wide, addressable as a single canonical endpoint | Dataset Register, Network of Terms, Dataset Knowledge Graph |
| Standard | A network commitment the Stack adopts | SCHEMA-AP-NDE, LDES, IIIF, DCAT-AP 3.0 / Schema.org Dataset |
| Foundational technology | Upstream open-source dependency outside NDE governance | QLever, Typesense, nginx, Fastify, Mercurius |
Components
The Stack provides the components below. They live in the Service Platform side chapter; this catalog is a quick index. The Pipeline chapter shows how the pipeline components compose.
| Component | Layer | Brief |
|---|---|---|
| Search Pipeline (proposed) | Data | Builds a search index of records from selected datasets |
| Knowledge Graph Pipeline | Data | Builds a queryable knowledge graph from selected datasets |
| Search APIs (proposed) | Data | Search and filter API that Presentation Layers consume |
| Knowledge Graph APIs | Data | Query interfaces for the Stack’s knowledge graphs: DKG (operational), Term Backlink Graph (proposed), Knowledge Graph voor Termen (proposed), and any self-operated KG a Service Platform builds |
| Change Stream Producer (proposed) | Data | Publishes a Service Platform’s data changes as a feed other systems can subscribe to |
| Heritage UI Components (future) | Presentation | Reusable display components |
Foundational technologies
The Stack is built on top of mature open-source infrastructure that lives outside NDE/LDE governance. These are dependencies, not Stack components; release cycles, roadmaps, and breaking changes follow upstream projects. The Stack picks opinionated defaults and treats them as exchangeable for any conformant alternative; substitutes plug in behind ports defined by the Ports & adapters pattern, so a swap is a configuration concern rather than a code rewrite.
Defaults favour operational lightness: solutions that are simple to run on LDE’s shared infrastructure, on national infrastructure, and by individual service providers on their own hardware, so the same Stack stays realistic to operate at every scale. That criterion is why the default search engine is Typesense rather than the heavier Elasticsearch.
| Concern | Stack default | Realistic substitutes |
|---|---|---|
| RDF triplestore / SPARQL engine | QLever – read-only-after-load, fast bulk-load, fits blue/green rebuild | Oxigraph, GraphDB, Jena Fuseki |
| Search engine | Typesense – used by the search pipeline | Elasticsearch, OpenSearch (each behind their own adapter at the search pipeline’s engine boundary) |
| Reverse proxy | nginx – default for proxy-level blue/green | HAProxy (runtime API for zero-reload switching), Caddy (admin-API hot reload), Envoy |
| Web / API runtime | Fastify – one runtime for both API styles: REST via @fastify/swagger for the OpenAPI surface (used by the Dataset Register API today), GraphQL via Mercurius (proposed @lde/graphql-server) | REST: any OpenAPI-capable framework. GraphQL: Apollo Server, Yoga, any GraphQL.js-based runtime |
Substrates
Three distinct bodies of source data (substrates) underlie the Stack. Each pipeline rides on exactly one. They differ in source, scale, cadence, and consumers; the layer pages refer back to them by letter.
| Substrate | What’s crawled | Source(s) | Scale / cadence | Feeds |
|---|---|---|---|---|
| A. Dataset descriptions | DCAT-AP metadata about datasets (titles, publishers, distributions, licenses, subjects) | Publishers’ DCAT-AP descriptions, harvested by the NDE Datasetregister | Small, metadata-only. Refresh frequently (daily) | Enumeration for B: which datasets exist and where their distributions live. Catalog Search Pipeline over the descriptions themselves (the Dataset Register browser), enriched with DKG facets |
| B. Metadata records | Metadata records inside each distribution (instances of CreativeWork, Person, Place, …) | Per-dataset SPARQL endpoint or RDF dump | Large, per-dataset. Refresh per source on last modified date | Object Search Pipeline (records inside distributions); Dataset Knowledge Graph; Term Backlink Graph (data-model-agnostic vocab walk) |
| C. Terms | Terms and the relations between them, across terminology sources | Terminology sources, aggregated by the NDE Network of Terms | Medium, vocabulary-scoped. Refresh on vocab updates | The report’s “Knowledge Graph voor Termen” (function 5) |
Observations that fall out of this separation:
- A enumerates B, not C. B’s pipelines read the Register to learn which datasets exist and where their distributions live. C is enumerated separately, from the Network of Terms catalogue of terminology sources. Part of that catalogue may migrate into the Register over time, but external sources like GeoNames, AAT and Wikidata stay outside it, so C keeps its own enumeration.
- Object search over B is the norm; only the register indexes A. A Service Platform’s Search Pipeline reads the register (substrate A) only to enumerate which datasets to crawl; the records it ingests are the objects inside their distributions (substrate B) – each
CreativeWork,Person, orPlace. The dataset descriptions themselves never enter that index. The one exception is the Dataset Register’s own browser: being the catalog, it indexes substrate A directly – the dataset descriptions are its records – enriched with DKG facets. Same pipeline and stages either way; only the substrate and record grain differ. - B carries multiple projections. The same crawl feeds three structurally different sinks: an AP-aware search-document projection (Search Pipeline), a VoID (Vocabulary of Interlinked Datasets) statistical summary (Dataset Knowledge Graph), and a data-model-agnostic term-backlink projection (Term Backlink Graph). All three are enumerated from A but compute over B’s contents: what differs is the projection, not the substrate.
- Change cadences differ. B changes most often: records are added, updated, and removed at source as collections grow. A changes less often: dataset metadata is updated more rarely than the records it describes. C is most stable: vocabularies move slowly. Stack components inherit the change cadence of the substrate they ride on.