Skip to main content

Is it http://schema.org or https://schema.org?

· 7 min read
Software Architect & Developer

Schema.org is the web’s most widely adopted vocabulary for structured data, supported by Google, Bing, and countless data platforms. It defines types like CreativeWork, Person, and Organization, giving machines a common language to understand content. This is exactly why the NDE application profile is built on Schema.org. Getting it right is what makes cultural heritage data findable and reusable across institutions.

But if you’ve worked with Schema.org in RDF, you’ve run into this: there’s http://schema.org and then there’s https://schema.org. That one-letter difference can cause real problems: SPARQL queries that silently return nothing, SHACL validation that rejects good data or ignores bad data, or datasets that should link up but don’t – especially when combining data from multiple sources, as in NDE.

How did this happen?

Historically, Schema.org identifiers were defined using http://schema.org/. In 2018, as part of the broader web push towards HTTPS, the website moved to HTTPS. Schema.org’s own FAQ had already declared (since 2015) that “both ‘https://schema.org’ and ‘http://schema.org’ are fine.” So people naturally started using https:// – copying it from the site, its examples, and following the FAQ advice – even though the vocabulary identifiers were still defined as HTTP.

But a webpage is not the same as an identifier. Your browser just “ends up” at https://schema.org/ either way, but for any tool that processes RDF, http://schema.org/CreativeWork and https://schema.org/CreativeWork are two different things.

As Tim Berners-Lee wrote in Axioms of Web Architecture: “the significance of identity for a given URI is determined by the person who owns the URI, who first determined what it points to.” So what did Schema.org’s owner determine?

Why “both are fine” is misleading

Schema.org’s FAQ #19 says that “both ‘https://schema.org’ and ‘http://schema.org’ are fine” and that “there should be no urgency about migrating existing data,” while noting the site itself has migrated to HTTPS as the default. That advice is accurate for search-engine consumers like Google, which normalize both variants. But for JSON-LD processors and RDF tooling, http://schema.org/CreativeWork and https://schema.org/CreativeWork are not the same — and the JSON-LD context still defines the vocabulary with HTTP identifiers.

JSON-LD contexts resolve to HTTP

JSON-LD is a common RDF serialization format. Schema.org uses it in its website examples. When you write Schema.org in JSON-LD, it looks like this:

{
"@context": "https://schema.org",
"@type": "CreativeWork"
}

JSON-LD processors follow a standardized mechanism called remote document and context retrieval:

  1. The processor sees @context: "https://schema.org".
  2. It fetches that URL. The server responds with a Link header pointing to the JSON-LD context: </docs/jsonldcontext.jsonld>; rel="alternate"; type="application/ld+json".
  3. The processor follows that link to https://schema.org/docs/jsonldcontext.jsonld.

Inside that document you’ll find something important:

{
"@context": {
"@vocab": "http://schema.org/"
}
}

This tells the JSON-LD processor: all Schema.org terms expand to http://schema.org/ URIs. So even if your RDF uses https://schema.org as the context URL, a processor expands "@type": "CreativeWork" to http://schema.org/CreativeWork.

You can see this in action on the command line:

echo '{"@context":"https://schema.org","@type":"CreativeWork","name":"Example"}' | riot --syntax=jsonld --output=nquads
_:B... <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/CreativeWork> .
_:B... <http://schema.org/name> "Example" .

Tools – including structured-data validators, knowledge-graph pipelines, and libraries like Riot – normalize everything to HTTP URIs. They aren’t downgrading security; they’re following the vocabulary definition from the official context. As of Schema.org v29.4, this is still the case.

What does the community say?

The broader community has been debating this for years, with valid arguments on both sides.

Arguments for staying with HTTP:

  • schemaorg#4404 (still open): contributors point out that “identifiers in the schema are technically URIs, not URLs. It’s actually better if they don’t change over time and there is no security implication involved in a predicate or type,” and that “Stable URIs that do not change over time are a MUST in the world of Linked Open Data.” Others describe the current situation as an “ugly intermediate zone” – neither fully HTTP nor fully HTTPS.
  • schemaorg#1325 (closed, not planned): confirmed there would be no mandatory migration to HTTPS.
  • schemaorg#2597: examples were updated to HTTPS, but the context was intentionally kept at HTTP.
  • schemaorg#2853: a v12.0 attempt to switch the context to HTTPS was immediately reverted due to breakage.
  • ro-crate#427 and codemeta#373: both RO-Crate and CodeMeta are discussing moving to HTTPS but still use HTTP.

Arguments for moving to HTTPS:

  • FAQ #19 says “both are fine” and that HTTPS is “our preferred form in examples.”
  • rdflib defines its Schema.org namespace as https://schema.org/ (exported as both SDO and schema).
  • PiCo, an NDE community vocabulary, has already adopted HTTPS in practice.

Even prefix conventions reflect the divide: prefix.cc maps schema: to http://schema.org/ and sdo: to https://schema.org/.

What about security?

A common argument for HTTPS identifiers is security: shouldn’t we fetch vocabulary definitions over a secure connection? In practice, this is a non-issue. If you dereference http://schema.org/CreativeWork, the server redirects you to https://schema.org/CreativeWork – the definition is served over HTTPS regardless. The HTTP in the URI is the identifier, not the transport protocol.

What about dereferenceability?

Schema.org URIs aren’t just opaque strings – they do resolve to useful definitions, and that dereferenceability is a core Linked Data principle. You can look up what CreativeWork means, what properties it has, and how it relates to other types. This remains fully functional with HTTP identifiers: you still get the definitions, still over a secure connection. What matters for interoperability is that everyone uses the same identifier, whichever scheme it uses – and right now, the canonical one is HTTP.

Cool URIs don’t change

Tim Berners-Lee wrote in his essay Cool URIs don’t change: URIs should be stable over time. Once http://schema.org/CreativeWork became the identifier for that concept, changing it to https://schema.org/CreativeWork would break existing data, queries, and validation rules.

He recently reflected on the HTTP-to-HTTPS transition in his book This is for Everyone (p. 176):

“We hit an unfortunate snag when we changed the URL to add an ‘S’: it broke all existing links on the web! In retrospect, the web would be much more functional and simpler if HTTP and HTTPS pages both used the same URL, just with different protocols.”

The URI scheme change should not have affected identity – but it did. Using HTTP for vocabulary identifiers is not unusual: Dublin Core Terms, for example, uses http://purl.org/dc/terms/ and has never changed to HTTPS.

Making a clear choice

The NDE Schema Application Profile recommends:

  • Publishers SHOULD use http://schema.org/, because that’s what the JSON-LD context resolves to and what aligns with the canonical vocabulary.
  • Consumers MUST accept both HTTP and HTTPS variants. A large amount of Schema.org data already exists in the wild using HTTPS — including vocabularies like PiCo — and refusing it would mean ignoring perfectly valid datasets.

This is an NDE interoperability recommendation tied to the current JSON-LD context behaviour, not a universal policy. If Schema.org migrates its context to HTTPS, this recommendation would change accordingly.

The rationale is documented in schema-profile#45. This pragmatic approach follows the robustness principle: be strict in what you publish, liberal in what you accept.

Conclusion

The community may eventually converge on HTTPS, but until Schema.org’s JSON-LD context reflects that change, using HTTP remains the most interoperable choice. When publishing structured data: use http://schema.org/. When consuming it: accept both.