Skip to content

Versioning & data packaging

TextRefs versions three things that move at different speeds, and archives each independently:

TrainLives inTag formatZenodo concept DOI
Standard + sitetextrefs/textrefs.orgvMAJOR.MINOR.PATCH[-pre]TextRefs Standard
Registry datatextrefs/registryvYYYY.MM.NTextRefs Registry
Data-package versioninside datapackage.jsonSemVer without leading v(carried within registry deposit)

The site repository couples the spec, JSON-LD context, Zod schemas, and Astro site under a single tag because pre-1.0 the site is the spec’s reference rendering; splitting them now would create empty changelogs and confuse Zenodo metadata. Registry data is decoupled — record changes flow on their own cadence — and lives in a separate repository because the Zenodo–GitHub integration mints one concept DOI per repository. The two repositories are cross-linked via .zenodo.json related_identifiers.

The site repository includes textrefs/registry as a git submodule at data/. Until the registry cuts its first tagged release, the submodule tracks the registry’s dev branch; afterward, the site pins to specific registry tags for reproducible builds.

Each /standard/* page carries a maturity field in its frontmatter, encoding intent alongside the SemVer tag (which encodes pre-release status):

ValueMeaning
working-draftUnstable. Data model and prose may change without notice and without a version bump while the core is settled.
candidate-recommendationStable enough to implement against. Breaking changes still require a major version bump.
recommendationStable recommendation. Breaking changes require a major version bump and a new document.
supersededReplaced by a later version. Retained at its tagged URL for archival lookup.

Transitions:

  • 0.x releases stay working-draft regardless of any -draft suffix on the tag.
  • First 1.0.0-rc.1 enters candidate-recommendation.
  • 1.0.0 enters recommendation.
  • Breaking schema changes require a major version increment.
  • Compatible new fields require a minor version increment.
  • Corrections that do not change schema shape require a patch increment.

Monthly exports use this directory layout:

registry/exports/YYYY-MM/datapackage.json
registry/exports/YYYY-MM/works.jsonl
registry/exports/YYYY-MM/citation-systems.jsonl
registry/exports/YYYY-MM/references.jsonl
registry/exports/YYYY-MM/mappings.jsonl
registry/exports/YYYY-MM/resolver-targets.jsonl
registry/exports/YYYY-MM/CHANGES.md

Registry exports are organized by object type. This gives consumers stable file names, simple streaming imports, and one predictable place to find each record type. Relationships are represented inside records through standard fields such as key, work_key, citation_system_key, subject, and target. CHANGES.md is a record-level diff against the previous tagged export.

GitHub Releases are the primary distribution point for generated registry dumps. Each released tag in either repository is also deposited in the TextRefs Zenodo community for long-term archival preservation and DOI minting. Cite the version DOI when referring to a specific archived dump.

Each datapackage.json MUST include:

  • profile: data-package.
  • name: textrefs-registry.
  • version: SemVer package version.
  • licenses: SPDX identifier CC0-1.0 for registry data.
  • resources: one resource per JSONL file.
  • schema: field descriptors for each resource.

Records do not carry their own SemVer. The registry is append-only with status transitions (candidateactivedeprecated / withdrawn / blocked). Consumers pin to a registry tag (or its DOI) for reproducibility. Identifier-level changes are expressed via tombstones, below.

Registry identity is permanent: the IRI of a Work, CitationSystem, CanonicalReference, or MappingAssertion MUST continue to resolve once minted. Re-minting (renaming a key, correcting a locator that changes the content-derived UUID, splitting/merging records) MUST be expressed by tombstoning the old record and minting a successor.

Tombstones use one status value, no extra fields. The old record stays in the data tree with status: withdrawn. If a successor exists, a single MappingAssertion with relation: exactMatch, subject: <old IRI>, and target: <new IRI> carries the link. Consumers walk the mapping to find the successor.

Tombstones are full records, not deletions. The old record retains every other field unchanged; only status flips to withdrawn and modified is bumped. If a successor exists, the successor is a separately authored record at the new IRI, and the linking MappingAssertion is committed alongside.

Old IRI HTML pages render a tombstone banner. The page already lists every MappingAssertion whose subject is this record, so the successor (if any) appears in that list with no extra rendering logic. The .json JSON-LD sibling returns the withdrawn record verbatim. Old IRIs are not hard-redirected: archival consumers MUST be able to inspect the tombstone payload.

Tombstones MUST appear in monthly exports inside the same .jsonl file as their type. The status field is the signal — no separate tombstones.jsonl.

The compiler enforces: an active CanonicalReference MUST NOT reference a tombstoned Work or CitationSystem through work_key or citation_system_key. MappingAssertions are exempt — successor links from a withdrawn subject to an active target are exactly the documented pattern.

aliases.json handles presentational URL aliases (multiple paths pointing at the same canonical record). Tombstones handle identity changes (the record itself is no longer canonical). These are distinct mechanisms and MUST NOT be conflated.

Exports MUST NOT contain primary full text, commentary, apparatus, or rights metadata that implies TextRefs may redistribute copyrighted text. Disputed resolver endpoints remain in exports with status: blocked and tombstone rationale fields.