Versioning & data packaging
Dieser Inhalt ist noch nicht in deiner Sprache verfügbar.
TextRefs versions three things that move at different speeds, and archives each independently:
| Train | Lives in | Tag format | Zenodo concept DOI |
|---|---|---|---|
| Standard + site | textrefs/textrefs.org | vMAJOR.MINOR.PATCH[-pre] | TextRefs Standard |
| Registry data | textrefs/registry | vYYYY.MM.N | TextRefs Registry |
| Data-package version | inside datapackage.json | SemVer without leading v | (carried within registry deposit) |
The site repository couples the spec, JSON-LD context, Zod schemas, and Astro site under a single tag because pre-1.0 the site is the spec’s reference rendering; splitting them now would create empty changelogs and confuse Zenodo metadata. Registry data is decoupled — record changes flow on their own cadence — and lives in a separate repository because the Zenodo–GitHub integration mints one concept DOI per repository. The two repositories are cross-linked via .zenodo.json related_identifiers.
The site repository includes textrefs/registry as a git submodule at data/. Until the registry cuts its first tagged release, the submodule tracks the registry’s dev branch; afterward, the site pins to specific registry tags for reproducible builds.
Maturity ladder
Section titled “Maturity ladder”Each /standard/* page carries a maturity field in its frontmatter, encoding intent alongside the SemVer tag (which encodes pre-release status):
| Value | Meaning |
|---|---|
working-draft | Unstable. Data model and prose may change without notice and without a version bump while the core is settled. |
candidate-recommendation | Stable enough to implement against. Breaking changes still require a major version bump. |
recommendation | Stable recommendation. Breaking changes require a major version bump and a new document. |
superseded | Replaced by a later version. Retained at its tagged URL for archival lookup. |
Transitions:
0.xreleases stayworking-draftregardless of any-draftsuffix on the tag.- First
1.0.0-rc.1enterscandidate-recommendation. 1.0.0entersrecommendation.
SemVer rules for data packages
Section titled “SemVer rules for data packages”- Breaking schema changes require a major version increment.
- Compatible new fields require a minor version increment.
- Corrections that do not change schema shape require a patch increment.
Export layout
Section titled “Export layout”Monthly exports use this directory layout:
registry/exports/YYYY-MM/datapackage.jsonregistry/exports/YYYY-MM/works.jsonlregistry/exports/YYYY-MM/citation-systems.jsonlregistry/exports/YYYY-MM/references.jsonlregistry/exports/YYYY-MM/mappings.jsonlregistry/exports/YYYY-MM/resolver-targets.jsonlregistry/exports/YYYY-MM/CHANGES.mdRegistry exports are organized by object type. This gives consumers stable file names, simple streaming imports, and one predictable place to find each record type. Relationships are represented inside records through standard fields such as key, work_key, citation_system_key, subject, and target. CHANGES.md is a record-level diff against the previous tagged export.
Archival copies and DOIs
Section titled “Archival copies and DOIs”GitHub Releases are the primary distribution point for generated registry dumps. Each released tag in either repository is also deposited in the TextRefs Zenodo community for long-term archival preservation and DOI minting. Cite the version DOI when referring to a specific archived dump.
Frictionless requirements
Section titled “Frictionless requirements”Each datapackage.json MUST include:
profile:data-package.name:textrefs-registry.version: SemVer package version.licenses: SPDX identifierCC0-1.0for registry data.resources: one resource per JSONL file.schema: field descriptors for each resource.
Per-record versioning
Section titled “Per-record versioning”Records do not carry their own SemVer. The registry is append-only with status transitions (candidate → active → deprecated / withdrawn / blocked). Consumers pin to a registry tag (or its DOI) for reproducibility. Identifier-level changes are expressed via tombstones, below.
Tombstones and re-minted records
Section titled “Tombstones and re-minted records”Registry identity is permanent: the IRI of a Work, CitationSystem, CanonicalReference, or MappingAssertion MUST continue to resolve once minted. Re-minting (renaming a key, correcting a locator that changes the content-derived UUID, splitting/merging records) MUST be expressed by tombstoning the old record and minting a successor.
Schema
Section titled “Schema”Tombstones use one status value, no extra fields. The old record stays in the data tree with status: withdrawn. If a successor exists, a single MappingAssertion with relation: exactMatch, subject: <old IRI>, and target: <new IRI> carries the link. Consumers walk the mapping to find the successor.
On-disk representation
Section titled “On-disk representation”Tombstones are full records, not deletions. The old record retains every other field unchanged; only status flips to withdrawn and modified is bumped. If a successor exists, the successor is a separately authored record at the new IRI, and the linking MappingAssertion is committed alongside.
HTTP behavior
Section titled “HTTP behavior”Old IRI HTML pages render a tombstone banner. The page already lists every MappingAssertion whose subject is this record, so the successor (if any) appears in that list with no extra rendering logic. The .json JSON-LD sibling returns the withdrawn record verbatim. Old IRIs are not hard-redirected: archival consumers MUST be able to inspect the tombstone payload.
Export inclusion
Section titled “Export inclusion”Tombstones MUST appear in monthly exports inside the same .jsonl file as their type. The status field is the signal — no separate tombstones.jsonl.
Compiler invariants
Section titled “Compiler invariants”The compiler enforces: an active CanonicalReference MUST NOT reference a tombstoned Work or CitationSystem through work_key or citation_system_key. MappingAssertions are exempt — successor links from a withdrawn subject to an active target are exactly the documented pattern.
Aliases vs. tombstones
Section titled “Aliases vs. tombstones”aliases.json handles presentational URL aliases (multiple paths pointing at the same canonical record). Tombstones handle identity changes (the record itself is no longer canonical). These are distinct mechanisms and MUST NOT be conflated.
Rights and content guardrails
Section titled “Rights and content guardrails”Exports MUST NOT contain primary full text, commentary, apparatus, or rights metadata that implies TextRefs may redistribute copyrighted text. Disputed resolver endpoints remain in exports with status: blocked and tombstone rationale fields.