yerty/insights/methodology · v1.0 · last updated June 2026
How the Tribunal Data Explorer is built.
Every figure is reproducible. Every dataset is versioned. Every known limitation is published — including what isn't in the data.
Questions about the methodology? Email methodology@yerty.co.uk.
01 — Sources
Where the data comes from.
The Tribunal Data Explorer is built on the published register of UK employment tribunal decisions. The register is maintained by His Majesty's Courts and Tribunals Service (HMCTS) and is publicly available at gov.uk/employment-tribunal-decisions.
Yerty ingests every decision in the register — currently 100,000+ documents and growing weekly — without modification. Each ingested document is timestamped, hashed, and stored in immutable form before any extraction work begins. The bronze copy of every decision is the source of truth for everything downstream.
In the dataset
- Reasons documents (the published judgment + reasoning)
- Award documents (where compensation was awarded)
- Cost documents (where costs were ordered)
- Default judgments where published
Not in the dataset
- Settled cases that did not proceed to judgment
- Withdrawn claims
- Proceedings not selected for publication
- Decisions made before the published register began (pre-Feb 2017)
02 — Extraction pipeline
Raw documents → structured records → analytical views.
Every decision passes through a documented extraction pipeline that converts raw published PDFs into structured records and analytical views. Each layer is reproducible and versioned independently.
Raw
Documents ingested verbatim and stored immutably. Content-hashed on receipt. The source of truth for everything downstream.
Structured
Field-level extraction from each document: parties, judges, claims, awards, dates, regions, outcomes. Validated through documented review processes.
Analytical
Aggregated views: medians, distributions, sector-level statistics, time series. Updated weekly as new decisions are processed.
03 — Coverage
What the dataset can — and cannot — show.
The single biggest limitation of any tribunal dataset is that settled cases — by far the majority — are not in it. We say this loudly because it matters for every figure we publish.
04 — Accuracy & validation
How we know the figures are right.
Critical fields (awards, dates, parties, outcomes) are validated through multiple methods before publication. Lower-stakes fields are validated through periodic sampled audits across claim types and document templates.
We publish the limitations we know about — see the next section — and we update them as we discover new ones. We do not publish single-number accuracy figures because accuracy varies by field, by year, and by document template; a single number would mislead more than inform.
05 — Versioning & provenance
Every figure traces back.
The dataset is versioned by quarter. Every figure published on yerty.co.uk references a specific dataset version. When the version updates (weekly ingestion), historical figures are preserved — so any cited figure remains reproducible against the version that produced it.
06 — Limitations
The honest list of what we can't do.
- Settled cases are the majority of disputes; they're not in the data.
- Awards are net of contributory conduct deductions — true claimant compensation may differ.
- "Industry" classifications are derived from Companies House data; some respondents have ambiguous primary industry.
- Judges are canonicalised; minor name variants may exist but are merged conservatively.
- Time-to-hearing is from claim filing to first substantive hearing; preliminary hearings are excluded.
- Older decisions (pre-2020) have less complete metadata extraction.
07 — Updates
How often the data refreshes.
The published register updates daily; Yerty ingests weekly on Mondays. Extraction and validation run within 48 hours of ingestion. New decisions appear in the Explorer the Wednesday after they're published.
08 — Citing Yerty
How to cite the dataset.
Academic (APA)
Yerty Insights. (2026). Tribunal Data Explorer (Version 2026.Q2) [Data set]. RightsTech Ltd. https://yerty.co.uk/insights
Journalistic
Source: Yerty Insights, published ET decisions
09 — Ownership and use
The dataset.
The Tribunal Data Explorer dataset — including its structure, organisation, and analytical views — is a database within the meaning of the UK Database Regulations 1997, and the property of RightsTech Ltd. The published tribunal decisions on which it is built are public; the structured database is not.
Use of the dataset is governed by the Terms of Service. Citation and excerpting for journalism, research, and educational purposes is welcome with attribution. Programmatic access for commercial or competitive purposes requires permission — contact licensing@yerty.co.uk.
10 — Questions about the methodology?
Get in touch.
For questions about coverage, extraction accuracy, version-specific reproducibility, or anything else technical — email methodology@yerty.co.uk. We try to answer within 48 hours.
Email methodology@yerty.co.uk →The Tribunal Data Explorer is built on published UK employment tribunal decisions. Yerty Insights is a product of RightsTech Ltd (Companies House No. 16602847, England & Wales). Yerty provides data and analytics — not legal advice.