yerty/insights/methodology · v1.0 · last updated June 2026

How the Tribunal Data Explorer is built.

Every figure is reproducible. Every dataset is versioned. Every known limitation is published — including what isn't in the data.

Questions about the methodology? Email methodology@yerty.co.uk.

01 — Sources

Where the data comes from.

The Tribunal Data Explorer is built on the published register of UK employment tribunal decisions. The register is maintained by His Majesty's Courts and Tribunals Service (HMCTS) and is publicly available at gov.uk/employment-tribunal-decisions.

Yerty ingests every decision in the register — currently 100,000+ documents and growing weekly — without modification. Each ingested document is timestamped, hashed, and stored in immutable form before any extraction work begins. The bronze copy of every decision is the source of truth for everything downstream.

In the dataset

  • Reasons documents (the published judgment + reasoning)
  • Award documents (where compensation was awarded)
  • Cost documents (where costs were ordered)
  • Default judgments where published

Not in the dataset

  • Settled cases that did not proceed to judgment
  • Withdrawn claims
  • Proceedings not selected for publication
  • Decisions made before the published register began (pre-Feb 2017)

02 — Extraction pipeline

Raw documents → structured records → analytical views.

Every decision passes through a documented extraction pipeline that converts raw published PDFs into structured records and analytical views. Each layer is reproducible and versioned independently.

01

Raw

Documents ingested verbatim and stored immutably. Content-hashed on receipt. The source of truth for everything downstream.

02

Structured

Field-level extraction from each document: parties, judges, claims, awards, dates, regions, outcomes. Validated through documented review processes.

03

Analytical

Aggregated views: medians, distributions, sector-level statistics, time series. Updated weekly as new decisions are processed.

03 — Coverage

What the dataset can — and cannot — show.

The single biggest limitation of any tribunal dataset is that settled cases — by far the majority — are not in it. We say this loudly because it matters for every figure we publish.

Phase of proceedingsWhat's observableWhat's not
Pre-claimNothing observableAll pre-claim activity
Internal grievanceNothing observableOutcomes, settlements, dismissals
ACAS conciliationNothing observableAll ACAS-conciliated outcomes
Settled before hearingNothing observableAll settlements (the majority of cases)
Withdrawn before hearingNothing observableAll withdrawals
Heard and judgedObservablen/a
Settled at hearingPartial — only if announcedConfidential settlements
DecidedFully observablen/a
AppealedObservable via EAT registerSettlements during appeal

04 — Accuracy & validation

How we know the figures are right.

Critical fields (awards, dates, parties, outcomes) are validated through multiple methods before publication. Lower-stakes fields are validated through periodic sampled audits across claim types and document templates.

We publish the limitations we know about — see the next section — and we update them as we discover new ones. We do not publish single-number accuracy figures because accuracy varies by field, by year, and by document template; a single number would mislead more than inform.

05 — Versioning & provenance

Every figure traces back.

The dataset is versioned by quarter. Every figure published on yerty.co.uk references a specific dataset version. When the version updates (weekly ingestion), historical figures are preserved — so any cited figure remains reproducible against the version that produced it.

06 — Limitations

The honest list of what we can't do.

  • Settled cases are the majority of disputes; they're not in the data.
  • Awards are net of contributory conduct deductions — true claimant compensation may differ.
  • "Industry" classifications are derived from Companies House data; some respondents have ambiguous primary industry.
  • Judges are canonicalised; minor name variants may exist but are merged conservatively.
  • Time-to-hearing is from claim filing to first substantive hearing; preliminary hearings are excluded.
  • Older decisions (pre-2020) have less complete metadata extraction.

07 — Updates

How often the data refreshes.

The published register updates daily; Yerty ingests weekly on Mondays. Extraction and validation run within 48 hours of ingestion. New decisions appear in the Explorer the Wednesday after they're published.

08 — Citing Yerty

How to cite the dataset.

Academic (APA)

Yerty Insights. (2026). Tribunal Data Explorer (Version 2026.Q2)
[Data set]. RightsTech Ltd. https://yerty.co.uk/insights

Journalistic

Source: Yerty Insights, published ET decisions

09 — Ownership and use

The dataset.

The Tribunal Data Explorer dataset — including its structure, organisation, and analytical views — is a database within the meaning of the UK Database Regulations 1997, and the property of RightsTech Ltd. The published tribunal decisions on which it is built are public; the structured database is not.

Use of the dataset is governed by the Terms of Service. Citation and excerpting for journalism, research, and educational purposes is welcome with attribution. Programmatic access for commercial or competitive purposes requires permission — contact licensing@yerty.co.uk.

Read the Terms of Service →

10 — Questions about the methodology?

Get in touch.

For questions about coverage, extraction accuracy, version-specific reproducibility, or anything else technical — email methodology@yerty.co.uk. We try to answer within 48 hours.

Email methodology@yerty.co.uk

The Tribunal Data Explorer is built on published UK employment tribunal decisions. Yerty Insights is a product of RightsTech Ltd (Companies House No. 16602847, England & Wales). Yerty provides data and analytics — not legal advice.