The canonical record
Data and corrections
The record
Complete results, layers that fill in
Every result is known back to 1886. The richer detail thins as the record reaches into the Victorian past — this is exactly where, layer by layer and decade by decade.
every result known, back to 1886
- 81%
- Complete goalscorer rows
- 100%
- Starting XIs
- 99%
- Attendances
Swipe for all decades →
Slice: all 6,028 matches in the database, grouped by decade. Every match carries a result row.
Intensity and the cell value are the share of each decade's matches that carry the layer. The all-time column on the right is the same share across the whole record.
The other cut
Coverage by competition type
The same layers, sliced by competition rather than by decade — league fixtures are the best-covered, the deep cup and European archive the thinnest.
- League
- Matches
- 4,860
- Scorers
- 3,922 (81%)
- Lineups
- 4,859 (100%)
- FA Cup
- Matches
- 464
- Scorers
- 380 (82%)
- Lineups
- 464 (100%)
- Europe
- Matches
- 430
- Scorers
- 362 (84%)
- Lineups
- 430 (100%)
- League Cup
- Matches
- 218
- Scorers
- 180 (83%)
- Lineups
- 217 (100%)
- Shields and Super Cups
- Matches
- 40
- Scorers
- 31 (78%)
- Lineups
- 36 (90%)
- Test Matches
- Matches
- 8
- Scorers
- 4 (50%)
- Lineups
- 8 (100%)
- World
- Matches
- 8
- Scorers
- 7 (88%)
- Lineups
- 8 (100%)
Provenance
How the record is built and corrected
Every layered fact cites a source, and corrections follow one contract. The queue below is where the faint cells above turn into work.
Sources
Grouped by upstream source where several use cases share a lineage. Expand a family to see how each use case is applied and an example on file.
High-value gaps
United scored here but the goal-by-goal row isn't complete yet.
Showing 12 of 34 — the queue prioritises recent post-war work first.
The queue prioritises recent post-war United goalscorer gaps, then opposition goals, lineups, and attendance. Older archive work can still be added whenever a citation is strong.
For developers
Public read-only API
The API serves the same read-only record used by the app. Responses are plain JSON with permissive CORS, pagination on the large lists, and an attribution block that points back to this coverage ledger.
- /api/v1/metaDataset metadata and coverage counts
- /api/v1/matchesPaginated matches, filterable by date, season, venue, and opponent
- /api/v1/matches/{id}One match with events, lineups, Elo, and sources
- /api/v1/seasonsSeason summaries by competition
- /api/v1/playersPlayer totals with pagination
- /api/v1/opponentsOpponent head-to-head records
Treat result rows as the stable core and read facet flags before using event, lineup, assist, card, attendance, or source-derived fields as complete historical totals.
Dataset downloads
Each production build exports flat files from the compiled SQLite database, so the downloadable release matches the app and API. Use the manifest first to see file counts and build metadata.
- manifest.jsonRelease metadata and row counts
- matches.csvFixture spine and match facts
- events.csvGoal, assist, and card event rows
- lineups.csvStarting, bench, and substitution rows
- elo_history.csvPre/post-match ratings and expectancies
- season_summaries.csvCompetition season summaries
- players.csvAll-time player totals
Also layered onto lineups: 2,474 matches with used-substitute records and 882 with a named bench.