Next Day Coffee Supplies Available From One Of The UK's Leading Coffee Suppliers & A1 Coffee Beans Roasters
Call us now on: 0800 644 6650
Items in my bag 0
      Pure Gusto Research · Reference document

      Methodology

      Data sources, definitions, dataset construction, chain classification, and known limitations behind England's Coffee Shops: The Definitive Study.

      Updated 4 June 2026 · Companion to The Definitive Study · England only
      01 · Overview

      What the study is

      This study analyses the distribution, density, and characteristics of coffee venues across England, enriched with official demographic and deprivation data at neighbourhood level. It combines Google Maps business-listing data with Census 2021 population statistics, the Index of Multiple Deprivation 2025, and ONS postcode geography.

      The methodology was designed to produce findings that are reproducible, auditable, and robust enough to withstand journalistic scrutiny.

      02 · Data sources

      Where the data comes from

      Venue data

      Business-listing data was collected from Google Maps via a commercial data aggregator. Each listing includes business name, address, postcode, category, customer rating, review count, price level, claimed status, and operational status. Data reflects Google Maps listings as of April 2026. Coverage depends on businesses having an active Google Maps listing — venues without a listing are not captured.

      Geographic data

      Postcode geography was sourced from the ONS Postcode Directory (ONSPD), linking each postcode to its Lower Super Output Area (LSOA), Middle Super Output Area (MSOA), and Local Authority District (LAD), so venue data can be joined to official demographic statistics at neighbourhood level.

      New Towns data

      LADs are flagged for their relationship to England's post-war New Towns programme based on designations under the New Towns Acts of 1946 to 1980, compiled from the Town and Country Planning Association's “Explore the UK's New Towns” index and cross-checked against the original designation orders on legislation.gov.uk. The 12 candidate towns announced by the March 2026 New Towns Taskforce are excluded as future-state proposals rather than developed settlements.

      Rural-Urban classification data

      LADs are classified by the ONS 2021 Rural-Urban Classification (RUC21), built on Census 2021 outputs. A four-class base — Urban, Intermediate urban, Intermediate rural, Majority rural — is used as the ordinal for cohort tests. LADs created by the 2023 local government reorganisation are reconciled to RUC21 using the ONS post-reorganisation lookups.

      Coastal classification data

      Coastal LAD flags are derived from the ONS Built-Up Area Coastal Classification, February 2024. This is a permissive classification — a LAD with a single coastal Built-Up Area at its edge counts as coastal — so “coastal” here means “geographically touches the coast” rather than “economically coastal-dependent.”

      Demographic data

      All demographic figures derive from the England and Wales Census 2021 (ONS), provided at LSOA level (average population ~1,700): population and household counts, age profile, occupation by NS-SEC, industry of employment, and travel to work.

      Deprivation data

      Deprivation rankings are from the Index of Multiple Deprivation 2025 (MHCLG). IMD deciles run from 1 (most deprived 10%) to 10 (least deprived 10%).

      Earnings data

      Earnings figures are from the Annual Survey of Hours and Earnings (ASHE) 2024 (ONS), at Local Authority District level — every venue in the same local authority receives the same earnings figures.

      House price data

      Residential transaction data is from HM Land Registry Price Paid Data (2022–2026), aggregated to LSOA level.

      Broadband coverage data

      Fixed broadband coverage figures are from Ofcom Connected Nations 2023, aggregated to LSOA.

      Business count data

      Business counts are from the ONS/NOMIS UK Business Counts (IDBR) 2025, at Local Authority District level.

      Wellbeing data

      Personal wellbeing scores are from the ONS Personal Well-being Estimates by Local Authority, 2022-23 (life satisfaction, happiness, worthwhile, anxiety). The anxiety measure is directionally inverted — a higher score indicates greater anxiety.

      Higher education student data

      Student enrolment figures are from the HESA Detailed Table dt051, 2023/24, with provider postcodes from the Office for Students registered providers list. Data covers 264 English HE institutions across 109 local authorities. The Open University (Milton Keynes) is excluded from campus-based footfall analysis as its students study primarily by distance learning.

      03 · Geographic scope

      England only

      This study covers England only. The Census, IMD, and ONS postcode data used are England-specific. Wales, Scotland, and Northern Ireland are not included.

      04 · Dataset construction

      How the dataset was built

      1. Venue collection

        Venues were collected using targeted category searches for coffee- and café-related venue types across all postcode sectors in England. A supplementary gap-fill collection was run for postcode sectors that returned no results in the primary search, filtered to coffee-relevant categories and used only where the primary collection returned nothing for that sector.

      2. Deduplication and quality filtering

        The following records were excluded from all analysis: permanently closed venues; duplicate listings (only the primary listing retained); venues on terminated or unmatched postcodes; and petrol-station venues (excluded on the grounds that the primary purpose of the visit is fuel, not coffee — including forecourt café brands operating inside petrol stations).

      3. Geographic enrichment

        Each venue's postcode was matched to the ONS Postcode Directory to obtain LSOA, MSOA, and Local Authority District codes and the postcode sector.

      4. Demographic enrichment

        Each venue was enriched with Census 2021 and IMD 2025 at LSOA level; ASHE 2024 earnings, IDBR 2025 business counts, ONS Wellbeing 2022-23, and HESA student enrolments at Local Authority District level; Land Registry Price Paid Data at LSOA level (LSOAs with fewer than 5 transactions set to NULL); and Ofcom Connected Nations 2023 aggregated to LSOA. LAD-level sources assign identical values to all venues in the same local authority and are suitable for cross-LAD comparison, not for comparing neighbourhoods within a council area.

      05 · Definitions

      What we mean by “coffee shop”

      ABroad definition — “coffee-serving venues”

      All venues in the dataset after quality filtering: dedicated coffee shops and cafés alongside other venue types where coffee is served as part of a wider offer (bakeries, delis, sandwich shops, garden centres, farm shops, community centres).

      47,976venues · broad

      This captures the full ecosystem of places where people drink coffee in England. It is broader than industry definitions of “coffee shops” and should not be compared directly to industry market-size estimates without adjustment.

      BSpecialist definition — “dedicated coffee shops and cafés”

      A subset of the broad definition, restricted to venues whose Google Maps primary category indicates that coffee or café service is the main purpose of the venue — dedicated coffee shops, cafés, coffee roasters, tea rooms, and closely-related café types. Accented and unaccented spellings (café / cafe) are treated as equivalent so the boundary is not an artefact of spelling.

      35,040venues · specialist

      This is consistent with commercial industry estimates of the UK coffee shop market, which typically count only specialist operators whose primary revenue is coffee. The gap to industry estimates reflects this study's inclusion of community cafés, institutional cafés, and smaller operators not tracked commercially.

      CChain dominance across both universes

      Chain venues account for 16% of specialist coffee shops and cafés (5,619 of 35,040) — virtually identical to the 15.2% chain share in the broad definition. There is no categorisation of England's coffee market under which chains substantially dominate. In the inverse subset (the 12,936 venues serving coffee outside the strict coffee-shop definition), the chain share is 13%. Chain operators cluster in a small number of clearly-defined commercial niches; independent operators remain dominant across the remainder of the market.

      06 · Chain classification

      How chains were identified

      Venues were classified as chain or independent by title matching against a maintained list of nationally-recognised multi-site operators. The classification is applied once to the full dataset, so every figure in this study uses the same chain/independent labels for the same venues. The list is reviewed periodically; when it changes, the entire dataset is re-classified.

      Any venue not matched against this list is classified as independent — which will include some small regional chains and franchise operators. The independent figure should therefore be read as “not a nationally recognised chain” rather than “single-site operator.”

      The list spans every category in which national operators appear, so the chain/independent split is not skewed by omitting a sector. The largest and most recognisable include specialist coffee chains (Costa, Starbucks, Caffè Nero, Coffee#1, Black Sheep Coffee, 200 Degrees), sandwich-and-coffee chains (Pret A Manger, Subway), national bakeries (Greggs, Gail's, PAUL, Patisserie Valerie), supermarket cafés (Tesco, Sainsbury's, M&S, Morrisons), and pub and restaurant operators that serve coffee at scale (Wetherspoon, The Lounges). The full maintained list is longer; what every figure in this study depends on is the classification rule, applied identically to every venue — not the length of the list.

      07 · Cohorts tested

      How groups of LADs are defined

      Several findings test whether coffee venue patterns differ across groups of LADs:

      • New Towns — LADs where a designated post-war New Town dominates the authority (e.g. Harlow, Basildon, Stevenage), versus those with no designated New Town. Towns expanded under the Town Development Act 1952 (e.g. Basingstoke) are not treated as New Towns, as they expanded existing settlements rather than designating new ones.
      • Rural-urban gradient — the four-class RUC21 base used as an ordinal, tested for a monotonic dose-response rather than a fixed linear step between classes.
      • Coastal — LADs touching the coast versus those that do not, with the caveat that the classification is permissive (a single coastal area qualifies a LAD), so a coastal result is not evidence of a “coastal-economy” effect.
      • University towns — LADs with both a high student-to-resident ratio and a material absolute student population. The Open University's distance-learning enrolment is removed from Milton Keynes before this test, since it does not represent campus footfall.
      • City of London is excluded from all density analysis: its ~9,000 residents against a ~500,000 daytime working population make a resident-denominator density meaningless. The rule applies wherever density enters a ranking, distribution, or cohort comparison.

      The rural and deprivation cohorts are treated as independent dimensions: a cross-test confirms they are only weakly correlated, so the rural rating premium is not a deprivation effect and the deprivation-rating null is not undermined by it.

      08 · Statistical approach

      How comparative claims were tested

      Findings that make comparative claims are supported by formal statistical tests, applied consistently so results stay comparable. Cohort comparisons use non-parametric tests (Mann-Whitney U with Cliff's δ as the effect size), chosen because per-LAD distributions are skewed and contain outliers. Dose-response on the rural-urban ordinal uses Spearman ρ. Per-LAD rating confidence intervals are bootstrapped with the venue as the unit of resampling. Where a family of correlations is tested together, a Bonferroni correction is applied so a single chance result cannot carry a finding.

      Findings were held to a pre-specified standard before being given their place in the study. One prospective primary finding — the relationship between local wellbeing and coffee provision — was tested against a three-part rule set in advance: minimum effect size, sign coherence, and surviving multiple-comparison correction. It failed all three. Rather than soften or quietly drop it, it was moved to the supplementary section and the rule was made precedent for any future finding. This is the standard the study holds itself to: claims that do not clear the bar are demoted, not dressed up.

      09 · Population & density

      How density is calculated

      Population at Local Authority District level is the sum of Census 2021 LSOA populations across every LSOA in the LAD, including residential LSOAs with no coffee venues. This produces a true LAD population total consistent with ONS published figures, used as the denominator for every per-capita calculation so figures are comparable across findings. Coffee density is expressed as venues per 10,000 residents.

      Correction note. Earlier versions of this study (prior to May 2026) used a venue-weighted population estimate that implicitly excluded residential areas with no coffee venues from the denominator, inflating density by roughly 1.5×–3.5× depending on the LAD, with the largest bias in rural and coastal authorities. The corrected resident-population denominator removes this bias and brings density into agreement with ONS LAD population estimates. Density figures published before May 2026 should be treated as superseded.

      10 · Known limitations

      What this study can and cannot say

      Census 2021 data reflects patterns as of March 2021; areas with significant change since then may show outdated figures.

      Google Maps coverage is not uniform. Rural areas, newer businesses, and unclaimed listings may be under-represented; the study does not correct for coverage gaps.

      Two-source venue collection. Venue data combines a primary business-listings source with a Google Maps gap-fill for postcode sectors where the primary source returned zero results; all published figures are on the combined basis. Empirical analysis of the raw responses found the gap-fill did not in fact surface venues missing from the primary source — every unique venue it returned also appears in the primary dataset, because its radius-based search surfaced already-captured venues from adjacent sectors. Its share by LAD was concentrated in commuter and new-town areas with essentially no correlation to venue density (r = −0.03). The practical effect is a small, localised inflation in a minority of LADs; every finding was tested against this and is robust to it. Blackpool illustrates the magnitude: 11.13 venues per 10,000 on the combined basis, approximately 10.07 on a primary-source-only basis.

      Chain classification is based on title matching and is approximate. Franchise operators (e.g. a Costa inside a hospital) are classified as chains; some small regional chains may be classified as independents.

      Ratings and review counts reflect Google Maps user behaviour; areas with lower digital engagement may have fewer reviews. All ratings analysis applies a minimum threshold of 20 reviews per venue.

      England only. No claims are made about Wales, Scotland, or Northern Ireland.

      Coverage gaps. 70 LSOAs have no IMD 2025 data and are excluded from deprivation analysis. ~200 local authorities have no registered HE institution and receive no student-enrichment values.

      Wellbeing data is from 2022-23 self-reported survey responses and may not capture later cost-of-living pressures.

      11 · Key figures for citation

      Numbers to quote

      Headline metrics & data sources at a glance
      Total venues analysed (broad)47,976
      Total venues analysed (specialist)35,040
      Local authorities covered309
      Postcode sectors covered7,152
      LSOAs covered15,545
      Chain share (broad definition)15.2%
      Chain share (specialist definition)16.0%
      Chain share (broad-not-specialist subset)13.0%
      Average Google rating (national, votes-weighted, ≥20 reviews)4.43 ★
      Average review count per venue299
      Data collection dateApril 2026
      Demographic data sourceCensus 2021
      Deprivation data sourceIoD 2025
      Earnings data sourceASHE 2024 (LAD level)
      Wellbeing data sourceONS Personal Well-being Estimates 2022-23
      Student data sourceHESA dt051 2023/24 + OfS register
      HE institutions covered264 (109 local authorities)
      England HE students covered~2.4 million

      Chain share is computed as venue-weighted national share — total chain venues divided by total filtered venues (broad: 7,299 ÷ 47,976 = 15.2%; specialist: 5,619 ÷ 35,040 = 16.0%; broad-not-specialist: 1,680 ÷ 12,936 = 13.0%). A LAD-weighted average — the mean of per-LAD chain percentages — gives 15.52% for the broad definition and is the appropriate figure for LAD-vs-LAD comparisons. Both are correct under their respective definitions; the choice depends on the claim being made.

      About Pure Gusto

      Pure Gusto is one of the UK's leading independent coffee suppliers, roasting and supplying coffee beans, equipment and sundries to cafés, restaurants and hospitality businesses across England. We have worked with independent coffee-shop operators for over twenty years.

      puregusto.co.uk →

      Research partner — Sayu

      • Press: press@sayu.co.uk
      • Phone: 01642 664550
      • Scotswood House, Teesdale South
        Thornaby Place, Stockton-on-Tees
        TS17 6SB
      • sayu.co.uk/contact-us
      TOP
      CLOSE
      Thank you for subscribing to the
      PureGusto Newsletter