- UK primary care EHR studies rarely address rare genetic diseases: 0.82% (47/5754) of publications identified across five databases.
- EHR analyses deliver population-level insights into phenotype variation, complications and management outcomes for multiple rare genetic conditions.
- Limited diagnostic coding constrains research, yet linkage, cohort designs and database scale support wider use of UK primary care EHRs.
Eur J Hum Genet. 2026 May 20. doi: 10.1038/s41431-026-02114-w. Online ahead of print.
ABSTRACT
Rare disease studies often rely on small, selected cohorts, are resource-intensive and difficult to scale. UK primary care electronic health record (EHR) databases provide population-based, longitudinal data, but their use for rare genetic disease research has not been systematically examined. Through systematic mapping of publications from five UK primary care EHR databases (CPRD, OPCRD, QResearch, SAIL Databank and THIN), we found that only 0.82% (47 of 5754) of studies reported on rare genetic diseases. Of these, 77% (36 of 47) linked to external datasets. Study designs included case-control, cross-sectional and cohort studies. Cohort designs predominated, often with individual-level matched comparators. Case ascertainment was primarily based on routinely recorded diagnostic codes. Most studies examined a single disease, collectively encompassing 23 conditions. There was a skew towards multisystem, neurological, autosomal dominant and single-gene disorders, with relatively higher population frequencies and therapeutic tractability. Rare disease sample sizes ranged from 21 to 5059 (median 392). Important insights were revealed into phenotypic variation, phenotype expansion, complications and management outcomes, including findings not readily identifiable in traditional studies. Examples include higher prevalence of hereditary haemorrhagic telangiectasia in females, consistent with sex-modified phenotypic expression; non-skeletal complications and premature mortality in X-linked hypophosphataemia; and elevated malignancy risk in myotonic dystrophy type 1 with type 2 diabetes, potentially attenuated by metformin. In conclusion, UK primary care EHR databases are markedly underutilised for rare genetic diseases. For many conditions, limited availability of diagnostic codes is a constraint. However, their demonstrated capacity, scale, scope and population representativeness support wider use.
PMID:42162269 | DOI:10.1038/s41431-026-02114-w
AI Search
Share Evidence Blueprint

Search Google Scholar
Save as PDF

