Longitudinal Education Outcomes (LEO)
The currently available Longitudinal Education Outcomes (LEO) dataset contains de-identified information on the characteristics, education, employment, benefits, and earnings of children and young people educated in England, followed up through to adulthood.
This is a unique source of information, with the potential to provide transformative insight and evidence on longer-term labour market outcomes and educational pathways.
This year has seen several key milestones in development work on LEO across members of the ADR UK partnership.
Synthetic Longitudinal Education Outcomes – England
Over the past year, ADR UK has explored the use of synthetic datasets to support research planning, development, training, and collaboration. Synthetic data is data that mimics real administrative data, but doesn’t include real people’s information. It is proving to be a valuable tool for smoothing the researcher journey.
Creating synthetic data for the LEO dataset was a key aim of an ADR England project which seeks to improve the dataset’s usability and usefulness.
This year, the project lead, based at UCL, and colleagues at DfE have successfully developed a synthetic version of LEO. The dataset is classified as ‘low fidelity’, meaning it preserves certain properties of the original data (including patterns of missingness) but relationships between variables are not preserved, making it fully non-disclosive.
The synthetic LEO dataset allows researchers to familiarise themselves with the structure and characteristics of the real data before applying for secure access. This can help assess dataset suitability and - in some cases - draft code outside of the secure environment.
The potential benefits are wide-ranging: researchers being better prepared for projects; researchers needing to spend less time working within a Trusted Research Environment to complete their project; data owners receiving more feasible applications; and trusted research environments facing fewer unviable requests.
ADR UK is now working with multiple stakeholders to establish mechanisms for routine synthetic data production across all flagship datasets.
Note: LEO Synthetic Data will be made available via the UK Data Service as a safeguarded dataset.
Longitudinal Education Outcomes for Northern Ireland
ADR NI’s creation of the Education Outcomes Linkage 2018/19–2021/22 dataset has laid the foundation for developing LEO NI. Phase one is currently in development.
ADR NI is working with the NI Department for the Economy (DfE) Analytical Services Division and the Department of Education (DE) to develop this dataset. It will be a linked, de-identified dataset combining post-primary school data from DE with apprenticeships, further and higher education data from DfE. Later phases will include employment, earnings, and benefits data.
LEO NI will enable research into career paths from post-primary education through training and into the labour market. For the first time, we will be able to track how attendance, background, and training affect later outcomes - supporting more effective interventions. This modern, integrated approach aligns with UK-wide developments and promises to transform policymaking through evidence-based insights.
Scottish Longitudinal Education Outcomes – Scottish Universities
Scotland has a strong foundation of high-quality data supporting research on children, young people, and learners. In line with Scotland’s policy ambition for all children to grow up loved, safe and respected, work has focused on building a library of research-ready datasets to inform and improve outcomes.
The dataset, which includes all university students, provides information on earnings, employment, and benefits linked to education records. Covering tax years 2005/06 to 2021/22 (with more years to follow), it contains approximately three million anonymised records. It is standalone, non-linkable, and ready to use without knowledge of SQL or data cleaning.
Scottish LEO opens up new opportunities to explore graduate pathways and labour market outcomes. Research questions it can address include:
-
How many learners take conversion degrees such as teaching?
-
Do outcomes differ between PhD and bachelor’s degree holders?
-
Are some learners more likely than others to work in fields related to their studies?
Next steps include exploring additional variables and releasing the Scottish LEO Modern Apprenticeships dataset.
Insights into education outcomes in Wales
ADR Wales is pursuing acquiring benefits and income datasets to be linked to other datasets within the SAIL Databank, to enable better understanding of longer term educational outcomes for individuals. A breakthrough this year has been the ability to link HMRC employed income data to education datasets (alongside other de-identified administrative datasets held in SAIL, such as Census 2021, prisons and probation data, and GP and hospital records).
Using this data, accredited researchers can undertake approved, public good projects to investigate educational pathways and labour market outcomes. The team is working to acquire benefits data from the Department for Work and Pensions, which will increase understanding of individual and household economic status, associated with other factors.