Job summary
This is an exciting opportunity to join a leading health data engineering team, based at GSTT, but operating across Health Data for London and the AI Centre.
We are looking for a motivated individual with excellent data engineering and infrastructure skills, who can be independently forward deployed across NHS hospitals in London to lead development of infrastructure and data pipelines, working in SQL/dbt and using language AI for curating unstructured data. You will work on a pan-London Snowflake platform, solving key technical challenges to enable data-driven value for a population of >10 million.
Essential Criteria
- Relevant technical degree
- Proficient in SQL, dbt, Python, and orchestration frameworks
- Proficient in at least one modern cloud data platform
- Expertise in on prem/cloud infrastructure management
- Experience working in agile development teams with good development practices
- Expertise in NHS data models / data standards
- Ability to effectively break down complex analyses for non-technical stakeholders
- A strong desire to create real-world positive impacts for patients and the NHS
Desirable Criteria
- Expertise in software engineering, including RESTful API development
- Expertise in FHIR / HL7 development
- Expertise working with NLP pipelines for unstructured medical records.
- Experience working with Real-World Data or EHR databases
- Experience building OMOP common data model pipelines
Main duties of the job
The Lead Data Engineer is a senior technical role that will:
- Design and lead on technical objectives for the AI Centre and Health Data for London
- Lead development of cloud data infrastructure and ELT pipelines across London hospitals and platforms
- Work with a dedicated AI Centre team to deploy language AI technologies for extracting and standardising unstructured clinical records
- Provide expert technical support for the standardisation of London data into research data models
- Co-ordinate, support, and upskill local analysts/engineers in cross-London collaborations to ensure alignment on projects and timelines
- Build robust technical solutions for automation of data pipelines and cohort creation for London research data delivery
- Contribute to deployment architectures for live tools built on top of London data platforms
- Contribute to academic publications, stakeholder presentations, and help to produce materials that support public, patient, and community engagement, such as blog posts
About us
AI Centre for Value-Based Healthcare
The AI, Data & Digital Innovation directorate is made up of data and technology experts - based in GSTT but working closely as a team with KCH and KCL.
The team forms part of the Artificial Intelligence Centre for Value-Based Healthcare - a consortium of NHS, academic, and industry partners from across the UK. This consortium offers expert professional technical delivery across data engineering, data science & AI development, and software engineering. Programmes include region-wide infrastructure delivery of cloud and federated platforms, multi-modal Real-World Data engineering, foundation model development, and development of different Language AI solutions.
London / GSTT Snowflake Platform
A secure data and research cloud platform that provides access to some of the broadest and deepest data in the NHS, including low latency patient-level data flows from primary care, linked to Acute Trust data.
Secure Data Environment (SDE) for London
The London SDE is a data, research, and analytics ecosystem that unites data across the London region. It includes Health Data for London are the next iteration of this programme, delivering one of the best and most diverse research data assets in the world that links data across care pathways for more than 10 million patients. The AI Centreis commissioned to deliver multi-modal data integrations for Health Data for London.
Job description
Job responsibilities
The Lead Data Engineer will be responsible for:
- Owning the building of SQL/Python pipelines (primarily in Snowflake and dbt) to extract data from different databases and raw sources, ending in generation of cohorts for research, analysis, and machine learning
- Designing and leading programmes related to ingestion and standardisation of structured and unstructured data within London programme
- Ensuring technical outputs of such projects meet deliverables of Health Data for London
- Owning the design and development of data outputs for customers inside the NHS Trust and users of the London SDE
- Leading engagement with technical teams in NHS Hospital Trusts, to build consensus and drive collaboration
- Chairing meetings and technical workshops to update work packages across a complex multi-stakeholder and multi-institution environment across London
- Supporting, supervising, and upskilling more junior team members, either within the SDE programme, or within other NHS analytics teams, through oversight of per-project technical work and outputs
- Maintaining a central repository of reproducible code based on a common data model shared across London regions
Job description
Job responsibilities
The Lead Data Engineer will be responsible for:
- Owning the building of SQL/Python pipelines (primarily in Snowflake and dbt) to extract data from different databases and raw sources, ending in generation of cohorts for research, analysis, and machine learning
- Designing and leading programmes related to ingestion and standardisation of structured and unstructured data within London programme
- Ensuring technical outputs of such projects meet deliverables of Health Data for London
- Owning the design and development of data outputs for customers inside the NHS Trust and users of the London SDE
- Leading engagement with technical teams in NHS Hospital Trusts, to build consensus and drive collaboration
- Chairing meetings and technical workshops to update work packages across a complex multi-stakeholder and multi-institution environment across London
- Supporting, supervising, and upskilling more junior team members, either within the SDE programme, or within other NHS analytics teams, through oversight of per-project technical work and outputs
- Maintaining a central repository of reproducible code based on a common data model shared across London regions
Person Specification
Qualifications / Education
Essential
- Relevant technical degree (undergraduate or postgraduate)
Knowledge & experience
Essential
- Experience working in agile development teams with good development practices, including CI/CD, unit testing, version control
- Expertise in NHS data models (e.g. EHR data, SUS, HES, EMIS/TPP primary care data) and NHS data standards (e.g. SNOMED-CT, ICD-10)
- Ability to effectively break down complex analyses for non-technical stakeholders
- A strong desire to create real-world positive impacts for patients and the NHS, across themes such as healthcare inequalities, long-term conditions, and cancer care.
Desirable
- Experience working with Real-World Data or EHR databases
- Experience building OMOP common data model pipelines
Technical Expertise
Essential
- Proficient in SQL, dbt, Python, and orchestration frameworks
- Proficient in at least one modern cloud data platform (e.g. Snowflake / Databricks / Big Query)
- Expertise in on premise and cloud infrastructure management
Desirable
- Expertise in software engineering, including RESTful API development
- Expertise in FHIR / HL7 development
- Expertise working with natural language processing pipelines, including for entity extraction from unstructured medical records.
Person Specification
Qualifications / Education
Essential
- Relevant technical degree (undergraduate or postgraduate)
Knowledge & experience
Essential
- Experience working in agile development teams with good development practices, including CI/CD, unit testing, version control
- Expertise in NHS data models (e.g. EHR data, SUS, HES, EMIS/TPP primary care data) and NHS data standards (e.g. SNOMED-CT, ICD-10)
- Ability to effectively break down complex analyses for non-technical stakeholders
- A strong desire to create real-world positive impacts for patients and the NHS, across themes such as healthcare inequalities, long-term conditions, and cancer care.
Desirable
- Experience working with Real-World Data or EHR databases
- Experience building OMOP common data model pipelines
Technical Expertise
Essential
- Proficient in SQL, dbt, Python, and orchestration frameworks
- Proficient in at least one modern cloud data platform (e.g. Snowflake / Databricks / Big Query)
- Expertise in on premise and cloud infrastructure management
Desirable
- Expertise in software engineering, including RESTful API development
- Expertise in FHIR / HL7 development
- Expertise working with natural language processing pipelines, including for entity extraction from unstructured medical records.
Disclosure and Barring Service Check
This post is subject to the Rehabilitation of Offenders Act (Exceptions Order) 1975 and as such it will be necessary for a submission for Disclosure to be made to the Disclosure and Barring Service (formerly known as CRB) to check for any previous criminal convictions.
Applications from job seekers who require current Skilled worker sponsorship to work in the UK are welcome and will be considered alongside all other applications. For further information visit the UK Visas and Immigration website (Opens in a new tab).
From 6 April 2017, skilled worker applicants, applying for entry clearance into the UK, have had to present a criminal record certificate from each country they have resided continuously or cumulatively for 12 months or more in the past 10 years. Adult dependants (over 18 years old) are also subject to this requirement. Guidance can be found here Criminal records checks for overseas applicants (Opens in a new tab).
Additional information
Disclosure and Barring Service Check
This post is subject to the Rehabilitation of Offenders Act (Exceptions Order) 1975 and as such it will be necessary for a submission for Disclosure to be made to the Disclosure and Barring Service (formerly known as CRB) to check for any previous criminal convictions.
Applications from job seekers who require current Skilled worker sponsorship to work in the UK are welcome and will be considered alongside all other applications. For further information visit the UK Visas and Immigration website (Opens in a new tab).
From 6 April 2017, skilled worker applicants, applying for entry clearance into the UK, have had to present a criminal record certificate from each country they have resided continuously or cumulatively for 12 months or more in the past 10 years. Adult dependants (over 18 years old) are also subject to this requirement. Guidance can be found here Criminal records checks for overseas applicants (Opens in a new tab).