TL;DR
Lead Cloud Site Reliability Engineer (Fintech): Strengthening observability, reliability, and operational excellence across the cloud estate (GCP and Azure) with an accent on SRE principles (SLIs, SLOs, error budgets) and operational excellence. Focus on leading a team of SREs, driving improvements in observability, incident response and root cause analysis.
Location: Hybrid (2 days in office per week) in Halifax, Leeds or Manchester
Salary: £92,701 - £109,060
Company
Lloyds Banking Group is redefining itself into a modern, innovative, purposeful organisation, investing heavily in cloud, automation and engineering excellence.
What you will do
- Lead, coach, and develop a high‑performing SRE team.
- Partner with Product Owners and Engineering Leads to embed reliability into roadmaps and delivery decisions.
- Apply SRE principles (SLIs, SLOs, error budgets) to ensure services remain highly reliable, performant and scalable.
- Drive improvements in observability across metrics, logs, traces and events.
- Own Infrastructure‑as‑Code and CI/CD‑based environments, implementing enhancements and responding to operational change.
- Lead coordination of incident response and root cause analysis.
Requirements
- Proven experience applying SRE practices within Azure, GCP, or both.
- Strong understanding of SLIs, SLOs, error budgets.
- Experience ensuring reliability of production services, including availability, performance and recoverability.
- Hands‑on or leadership experience in incident and problem management.
- Background in software engineering or cloud engineering, with good understanding of modern SDLC practices.
- Practical experience with DevOps, CI/CD and automation to improve service reliability.
Nice to have
- Certifications or strong experience with Google Cloud Platform and/or Microsoft Azure.
- Knowledge of Kubernetes, compute services, API management and large‑scale distributed systems.
- Experience with Terraform, Jenkins, or equivalent configuration/pipeline tooling.
- Ability to write and maintain scripts or code in languages such as Python, Bash, PowerShell or Groovy.
Culture & Benefits
- Competitive salary and performance‑related bonus
- 28 days holiday plus bank holidays
- Generous pension contribution
- Private medical insurance
- Flexible benefits to suit your lifestyle
- Hybrid working model and family‑friendly policies
- Access to wellbeing support, training and career development
