TL;DR
L3 Function Head (AI): Leading the establishment and scaling of a global L3 Support Line for server and rack infrastructure across Europe and the US, with an accent on incident command for high-severity production events and formal problem management. Focus on driving root-cause resolution, converting recurring failures into architectural improvements, and designing enterprise-grade support frameworks for key clients.
Location: Onsite in Amsterdam, Netherlands. This role involves leading a distributed L3 team across Europe and the US.
Company
Nebius is leading a new era in cloud computing to serve the global AI economy, creating tools and resources for customers to solve real-world challenges.
What you will do
- Act as Incident Commander for high-severity infrastructure incidents, driving structured triage and permanent root-cause fixes.
- Identify recurring failure patterns and implement scalable solutions, leading quarterly reliability reviews.
- Design and scale the L3 operating model, including intake, prioritization, ownership, and escalation.
- Hire and grow a distributed L3 team across the EU and US.
- Define enterprise-grade support processes (SLA handling, escalation paths) and act as a senior escalation interface for customer issues.
Requirements
- Experience building or leading L3 / escalation support for datacenter server infrastructure.
- Strong Incident Commander experience in production environments.
- Background supporting enterprise customers under contractual SLAs.
- Proven ability to build incident & problem management processes from scratch.
- People leadership experience (hiring, coaching, scaling teams).
- Strong English communication skills.
Nice to have
- Deep Linux, hardware, and firmware troubleshooting capability.
- GPU server platform experience (e.g., NVIDIA diagnostics).
- Experience managing ODM/OEM escalations.
- Bash / basic Python scripting.
- Exposure to OCP-based platforms.
Culture & Benefits
- Competitive salary and comprehensive benefits package.
- Opportunities for professional growth.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.
- Focus on cutting-edge AI and ML challenges.
