TL;DR

Training Process Management Engineer (Backend): Develop and optimize distributed operating system software that orchestrates and supervises large-scale machine learning training workloads across thousands of machines. With an accent on performance, correctness, scalability, and reliability. Focus on designing and debugging high-performance asynchronous systems and managing complex distributed system challenges at frontier AI scale.

Location: London, UK with hybrid work model (3 days in office per week) and relocation assistance available

Company

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity by pushing the boundaries of AI capabilities and safely deploying AI products.

What you will do

Work across Python and Rust stacks to build and maintain software for orchestration and monitoring of ML workloads on supercomputers
Profile and optimize software stack for computation orchestration at frontier scale
Improve reliability, observability, and fault tolerance of long-running jobs
Debug complex distributed system issues across large clusters
Adapt to evolving ML system requirements to support researchers

Requirements

Location: Must be based in or willing to relocate to London, UK
Experience developing distributed systems and strong software engineering skills
Proficiency in Rust and Python or another systems programming language (e.g., C++)
Solid Linux knowledge with systems-level debugging, performance analysis, and memory profiling
Experience with asynchronous and concurrent systems development
Strong focus on performance, correctness, and reliability

Culture & Benefits

Hybrid work model with 3 days in office per week
Relocation assistance for new employees
Equal opportunity employer with commitment to diversity and inclusion
Supportive environment valuing engineering ownership and agency

Training, Process Management Engineer

Описание вакансии

TL;DR

Company

What you will do

Requirements

Culture & Benefits

Мэтч