TL;DR
Director of Network Engineering (AI): Leading the design, development, and operation of network services for AI GPU cloud infrastructure with an accent on high-performance Ethernet and InfiniBand deployments. Focus on setting technical direction, driving operational excellence, and managing cross-team dependencies for scalable and reliable network solutions.
Location: Fully remote (global)
Company
Nscale is the GPU cloud engineered for AI, providing cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers.
What you will do
- Lead the delivery of network infrastructure programmes across Ethernet, InfiniBand, and WAN domains.
- Set technical direction and standards for high-performance Ethernet fabrics and implement through DevOps practices.
- Provide technical leadership for InfiniBand deployments, guiding architecture, configuration, and performance tuning.
- Own prioritisation and execution planning for the network roadmap, balancing commitments and risks.
- Drive operational excellence and incident reduction across all network domains.
- Manage team execution and technical development, aligning engineers to outcomes and fostering capability.
Requirements
- Proven people-management experience in leading a technically strong network engineering function.
- Strong technical depth in high-performance Ethernet fabrics, with hands-on experience in VLANs, VxLAN, EVPN, and BGP.
- Proven capability leading InfiniBand programmes for accelerated compute clusters (NVIDIA Quantum/QM/Mellanox gear).
- Depth in large-scale data centre network design (Clos/Spine-Leaf topologies, SD-WAN).
- Ability to articulate trade-offs and technical practices.
- Leadership in multi-team or project delivery with cross-functional dependencies.
Nice to have
- Experience building and operating SONiC-based switches.
- Production InfiniBand operational experience (SR-IOV, multi-tenancy, isolation).
Culture & Benefits
- Collaborative, supportive, and innovative environment where contributions spark real impact.
- Highly competitive package (base + equity) with reviews every 12 months.
- Opportunity to join a fast-growing tech startup at the cutting edge of AI.
- Dynamic progression plan tailored to ambitions, with full support for growth.
- Human-First Flexibility, allowing autonomy to shape your day around life's moments.
- Join a remote-first team with seamless virtual collaboration.
