TL;DR

Member Of Technical Staff - Voice Model (AI): Building the world’s best voice AI, delivering smooth, natural, low-latency spoken interactions across devices and real-time scenarios with an accent on massive data curation, premium audio processing, and frontier speech-language pre-training. Focus on pushing quality, speed, and stability to the limit, ensuring Grok Voice responses are accurate, factually grounded, and conversational.

Location: Palo Alto, CA

Salary: $150,000 - $450,000 USD

Company

xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.

What you will do

  • Design and execute large-scale speech data curation and processing pipelines.
  • Work on pre-training and post-training of speech-language models.
  • Build and iterate a comprehensive evaluation framework covering objective metrics, human preference studies, and content factuality assessments.
  • Integrate voice models into applications and real-time environments.

Requirements

  • Python expert with deep proficiency in writing clean, efficient code for AI/ML systems.
  • Hands-on experience processing large-scale datasets using tools like Spark and Ray.
  • Proficiency in pre-training and post-training speech-language models using JAX/PyTorch.
  • Ability to set up and run rigorous evaluation pipelines.
  • Experience building or working with large-scale distributed training and inference systems on Kubernetes.
  • Proactive, self-driven attitude.

Culture & Benefits

  • Equity.
  • Comprehensive medical, vision, and dental coverage.
  • Access to a 401(k) retirement plan.
  • Short & long-term disability insurance.
  • Life insurance.
  • Various other discounts and perks.