TL;DR
Member Of Technical Staff - Voice Model (AI): Building the world’s best voice AI, delivering smooth, natural, low-latency spoken interactions across devices and real-time scenarios with an accent on massive data curation, premium audio processing, and frontier speech-language pre-training. Focus on pushing quality, speed, and stability to the limit, ensuring Grok Voice responses are accurate, factually grounded, and conversational.
Location: Palo Alto, CA
Salary: $150,000 - $450,000 USD
Company
xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.
What you will do
- Design and execute large-scale speech data curation and processing pipelines.
- Work on pre-training and post-training of speech-language models.
- Build and iterate a comprehensive evaluation framework covering objective metrics, human preference studies, and content factuality assessments.
- Integrate voice models into applications and real-time environments.
Requirements
- Python expert with deep proficiency in writing clean, efficient code for AI/ML systems.
- Hands-on experience processing large-scale datasets using tools like Spark and Ray.
- Proficiency in pre-training and post-training speech-language models using JAX/PyTorch.
- Ability to set up and run rigorous evaluation pipelines.
- Experience building or working with large-scale distributed training and inference systems on Kubernetes.
- Proactive, self-driven attitude.
Culture & Benefits
- Equity.
- Comprehensive medical, vision, and dental coverage.
- Access to a 401(k) retirement plan.
- Short & long-term disability insurance.
- Life insurance.
- Various other discounts and perks.
