TL;DR

Senior Member Of Technical Staff (SMTS) Site Reliability Engineer (Cloud Automation): Building and optimizing highly available, active-active mission-critical cloud infrastructure that powers Salesforce at scale with an accent on maximizing developer velocity through automation-first thinking and a strict "No Ticket-Ops" philosophy. Focus on integrating AI agents into GitOps workflows and enterprise WorkOS to build a smart, secure platform.

Location: Must be based in New York, NY or San Francisco, CA

Company

Salesforce's Cloud Platform Engineering team builds and operates highly available, active-active mission-critical infrastructure, treating the internal cloud as a product to maximize developer velocity through automation and AI.

What you will do

  • Build, maintain, and scale automated provisioning workflows ("The Vending Machine") that orchestrate the creation of new, fully governed multi-account cloud environments.
  • Author, test, and maintain a library of pre-approved Infrastructure-as-Code ("Golden Modules") templates that internal developers will consume.
  • Partner with enterprise CI/CD teams to plug automated security scanning, Policy-as-Code, and cost-estimation checks into developer Pull Request processes.
  • Implement data-plane-driven automated failover mechanisms and develop integrations connecting provisioning tools to enterprise WorkOS (Slack) for real-time operational intelligence.

Requirements

  • Bachelor's degree in Computer Science, Computer Engineering, Software Engineering or relevant work experience.
  • 7+ years of software engineering or Site Reliability Engineering experience in large-scale cloud environments.
  • Expert-level proficiency in Infrastructure-as-Code (strictly Terraform) and managing state in highly distributed architectures.
  • Strong programming skills in Python, Go, or similar languages used for building automation tooling and API integrations.
  • Proven experience operating multi-region, active-active cloud environments and implementing automated disaster recovery tests.
  • Deep understanding of GitOps workflows and integrating infrastructure guardrails into existing enterprise CI/CD pipelines.

Culture & Benefits

  • Focus on customer satisfaction (internal developers), automation, eradicating manual toil, and a "No Ticket-Ops" philosophy.
  • Belief that security should be "shifted left" and built into the code, not bolted on as an afterthought.
  • SRE mindset, engineering for failure, prioritizing self-healing systems, and maintaining a 99.999% availability standard.
  • Leveraging AI agents directly into GitOps workflows and enterprise WorkOS (Slack) for a smart, secure platform.
  • Operating as a LEAN, innovative team of "T-shaped" engineers who learn from one another.