TL;DR

Senior Site Reliability Engineer (Cloud): Responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of ClickHouse Cloud infrastructure with an accent on incident management and response, post-mortem analysis, and continuous improvement. Focus on developing software platforms and tools to optimize the operational and engineering efficiencies of ClickHouse Cloud at scale.

Location: Must be based in the United States

Company

ClickHouse is a fast-growing private cloud company that leads the market in real-time analytics, data warehousing, observability, and AI workloads.

What you will do

  • Collaborate with various engineering teams to design and implement scalable, secure, and highly available systems for ClickHouse.
  • Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud.
  • Ensure all infrastructure components have monitoring and alerting in place to ensure timely detection and resolution of incidents.
  • Enhance and refine incident response processes and post-mortem analysis for outages.
  • Continuously improve the reliability and performance of ClickHouse services.
  • Manage on-call processes to respond to performance and reliability issues and establish best practices for issue resolution and downtime minimization.

Requirements

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • At least 8 years of experience in Site Reliability Engineering or a related field.
  • Hands-on experience with Go and/or Python.
  • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
  • Excellent understanding of distributed databases and SQL, particularly ClickHouse.
  • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.

Culture & Benefits

  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly.
  • Employer contributions towards your healthcare.
  • Every new team member receives stock options.
  • Flexible time off in the US, generous entitlement in other countries.
  • $500 Home office setup if you’re a remote employee.
  • Opportunities to engage with colleagues at company-wide offsites.