TL;DR

Database Reliability Engineer (SRE): Building and improving the reliability, availability, scalability, and performance of ClickHouse Core with an accent on incident response, post-mortem analysis, and chaos engineering. Focus on identifying root causes, implementing bug fixes, and establishing best practices for distributed database operations in the cloud.

Location: Remote from the Netherlands, UK, United States, or Germany

Company

ClickHouse is a fast-growing private cloud company recognized for leadership in real-time analytics, data warehousing, observability, and AI workloads.

What you will do

  • Continuously improve the reliability and performance of ClickHouse core.
  • Improve and create metrics and alerts to prevent production problems.
  • Identify root causes of customer problems, submit bug fixes, and suggest improvements for ClickHouse Core.
  • Enhance incident response processes and post-mortem analysis for core outages, including communication with customers.
  • Plan, enable, and drive Chaos Engineering initiatives across engineering teams.
  • Manage on-call processes to respond to performance and reliability issues and coordinate escalation.

Requirements

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • At least 5 years of experience in Reliability Engineering, QA, or customer-facing engineering.
  • Previous experience operating ClickHouse or other SQL databases in production.
  • Excellent understanding of distributed database internals and SQL, with ClickHouse experience a major plus.
  • Scripting experience with Shell or Python, and ability to read C++ code.
  • Knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.

Culture & Benefits

  • Flexible, globally distributed, and remote-friendly work environment.
  • Employer contributions towards healthcare.
  • Equity in the company through stock options for new team members.
  • Flexible time off in the US and generous entitlement in other countries.
  • $500 home office setup for remote employees.
  • Opportunities to engage with colleagues at company-wide offsites.