We are seeking an experienced DevOps Team Lead to lead a team of talented DevOps engineers working on a cutting-edge product in the DevOps domain. This team plays a critical role in enhancing and maintaining a high-scale SaaS platform, streamlining the delivery lifecycle for multiple products, improving developer experience, and advancing monitoring capabilities.
As part of a company deeply rooted in Kubernetes and GitOps, the DevOps team not only provides expertise in these areas but is also integral to shaping the direction of our product. This is a unique opportunity to make a significant impact in a dynamic, high-growth environment.
What Youll Be Doing:
Managing all aspects of the team's day-to-day work, planning, and prioritizing tasks and agendas while working closely with development teams, support engineers, and R&D leadership.
Leveraging a highly technical background to take on complex tasks as needed.
Ensuring the availability and smooth, cost-efficient operation of our production environment, including leading efficient incident response.
Maintaining and improving the release lifecycle for various products, ensuring frequent and reliable delivery.
Designing and embedding industry best practices for online services, including disaster recovery, business continuity, and service health measurement.
Providing development teams with platforms and tools that streamline their work and increase productivity.
Establishing a robust and reliable monitoring framework for complex distributed systems.
Driving the professional growth and development of a highly motivated team.
Constantly learning and exploring the cutting edge of DevOps tools and practices, and leading their implementation.
As part of a company deeply rooted in Kubernetes and GitOps, the DevOps team not only provides expertise in these areas but is also integral to shaping the direction of our product. This is a unique opportunity to make a significant impact in a dynamic, high-growth environment.
What Youll Be Doing:
Managing all aspects of the team's day-to-day work, planning, and prioritizing tasks and agendas while working closely with development teams, support engineers, and R&D leadership.
Leveraging a highly technical background to take on complex tasks as needed.
Ensuring the availability and smooth, cost-efficient operation of our production environment, including leading efficient incident response.
Maintaining and improving the release lifecycle for various products, ensuring frequent and reliable delivery.
Designing and embedding industry best practices for online services, including disaster recovery, business continuity, and service health measurement.
Providing development teams with platforms and tools that streamline their work and increase productivity.
Establishing a robust and reliable monitoring framework for complex distributed systems.
Driving the professional growth and development of a highly motivated team.
Constantly learning and exploring the cutting edge of DevOps tools and practices, and leading their implementation.
Requirements:
5+ years of experience with DevOps and cloud technologies.
2+ years of experience as a team lead.
Possess exceptional troubleshooting skills and diagnostic intuition for solving challenging problems.
Extensive experience with Kubernetes (both operational and deployment) and Docker.
High familiarity with public cloud providers (AWS, GCP, and Azure); proven expertise with more than one is a significant advantage.
Familiarity with continuous monitoring solutions, including Prometheus, Grafana stack (Loki, Mimir, Pyroscope), and OpenTelemetry.
Experience with event-driven microservices and event buses such as RabbitMQ or Kafka.
Strong knowledge of deployment models, capacity management, and service utilization.
Expertise in the design, architecture, and operation of complex, large-scale online services.
Programming and scripting proficiency in Bash, Python, or Golang.
Familiarity with GitOps principles is a must; strong experience with one or more Argo projects is a plus.
Proven experience with Infrastructure as Code (IaC) tooling like Terraform; experience with Crossplane is an advantage.
Experience working with databases (e.g., NoSQL, MongoDB) and a strong understanding of system and networking concepts and troubleshooting techniques.
Proven ability to collaborate effectively with cross-functional, global, and remote teams from diverse backgrounds.
5+ years of experience with DevOps and cloud technologies.
2+ years of experience as a team lead.
Possess exceptional troubleshooting skills and diagnostic intuition for solving challenging problems.
Extensive experience with Kubernetes (both operational and deployment) and Docker.
High familiarity with public cloud providers (AWS, GCP, and Azure); proven expertise with more than one is a significant advantage.
Familiarity with continuous monitoring solutions, including Prometheus, Grafana stack (Loki, Mimir, Pyroscope), and OpenTelemetry.
Experience with event-driven microservices and event buses such as RabbitMQ or Kafka.
Strong knowledge of deployment models, capacity management, and service utilization.
Expertise in the design, architecture, and operation of complex, large-scale online services.
Programming and scripting proficiency in Bash, Python, or Golang.
Familiarity with GitOps principles is a must; strong experience with one or more Argo projects is a plus.
Proven experience with Infrastructure as Code (IaC) tooling like Terraform; experience with Crossplane is an advantage.
Experience working with databases (e.g., NoSQL, MongoDB) and a strong understanding of system and networking concepts and troubleshooting techniques.
Proven ability to collaborate effectively with cross-functional, global, and remote teams from diverse backgrounds.
This position is open to all candidates.