Reqired Staff Software Engineer – Distributed Systems Quality Platform
About the role:
Were building the next-generation testing platform that will redefine how complex distributed systems are validated at scale. Our mission is to create an automated, intelligent testing environment that continuously validates the correctness, performance, and resilience of the Data Platform across every layer of the stack. This is an opportunity to reinvent how next-gen infrastructure is tested, leveraging both proven engineering techniques and the latest AI-driven approaches. Youll work on solving some of the hardest challenges in large-scale systems engineering: How do you validate correctness in deeply parallel I/O systems? How do you uncover reliability gaps that appear only under extreme concurrency? How can AI tools help us automatically generate test scenarios, analyze failures, and predict weak spots before they occur?
As a Staff Software Engineer, youll be at the center of these questions. Youll set the technical direction, design testing frameworks that scale with our product, and partner across engineering to raise the bar for quality.
What Youll Do:
Design and implement a next-generation testing and quality framework for distributed systems, enabling automated validation of functionality, performance, and resilience.
Leverage AI-driven tools to scale the testing environment, including automated test generation, intelligent workload synthesis, and anomaly detection.
Create end-to-end testing environments that simulate real-world scale, stress, and failure conditions.
Define and drive the technical strategy for testing across, setting the standard for quality engineering.
Mentor and influence engineers across teams, fostering a culture of technical rigor and reliability obsession.
Collaborate with product and infrastructure teams to ensure the testing platform is deeply integrated into our development lifecycle.
About the role:
Were building the next-generation testing platform that will redefine how complex distributed systems are validated at scale. Our mission is to create an automated, intelligent testing environment that continuously validates the correctness, performance, and resilience of the Data Platform across every layer of the stack. This is an opportunity to reinvent how next-gen infrastructure is tested, leveraging both proven engineering techniques and the latest AI-driven approaches. Youll work on solving some of the hardest challenges in large-scale systems engineering: How do you validate correctness in deeply parallel I/O systems? How do you uncover reliability gaps that appear only under extreme concurrency? How can AI tools help us automatically generate test scenarios, analyze failures, and predict weak spots before they occur?
As a Staff Software Engineer, youll be at the center of these questions. Youll set the technical direction, design testing frameworks that scale with our product, and partner across engineering to raise the bar for quality.
What Youll Do:
Design and implement a next-generation testing and quality framework for distributed systems, enabling automated validation of functionality, performance, and resilience.
Leverage AI-driven tools to scale the testing environment, including automated test generation, intelligent workload synthesis, and anomaly detection.
Create end-to-end testing environments that simulate real-world scale, stress, and failure conditions.
Define and drive the technical strategy for testing across, setting the standard for quality engineering.
Mentor and influence engineers across teams, fostering a culture of technical rigor and reliability obsession.
Collaborate with product and infrastructure teams to ensure the testing platform is deeply integrated into our development lifecycle.
Requirements:
Extensive experience (8+ years) designing and building large-scale distributed systems in domains like storage, networking, or cloud infrastructure.
Deep knowledge of system correctness, concurrency, and reliability, and how to validate them in practice.
Strong programming skills in languages such as Go, C++, Rust, or Python.
Proven ability to design frameworks or platforms that enable other engineers to move faster while improving quality.
Experience with or interest in AI/ML-driven approaches to testing and validation.
Track record of technical leadership influencing beyond your immediate team, setting vision, and mentoring others.
Extensive experience (8+ years) designing and building large-scale distributed systems in domains like storage, networking, or cloud infrastructure.
Deep knowledge of system correctness, concurrency, and reliability, and how to validate them in practice.
Strong programming skills in languages such as Go, C++, Rust, or Python.
Proven ability to design frameworks or platforms that enable other engineers to move faster while improving quality.
Experience with or interest in AI/ML-driven approaches to testing and validation.
Track record of technical leadership influencing beyond your immediate team, setting vision, and mentoring others.
This position is open to all candidates.
























