The team is responsible for developing high-performance computing and cloud infrastructure, for the worlds largest supercomputers and data-centers. The work environment is educational, dynamic, and challenging as our employees are currently working on innovative, next-generation products at the forefront of technology in terms of performance, scalability, and features.
What youll be doing:
Design and build innovative features for cloud native batch scheduler in both private and public cloud environments, enhancing functionality and performance.
Develop a cloud native batch scheduler that accelerates HPC and AI workloads using our advanced technologies in cloud environments, e.g. DPU, ConnectX and GPU/NVLink.
Take part in developing NVDIA state-of-the-art AI supercomputer.
Work closely with other teams on new products or features/improvements of existing products.
Support, maintain and document software functionality.
What we need to see:
BSc in Computer Science or equivalent program.
8+ years of hands-on experience in software development, preferably with C, Python, Rust and Golang.
Wide hands-on experience with Kubernetes ecosystems development and programming.
Experience with Jenkins, GitLab and/or GitHub.
Deep understanding of batch scheduler, e.g. scheduling algorithm.
Strong background in designing, implementing, and debugging sophisticated software.
Highly motivated with strong interpersonal skills, ability to work successfully with multi-functional teams, developers, and architects.
Coordinate effectively across organizational boundaries and geographies.
Strong self-initiative, independence, and flexibility to a new technology.
Ways to stand out from the crowd:
R&D background with Volcano/Kubernetes/Slurm.
R&D background with Kubernetes operators/controllers development.
Experience with working on open-source projects.
Understanding of HPC/AI systems and related technologies.