What Youll Be Doing:
Designing and developing an automation platform used to provision, configure, and monitor HPC data centers
Implementing scalable, reliable, and maintainable services that enhance cluster visibility and improve operational efficiency
Collaborating closely with internal and external stakeholders to understand requirements and deliver robust full-cycle solutions
Improving stability and performance across the provisioning pipeline through architectural enhancements and code optimizations
Troubleshooting issues in distributed environments and contributing to system observability and reliability improvements
Working cross-functionally with architects, DevOps engineers, product managers and stakeholders to ensure high-quality releases
Participating in code reviews, technical design discussions, and continuous improvement activities within the team
What We Need to See:
B.Sc. in Computer Science, Engineering, or a related field (or equivalent practical experience).
5+ years of hands-on experience in software development.
Strong understanding of software design patterns, architectural principles, and standard methodologies in complex distributed systems.
Experience with microservices architectures.
Proficiency with version control systems (e.g., Git) and CI/CD pipelines.
Experience working with both Linux and Windows operating systems.
Strong problem-solving and analytical thinking, with the ability to deliver high-quality code and resilient services.
Strong debugging and troubleshooting capabilities in distributed systems or multi-node environments.
Experience with the networking domain.
Excellent communication skills and an ownership-driven mindset.
Ways to Stand Out from the Crowd:
Python proficiency (strongly preferred due to Python-based tooling).
Familiarity with DevOps methodologies and tools (e.g., Jenkins, Ansible).
Hands-on experience with Docker and containerized environments.
Experience with agentic AI development.





















