Role Description
– Deploy models under various workloads (open-source, customer-provided, in-house) on dedicated hardware platforms, with emphasis on NLP, computer vision, and LLM workloads.
– Conduct detailed benchmarking and comparative analysis of workloads on dedicated acceleration platforms as well as other inference acceleration platforms.
– Work closely with core members of the AI/ML teams to define test procedures, conduct experiments, collect, and analyze data.
– Automate testing pipelines using Python, Bash, and Jenkins as part of CI/CD workflows.
– Investigate and resolve hardware/software integration issues, analyze logs, run experiments, identify bottlenecks, and systematically improve performance.
– Write clear, concise, and insightful performance analysis and comparison reports.
Who is this role for?
This role is particularly suited for candidates with a strong Python development background, capable of writing high-quality code and building automation and testing infrastructure,
as well as for those with curiosity, investigative and analytical skills, and a passion for working at the forefront of AI and hardware technologies.
We’re looking for people who thrive at the intersection of software and hardware – able to write complex scripts while also understanding how the model and code impact the hardware’s real-world performance.