Role description:
You will lead a small talented core team of R&D which is responsible for the following main deliverables:
Designing, analyzing, and optimizing workloads from various sources (open source, customer provided, home-grown) on platforms. The focus is on workloads for NLP, speech, and computer-vision.
Benchmarking and competitive analysis of workloads on other inference acceleration platforms.
Working directly with customers on new requirements and efficient deployment of their workloads on platform
Identifying missing gaps and new requirements for SW/HW to improve workload performance and efficient deployment.
This is an exciting opportunity to work on cutting-edge and emerging technologies, across multi-disciplinary domains of deep-learning models and computer architectures.
This is not a position of data science!
You will lead a small engineering team (3-5 engineers)
Provide both technical and managerial leadership.
Participate in design reviews, perform code reviews, and take part in coding tasks.
Foster a collaborative and innovative team culture, ensuring effective communication and knowledge sharing.
Must-have requirements:
BSc/MSc in Computer Science or Computer Engineering from the accredited university
Strong hands-on in Python programming and DL frameworks (mainly PyTorch)
Proven experience in ML engineering and specifically, developing of AI pipelines (composed of pretrained DL models and pre/post processing), data streaming, model zoo handling, and inference serving in production environments.
Experience as a technical lead of small R&D SW teams.
Advantages:
Experience using Nvidia tools and leveraging CPU+GPU instances on cloud or on-premises for development and for in-production deployment.
Experience with C++ and software programming principles (e.g., OOP, design patterns)
Working with remote (offshore) partners.