Responsibilities:
Play a key role in defining the architecture for the company`s parallel compute framework API, and how it will interact with other common parallel compute frameworks.
Research common parallel compute frameworks implementations and suggest design and implementation concepts for accelerating them on our hardware.
Dive into linear algebra common libraries codebase and understand how to integrate them into the company`s SDK with the aim of best utilizing our hardware.
Understand our full software stack end-to-end,and be able to identify performance bottlenecks and implement accelerated solutions.
Work in close collaboration with cross-functional and multidisciplinary teams, including software, hardware, system, research, and apps engineering teams.
6+ years of software engineering experience working on low-level software for heterogeneous compute.
MSc/BSc in Computer Science or other equivalent educational experience.
5+ years of advanced C++ experience, with excellent coding skills.
3+ years of parallel compute software acceleration coding experience, utilizing one or more of the following frameworks: OpenMP, Kokkos, CUDA, OpenCL, and/or HIP.
Experience with compilers (RISC-V, ARM and/or x86 backends) : a big advantage.
In-depth knowledge of processor architecturessuch asCPUs (x86, ARM, RISC-V), GPUs, DSPs, or TCUs : an advantage.
Hands-on experience working with common linear algebra accelerated libraries : BLAS, FFTW, Eigen, LAPACK : an advantage.
Experience with Linux foundations : an advantage.