פורסם לפני יותר מחודש
פורסמה ברשת
We are looking for a talented Performance Research Engineer to join our Performance group.

The ideal candidate will profile and analyze AI workloads on large GPUs and CPUs scale clusters for distributed Deep Learning LLM training focusing at the collectives communication and networking.

You will work and interact with many types of HW and platforms such as HCAs, Switches, CPUs, GPUs, and Systems.

You will experience with and develop performance analysis tools and methodologies to dive deeply into the details, understand performance expectation, limitations, and bottlenecks.

What you'll be doing:

Experience and research AI workloads and DL models specifically tailored for large-scale deep learning LLM training on our supercomputers with a focus on High-performance networking.

Benchmarking, Profiling, and Analyzing the performance to find bottlenecks and identify areas of improvement and optimizations, with a strong emphasis on networking aspects.

Implement performance analysis tools.

Collaborating with many teams from HW to SW to provide performance analysis insights.

Define performance test planning , set performance expectations for new technologies and solutions, and work to reach the performance targets limits.

Requirements:
What we need to see:

B.Sc in Computer Science or Software Engineering.

5+ years of experience with high-performance Networking (RDMA, MPI).

Demonstrated Performance Analysis skills and methodologies.

Experience with our GPUs, CUDA library, deep learning frameworks like TensorFlow or PyTorch, combined with expertise in networking collective communication libraries (such as NCCL) and protocols (such as RoCE and RDMA).

Fast and self-learning capabilities with strong analytical and problem-solving skills.

Programming Languages: Python, Bash and C languages

Experience with Linux OS distros.

Team player with good communication and interpersonal skills.

Ways to stand out from the crowd:

In-depth knowledge and experience with AI workloads and benchmarking for distributed LLM training.

Knowledge in CUDA, and NCCL libraries.

Knowledge in Congestion Control algorithms.

In-depth System knowledge and understanding (Intel / AMD / ARM CPUs, our GPUs, HCA, Memory, PCI).

Strong Performance Analysis skills and methodologies using modern tools.

This position is open to all candidates.
מידת ההתאמה שלי לתפקיד
התאמה למשרה
התאמתך לתפקיד מחושבת על פי כישורך (כפי שסיפרת לנו עליהם) מול דרישות המעסיק - אין בכך כדי להעיד על קבלתך לעבודה (זה יחליט המעסיק)
כישורים חסרים
משרות חדשות במערכת שיכולות לעניין אותך
דוברי שפות
פורסם לפני יותר מחודש
Required Senior ML EngineerResponsibilities:As a Senior Machine Learning engineer, you will be working directly with our Head of R&D and ...
דוברי שפות
פורסם לפני יותר מחודש
We are looking for a Research Engineer to join our Tel Aviv team where we are breaking new ground in ...
דוברי שפות
רעננה
פורסם לפני יותר מחודש
Our technology has no boundaries! We are building the worlds most groundbreaking and state-of-the-art accelerated computing platforms. Because of our ...
דוברי שפות
פורסם לפני יותר מחודש
We have been defining computer graphics, PC gaming, and accelerated computing for more than 25 years. With an outstanding legacy ...
דוברי שפות
רעננה
פורסם לפני יותר מחודש
As a Senior Software Architect in the Accelerated Computing System and Software team, you will define Software Defined Networking (SDN) ...
פורסם לפני יותר מחודש
We are looking for a talented Senior HPC and AI Networking Performance Research and Analysis Engineer to join our Performance ...
פורסם לפני יותר מחודש
Our technology has no boundaries! We are building the worlds most groundbreaking and innovative accelerated compute platforms for the world ...
פורסם לפני יותר מחודש
We are looking for the best minds and spirits to join us in our journey. We know our product is ...
היברידי
פורסם לפני יותר מחודש
We are looking for a Talented individual, LLM AI Engineer who will join the R&D team.In this role, you will ...
רעננה
פורסם לפני יותר מחודש
Our SOC Architecture team is looking for a Senior Data Scientists with SW development skills and HW-System architecture experience. Do ...
Mobileye
פורסם לפני יותר מחודש
The positionThe road algorithm team is hiring! The road algorithm team is responsible for the critical and challenging task of detecting ...
באר שבערעננה
פורסם לפני יותר מחודש
We have continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the ...
פורסם לפני יותר מחודש
Do you want to help drive the development of CPU technology for architectures used for artificial intelligence (AI) / deep ...
HighTech Company
אזור השרוןאילתאשקלוןבאר שבעדימונה / ערד / ים המלחהוד השרוןהרצליה / רמת השרוןזכרון יעקב / בנימינהחדרהחולון / בת יםחיפהטבריהיבנהיוקנעם / רמת ישילוד / רמלהמודיעיןנתניהעכו / נהריהפרדס חנה כרכורפתח תקווהקריותקרית גת / קרית מלאכיראש העיןראשון לציוןרחובות / נס ציונה/ גדרהתל אביב
פורסם לפני יותר מחודש
Spearhead deep learning projects that tackle real-world challenges, pushing the boundaries of what is possible. ? Harness advanced data analysis ...
רעננה
פורסם לפני יותר מחודש
We are looking for an experienced networking software engineer. An awesome candidate is highly technical who is also comfortable with ...
הצגת משרות נוספות