AI and HPC Cluster Group Manager

רעננה

הגשת מועמדות יצירת בקשה לפגישה עם המעסיק

רעננה

פורסם לפני יותר מחודש

פורסמה ברשת

We pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society.

Its a unique legacy of innovation thats fuelled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. Doing whats never been done before takes vision, innovation, and the worlds best talent. As our employee, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.

Our Networking is looking for an AI & HPC Clusters's group manager to join Cloud Solutions group. In this role, you will build, manage, and maintain the biggest cluster in our Networking R&D to validate and test next-generation networking cloud technology and Reference Architecture that are being released to our customers. We are currently working on next generation BlackWell GPU Platform AI clouds with our XDR (800G InfiniBand) and SpectrumX800 next generation technology. Come join the team and see how you can make a lasting impact on the world.

What youll be doing:

Lead a group that is responsible for building, managing, and maintaining SW R&D clusters composed of Linux, Windows, and VMware systems, x86 and ARM CPU, GPU, Ethernet, and InfiniBand technologies.

Work closely with the engineering and architecture teams to understand, plan and build new clusters for validating and testing new Networking technology solutions.

Drive the design and implementation of automatic systems to deploy, configure, maintain, and monitor these clusters.

Drive the design and implementation of resource management systems for multiuser environments with different needs on these clusters.

Manage R&D lab including inventory, power, space, and cooling.

Build, expand, and mentor the team to address growing demands and requirements.

Innovate! Influence on our Networking cluster management tools to shine in customers view.

Requirements:
What We Need to See:

A degree in Computer Science, Engineering, or a related field.

5+ years of managerial experience including managers management.

10+ years of relevant overall professional experience.

Experience in Data center management from a multidisciplinary company, including handling power, cooling, and space.

Experience in managing HPC/AI clusters.

Deep understanding of operating systems, computer networks, and high-performance hardware.

Deep knowledge of distributed resource scheduling systems and orchestration tools such as Slurm, K8s.

Strong organizational and project management skills, comfortable with multitasking in a dynamic environment with shifting priorities and changing requirements.

Enthusiastic and ambitious personality, encouraging a positive and productive work environment.

Ways to Stand Out From the Crowd:

Knowledge of HPC and AI solution technologies from CPUs and GPUs to high-speed interconnects and supporting software.

Familiarity with CUDA and managing GPU-accelerated computing systems.

Experience and knowledge of InfiniBand

This position is open to all candidates.

מידת ההתאמה שלי לתפקיד

התאמה למשרה

התאמתך לתפקיד מחושבת על פי כישורך (כפי שסיפרת לנו עליהם) מול דרישות המעסיק - אין בכך כדי להעיד על קבלתך לעבודה (זה יחליט המעסיק)

כישורים חסרים

CUDA ומערכות GPUgot it don't got itCUDA, חישובי GPUgot it don't got itאתרנט, טכנולוגיית InfiniBandgot it don't got itבניית צוות והדרכהgot it don't got itהרחבת צוות וחניכהgot it don't got itחדשנות והשפעהgot it don't got itחדשנות והשפעה על כליםgot it don't got itחומרה ביצועים גבוהיםgot it don't got itטכנולוגיות Ethernet, InfiniBandgot it don't got itידע ב-CPU x86/ARM, GPUgot it don't got itידע ב-InfiniBandgot it don't got itידע בטכנולוגיות HPC/AIgot it don't got itידע בפתרונות HPC/AIgot it don't got itידע ברשתות מחשביםgot it don't got itכלים לאורכסטרציה (Slurm, K8s)got it don't got itכלים לאורכסטרציה (כגון K8s)got it don't got itלינוקס, חלונות, VMwaregot it don't got itמומחיות ב-CPU x86/ARM, GPUgot it don't got itמומחיות מערכות הפעלהgot it don't got itמיומנויות ניהול פרויקטיםgot it don't got itמלאי, חשמל, קירורgot it don't got itמערכות התקנה אוטומטיותgot it don't got itמערכות לינוקס, ווינדוס, ווימוורgot it don't got itמערכות ניהול משאביםgot it don't got itמערכות תזמון משאביםgot it don't got itניהול בתחום AI ו-HPC קלאסטריםgot it don't got itניהול מעבדת מוgot it don't got itניהול מרכזי נתוניםgot it don't got itניהול פרויקטיםgot it don't got itניהול צביריםgot it don't got itניהול צוותgot it don't got itניהול קלאסטרים HPC/AIgot it don't got itניסיון ב-InfiniBandgot it don't got itסביבת עבודה חיוביתgot it don't got itעיצוב ויישום קלאסטריםgot it don't got itפריסת מערכת אוטומטיתgot it don't got itריבוי משימות בסביבה דינמיתgot it don't got itריבוי משימות והתאמהgot it don't got itשיתוף פעולה הנדסיgot it don't got itתזמון משאבים מבוזריםgot it don't got it

למציאת הכשרות רלוונטיות עדכון כישורים בפרופיל האישי

משרות חדשות במערכת שיכולות לעניין אותך

מנהל /ת פרויקט בינוי לWXG (משרה 2194)

דוברי שפות

כפר יונהראשון לציון

פורסם לפני שבוע 1

דרוש/ה מנהל /ת פרויקט לניהול מגה פרויקט בתחום הבינוי למגה פרויקטיםהתפקיד כולל את ניהול וליווי הפרויקט וצוות הפרויקט באתר, על ...

משרות תוצאות חיפוש מתאימות לבקשתך:

AI and HPC Cluster Group Manager