MLOps Engineer - AI Infra Group - COB

MLOps Engineer – AI Infra Group

היברידימלאה

אזור מרכז - גוש דןתל אביב

יצירת בקשה לפגישה עם המעסיק

NLP/Machine Learning|איש DevOps|מחשבים ורשתות|תוכנה

אזור מרכז - גוש דןתל אביב

היברידימלאה

פורסם לפני יותר מחודשיים

פורסמה ברשת

Required MLOps Engineer – AI Infra Group
Tel Aviv Full-time
We are on an expedition to find you, someone who is passionate about creating intuitive, out-of-this-world production-grade AI infrastructure. This group builds scalable, high-performance AI systems for internal users and external customers, designed to run seamlessly across cloud and on-premise environments using the latest hardware advancements.
Responsibilities
Design, build, and maintain scalable Kubernetes-based infrastructure for ML workloads across on-premise and cloud environments
Architect hybrid infrastructure solutions enabling seamless model flow from on-premise training environments to cloud-based inference deployments
Implement model registry and artifact management strategies that support cross-environment synchronization, versioning, and governance
Design secure, efficient data and model transfer mechanisms between on-premise and cloud (networking, storage replication, caching strategies)
Implement and manage GPU scheduling, resource allocation, and cluster autoscaling for heterogeneous compute environments
Build and maintain CI/CD pipelines for ML systems, including model versioning, testing, and promotion across environments
Develop observability solutions (logging, monitoring, alerting) for ML infrastructure across hybrid deployments
Collaborate with ML Engineers to define infrastructure requirements and SLAs for training and serving workloads.

Requirements:
5+ years of experience in infrastructure engineering, platform engineering, or DevOps, preferably supporting ML or data-intensive workloads
Experience designing and operating hybrid cloud architectures (on-premise + cloud) with focus on data/model synchronization
Familiarity with model registry solutions (MLflow or cloud-native registries) and artifact management at scale
Experience with GPU compute infrastructure, device plugins, and resource scheduling (e.g., NVIDIA GPU Operator)
Proficiency in IaC tools (Terraform) and GitOps practices (ArgoCD)
Experience with monitoring and observability stacks (Prometheus, Grafana, ELK)
Familiarity with ML workflows to understand workload characteristics and requirements.

This position is open to all candidates.

מידת ההתאמה שלי לתפקיד

עדכון כישורים

התאמה למשרה

התאמתך לתפקיד מחושבת על פי כישורך (כפי שסיפרת לנו עליהם) מול דרישות המעסיק - אין בכך כדי להעיד על קבלתך לעבודה (זה יחליט המעסיק)

למציאת הכשרות רלוונטיות עדכון כישורים

כישורים חסרים

Argocd (11.4%)got it don't got itData And Model Transfer (5.7%)got it don't got itGpu Scheduling (5.7%)got it don't got itHybrid Infrastructure (5.7%)got it don't got itKubernetes Infrastructure (11.4%)got it don't got itMl Ci Cd Pipelines (11.4%)got it don't got itModel Registry (11.4%)got it don't got itObservability (14.3%)got it don't got itPrometheus (11.4%)got it don't got itTerraform (11.4%)got it don't got it

משרות חדשות במערכת שיכולות לעניין אותך

Senior DevOps Engineer בחברת סטארט-אפ

מלאה

רעננה

פורסם לפני יותר מחודשיים

החברה מפתחת פלטפורמת Cyber security ל-Enterprise המבוססת על פתרון מניעתי ופרואקטיבי מבוסס AI, אשר מונע מראש ממידע לא מאובטח לעלות ...

לוח אפשרויות קריירה - הדרך החכמה להייטק

MLOps Engineer – AI Infra Group