We are seeking a Senior Data Engineer to own and scale data platform infrastructure.
Lead the architecture and implementation of high-throughput, low-latency data pipelines across teams.
Support cutting-edge GenAI and cybersecurity use cases.
Be part of the Data and AI Algorithms group.
Collaborate closely with AI/ML engineers, architects, development teams, and security researchers.
Ensure the availability, reliability, and agility of our data infrastructure.
Help define best practices for data modeling and orchestration at scale.
Requirements:
6+ years of hands-on experience in building and operating distributed data systems at scale.
Production experience with big-data distributed systems such as Apache Spark and Ray
Production experience in AI/ML model deployment and monitoring
Hands-on with modern data lakes and open table formats (Delta Lake, Apache Iceberg)
Strong coding skills in Python. Strong CI/CD and infrastructure-as-code capabilities.
Experience with cloud-native data services (e.g., AWS EMR, Athena, Azure Data Explorer etc.).
Familiarity with orchestration tools like Airflow, Kubeflow, Dagster or similar
Excellent communication skills, ownership mindset, and problem-solving capabilities.
Experience in data modeling for analytics, AI/ML, and real-time application is a an advantage
Experience in stream processing (e.g., Kafka, Flink) and batch data systems is an advantage
6+ years of hands-on experience in building and operating distributed data systems at scale.
Production experience with big-data distributed systems such as Apache Spark and Ray
Production experience in AI/ML model deployment and monitoring
Hands-on with modern data lakes and open table formats (Delta Lake, Apache Iceberg)
Strong coding skills in Python. Strong CI/CD and infrastructure-as-code capabilities.
Experience with cloud-native data services (e.g., AWS EMR, Athena, Azure Data Explorer etc.).
Familiarity with orchestration tools like Airflow, Kubeflow, Dagster or similar
Excellent communication skills, ownership mindset, and problem-solving capabilities.
Experience in data modeling for analytics, AI/ML, and real-time application is a an advantage
Experience in stream processing (e.g., Kafka, Flink) and batch data systems is an advantage
This position is open to all candidates.