we are seeking a Senior Data Infra Engineer. You will be responsible for designing and building all data, ML pipelines, data tools, and cloud infrastructure required to transform massive, fragmented data into a format that supports processes and standards. Your work directly empowers business stakeholders to gain comprehensive visibility, automate key processes, and drive strategic impact across the company.
Responsibilities
Design and Build Data Infrastructure: Design, plan, and build all aspects of the platform's data, ML pipelines, and supporting infrastructure.
Optimize Cloud Data Lake: Build and optimize an AWS-based Data Lake using cloud architecture best practices for partitioning, metadata management, and security to support enterprise-scale operations.
Lead Project Delivery: Lead end-to-end data projects from initial infrastructure design through to production monitoring and optimization.
Solve Integration Challenges: Implement optimal ETL/ELT patterns and query techniques to solve challenging data integration problems sourced from structured and unstructured data.
Responsibilities
Design and Build Data Infrastructure: Design, plan, and build all aspects of the platform's data, ML pipelines, and supporting infrastructure.
Optimize Cloud Data Lake: Build and optimize an AWS-based Data Lake using cloud architecture best practices for partitioning, metadata management, and security to support enterprise-scale operations.
Lead Project Delivery: Lead end-to-end data projects from initial infrastructure design through to production monitoring and optimization.
Solve Integration Challenges: Implement optimal ETL/ELT patterns and query techniques to solve challenging data integration problems sourced from structured and unstructured data.
Requirements:
Experience: 5+ years of hands-on experience designing and maintaining big data pipelines in on-premises or hybrid cloud SaaS environments.
Programming & Databases: Proficiency in one or more programming languages (Python, Scala, Java, or Go) and expertise in both SQL and NoSQL databases.
Engineering Practice: Proven experience with software engineering best practices, including testing, code reviews, design documentation, and CI/CD.
AWS Experience: Experience developing data pipelines and maintaining data lakes, specifically on AWS.
Streaming & Orchestration: Familiarity with Kafka and workflow orchestration tools like Airflow.
Preferred Qualifications
Containerization & DevOps: Familiarity with Docker, Kubernetes (K8S), and Terraform.
Modern Data Stack: Familiarity with the following tools is an advantage: Kafka, Databricks, Airflow, Snowflake, MongoDB, Open Table Format (Iceberg/ Delta)
ML/AI Infrastructure: Experience building and designing ML/AI-driven production infrastructures and pipelines.
Experience: 5+ years of hands-on experience designing and maintaining big data pipelines in on-premises or hybrid cloud SaaS environments.
Programming & Databases: Proficiency in one or more programming languages (Python, Scala, Java, or Go) and expertise in both SQL and NoSQL databases.
Engineering Practice: Proven experience with software engineering best practices, including testing, code reviews, design documentation, and CI/CD.
AWS Experience: Experience developing data pipelines and maintaining data lakes, specifically on AWS.
Streaming & Orchestration: Familiarity with Kafka and workflow orchestration tools like Airflow.
Preferred Qualifications
Containerization & DevOps: Familiarity with Docker, Kubernetes (K8S), and Terraform.
Modern Data Stack: Familiarity with the following tools is an advantage: Kafka, Databricks, Airflow, Snowflake, MongoDB, Open Table Format (Iceberg/ Delta)
ML/AI Infrastructure: Experience building and designing ML/AI-driven production infrastructures and pipelines.
This position is open to all candidates.


















