Responsibilities
Design and Build Data Infrastructure: Design, plan, and build all aspects of the platform's data, ML pipelines, and supporting infrastructure.
Optimize Cloud Data Lake: Build and optimize an AWS-based Data Lake using cloud architecture best practices for partitioning, metadata management, and security to support enterprise-scale operations.
Lead Project Delivery: Lead end-to-end data projects from initial infrastructure design through to production monitoring and optimization.
Solve Integration Challenges: Implement optimal ETL/ELT patterns and query techniques to solve challenging data integration problems sourced from structured and unstructured data.
Experience: 3+ years of hands-on experience designing and maintaining big data pipelines in on-premises or hybrid cloud SaaS environments.
Programming & Databases: Proficiency in one or more programming languages (Python, Scala, Java, or Go) and expertise in both SQL and NoSQL databases.
Engineering Practice: Proven experience with software engineering best practices, including testing, code reviews, design documentation, and CI/CD.
AWS Experience: Experience developing data pipelines and maintaining data lakes, specifically on AWS.
Streaming & Orchestration: Familiarity with Kafka and workflow orchestration tools like Airflow.


















