What you'll be doing:
Design, build, and maintain foundational infrastructure and key applications that allow people to leverage proprietary and public data effectively
Use modern technologies daily, such as Snowflake, Databricks, Apache Spark, Elasticsearch, AWS, Kafka, and more.
Expand and optimize data warehouse and lakehouseweighing tradeoffs between performance, cost, and user experienceacross continually larger and more complex datasets.
Model data and define storage strategy, to ensure that data is intuitive for researchers to find, and easy and fast for them to access
Enhance analysts most-used web application, which gives them in-depth insight into every decision we make
Collaborate directly with colleagues from various disciplines: Analysts, Data Scientists, and other Engineers (all of whom write production code!)
Self-manage project planning, milestones, designs, and estimations
Hold yourself and others to a high standard when working with production systems. We take pride in our tests, monitoring & alerting abilities just as we do with our systems.
Debug complex problems across the whole stack.
5+ years of proven experience with designing, implementing, and maintaining large-scale backend production systems
Experience building and maintaining data-intensive applications
Experience with data processing technologies, e.g. Spark, Flink, EMR, Iceberg
Experience with data warehousing frameworks, e.g. Snowflake, Hive, Glue, Databricks Unity Catalog
Experience working in Python
Experience with AWS or other public clouds
Professional proficiency in English