The Challenge
As a Data Engineer in the Data Infrastructure team, you will build platforms and tools that churn through, process & analyze terabytes of data. You will have to build and manage the entire data flow pipeline. You will work on technologies such as Apache Spark, Elasticsearch, Bigquery to build a scalable infrastructure that delivers recommendations to our users in real-time.
The pace of our growth is incredible – if you want to tackle hard and interesting problems at scale, and create an impact within an entrepreneurial environment, join us!
Roles and Responsibilities
You will work closely with Software Engineers & ML engineers to build a data infrastructure that fuels the needs of multiple teams, systems and products.
You will automate manual processes, optimize data delivery and build the infrastructure required for optimal extraction, transformation and loading of data required for a wide variety of use-cases using Apache Spark.
You will build stream processing pipelines and tools to support a vast variety of analytics and audit use-cases.
You will continuously evaluate relevant technologies, influence and drive architecture and design discussions.
You will work in a cross-functional team and collaborate with peers during the entire SDLC.
Expectations
Minimum 4 years of work experience building data warehouse and BI systems.
Experience in either Go or Python (plus to have).
Experience in Apache Spark, Hadoop, Redshift, Bigquery.
Strong understanding of database and storage fundamentals.
Experience with the AWS stack and Google Cloud.
Ability to create data-flow design and write complex SQL / Spark-based transformations.
Experience working on real-time streaming data pipelines using Spark Streaming or Storm.