Senior Data Engineer - Remote, SAAS, $125-170K + Bonus

[REMOTE]

27 Feb 2024

Job Brief:
Steppingblocks brings big data analytics to higher education with rich data and interactive visualizations. We enable students to make data-driven, efficient decisions regarding their education and career journeys. We also help university administrators better understand outcomes for their graduates to better modify curricula for demanded skills, engage with employers/alumni, and report to relevant stakeholders.

We are looking for an experienced Senior Data Engineer to join our data team. You will build, improve, and maintain the Python-based ETL data pipeline and analytics infrastructure that powers our products and business decisions. 

This is an opportunity to work with cutting-edge technologies and large volumes of data to solve complex problems. You will work with unstructured (text) data to transform and enrich data and tune functions for speed and accuracy at scale. You will work on socio-economic and firmographics data and collaborate with both Product and Business stakeholders to prioritize new features or fixes that are needed. 

This opportunity requires the ability to quickly learn new technologies and techniques. Strong communication skills are also vital to interact with diverse teams across the organization. The role offers great potential for growth into team leadership positions.

Responsibilities:

  • Collect, integrate, and organize raw data from disparate sources into structured formats

  • Design, develop and optimize scalable ETL data pipelines and workflows

  • Build custom algorithms and data analysis processes to generate business insights

  • Work closely with data scientists, analysts and business teams to identify and fulfill new analytics feature requests

  • Monitor and enhance data quality, reliability, and performance

  • Architect new data collection procedures and data stores/databases

  • Contribute to the evolution of the overall data infrastructure and roadmap

Qualifications:

  • 5-10+ years experience building and optimizing data pipelines, ETL processes and data sets

  • Expert knowledge of Python, including Pandas, NumPy, and other common data manipulation libraries

  • Experience with big data tools like Spark, Elasticsearch, Snowflake, etc.

  • High attention to detail

  • Excellent written and verbal communication skills

  • Knowledge of software engineering best practices including testing, documentation, and code reviews

  • Comfortable working with business stakeholders as well as software engineers

Preferred Qualifications:

  • BSC/MSC in Computer Science, Statistics, Mathematics or another quantitative field

  • Experience with Dask or distributed computation frameworks

  • Knowledge of advanced statistical methods and machine learning techniques

  • Background in economics, social sciences or business analytics

  • Web scraping skills

  • Strong mathematical skills and statistical background

  • Knowledge of other programming languages (C, Cython, Numba, Rust)

  • Experience with NoSQL and RDBMS databases

  • Familiarity with Graph Databases and algorithms

  • Familiarity with Data Governance principles

Perks

  • Unlimited PTO

  • Medical, Vision and Dental benefits

  • Development stipend

  • 401K

Mid-Senior Level

Full Time

[REMOTE]