Crumbl HQ is seeking a skilled and motivated data engineer to join our growing team. The successful candidate will be responsible for building and maintaining data pipelines using dbt and Prefect to support data-driven decision-making across the organization.
Responsibilities
Design, build, and maintain scalable and reliable data pipelines through ELT/ETL extraction methods.
Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and ensure data quality
Develop and maintain documentation, including data dictionaries, workflow diagrams, and data flow diagrams
Ensure the integrity and security of data by implementing appropriate controls and monitoring
Optimize and tune data pipelines to ensure efficient processing and query performance
Implement and maintain data security policies and procedures, including access controls, encryption, and data masking
Design and implement data processing workflows using dbt and Prefect to support data science and machine learning applications
Develop and maintain data ingestion processes to bring data from external sources into the organization’s data environment
Identify and address performance issues with data pipelines, and work with infrastructure and operations teams to optimize system performance
Conduct testing and validation of data pipelines to ensure they are functioning correctly and meeting business requirements
Participate in code reviews and contribute to the development of best practices for data engineering
Stay current with emerging technologies and trends in data engineering and data science, and identify opportunities to leverage them within the organization
Qualifications
Bachelor’s or Master’s degree in Data Science, Information Systems, or a related field
3+ years building and maintaining production data pipelines (degree in a related field or equivalent experience)
Advanced SQL: window functions, CTEs, and query/performance tuning on large datasets
Strong Python for data engineering (modular, testable pipeline code)
Hands-on dbt experience: models, tests, macros, and incremental materializations
Production Snowflake experience: schema design, performance tuning, and warehouse/cost optimization
AWS data services (e.g., S3, Glue, Lambda)
Data quality and observability with dbt + Elementary
Infrastructure-as-code with Terraform and version control with Git
Dimensional data modeling (star/snowflake schemas, SCDs) and lakehouse concepts
Strong problem-solving skills and clear communication with analysts, scientists, and stakeholders