Jobless Developer
Jobgether logo
Jobgether

Posted 1 day ago

Open

Data Engineer

IrelandRemoteFull-time

AI Summary

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Data Engineer based in Ireland.

About this role

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Data Engineer based in Ireland.

This role focuses on rebuilding trust in a complex, regulated data environment where existing pipelines are not yet reliable, reproducible, or fully validated. You will be responsible for transforming a newly centralized data lake into a robust, analytics-ready foundation that supports downstream data science and risk modelling use cases. Working within a regulated credit and lending context, you will design and enforce strong data quality, lineage, and governance standards across multiple source systems. The role requires deep hands-on engineering across AWS, Spark, and modern data tooling, with a strong emphasis on correctness, auditability, and reproducibility. You will collaborate closely with data science and engineering stakeholders to define harmonized data models and prepare feature-ready datasets. This is a high-impact foundational role where your work directly enables reliable decision-making in a financial risk environment.

Accountabilities:

  • Rebuild and validate data pipelines to ensure full reproducibility of reporting and descriptive statistics across all datasets
  • Profile, reconcile, and harmonize heterogeneous source schemas across multiple business entities into a unified data model
  • Design and implement dbt-based data models (staging, intermediate, and marts) with strong testing and validation layers
  • Develop and maintain data quality frameworks using tools such as Great Expectations and dbt tests to enforce reliability
  • Build and implement entity resolution and record linkage logic across fragmented customer and account datasets
  • Ensure robust anonymization and pseudonymization processes that meet regulatory and compliance requirements
  • Optimize large-scale Spark-based processing jobs, including partitioning strategies, file formats, and cost-efficient compute usage
  • Orchestrate production-grade pipelines using tools such as Airflow or AWS Step Functions
  • Deliver clean, documented, and feature-ready datasets for downstream data science and risk modelling teams
  • Create clear technical documentation and runbooks to support operational handover and long-term maintainability
  • Requirements:

    • 4+ years of professional experience in data engineering with strong exposure to large-scale AWS and Spark environments
    • Advanced proficiency in SQL and Python for data processing and transformation at scale
    • Strong experience with AWS data services including S3, Glue, Athena, Redshift, EMR, and orchestration tools
    • Proven experience building and maintaining data models using dbt or similar frameworks
    • Hands-on experience with data quality, validation, and testing frameworks such as Great Expectations
    • Strong understanding of data governance, lineage, and reproducibility in production environments
    • Experience with entity resolution, deduplication, or record linkage across multiple data sources
    • Familiarity with anonymization and pseudonymization techniques in regulated environments
    • Experience working in regulated industries such as BFSI, healthcare, or government is highly valued
    • Ability to work independently or as a lead engineer within a small, fast-moving delivery team
    • Strong written and verbal communication skills in English, with the ability to document and explain complex systems clearly
    • Benefits:

      • Competitive compensation package aligned with experience and impact
      • Remote-friendly working arrangements within Europe
      • Opportunity to work on a high-impact, regulated data transformation project
      • Exposure to modern AWS data architecture and large-scale Spark processing environments
      • Direct collaboration with data science and engineering leadership on meaningful analytics use cases
      • Strong autonomy in shaping data foundations and engineering standards
      • Opportunity to build robust, production-grade systems from an early-stage data estate
      • International, collaborative environment with distributed teams

Browse these categories