Jobless Developer
ShopBack logo
ShopBack

Posted 7 months ago

Open

Data Engineering Intern

Ho Chi Minh, VietnamRemoteInternship

AI Summary

Data Engineering Intern focusing on building scalable data pipelines and lakehouse concepts using Spark, Iceberg, and Trino. Works with Airflow, dbt-on-Spark, and AI-assisted tooling to deliver reliable analytics-ready data.

About this role

Our Journey

The ShopBack Group is Asia-Pacific’s leading shopping, rewards, and payments platform, serving over 60 million shoppers across 13 markets. In 2025, the Group continued its global growth with its expansion into North America. Driven by the vision to make every day more rewarding, ShopBack is dedicated to saving members money and time, and delivering delight every day. The platform also enables merchants and brands to engage with their members in a cost-effective manner. Founded in 2014, ShopBack now powers over US$5.5 billion in annual sales for over 20,000 online and in-store partners, and has rewarded shoppers with more than US$800 million (over S$1 billion) in Cashback to date. Through its innovative offerings, ShopBack continues to create value for both members and merchants. Notably, its payment solution, ShopBack Pay, offers members a convenient and rewarding payment option at checkout.

About the role
At ShopBack, our engineering teams build scalable platforms and utilize leading technologies to build a world-class product. You will join a diverse and talented team of aspiring engineers with great ambition to impact the eCommerce landscape. We are seeking team members who strive to solve hard problems, take pride in delivering world-class products, and are strong team players. You will get an opportunity to work on building scalable data systems that help drive the organization’s data-driven decision-making and play a direct impact on growth.
Our Data Platform team builds and maintains the foundation that powers ShopBack’s analytics and decision-making.
We design and operate data pipelines across AWS S3, Apache Iceberg, and Trino, orchestrated through Airflow and modeled via dbt-on-Spark. You’ll work alongside experienced data engineers who value clean data, efficient systems, and thoughtful design, not just working code.

Your Adventure Ahead

  • Build and maintain data models and pipelines using Spark and Apache Airflow
  • Learn to design and optimize HUDI and Iceberg tables for performance and reliability
  • Write and validate SQL transformations consumed in Trino and Metabase
  • Collaborate with senior engineers to improve data quality and observability
  • Use AI tools (e.g., ChatGPT, Cursor, Claude Code) to assist coding and documentation, and learn how to verify their output
  • Document learnings and share improvements through Confluence and Slack
  • What you will learn:
  • How modern data lakehouses (Spark + Iceberg + Trino) work in production
  • Building reliable, testable dbt models on top of large datasets
  • End-to-end flow: ingestion → transformation → analytics
  • Practical debugging, observability, and version control in a real system
  • How to collaborate effectively in a hybrid engineering team
  • Essentials to Succeed

  • Strong interest in data engineering or data systems
  • Familiarity with Python and SQL (school projects or self-taught is fine)
  • Curiosity about data pipelines, storage formats, and data quality
  • Comfort experimenting with AI coding assistants responsibly
  • Clear communication, asks questions early, shares progress regularly
  • Bonus: exposure to AWS, dbt, Spark, or Trino
  • Skills

    AI Coding AssistantsApache AirflowAWS S3ChatGPTConfluenceData ModelingData PipelinesData QualityDbtHudiIcebergMetabaseObservabilityPythonSlackSparkSQLTrino

    Explore related jobs

    Browse these categories