Posted 7 months ago

Data Engineering Intern

Ho Chi Minh, VietnamRemoteInternship

AI Summary

Data Engineering Intern focusing on building scalable data pipelines and lakehouse concepts using Spark, Iceberg, and Trino. Works with Airflow, dbt-on-Spark, and AI-assisted tooling to deliver reliable analytics-ready data.

About this role

Our Journey

The ShopBack Group is Asia-Pacific’s leading shopping, rewards, and payments platform, serving over 60 million shoppers across 13 markets. In 2025, the Group continued its global growth with its expansion into North America. Driven by the vision to make every day more rewarding, ShopBack is dedicated to saving members money and time, and delivering delight every day. The platform also enables merchants and brands to engage with their members in a cost-effective manner. Founded in 2014, ShopBack now powers over US$5.5 billion in annual sales for over 20,000 online and in-store partners, and has rewarded shoppers with more than US$800 million (over S$1 billion) in Cashback to date. Through its innovative offerings, ShopBack continues to create value for both members and merchants. Notably, its payment solution, ShopBack Pay, offers members a convenient and rewarding payment option at checkout.

About the role

At ShopBack, our engineering teams build scalable platforms and utilize leading technologies to build a world-class product. You will join a diverse and talented team of aspiring engineers with great ambition to impact the eCommerce landscape. We are seeking team members who strive to solve hard problems, take pride in delivering world-class products, and are strong team players. You will get an opportunity to work on building scalable data systems that help drive the organization’s data-driven decision-making and play a direct impact on growth.

Our Data Platform team builds and maintains the foundation that powers ShopBack’s analytics and decision-making.

We design and operate data pipelines across AWS S3, Apache Iceberg, and Trino, orchestrated through Airflow and modeled via dbt-on-Spark. You’ll work alongside experienced data engineers who value clean data, efficient systems, and thoughtful design, not just working code.

Your Adventure Ahead

Build and maintain data models and pipelines using Spark and Apache Airflow

Learn to design and optimize HUDI and Iceberg tables for performance and reliability

Write and validate SQL transformations consumed in Trino and Metabase

Collaborate with senior engineers to improve data quality and observability

Use AI tools (e.g., ChatGPT, Cursor, Claude Code) to assist coding and documentation, and learn how to verify their output

Document learnings and share improvements through Confluence and Slack

What you will learn:

How modern data lakehouses (Spark + Iceberg + Trino) work in production

Building reliable, testable dbt models on top of large datasets

End-to-end flow: ingestion → transformation → analytics

Practical debugging, observability, and version control in a real system

How to collaborate effectively in a hybrid engineering team

Essentials to Succeed

Strong interest in data engineering or data systems

Familiarity with Python and SQL (school projects or self-taught is fine)

Curiosity about data pipelines, storage formats, and data quality

Comfort experimenting with AI coding assistants responsibly

Clear communication, asks questions early, shares progress regularly

Bonus: exposure to AWS, dbt, Spark, or Trino

Skills

AI Coding AssistantsApache AirflowAWS S3ChatGPTConfluenceData ModelingData PipelinesData QualityDbtHudiIcebergMetabaseObservabilityPythonSlackSparkSQLTrino

Explore related jobs

More jobs at ShopBack

Similar AI Coding Assistants jobs

Browse these categories

Remote Jobs Confluence Jobs Data Modeling Jobs Data Pipelines Jobs Data Quality Jobs