
Posted 8 months ago
Full-Stack Software Engineer (Data Pipeline Focus)
AI Summary
Designs and implements data-processing workflows and data pipelines, builds scalable REST APIs, and collaborates across teams to ensure end-to-end data solutions and performance.
About this role
Description
In this role, you'll contribute across the stack: developing ingest pipelines, building scalable REST APIs, and facilitating data exploration and understanding. The platform supports large-scale data ingestion, complex queries, and interactive analysis. While your primary focus will be on the data-pipeline layer, you’ll collaborate closely with other sub-teams to ensure end-to-end functionality and performance. We’re looking for someone excited to work across the system and to improve team processes and tooling, especially for faster integration of new data sources.
Responsibilities
Lead the design and implementation of data-processing workflows
Manage all aspects of the data-processing lifecycle (collection, discovery, analysis, cleaning, modeling, transformation, enrichment, validation)
Develop and maintain data models and JSON Schemas to ensure integrity and consistency
Collaborate with analysts and engineers to meet data requirements
Manage and optimize data storage/retrieval in Elasticsearch and Dgraph (plus MongoDB and Redis)
Orchestrate dataflow using Apache NiFi
Mentor teammates on best practices for data processing and software engineering
Use AI platforms to support hybrid automated/manual data transformation, code generation, and schema management
Work with analysts, product owners, and engineers to ensure solutions meet operational needs
Propose and implement process improvements for faster delivery of new data sources
Required Skills & Experience
Strong data-wrangling and dataflow background (discovery, mining, cleaning, exploration, enrichment, validation)
Proficiency in JSON and JSON Schemas (or similar)
Solid data-modeling experience
Experience with NoSQL databases (Elasticsearch, MongoDB, Redis, graph DBs)
Familiarity with dataflow tools such as Apache NiFi
Extensive experience in Python or Java (both preferred)
Experience using generative AI for code and data transformation
Git for version control; Maven for build automation
Comfortable in a Linux development environment
Familiarity with Atlassian tools (Jira, Confluence)
Strong communication and teamwork skills
Nice to Have
Experience with various corporate data formats
Knowledge of Kafka or RabbitMQ
Proficiency in Java/Spring (Boot, MVC/REST, Security, Data)
AWS (EC2, S3, Lambda) experience
API design for data services
Frontend experience (modern JS + Vue.js or similar)
CI/CD (e.g., Jenkins), automated testing (JUnit)
Docker, Kubernetes, and other containerization tech
DevOps tools (Packer, Terraform, Ansible)
Qualifications
12+ years of relevant experience and a B.S. in a technical discipline
(Four additional years of experience may substitute for a degree)