
Posted 5 months ago
Data Engineer (Python)
AI Summary
Data Engineer responsible for prototyping ingestion pipelines (batch and streaming) using NiFi, Kafka, and CDC, building lakehouse datasets, and delivering adoption-ready artifacts and runbooks.
About this role
Data Engineer (Python)
Company
Orcrist builds the Orcrist Intelligence Platform (OIP), a Kubernetes-based data intelligence system delivered as SaaS or self-hosted/on-prem (including air-gapped deployments). We run streaming and batch pipelines that power search, ML enrichment, and investigative workflows for mission-critical customers.
Role
Rapidly validate new data initiatives end-to-end—without sacrificing adoptability. On Innovation, you’ll prototype representative connectors and pipelines (batch + streaming), generate credible performance/operability readouts, and ship a handoff package that Foundation or a delivery team can productize.
What you'll do
- Prototype ingestion and connector patterns (batch + streaming) using NiFi, Kafka, Kafka Connect/Streams, and CDC approaches.
- Design “prototype-grade but adoptable” schemas and data models with clear semantics and evolution discipline.
- Build incremental lakehouse datasets (Hudi/Iceberg/Delta patterns) and produce queryable outputs for realistic latency/throughput evaluation.
- Bake in data quality and provenance mindset early (checks, metadata hooks, operability basics).
- Containerize and deploy prototypes on Kubernetes; deliver minimal runbooks/configs that make adoption straightforward.
- Produce adoption artifacts: schemas, reference implementations, technical design notes, and an integration backlog.
About You
- 3+ years data engineering experience (level dependent) with real pipeline delivery beyond ad-hoc scripts.
- Strong Python + SQL; comfortable building transformations, validation tooling, and pipeline glue code.
- Practical streaming/CDC fundamentals (ordering, duplication, replay, idempotency) and Kafka ecosystem experience.
- Familiar with lakehouse/storage and query layers (e.g., Hudi/Iceberg/Delta, Trino/Hive/Postgres) and how to make datasets usable.
- Comfortable working in Kubernetes/container environments and documenting decisions clearly.
- Eligible to work in Germany; EU/NATO citizenship preferred and export-control screening applies.
Nice‑to‑haves
- Great Expectations or similar data quality tooling; metadata/lineage platforms (OpenMetadata/DataHub/Atlas).
- Experience shipping in on-prem or air-gapped environments; governance/policy awareness for regulated customers.
- German language (B1+) and/or experience with OSINT/GEOINT/multi-INT data shapes.
What We Offer
- Modern data stack with real constraints: Kafka + NiFi + lakehouse + distributed SQL + Kubernetes.
- Remote-first in Germany with regular Berlin prototyping sprints, 30 days vacation, equipment & learning budget.
- High leverage: your prototypes become blueprints multiple teams reuse and productize.
Skills
Explore related jobs
More jobs at Orcrist Technologies
Similar Atlas jobs
Jobs in Berlin
Sales Manager (m/w/d) – Enterprise Softwarevertrieb & RFP (Facility Management)Vertigis · Berlin, Berlin- Zerspanungsmechaniker, CNC-Dreher/ Fräser (m/w/d) - bis zu 19,00 € / Std.Tabel Personalberatung GmbH · Berlin, Berlin
- Werkstudent (m/w/d) SAP-Beratung – SAP- Support und Logistikstatus C AG · Berlin, Berlin
Client Service Director (gn)Intermategroupgmbh · Berlin, Berlin
Junior Data Analyst (gn)Intermategroupgmbh · Berlin, Berlin
Senior Projektmanager:in (gn) Telekommunikation - Social Media & Influencer MarketingIntermategroupgmbh · Berlin, Berlin