Member of Technical Staff, Training (Paris, London)
AI Summary
Senior engineering role focusing on profiling, bottleneck elimination, and building distributed training systems for large-scale foundation models, from data pipelines to GPU kernels.
About this role
What You’ll Do
Drive down wall-clock time to convergence by profiling and eliminating bottlenecks across the foundation model training stack stack, from data pipelines to GPU kernels
Design, build, and optimize distributed training systems (PyTorch) for multi-node GPU clusters, ensuring scalability, robustness, and high utilization
Implement efficient low-level code (CUDA, cuDNN, Triton, custom kernels) and integrate it seamlessly into high-level training frameworks
Optimize workloads for hardware efficiency: CPU/GPU compute balance, memory management, data throughput, and networking
Develop monitoring and debugging tools for large-scale runs, enabling rapid diagnosis of performance regressions and failures
What You’ll Bring
Deep experience in distributed systems, ML infrastructure, or high-performance computing (8+ years)
Production-grade expertise in Python
Low-level performance mastery: CUDA/cuDNN/Triton, CPU–GPU interactions, data movement, and kernel optimization
Scaling at the frontier: experience with PyTorch and training jobs using data, context, pipeline, and model parallelism
System-level mindset with a track record of tuning hardware–software interactions for maximum utilization
Skills
Explore related jobs
More jobs at Genesis
Similar CPU–GPU Interactions jobs
Jobs in Paris
- Chef de division – Indicateurs et performances en matière de gouvernanceOecd · Paris, Île-de-France Region
- CDD - Bagagiste-Voiturier tournant (H/F)Relais & Châteaux · Paris, IDF
- Head of Division, Governance Indicators and PerformanceOecd · Paris, Île-de-France Region
Vendeuse grands magasin H/F (CDD 35H)AUBADE · Paris, France- Responsable juridique - Paris - H/FIliad - Free · Paris, IDF
- Délégué Commercial H/FMETRO/MAKRO · Paris, IDF
