Datacentre Operations Engineering
AI Summary
About UsWe’re a fast-growing GPU-as-a-Service provider, delivering scalable, high-performance compute infrastructure purpose-built for AI and HPC workloads. Operating across global data centres, we run mission-critical environments where uptime, throughput, and ultra-low latency are non-negotiable.Role OverviewWe are seeking a deeply technical, hardware-passionate Datacentre Operations Engineer to execute on-the-ground operations for our Paris-Saclay deployment—Radiant's premier, cuttingedge AI
About this role
About Us
We’re a fast-growing GPU-as-a-Service provider, delivering scalable, high-performance compute infrastructure purpose-built for AI and HPC workloads. Operating across global data centres, we run mission-critical environments where uptime, throughput, and ultra-low latency are non-negotiable.
Role Overview
We are seeking a deeply technical, hardware-passionate Datacentre Operations Engineer to execute on-the-ground operations for our Paris-Saclay deployment—Radiant's premier, cuttingedge AI infrastructure site in Europe. This role focuses on delivering precise, repeatable physical practices—including advanced smart-hands support, complex cabling, and hands-on operation of advanced liquid cooling and ultra-high-density compute systems—to guarantee world-class SLAs on next-generation hardware architecture.
Working closely with Infrastructure (HPC) SRE, Network Engineering, and Datacentre Strategy teams, you will uphold uncompromising standards on the data centre floor. You will live and breathe the hardware, maintaining elite facility reliability through hands-on deployment, proactive maintenance, rapid incident response, and structured break/fix execution across advanced liquid cooling systems, busbar-based high-density power distribution, and next-generation GPU compute platforms. The role centres on technical execution and optimised output.
You will turn global engineering standards into flawless, repeatable daily routines, continually honing on-the-ground practices to keep our most advanced hardware running at peak performance. Experience with NVIDIA NVL72- class or busbar/high-density compute is strongly valued; candidates who can demonstrate a strong aptitude and clear willingness to train to operational proficiency on these platforms are equally welcome.
As Radiant expands its EMEA footprint, your relentless drive for hardware perfection and proven field expertise with high-density environments will serve as the operational blueprint to scale execution models efficiently across the region.
What’s in it for you?
Join a team operating some of the world’s most advanced high-performance computing infrastructure. As a Datacentre Operations Engineer, you’ll work hands-on with cutting-edge GPU and CPU platforms — including the latest NVIDIA architectures — powering dense, large-scale compute environments used for AI, machine learning, and next-generation workloads.
This is an opportunity to build expertise at the forefront of modern infrastructure, where reliability, scale, and performance matter every day. You’ll collaborate with experienced engineers across a globally distributed organisation that values openness, inclusion, technical excellence, and continuous learning.
We move quickly, solve meaningful challenges, and give people the space to make an impact. If you thrive in fast-paced environments, enjoy working with advanced technology, and want to help shape the future of high-performance compute, you’ll find both challenge and opportunity here.
You can also expect:
Exposure to industry-leading GPU and AI infrastructure
Opportunities to grow alongside a rapidly scaling global business
A collaborative, inclusive, and supportive engineering culture
Real ownership and the ability to influence operational excellence
Work that sits at the intersection of people, performance, and technology
A modern, flexible, globally connected workplace with ambitious goals
Key Responsibilities
Hardware Operations & Break/Fix
Quickly diagnose and resolve hardware and network issues to maximise uptime; execute structured fault isolation methodologies to drive rapid resolution
Respond to critical hardware alerts via our monitoring and observability platform; contribute to ongoing service improvement to improve monitoring capability and alert quality
Deploy and maintain HPC and AI hardware for uninterrupted operations, including hardware troubleshooting, firmware updates, and component replacement
Execute break/fix procedures for advanced hardware platforms, including GPU module exchange, component-level fault isolation, and firmware-level diagnostics
Execute or support break/fix operations on ultra-high-density compute systems including NVIDIA NVL72-class (GB200 NVLink rack-scale) or equivalent platforms, including coolant loop isolation, GPU module swap, and busbar connection/disconnection—under the direction of the Lead where qualification is in progress Liquid Cooling Operations
Operate, monitor, and maintain advanced Direct Liquid Cooling (DLC) systems, including Cooling Distribution Units (CDUs), rear-door heat exchangers, in-row cooling, and associated coolant infrastructure
Execute routine and corrective maintenance on liquid cooling circuits: topping up coolant, monitoring flow rates and temperatures, identifying and reporting leaks, and performing scheduled inspections
Follow and contribute to SOPs for safe working on liquid-cooled compute platforms, including isolation and lock-out/tag-out procedures
Monitor thermal performance and raise anomalies before they escalate into incidents Capacity Management
Contribute to site-level capacity management operations, maintaining accurate records of power, space, and cooling utilisation
Support capacity planning activities by providing accurate as-built data and flagging infrastructure changes to the Lead and relevant teams
Manage on-the-ground assets from point of purchase and delivery through lifecycle management and disposal, owning asset management within Radiant's CMDB system Infrastructure & Facilities
Handle RMAs and support requests within Radiant's Service Level Objectives (SLOs) to meet customer contract SLAs
Contribute to ongoing maintenance, fostering compliance and leveraging strong vendor partnerships
Operate cooling, power distribution (including busbar and PDU infrastructure), and other critical data centre technologies to maintain high operational standards
Develop and maintain datacentre/hardware management SOPs, ensuring continual alignment with Radiant's governance and compliance requirements
Service & Operational Excellence
Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement
Operate and support services 24x7x365 for production environments, including on-call rotation • Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations
Communicate technical decisions clearly to stakeholders and customers • do, document, automate
Champion a culture of: do, document, automate
Willing to cross train and upskill in Infrastructure/Platform SRE practices
Willing to travel across EMEA to support future datacentre onboarding and train in new technologies
Essential Skills & Experience
Degree in Computer Science/Electrical Engineering, or 5+ years of directly relevant industry experience in data centre operations
3+ years of experience in data centre operations, HPC, or related roles
Passion for hardware and upholding the highest operational standards on the ground
Strong communication skills in both French and English
Proven hands-on experience with HPC NVIDIA GPU platforms or equivalent high-density compute systems, high-performance storage, and networking
Practical knowledge of Direct Liquid Cooling (DLC) systems—CDU operation, coolant monitoring, leak detection, and associated maintenance—or strong related cooling infrastructure experience with clear willingness to train on DLC
Experience with, or demonstrable willingness and aptitude to train on, ultra-high-density compute platforms such as NVIDIA NVL72, busbar-based power distribution, or equivalent systems operating at >30kW/rack
Familiarity with structured break/fix practices for complex hardware platforms, including coolant loop isolation, module-level component exchange, and firmware fault isolation
Expertise in hardware installation, network configuration, and low-level system maintenance, including firmware management
Knowledge of data centre environment technologies, including cooling and high-density power distribution
Understanding of capacity management principles: power, space, and cooling tracking
Strong understanding of hardware and spares management; ability to handle RMAs within defined SLOs
Understanding of HPC and AI workloads at a high level
Strong problem-solving abilities and resilience in a fast-paced environment
Strong grasp of ITSM and service operation best practices
Excellent communication skills and ability to collaborate with cross-functional, internationally dispersed teams
Comfortable interfacing with internal stakeholders and external customers
Bonus: Vendor-endorsed qualifications from NVIDIA, HPE, or equivalent OEMs for high density AI compute or liquid cooling systems
Preferred Qualifications
Knowledge of large scale private cloud deployments and capacity planning.
Qualifications in HVAC management and deployments
Certifications in relevant areas - Hardware, Networking
ITIL Foundation level qualification or equivalent experience
Explore related jobs
More jobs at Radiant
Jobs in Paris
Senior brand designerLucis · Paris- AJuriste Marchés Publics – Prestations Intellectuelles Informatiques H/FAgence du Numérique en Santé · Paris, Île-de-France
CDD - Destination Product ManagerAccorcorpo · Paris, IDF
Réceptionniste TournantAccor (Smartrecruiters) · Paris, IDF- Coiffeur / CoiffeuseProvalliance · Paris, IDF
- Coiffeur / CoiffeuseProvalliance · Paris, IDF
