Technical Product Manager - AI Cloud Infrastructure
AI Summary
Role Summary:We are seeking a Technical Product Manager – AI Cloud Infrastructure to join our fast-scaling team. In this role, you will embed with engineering to act as the "First Customer," owning the continuous validation, reliability strategy, and technical documentation for our bare-metal, VM, Kubernetes, and ML infrastructure.
About this role
Role Summary:
We are seeking a Technical Product Manager – AI Cloud Infrastructure to join our fast-scaling team. In this role, you will embed with engineering to act as the "First Customer," owning the continuous validation, reliability strategy, and technical documentation for our bare-metal, VM, Kubernetes, and ML infrastructure. By treating testability as a core feature and shadowing real-world workflows, you will ensure our compute platform handles the demands of advanced AI training and engineering workloads. This is an opportunity to join a mission-led AI business that is redefining infrastructure, intelligence, and impact for enterprise customers.
Key Responsibilities:
- Execute integration testing in staging environments, work closely with the platform engineers to build repeatable test frameworks, and shadow internal and external AI infrastructure engineers to translate their real-world usage patterns into automated in-house test cases.
- Establish strict quality gates, performance SLOs, and scheduling benchmarks that our compute and orchestration services must pass before production deployment.
- Review, refine, and author technical guides, API documentation, and CLI guides, using them as the blueprint to test the platform exactly as an external engineer would.
- Partner with software and platform engineers to design robust validation suites, anticipating complex edge cases and structural failure modes across bare-metal provisioning and Kubernetes cluster lifecycles.
Essential Experience:
- Technical familiarity with bare-metal infrastructure (e.g., PXE booting, IPMI/Redfish), virtualization layers (e.g., KVM), and container orchestration (Kubernetes or similar).
- Track record designing comprehensive test strategies, validation frameworks, and acceptance criteria for highly technical cloud-native, API, or infrastructure-as-a-service (IaaS) products.
- Analyse infrastructure services, CLIs, and APIs from a developer’s perspective to identify friction points, usability gaps, and reliability risks.
- Working knowledge of modern CI/CD pipelines, automated testing, and automation tooling (e.g., GitLab CI, GitHub Actions, Terraform, Ansible) to help engineering shape automated quality gates.
- Proven experience in a highly technical role embedded directly within a core infrastructure or platform engineering team.
One or more would be an advantage:
- Direct exposure to high-performance computing (HPC) setups, large-scale cluster scheduling (e.g., Slurm), or infrastructure optimized for heavy AI/ML training workloads.
- Experience using cloud observability, telemetry, and monitoring tools (e.g., Prometheus, Grafana, Datadog) to track and improve system reliability metrics.
- Experience writing or structuring technical documentation, API reference guides, and developer tutorials from scratch.
Why Join Era4:
You’ll be joining a mission-driven start-up building critical national infrastructure, where operational excellence directly enables growth. This role offers high visibility with leadership, real autonomy, and the chance to shape how a next-generation company operates at scale.
Diversity & Inclusion:
Era4 is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Era4 develops, owns and operates AI infrastructure across the UK, powered by renewable energy. Converting legacy industrial and energy sites into modern data-centre facilities, Era4 is combining brownfield regeneration opportunities with cleaner, efficient, scalable compute capacity for healthcare, research, finance, enterprise, and public-sector organisations
Explore related jobs
More jobs at Era4
Jobs in London
Water Hygiene EngineerSMS Environmental · London, London
Senior Manager - Human ResourcesMacrobond Financial AB · London, London
Customer Service ExecutiveLITTA APP LIMITED · London, Greater London
Allocations & Logistics Co-ordinatorLITTA APP LIMITED · London, Greater London
Project Assistant - Museums & ExhibitionsConstantine · London, London- Senior Brand & Packaging DesignerOlsam Group · London, Greater London