Jobless Developer
Müller`s Solutions logo
Müller`s Solutions

Posted 4 months ago

Open

Dell AI Infrastructure & MLOps Engineer - (6 Month Only)

DubaiOn-siteFull-time

AI Summary

Operates, configures, and maintains AI infrastructure and MLOps platforms (Kubernetes-based workloads) for a 6-month project, serving as a technical advisor and hands-on engineer to deploy a strategic AI environment.

About this role

As an AI Infrastructure & MLOps Engineer at Müller’s Solutions for a 6-month contract, This role is primarily operations-focused (90%), with hands-on involvement in ** implementation, configuration, and setup** of AI infrastructure and MLOps workflows.

You will play a key role in managing, operating, and guiding the deployment of a **strategic AI environment **, working closely with the customer as a technical advisor and hands-on engineer.

What about the role responsibilities?

  • Operate and maintain AI infrastructure and MLOps platforms in a production environment.
  • Monitor, manage, and troubleshoot Kubernetes-based AI workloads.
  • Perform Acceptance Testing Planning and Execution for AI infrastructure and platforms.
  • Ensure stability, performance, and availability of AI systems.
  • Support day-to-day operational tasks across compute, storage, and networking layers.
  • Install and configure NVIDIA Enterprise AI Stack (NVAI).
  • Configure and manage MLOps platforms such as ** Kubeflow and MLflow **.
  • Assist in setting up **end-to-end AI workflows **, including data pipelines.
  • Support the initial implementation phase of the AI environment.
  • Act as a technical guide and advisor to the customer during the early stages of their AI adoption.

Requirements

What should you have to fit in this role?

Technical Requirements

AI / MLOps Stack

  • Proficient experience with the NVIDIA Enterprise AI Stack
  • Familiarity with Ubuntu Linux
  • Experience with Kubernetes
  • Knowledge of Kubeflow / MLflow
  • Experience with QFLOW (an open-source AI data pipeline management tool)

Programming & Automation

  • 4–6 years of practical experience in:

    • Python
    • Jupyter Notebook / JupyterLab
  • Competence in writing, testing, and maintaining operational scripts and AI workflows.

Infrastructure Experience

Practical experience with enterprise infrastructure, encompassing:

  • Dell PowerScale (5 nodes)
  • XE Server (1 node)
  • Dell R570 Servers (5 nodes)
  • Dell Network Switches (2 switches)
  • GPU-based AI servers (in a small-scale environment)

Environment Overview

  • Initial implementation of AI

  • Compact configuration:

    • 1 GPU server
    • 1 PowerScale
    • 5 control plane servers
  • Opportunity to shape best practices from the ground up

To succeed in this role, it's nice to have:

• Familiarity with data frameworks like Apache Spark or Hadoop for data processing.

• Understanding of ML model monitoring and logging practices to ensure system reliability.

• Experience with security best practices in AI systems.

Skills

AI Security Best PracticesAI Workflow AutomationApache Spark (nice To Have)Data PipelinesDell Network SwitchesDell PowerScaleDell R570 ServersGPU-based AI ServersHadoop (nice To Have)JupyterLabJupyter NotebookKubeflowKubernetesLogging PracticesMLflowML Model MonitoringNVIDIA Enterprise AI StackOperational ScriptingPythonQFLOWUbuntu LinuxXE Server

Explore related jobs

Browse these categories