Jobless Developer
Buzz Solutions logo
Buzz Solutions

Posted 1 month ago

Open

Applied Machine Learning Platform Engineer

RemoteRemoteFull-time

AI Summary

An entry-to-mid level role focused on building and scaling ML training infrastructure, data pipelines, and tooling to support computer vision models.

About this role

About Us

Buzz is revolutionizing the analytics and maintenance of power grid infrastructure through our advanced AI solutions. Our computer vision systems analyze critical infrastructure to enhance safety, reliability, and operational efficiency across the power grid network.

Job Description

We're looking for an entry/mid-level Applied Machine Learning Platform Engineer to join our computer vision team and help improve the databases, cloud infrastructure, and tooling our team builds on. You'll build tooling and infrastructure to help scale our training and data pipelines. You'll work within a team of experienced ML engineers with the autonomy to drive your own projects and the support to keep growing.

Responsibilities

  • Design, build, and maintain scalable training infrastructure for computer vision workloads
  • Implement and manage distributed training pipelines (multi-GPU, multi-node) to support large-scale model training and hyperparameter tuning
  • Build and maintain robust data pipelines for ML development
  • Design database schemas and storage strategies for managing large training datasets, annotations, and model artifacts
  • Implement and manage feature stores, data versioning, and experiment tracking to support reliable model iteration
  • Automate existing analysis workflows
  • Maintain clear documentation for platform components, data contracts, and deployment processes
  • Communicate infrastructure decisions, tradeoffs, and system limitations clearly to ML engineers and stakeholders
  • Conduct thorough code reviews and write integration tests for ML pipelines

Qualifications & Experience

  • 2-4 years of industry experience in platform, backend, data, or MLOps engineering roles
  • Python proficiency — idiomatic code, type hints, async patterns, packaging, and performance-aware implementation
  • Strong software engineering fundamentals — testing, code review, API design, component-level system design
  • Hands-on experience building and operating distributed cloud machine learning infrastructure
  • Designing and maintaining scalable training infrastructure, managing ML platform reliability, optimizing data pipelines for throughput at scale
  • Experience with database design and data systems for ML workloads — schema design, query optimization, and storage strategies for large-scale datasets
  • Excels at workflow orchestration and automation
  • Solid proficiency in Python and core ML tooling:
    • Python ecosystem: Pytest, UV, FastAPI, Pydantic
    • Tooling: Git, Docker, UV
    • Tracking: MLflow, Weights & Biases, or equivalent
    • Automation: Github Actions, CI/CD, Prefect or equivalent
    • Infrastructure: AWS, GCP, Kubernetes, Helm, Terraform or equivalent
    • Databases: postgres, DynamoDB, Bigtable

* Buzz Solutions does not provide Visa sponsorship for work authorizations in the United States at this time *

Skills

API DesignAsync PatternsAutomationBigtableCI/CDComponent-level System DesignDatabase Design For ML WorkloadsData PipelinesData VersioningDistributed Cloud ML InfrastructureDockerDynamoDBExperiment TrackingFeature StoresGitGitHub ActionsHelmKubernetesMLflowMVflow?PackagingPerformance-aware ImplementationPostgreSQLPrefectPythonTerraformTestingType HintsWeights & BiasesWorkflow Orchestration

Explore related jobs

Browse these categories