Posted 17 months ago
Senior Network Engineer
AI Summary
Senior Network Engineer responsible for designing, implementing, and maintaining large-scale hybrid data center networks, focusing on routing, switching, security, and automation to ensure high availability and scalability.
About this role
About the Role
As a Senior Network Engineer at Together, you are responsible for designing, implementing, and maintaining our network infrastructure to ensure seamless connectivity and optimal performance for all user-facing services and production systems. As both a strategic planner and a hands-on engineer, you apply sound networking principles, operational discipline, and advanced automation to our network environments.
You specialize in networking systems—including routing, switching, network security, and protocols—implementing best practices for availability, reliability, and scalability. You have a keen interest in network design, optimization, and emerging technologies in HPC-based data center networking.
Outstanding problem-solving abilities and a comprehensive understanding of fundamental network theory are also critical to your success.
Requirements
- 8+ years of professional experience building, managing, and supporting large-scale hybrid data center networks (excluding enterprise networks).
- High level of proficiency with TCP/IP networking architecture and technologies such as BGP, OSPF, VXLAN, EVPN, and QoS.
- Experience developing network automation pipelines using Python, Ansible, or other languages/tools utilized in infrastructure automation.
- Proficient in using tools such as Wireshark, tcpdump, nmap, MTR, and curl to identify connectivity issues, latency problems, and network bottlenecks.
- Experience designing and supporting multi-tenant networks
- Hands-on experience deploying and supporting network devices from Cisco, Arista, Juniper, and Mellanox.
- Experience working with cloud networks such as AWS, GCP, and Azure.
- Solid experience working in and troubleshooting within a Linux environment.
Responsibilities
- Design, deploy, manage and maintain global multi-vendor, multi-protocol high performance compute networks.
- Analyze data to diagnose and identify root causes to network issues to minimize downtime
- Evaluate and recommend network technologies, hardware, and software solutions.
- Participate in design reviews to ensure the proposed network architecture aligns with business needs and is optimized for performance, scalability, and reliability.
- Manage relationships with external vendors and partners to test and verify hardware and software selections.
- Develop, and deploy systems and tools to keep all networks running reliably and efficiently
- Establish and implement industry best practices and contribute to the design of new scalable network solutions
- Ensure compliance with IT governance standards and best practices.
- Lead projects to address complex technical challenges, directly contributing to roadmaps and partner alongside the best engineers in the industry to develop world-class solutions
Preferred
- Knowledge of RoCE and Infiniband protocols a plus
- Experience with Docker, Kubernetes, or Slurm a plus
- Understanding of AI training workloads and the demands they exert on networks a plus
About Together AI
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.
Compensation
We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $190,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.
Equal Opportunity
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
Please see our privacy policy at https://www.together.ai/privacy
Skills
Explore related jobs
More jobs at Together AI
- Systems Research Engineer Intern - GPU Programming (Fall 2026)San Francisco
- Research Intern, Inference (Fall 2026)San Francisco
- Frontier Agents Intern (Fall 2026)San Francisco
- Data Center Operations CoordinatorSan Francisco
- Staff Engineer, Distributed Storage and HPC & AI InfrastructureSan Francisco
- Manager, Infrastructure Strategy & OperationsSan Francisco
Similar Ansible jobs
Jobs in San Francisco
- Senior Software Engineer, Frontend Full Stack - IT ProductRippling · San Francisco, Canada
- Senior Software Engineer, Backend - IT ProductRippling · San Francisco, Canada
- Senior Director of Data Analytics & Business SystemsInstead · San Francisco
- Product ManagerInstead · San Francisco
- Platform EngineerInstead · San Francisco
- Senior Derivative AccountantRipple · San Francisco, CA