Sr./Staff ML Infrastructure Engineer, Compute (TPU Scheduling) - Foundation Model

Apple Inc

Quick summary

Work type: On-site
Location: Seattle, WA
Salary: $171,600–$302,200 / yr
Posted: 27 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $217k

This role $237k

$146k most similar roles pay here $319k

This role pays more than 66% of similar roles. Most pay $183,487–$249,750 — the shaded band above. At the midpoint, this role pays about $237k versus about $217k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 638 open roles on FindRole.

Listed pay typically runs $171,600–$272,100 across 505 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Sr./Staff ML Infrastructure Engineer, Compute (TPU Scheduling) - Foundation Model

Apply Now Log in to save

As a Senior/Staff ML Infrastructure Engineer on the Foundation Model Compute Infrastructure team, you will design and develop scheduling and orchestration systems for TPU-based workloads across multi-region clusters. Your day-to-day responsibilities include building topology-aware schedulers to improve utilization and reliability, developing orchestration systems for distributed ML workloads on Kubernetes, and automating provisioning and resource management workflows. You will collaborate with foundation model teams to support advanced frameworks like Pathways and JAX, and mentor engineers while influencing architectural direction across the AI compute platform. The role requires strong programming skills in Python, Go, or C++, experience with Kubernetes and large-scale cluster management systems, and expertise in distributed systems, scalability, reliability, and performance engineering. Familiarity with TPU infrastructure and frameworks such as JAX, TensorFlow, and PyTorch is preferred.

Skills

Python Kubernetes Go TPU GPU JAX PyTorch TensorFlow Ray Pathways Docker CI/CD

What you'll do

Design and evolve scheduling systems for TPU-based workloads across multi-region clusters.
Build topology-aware schedulers to enhance utilization and reliability of TPU infrastructure.
Develop orchestration systems for distributed ML workloads on Kubernetes and accelerator hardware.
Automate provisioning, resource management, and recovery handling to improve cluster efficiency.
Mentor engineers and influence architectural direction in Apple’s AI compute platform.

What we're looking for

7+ years of experience building large-scale distributed systems or cloud infrastructure
Strong programming skills in Python, Go, C++, or similar languages
Extensive experience with compute infrastructure and workload scheduling
Expertise in distributed systems, scalability, reliability, and performance engineering
Experience with Kubernetes, container orchestration, or large-scale cluster management systems
Bachelor’s degree in Computer Science, Engineering, or related field

Similar roles

Sr./Staff ML Infrastructure Engineer, Compute (TPU Scheduling) - Foundation Model

Apple Inc

Santa Clara, CA 27 days ago $181,100–$318,400

Python Kubernetes TPU Go C++ Docker JAX PyTorch TensorFlow Ray Pathways Prometheus Grafana CI/CD AWS Azure Google Cloud Platform

Save

Sr. / Staff ML Engineer, FM Training Integration - ML Compute

Apple Inc

Santa Clara, CA 23 days ago $181,100–$318,400

Python PyTorch JAX Docker Kubernetes GPU TPU CI/CD NVIDIA Nsight PyTorch Profiler AWS Azure GCP PostgreSQL MongoDB

Save

AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Infrastructure

Apple Inc

San Francisco, CA 23 days ago $181,100–$318,400

Python Kubernetes Ray PySpark JAX PyTorch TensorFlow CUDA NCCL TPU XLA Docker CI/CD MLOps Prometheus Grafana AWS GPU High-performance networking

Save

Senior ML Infrastructure Engineer - VE Algorithms

Apple Inc

San Diego, CA 45 days ago $139,500–$258,100

Python PyTorch C++ Docker Kubernetes AWS CI/CD Prometheus Grafana PostgreSQL

Save

Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms

Apple Inc

San Francisco, CA 38 days ago $181,100–$318,400

CUDA Pytorch Tensorflow Golang Python Nvidia_TensorRT-LLM vLLM DeepSpeed Nvidia_Triton_Server Triton CI/CD

Save

Staff ML Infrastructure Engineer (Compute)

General Motors (GM)

Remote (Gm Automation - Sunnyvale - Gm Automation - Sunnyvale, US) 9 days ago $197,000–$326,000

Kubernetes Docker Go AWS GCP Azure CI/CD Prometheus Grafana Python PostgreSQL Terraform GitLab HPC GPU Telemetry

Remote

Save