Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms

Apple Inc

Quick summary

Work type
On-site
Location
San Francisco, CA
Salary
$181,100–$318,400 / yr
Posted
38 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $223k
This role $250k
$162k most similar roles pay here $335k

This role pays more than 78% of similar roles. Most pay $196,750–$249,750 — the shaded band above. At the midpoint, this role pays about $250k versus about $223k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 638 open roles on FindRole.

Listed pay typically runs $171,600–$272,100 across 505 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms

Join the Foundation Model Inference Team within AI, Search & Knowledge Platforms as a Staff/Senior Machine Learning Engineer to build and optimize the inference stack for Apple's foundation models. You will collaborate with research teams to fine-tune cutting-edge model architectures and work closely with product teams to deploy production-grade solutions serving millions of users in real-time. Your responsibilities include developing profiling tools and simulators to identify performance bottlenecks across various hardware platforms, as well as mentoring engineers within the organization. Ideal candidates have 5+ years of experience leading complex projects, expertise in LLM inference stacks, GPU programming with CUDA, and proficiency in frameworks like PyTorch or TensorFlow. Experience with Nvidia TensorRT-LLM, vLLM, DeepSpeed, and Triton Server is a plus, as is familiarity with high-throughput services at supercomputing scale.

What you'll do

  • Optimize inference for cutting-edge model architectures with the Foundation Model Research team.
  • Build production-grade solutions to launch models serving millions of customers in real time.
  • Develop tools and simulators to identify bottlenecks in inference across various hardware.
  • Mentor engineers within the organization on complex, ambiguous projects.
  • Enhance high-throughput services at supercomputing scale for efficient model deployment.

What we're looking for

  • 5+ years of experience leading and driving complex projects
  • Expertise in LLM inference stack optimization
  • Proficiency with GPU programming using CUDA
  • Experience with ML frameworks like PyTorch or TensorFlow
  • Knowledge of high-throughput services at supercomputing scale
  • BS degree in Computer Science, AI, Machine Learning, Data Science, or related field
  • Familiarity with Nvidia TensorRT-LLM and other inference optimization tools

More like this

Similar roles

Staff Machine Learning Engineer

Apple Inc

Cupertino, CA 43 days ago $212,000$318,400
Python PyTorch JAX TensorFlow Spark Daft Rust Java Go Kubernetes Docker CI/CD Parquet Iceberg Delta Lance Ray Data NVIDIA DALI WebDataset Mosaic StreamingDataset Arrow DataHub OpenLineage Unity Catalog Polars DuckDB

Sr Staff Machine Learning Engineer, ML Platform

Apple Inc

Cupertino, CA 9 days ago $212,000$386,300
Python TensorFlow PyTorch Kubernetes Docker CI/CD Prometheus Grafana PostgreSQL AWS Azure MLOps Federated Learning Differential Privacy LLMs Agentic AI

Senior Staff Machine Learning Engineer, Search & Discovery

SpaceX

Remote (US) 81 days ago $313,000$330,500
MachineLearning LargeLanguageModels AgenticAISystems RecommendationSystems RankingModels Embeddings RepresentationLearning SearchSystems GenerativeAI CI/CD Python Scalability RealTimeSystems Experimentation CloudServices Docker Kubernetes Terraform
Remote

Staff Machine Learning Engineer, Search Ranking

Snap Inc.

Santa Monica, CA 2 days ago $229,000$343,000
Python TensorFlow PyTorch Spark Flink Beam Java Scala C++ JAX A/B testing learning-to-rank LambdaMART neural ranking models transformer-based rankers large-scale data processing ML infrastructure online experimentation model monitoring feature pipelines training infrastructure serving systems multi-objective optimization LLMs foundation models semantic search natural language understanding retrieval-augmented generation