Software Engineer 5 – Model Runtime, AI Platform

Netflix

Remote

Quick summary

Work type
Remote
Location
Remote
Salary
$466,000–$750,000 / yr
Posted
46 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $179k
This role $608k
$68k most similar roles pay here $823k

This role pays more than 99% of similar roles. Most pay $142,400–$214,850 — the shaded band above. At the midpoint, this role pays about $608k versus about $179k for comparable roles.

Based on 240 similar postings.

Employer

About Netflix

Netflix is the world''s leading streaming entertainment service, offering a vast library of TV series, films, documentaries, and original content to subscribers in over 190 countries. Industry: Streaming Entertainment & Media

Netflix currently has 117 open roles on FindRole.

Listed pay typically runs $388,000–$619,000 across 113 roles with salary data.

Most-posted roles

View all roles at Netflix

At a glance

TL;DR · Software Engineer 5 – Model Runtime, AI Platform

As a Software Engineer at Netflix's Model Runtime team, you will work on the cutting edge of machine learning infrastructure, designing systems for reinforcement learning, reward modeling, and preference optimization. You will enable next-generation GenAI workloads by creating scalable distributed training frameworks and optimizing GPU pipelines for real-time inference. Your responsibilities include scaling fault-tolerant training across hundreds of GPUs using FSDP and mixed-precision strategies, as well as profiling PyTorch operators to enhance GPU utilization. The role requires expertise in ML systems engineering, hands-on experience with PyTorch internals, and proficiency in cloud computing, particularly AWS. Ideal candidates have a background in distributed training at scale, inference optimization techniques like quantization, and GPU performance tuning using CUDA and Nsight. This position offers the opportunity to tackle complex challenges in AI infrastructure that directly impact Netflix's global streaming service.

What you'll do

  • Build alignment and post-training infrastructure for reinforcement learning models.
  • Enable next-generation GenAI workloads including distributed training and serving.
  • Scale distributed training systems using FSDP across hundreds of GPUs.
  • Optimize full stack from PyTorch operators to GPU kernels for efficiency.
  • Evaluate emerging hardware and frameworks to keep Netflix at the efficiency frontier.

What we're looking for

  • Experience in ML systems engineering for large-scale training and inference.
  • Strong skills in systems programming across multiple stack layers, including PyTorch internals.
  • Hands-on experience with distributed training and system-model codesign at scale.
  • Comfort with ambiguity and ability to work across business and technical domains.
  • Expertise in cloud computing providers, preferably AWS.
  • Excellent written and verbal communication skills for remote environments.

More like this

Similar roles

Software Engineer 5 – Model Serving Systems, AI Platform

Netflix

Remote (Usa - Remote, US) 4 days ago $466,000$750,000
AWS Triton Inference Server TensorRT Docker Java Python Kubernetes CI/CD LLMs Model Serving Infrastructure High Availability Performance Tuning Deployment Management Capacity Planning Observability Logging
Remote

Careers

Qualcomm

San Diego, CA 46 days ago
Python C++ C TensorFlow PyTorch ONNX GPU NPU CPU Computer_Vision Audio Generative_AI Linux Windows CI/CD

AI Software Engineer

Broadcom

Atlanta, GA +2 51 days ago $108,000$172,800
Java Spring GitHub Git GitHubActions CI/CD Micrometer OpenTelemetry LargeLanguageModels LLMs VectorDatabases Langchain4J Embable Anthropic OpenAI AmazonBedrock GoogleGenAI AzureOpenAI TanzuPlatform10 Bitnami SpringAI

AI Software Engineer

Booz Allen Hamilton

Arlington, VA 72 days ago $86,800$198,000
Python Rust Go Scala Java RESTful APIs CI/CD GitLab CI Jenkins Agentic AI solutions Linux Docker AWS LocalStack ESXi Ansible Kubernetes SIEMs Security+ Linux+