Senior Site Reliability Engineer, CORE (Member Experience / Resilience Operations)

Netflix

Remote

Quick summary

Work type: Remote
Location: Remote
Salary: $388,000–$558,000 / yr
Posted: 63 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $173k

This role $473k

$87k most similar roles pay here $608k

This role pays more than 99% of similar roles. Most pay $142,400–$203,200 — the shaded band above. At the midpoint, this role pays about $473k versus about $173k for comparable roles.

Based on 240 similar postings.

Employer

About Netflix

Netflix is the world''s leading streaming entertainment service, offering a vast library of TV series, films, documentaries, and original content to subscribers in over 190 countries. Industry: Streaming Entertainment & Media

Netflix currently has 117 open roles on FindRole.

Listed pay typically runs $388,000–$619,000 across 113 roles with salary data.

Most-posted roles

View all roles at Netflix

At a glance

TL;DR · Senior Site Reliability Engineer, CORE (Member Experience / Resilience Operations)

Apply Now Log in to save

The Critical Operations and Reliability Engineering (CORE) team at Netflix seeks a Senior Site Reliability Engineer to enhance system reliability and observability for its global streaming service. This role involves designing fault-tolerant infrastructure, embedding reliability practices in the software development lifecycle, and defining key performance metrics like Service Level Objectives. The ideal candidate will automate deployment processes, manage on-call responsibilities, and lead incident response efforts while fostering a culture of continuous improvement across teams. Strong coding skills in Python, Go, or Java are essential, along with hands-on experience with cloud infrastructure such as AWS, Azure, or GCP. Candidates should have a deep understanding of distributed systems and the ability to balance reliability, velocity, and cost through data-driven decision-making.

Skills

AWS Python Go Kubernetes Terraform CI/CD Prometheus Grafana Docker Service Level Objectives SLOs Incident Management Observability Performance Tuning Java

What you'll do

Design and evolve resilient infrastructure for member-facing services, ensuring scalability and fault tolerance.
Embed reliability and observability into software development lifecycle across multiple teams.
Define and measure Service Level Objectives (SLOs) to guide capacity planning and operational priorities.
Build automated processes for deployment, monitoring, and incident response to ensure reliable operations.
Lead incident response efforts, focusing on learning and systemic fixes to avoid repeat issues.
Identify and reduce sources of instability in distributed systems through production analysis.

What we're looking for

5+ years of experience in SRE or Production Engineering roles for business-critical services.
Proficient in Python, Go, Java, or similar languages for automation and solution development.
Expertise in large-scale cloud environments on AWS, Azure, GCP with abstracted compute systems.
Deep understanding of distributed system failures, performance bottlenecks, and resilience design.
Proven ability to identify and mitigate reliability risks through metrics and architecture reviews.
Strong observability skills using metrics, logs, traces for debugging complex systems.
Experience in incident management, response coordination, and durable improvement follow-through.

Similar roles

Site Reliability Engineer, Senior

Booz Allen Hamilton

Aurora, CO 63 days ago $86,900–$198,000

Linux HP

Save

Site Reliability Engineer, Senior

Booz Allen Hamilton

Chantilly, VA +1 45 days ago $86,800–$198,000

Linux CI/CD

Save

Site Reliability Engineer, Senior

Booz Allen Hamilton

Aurora, CO 77 days ago $86,900–$198,000

Linux CI/CD

Save

Site Reliability Engineer, Senior

Booz Allen Hamilton

Aurora, CO 31 days ago $86,900–$198,000

Linux HP

Save

Senior Site Reliability Engineer

Adobe

San Jose 65 days ago $208,300–$301,600

AWS Kubernetes Terraform Python Go CI/CD Infrastructure as Code Docker PostgreSQL Security hardening AI-enabled platforms Cross-team leadership Developer experience optimization

Save

Senior Site Reliability Engineer

Carta

San Francisco, California +2 69 days ago $181,688–$213,750

AWS Terraform Python Kubernetes Docker Postgres Prometheus Grafana CI/CD gRPC Ansible ELK Stack Datadog GraphQL

Hybrid

Save