Sr Principal Site Reliability Engineer

The Walt Disney Company

Actively hiring
Remote (Usa - Ca - Market St, US) Posted 51 days ago $250,500$335,900 / year

At a glance

AI generated

TL;DR

As a Sr Principal Site Reliability Engineer at Disney Entertainment & ESPN Technology in San Francisco, you will lead the SRE team within Media Engineering, focusing on ensuring high availability and incident-free uptime across the entire platform. Your daily responsibilities include developing robust monitoring and alerting practices, driving redundancy strategies, and partnering with Infrastructure, Operations, Product, and Development teams to implement best practices and conduct audits. You will also be responsible for automating release processes and improving system performance and operational efficiency. Ideal candidates have extensive experience in managing complex globally connected teams, working with large-scale distributed platforms, and a passion for developing high-performing engineering teams. Familiarity with media streaming technologies, CDN integrations, and both backend services and client development is preferred. This role demands expertise in data-driven decision-making and the ability to foster strong cross-functional relationships across various regions.

Skills

Kubernetes AWS CI/CD Docker Prometheus Grafana Python PostgreSQL Terraform Ansible GitOps CDN integration media streaming technologies content delivery strategies

What you'll do

  • Ensure platform stability and uptime across processing platforms, content supply chains, CDN delivery, and playback.
  • Develop comprehensive instrumentation and alerting practices for all critical data flows.
  • Drive redundancy and resiliency strategies across thousands of servers in datacenter and cloud environments.
  • Lead Media Engineering’s Incident Response process to prevent service incidents through proactive actions.
  • Partner with Infrastructure, Operations, Product, and Development teams to implement best-practices and conduct audits.
  • Implement automation strategy for rapid safe releases, tighter content SLAs, and operational efficiency improvements.

What we're looking for

  • Minimum 12 years of engineering leadership experience with direct and indirect team management.
  • Bachelor’s degree in Engineering or related field, or equivalent work experience.
  • Experience working across complex globally connected teams with diverse stakeholders.
  • Expertise in large-scale globally distributed platforms including content preparation, distribution, playback, operations, and infrastructure.
  • Proven ability to develop strategies for improving stability, system performance, team capability, and operational efficiency.
  • Strong track record of developing cross-functional and cross-regional relationships.
  • Passionate about continuous learning and developing high-performing teams.

Market check

Salary context

This $250,500–$335,900 range sits above 95% of similar postings on FindRole.

Peer median band

$124,127$199,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$135,125$210,375

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About The Walt Disney Company

The Walt Disney Company is a diversified global entertainment and media enterprise operating in segments including Disney Parks, Experiences and Products; Entertainment (ABC, Hulu, Disney+); and ESPN. Industry: Entertainment & Media

The Walt Disney Company currently has 121 open roles on FindRole.

Listed pay typically runs $141,900–$190,300 across 113 roles with salary data.

Most-posted roles

View all roles at The Walt Disney Company

More like this

Similar roles

Principal Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Fl - Disney'S Hollywood Studios - Feature Animation Building, US) 48 days ago
AWS Azure GCP Terraform CloudFormation Ansible Chef CI/CD Docker Kubernetes Prometheus Grafana Python Linux Windows AI LLM PCI DevOps SRE SLI SLO SLA
Remote

Principal Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Fl - Disney'S Hollywood Studios - Feature Animation Building, US) 41 days ago
Akamai Kona Site Defender WAF Bot Manager DevOps CI/CD Python Go Docker Terraform AWS Azure Google Cloud PostgreSQL MongoDB Redis Prometheus Grafana Kubernetes Ansible Jenkins GitLab GitHub
Remote

Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Fl - Disney'S Hollywood Studios - Feature Animation Building, US) 49 days ago
Akamai Splunk AppDynamics GitHub Ansible Chef AWS Azure GCP CI/CD RESTful APIs Microservices Cloud computing Python JavaScript Kubernetes Terraform Prometheus Grafana
Remote

Site Reliability Engineer

Equifax

Usa - Missouri - St. Louis - Lackland, US 43 days ago
AWS GCP Terraform Jenkins Python Bash Docker Kubernetes CI/CD Prometheus PostgreSQL Linux Windows Ansible Chef

Site Reliability Engineer

Shopify

US 27 days ago
Kubernetes Docker CI/CD Python Go PostgreSQL AWS GCP Prometheus Grafana Terraform GitOps