Site Reliability Engineer II

The Walt Disney Company

Remote Actively hiring Verified listing

Remote, USA · New York, NY Posted 10 days ago $123,000–$165,000 / year

View original post Log in to save

At a glance

AI generated

TL;DR

As a Site Reliability Engineer II at Disney Entertainment and ESPN Product & Technology, you will join the Streaming SRE squad to enhance the reliability and scalability of critical backend systems. Your day-to-day responsibilities include designing and implementing automation for deployment, monitoring, and operational workflows; collaborating with software engineering teams to enforce SRE best practices; and developing tools to improve observability across metrics, logs, and distributed tracing. You will also participate in incident response and root cause analysis while maintaining Infrastructure-as-Code definitions and cloud environment configurations. The ideal candidate has hands-on experience with AWS, Python, Go, Bash, Docker, Kubernetes, and CI/CD systems like GitHub Actions or GitLab CI. Familiarity with observability stacks such as Prometheus and Grafana is preferred, along with a strong understanding of distributed system fundamentals and performance optimization in large-scale enterprise environments.

Skills

AWS Kubernetes Terraform Python Go Docker CI/CD Prometheus Grafana Bash Jenkins Infrastructure-as-Code GitOps SLO/SLI Service_mesh Performance_testing Message_queues AI_assisted_development_tools

What you'll do

Design and implement systems to enhance reliability, scalability, and performance.
Build automation for deployment, monitoring, alerting, and operational workflows.
Develop tools and dashboards to improve observability across metrics, logs, and distributed tracing.
Participate in incident response, root cause analysis, and corrective actions.
Assist in capacity planning, performance tuning, and scaling strategies for distributed systems.
Maintain Infrastructure-as-Code definitions and cloud environment configurations.

What we're looking for

3+ years of experience in Site Reliability Engineering or related field.
Hands-on experience with AWS, GCP, Azure cloud platforms and Docker/Kubernetes.
Proficiency in Python, Go, JavaScript, Bash scripting languages.
Working knowledge of Linux/Unix systems and CI/CD tools like GitHub Actions/GitLab CI.
Experience with Infrastructure-as-Code (Terraform, CloudFormation) and observability stacks.
Strong analytical skills for diagnosing complex system issues.

Market check

Salary context

This $123,000–$165,000 range sits above 42% of similar postings on FindRole.

Peer median band

$119,900–$198,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$126,125–$195,000

Middle half of comparable postings.

Based on 238 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About The Walt Disney Company

The Walt Disney Company is a diversified global entertainment and media enterprise operating in segments including Disney Parks, Experiences and Products; Entertainment (ABC, Hulu, Disney+); and ESPN. Industry: Entertainment & Media

The Walt Disney Company currently has 118 open roles on FindRole.

Listed pay typically runs $141,900–$190,300 across 110 roles with salary data.

Most-posted roles

View all roles at The Walt Disney Company

Similar roles

Sr Principal Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Ca - Market St, US) 52 days ago $250,500–$335,900

Kubernetes AWS CI/CD Docker Prometheus Grafana Python PostgreSQL Terraform Ansible GitOps CDN integration media streaming technologies content delivery strategies

Remote

Site Reliability Engineer II

CME Group

Chicago - 20 S. Wacker, US 30 days ago $93,900–$156,500

Google Cloud Platform Kubernetes Python Bash OpenTelemetry Splunk Prometheus Grafana Linux Distributed systems Networking(HTTP/TCP/UDP/IP) Message-oriented middleware Agile methodologies

Site Reliability Engineer II

CME Group

New York - 300 Vesey Street, US 18 days ago $93,900–$156,500

Python Bash Google Cloud Platform GCP Kubernetes Prometheus Grafana Dynatrace New Relic Moogsoft BigPanda LLMs LangChain LlamaIndex PagerDuty AIOps OpenTelemetry Splunk Linux AI ML AIOps

Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Fl - Disney'S Hollywood Studios - Feature Animation Building, US) 50 days ago

Akamai Splunk AppDynamics GitHub Ansible Chef AWS Azure GCP CI/CD RESTful APIs Microservices Cloud computing Python JavaScript Kubernetes Terraform Prometheus Grafana

Remote

Site Reliability Engineer

Equifax

Usa - Missouri - St. Louis - Lackland, US 44 days ago

AWS GCP Terraform Jenkins Python Bash Docker Kubernetes CI/CD Prometheus PostgreSQL Linux Windows Ansible Chef

Site Reliability Engineer

Shopify

US 28 days ago

Kubernetes Docker CI/CD Python Go PostgreSQL AWS GCP Prometheus Grafana Terraform GitOps