Site Reliability Engineer - CTJ - POLY | Microsoft Careers

Microsoft

Actively hiring
US Posted 97 days ago $119,800$234,700 / year

At a glance

AI generated

TL;DR

As a Senior Site Reliability Engineer (SRE) at Microsoft’s Azure Silver and Sovereign Team within the Azure Data Transfer division, you will apply SRE principles to ensure high availability and performance of mission-critical distributed systems. Your daily tasks include defining service health metrics through SLIs/SLOs, building automation to reduce operational overhead, and enhancing observability across logs, metrics, and traces. You will work closely with cross-functional teams to drive reliability improvements, participate in on-call rotations for incident response, and mentor engineers through design reviews and knowledge sharing. The role requires expertise in cloud reliability practices, Linux systems administration (Rocky 9, Redhat, Mariner), and automation tools like Ansible, as well as experience with CI/CD pipelines such as Azure DevOps or GitHub Actions. This position involves addressing complex challenges in highly regulated environments, ensuring strict compliance and security standards are met for both public and private sector customers.

Skills

Azure Kubernetes Ansible CI/CD GitHub Actions Linux Rocky 9 Redhat Mariner Python Go Terraform AWS Prometheus Grafana Docker SLIs/SLOs Chaos Engineering Infrastructure as Code Telemetry Observability Metrica Logs Traces Blameless Postmortems

What you'll do

  • Defines and improves service health via SLIs/SLOs and error budgets.
  • Implements reliable changes using SRE practices like progressive delivery and safe rollouts.
  • Builds observability by expanding metrics, logs, traces, and dashboards to detect incidents.
  • Participates in on-call rotations for complex incidents, leading response and mitigation efforts.
  • Applies secure-by-design principles to operations, monitoring, and automation.

What we're looking for

  • Requires Master's degree in CS/IT or equivalent experience.
  • At least 2 years of technical experience in software engineering, network engineering, or systems administration.
  • Expertise in large-scale cloud or distributed systems.
  • Experience with automation tools like Ansible and CI/CD pipelines.
  • Strong problem-solving skills for complex production environments.
  • Deep expertise in Linux system management and security hardening.

Market check

Salary context

This $119,800–$234,700 range sits above 69% of similar postings on FindRole.

Peer median band

$119,800$199,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$140,043$185,937

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 451 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 417 roles with salary data.

Most-posted roles

View all roles at Microsoft

More like this

Similar roles

Site Reliability Engineer - CTJ - POLY

Microsoft

Redmond, Wa,Us, US 66 days ago $100,600$199,000
Azure Kubernetes Docker CI/CD Python Go Terraform Prometheus Grafana AI ML Telemetry SDP PostgreSQL SQL Git Linux Windows Server DevOps SRE Cloud Security Capacity Planning

Senior Site Reliability Engineer | Microsoft Careers

Microsoft

US 106 days ago $119,800$234,700
Azure Kubernetes Terraform Python Go Docker CI/CD Prometheus Grafana GitOps Infrastructure-as-Code DNS CDN TLS Certificate Lifecycle Management Network Security Cloud Security Controls Identity-Driven Security Policies Microservices Patterns API Gateways Global Routing Architectures Automation Frameworks Scripting Distributed Tracing Metric Analysis Log Analysis