| Microsoft Careers

Microsoft

Hybrid Actively hiring
US Posted 69 days ago

At a glance

AI generated

TL;DR

Join the Silver Edge team as a Site Reliability Engineer, working on Azure Local services across three sovereign clouds. You will ensure service reliability through deployment, availability, security, performance, and customer satisfaction while solving complex technical issues in dynamic environments. Utilize cloud technologies, AI/ML, and telemetry data to suggest improvements and engage with product engineering teams for code/design reviews and incident responses. Independently write scripts for automation, develop alerts, and optimize resource management using advanced analytics. Monitor operations metrics and contribute to the development of monitoring tools that enhance product performance at scale. The role demands expertise in distributed systems, cloud technology layers, and industry trends, requiring a Master's or Bachelor’s degree plus relevant experience in software engineering, network engineering, or systems administration.

Skills

Azure Kubernetes Docker CI/CD Python Go Terraform Prometheus Grafana AI ML Telemetry SDP PostgreSQL SQL Git Linux Windows Server DevOps SRE Cloud Security Capacity Planning

What you'll do

  • Maintain Azure Service reliability including deployment, availability, security, performance for sovereign environments.
  • Engage with product engineering teams through code/design reviews and incident responses to suggest improvements.
  • Independently write scripts to automate scalable operations processes across components of products operating at scale.
  • Develop alerts and instrumentation to monitor product capacity, security risk, and resource demands using telemetry data.
  • Troubleshoot problems affecting availability, security, reliability, performance, and efficiency, proposing solutions to prevent recurring issues.

What we're looking for

  • Master's degree in Computer Science or 2+ years of technical experience in software engineering.
  • Bachelor's degree in Computer Science or equivalent work experience.
  • Expertise in distributed systems design and cloud technology layers.
  • Experience with AI/ML algorithms for telemetry analysis and automation.
  • Ability to troubleshoot complex issues independently using existing tools and models.
  • Strong knowledge of security best practices and capacity planning.

Market check

Salary context

This $100,600–$199,000 range sits above 43% of similar postings on FindRole.

Peer median band

$119,800$198,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$128,825$198,859

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 534 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 488 roles with salary data.

Most-posted roles

View all roles at Microsoft

More like this

Similar roles

Site Reliability Engineer - CTJ - POLY | Microsoft Careers

Microsoft

US 100 days ago $119,800$234,700
Azure Kubernetes Ansible CI/CD GitHub Actions Linux Rocky 9 Redhat Mariner Python Go Terraform AWS Prometheus Grafana Docker SLIs/SLOs Chaos Engineering Infrastructure as Code Telemetry Observability Metrica Logs Traces Blameless Postmortems

Site Reliability Engineer

The Walt Disney Company

Remote (Bay Lake, FL) 53 days ago
Akamai Splunk AppDynamics GitHub Ansible Chef AWS Azure GCP CI/CD RESTful APIs Microservices Cloud computing Python JavaScript Kubernetes Terraform Prometheus Grafana
Remote

Site Reliability Engineer

Equifax

St. Louis, Missouri 47 days ago
AWS GCP Terraform Jenkins Python Bash Docker Kubernetes CI/CD Prometheus PostgreSQL Linux Windows Ansible Chef
Hybrid

Site Reliability Engineer

Shopify

Europe 31 days ago
Kubernetes Docker CI/CD Python Go PostgreSQL AWS GCP Prometheus Grafana Terraform GitOps