MTS - Backend Engineer | Microsoft Careers
Microsoft
At a glance
AI generatedAs a Site Reliability Engineer joining our infrastructure team, you will play a crucial role in maintaining the reliability and efficiency of our large-scale distributed AI infrastructure. Your day-to-day responsibilities include ensuring uptime and resiliency for AI model training and inference systems, designing monitoring and alerting systems, optimizing resource utilization across compute, GPU clusters, storage, and networking, building automation tools for deployments and incident response, leading on-call rotations to troubleshoot issues, conducting blameless postmortems, and collaborating with ML engineers to improve developer experience. You will need strong proficiency in Kubernetes, Docker, CI/CD pipelines, public cloud platforms like Azure/AWS/GCP, monitoring tools such as Grafana and Datadog, and programming skills in Python or Go. This role involves working on cutting-edge infrastructure that powers the future of Generative AI, impacting millions of users through reliable deployments.
Skills
What you'll do
What we're looking for
Market check
This $119,800–$234,700 range sits above 64% of similar postings on FindRole.
Peer median band
$120,750–$202,200
Median floor and ceiling across peers.
Typical midpoint (25–75%)
$142,450–$195,000
Middle half of comparable postings.
Based on 240 comparable postings.
* 240 is the maximum number of comparable postings sampled.
Employer
Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing
Microsoft currently has 451 open roles on FindRole.
Listed pay typically runs $119,800–$234,700 across 417 roles with salary data.
Most-posted roles
More like this
Microsoft
Microsoft
Microsoft
Microsoft
Microsoft
Block