| Microsoft Careers

Microsoft

Hybrid

Quick summary

Work type
Hybrid
Location
Salary
$165,600–$296,400 / yr
Posted
62 days ago
Closes
Oct 12, 2026

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $188k
This role $231k
$121k most similar roles pay here $315k

This role pays more than 92% of similar roles. Most pay $167,100–$208,800 — the shaded band above. At the midpoint, this role pays about $231k versus about $188k for comparable roles.

Based on 239 similar postings.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 1578 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 1406 roles with salary data.

Most-posted roles

View all roles at Microsoft

At a glance

TL;DR · | Microsoft Careers

Join the Monetization Engineering team as a senior engineer specializing in GPU inference optimization for Microsoft’s AI-native surfaces such as Copilot and Search. You will accelerate large-scale deep learning inference across advertising and shopping platforms by optimizing OpenAI LLM models and next-generation architectures, ensuring seamless integration of cutting-edge GPU technologies with business applications. Key responsibilities include profiling CPU/GPU bottlenecks using tools like Nsight and TensorBoard, enhancing model compression techniques, and building high-throughput inference serving stacks in Azure environments. This role demands expertise in CUDA, TensorRT, Triton, and a deep understanding of LLM/SLM architectures to drive impactful solutions for Microsoft’s monetization platforms.

What you'll do

  • Optimize large-scale deep learning inference for Microsoft’s advertising platforms.
  • Accelerate GPU inference performance across various AI-native surfaces like Copilot and Search.
  • Bridge advanced GPU technologies with critical business applications in the monetization platform.
  • Identify and resolve CPU/GPU bottlenecks using profiling tools such as Nsight and TensorBoard.
  • Build high-throughput inference serving stacks for continuous batching and KV-cache optimizations.
  • Enhance model efficiency through techniques like quantization, distillation, and low-rank methods.

What we're looking for

  • Solid experience in GPU inference optimization using CUDA, TensorRT, or custom GPU kernels.
  • Proficiency with profiling tools like Nsight and TensorBoard for identifying CPU/GPU bottlenecks.
  • Deep understanding of LLM/SLM architectures including attention mechanisms, embeddings, MoE, and decoders.
  • Experience optimizing latency-critical online services for real-time performance.
  • Familiarity with model compression techniques such as quantization, distillation, SVD, and low-rank methods.
  • Expertise in building high-throughput inference serving stacks with continuous batching and KV-cache optimizations.

More like this

Similar roles

| Microsoft Careers

Microsoft

WA 77 days ago $142,800$274,800
CUDA NVIDIA_Triton_Inference_Server TensorRT Kafka Flink Spark_Streaming GPU CPU NUMA Docker CI/CD Prometheus Grafana PostgreSQL Python Go AWS Azure Google_Cloud_Pods Kubernetes Terraform

Principal Software Engineer | Microsoft Careers

Microsoft

US 147 days ago $142,800$274,800
Python Java Kubernetes AWS Azure CI/CD MLOps Apache Spark Flink Docker Prometheus Grafana PostgreSQL Redis Scalability High-Availability Multi-Agent Systems Reinforcement Learning
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

US 103 days ago $139,900$274,800
C C++ Rust Python JavaScript Java .NET Performance Engineering Large-Scale Software Design Architectural Modernization Legacy Codebase Optimization Performance Tooling Automation AI-Assisted Diagnostics Cross-Team Collaboration Code Reviews
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

US 1 day ago $165,600$296,400
Azure Kubernetes Docker Python Go Java SQL NoSQL CI/CD Prometheus Grafana Git GitHub Terraform AWS Google Cloud Microservices Service-Oriented Architecture LLM Responsible AI DevOps
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

Redmond, WA 56 days ago $139,900$274,800
C# .NET OAuth2 OIDC Distributed Systems High-Performance Runtime Engines Policy/Rules Engines Identity Platforms ESTS Messaging Protocols Zero Trust Principles CI/CD Terraform Azure Kubernetes Prometheus Grafana