Principal Software Engineer | Microsoft Careers

Microsoft

Hybrid Actively hiring
US Posted 47 days ago $163,000$296,400 / year

At a glance

AI generated

TL;DR

Join the Monetization Engineering team as a senior engineer specializing in GPU inference optimization for Microsoft’s AI-native surfaces such as Copilot and Search. You will accelerate large-scale deep learning inference across advertising and shopping platforms by optimizing OpenAI LLM models and next-generation architectures, ensuring seamless integration of cutting-edge GPU technologies with business applications. Key responsibilities include profiling CPU/GPU bottlenecks using tools like Nsight and TensorBoard, enhancing model compression techniques, and building high-throughput inference serving stacks in Azure environments. This role demands expertise in CUDA, TensorRT, Triton, and a deep understanding of LLM/SLM architectures to drive impactful solutions for Microsoft’s monetization platforms.

Skills

CUDA TensorRT Triton PyTorch Nsight Azure H100 A100 LLM SLM MoE Model Compression Quantization Distillation SVD Low-Rank Methods Continuous Batching KV-Cache Optimizations Routing DLIS Talon

What you'll do

  • Optimize large-scale deep learning inference for Microsoft’s advertising platforms.
  • Accelerate GPU inference performance across various AI-native surfaces like Copilot and Search.
  • Bridge advanced GPU technologies with critical business applications in the monetization platform.
  • Identify and resolve CPU/GPU bottlenecks using profiling tools such as Nsight and TensorBoard.
  • Build high-throughput inference serving stacks for continuous batching and KV-cache optimizations.
  • Enhance model efficiency through techniques like quantization, distillation, and low-rank methods.

What we're looking for

  • Solid experience in GPU inference optimization using CUDA, TensorRT, or custom GPU kernels.
  • Proficiency with profiling tools like Nsight and TensorBoard for identifying CPU/GPU bottlenecks.
  • Deep understanding of LLM/SLM architectures including attention mechanisms, embeddings, MoE, and decoders.
  • Experience optimizing latency-critical online services for real-time performance.
  • Familiarity with model compression techniques such as quantization, distillation, SVD, and low-rank methods.
  • Expertise in building high-throughput inference serving stacks with continuous batching and KV-cache optimizations.

Market check

Salary context

This $163,000–$296,400 range sits above 83% of similar postings on FindRole.

Peer median band

$142,050$264,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$177,250$214,500

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 534 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 488 roles with salary data.

Most-posted roles

View all roles at Microsoft

More like this

Similar roles

Principal Software Engineer | Microsoft Careers

Microsoft

WA 62 days ago $142,800$274,800
CUDA NVIDIA_Triton_Inference_Server TensorRT Kafka Flink Spark_Streaming GPU CPU NUMA Docker CI/CD Prometheus Grafana PostgreSQL Python Go AWS Azure Google_Cloud_Pods Kubernetes Terraform

Principal Software Engineer | Microsoft Careers

Microsoft

Redmond, WA 108 days ago $139,900$274,800
Python Java JavaScript C# AI CI/CD Kubernetes Docker AWS Azure PostgreSQL MongoDB Git Jenkins GitHub Swagger RESTful APIs Microservices Cloud Native DevOps SRE Observability Security
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

US 131 days ago $139,900$274,800
Python Java Kubernetes AWS Azure CI/CD MLOps Apache Spark Flink Docker Prometheus Grafana PostgreSQL Redis Scalability High-Availability Multi-Agent Systems Reinforcement Learning
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

US 87 days ago $139,900$274,800
C C++ Rust Python JavaScript Java .NET Performance Engineering Large-Scale Software Design Architectural Modernization Legacy Codebase Optimization Performance Tooling Automation AI-Assisted Diagnostics Cross-Team Collaboration Code Reviews
Hybrid

Principal Software Engineer | Microsoft Careers

Microsoft

US 9 days ago $165,600$296,400
Azure Kubernetes Docker CI/CD Apache Spark Kafka PostgreSQL Redis GraphQL Python JavaScript TypeScript React Node.js ML/AI Data pipelines Microservices APIs Schema evolution Telemetry Operational excellence
Hybrid