Senior Software Engineer, CoreAI Workload Engines | Microsoft Careers

Microsoft

Actively hiring
US Posted 54 days ago $119,800$234,700 / year

At a glance

AI generated

TL;DR

As a Senior Engineer on the CoreAI Workloads team at Azure, you will focus on optimizing inference engines and APIs for large language models (LLMs) like OpenAI’s, ensuring secure, reliable, and efficient GPU-based inference across diverse workloads. Your daily tasks include implementing performance improvements, running end-to-end experiments to measure and enhance latency, throughput, availability, and cost efficiency, and building robust experimentation capabilities. You will also design scalable inference architectures, extend AI infrastructure abstractions for elastic engines, and collaborate with networking teams on high-performance interconnects. Essential skills include proficiency in C++, Python, Kubernetes, and experience with GPU-accelerated stacks like CUDA. This role demands expertise in large-scale production systems, performance analysis, and technical leadership to influence platform architecture and drive continuous improvements across Azure’s AI ecosystem.

Skills

Python Kubernetes PyTorch CUDA Prometheus Grafana CI/CD Docker PostgreSQL Redis OpenAI LLMs NVIDIA_GPUs RDMA InfiniBand RoCE Heterogeneous_Fleets Disaggregated_Serving Multi_Token_Prediction KV_Offload Quantization

What you'll do

  • Optimize inference engines for OpenAI and open-source models by implementing performance improvements.
  • Run end-to-end experiments to measure and improve latency, throughput, availability, and cost of AI workloads.
  • Build experimentation capabilities for large-scale AI inference to ensure rapid, safe iteration.
  • Own serving availability and efficiency for Azure OpenAI Service through tiered experimentation and multi-modal utilization.
  • Design and evolve inference serving architectures using techniques like disaggregated serving and quantization.
  • Extend AI infrastructure abstractions to support elastic, heterogeneous inference engines at scale.

What we're looking for

  • Proven ability to design and operate large-scale production inference services.
  • Strong skills in performance analysis including benchmarking, profiling, diagnosing regressions.
  • Hands-on experience with Kubernetes for building and operating services.
  • Demonstrated technical leadership and cross-team architectural alignment.
  • Experience optimizing LLM inference in production environments (preferred).
  • Familiarity with GPU-accelerated inference stacks and high-performance networking.

Market check

Salary context

This $119,800–$234,700 range sits above 71% of similar postings on FindRole.

Peer median band

$119,800$234,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$144,750$191,687

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 534 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 488 roles with salary data.

Most-posted roles

View all roles at Microsoft

More like this

Similar roles

Principal Software Engineer, CoreAI Workload Engines | Microsoft Careers

Microsoft

US 54 days ago $142,800$274,800
Python Kubernetes PyTorch CUDA Prometheus Grafana CI/CD Docker PostgreSQL Redis OpenAI LLM Azure NVIDIA GPUs RDMA InfiniBand RoCE NCCL TensorFlow C++ Java JavaScript Go Git Jenkins Ansible Terraform AWS Google Cloud Platform CI/CD pipelines

Principal Software Engineer, CoreAI | Microsoft Careers

Microsoft

Redmond, WA 69 days ago $139,900$274,800
C++ Kubernetes CUDA Docker Azure Linux Performance Profiling Tools Debugging Tools CI/CD Multimodal Inferencing LLM Inferencing Infrastructure Service Reliability Engineering OpenAI

Principal Software Engineer, CoreAI | Microsoft Careers

Microsoft

US 75 days ago $142,800$274,800
Kubernetes Python C C++ Java JavaScript Terraform AWS Azure PostgreSQL CI/CD Prometheus Grafana Docker RDMA InfiniBand NCCL CUDA AKS Dynamic Resource Allocation(DRA)