Principal High-Performance LLM Training Engineer

Nvidia

Quick summary

Work type: On-site
Location: Santa Clara, CA
Salary: $272,000–$431,250 / yr
Posted: 47 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $209k

This role $352k

$136k most similar roles pay here $463k

This role pays more than 99% of similar roles. Most pay $171,500–$246,150 — the shaded band above. At the midpoint, this role pays about $352k versus about $209k for comparable roles.

Based on 239 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 967 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 950 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Principal High-Performance LLM Training Engineer

Apply Now Log in to save

NVIDIA is hiring a Principal Engineer to lead the optimization of large-scale AI training and post-training workloads on its advanced hardware and software platforms. This role involves analyzing and enhancing frontier-scale LLM workloads running on thousands of GPUs, driving improvements in frameworks like PyTorch, JAX, NeMo, and NeMo RL, and shaping future NVIDIA GPU, system, and software roadmaps based on real-world insights. The ideal candidate will have extensive experience in large-scale AI training systems, GPU performance optimization, distributed systems, and high-performance computing, with a deep understanding of GPU architecture from individual accelerators to datacenter-scale systems. They should be proficient in using profiling, tracing, benchmarking tools, and possess strong technical leadership skills to influence multi-functional decisions across NVIDIA’s teams. This role offers the chance to collaborate on cutting-edge AI projects that impact the future of computing and social progress.

Skills

PyTorch JAX NeMo CUDA Distributed Systems High-Performance Computing Mixed Precision Training Activation Checkpointing Profiling Tools Tracing Tools Benchmarking Tools GPU Architecture TensorFlow Kubernetes AWS Azure Google Cloud Platform PostgreSQL CI/CD

What you'll do

Lead end-to-end performance analysis and optimization of large-scale LLM training on NVIDIA platforms.
Identify and eliminate bottlenecks in compute, memory, communication, and scheduling for AI workloads.
Develop software tools and benchmarks to enhance efficiency and developer productivity across AI stacks.
Guide future GPU and system architecture decisions with insights from workload characterizations and simulations.
Serve as a technical expert for AI training performance, collaborating with cross-functional teams at NVIDIA.

What we're looking for

MS or PhD in Computer Science, Electrical Engineering, or related field with 12+ years of relevant experience.
Proven technical impact in large-scale AI training systems, GPU optimization, distributed systems, HPC, ML frameworks, compilers/runtimes, or hardware/software co-design.
Deep hands-on expertise in analyzing and optimizing performance of large-scale deep learning workloads, especially transformer-based models.
Strong understanding of GPU and AI accelerator architecture from individual accelerators to datacenter-scale systems.
Experience with distributed training techniques including various parallelism strategies and mixed precision training.
Extensive use of profiling, tracing, benchmarking, and modeling tools to diagnose complex bottlenecks and drive performance improvements.
Excellent communication and technical leadership skills to influence multi-functional decisions across teams.

Similar roles

Senior High-Performance LLM Training Engineer

Nvidia

Santa Clara, CA 67 days ago $184,000–$287,500

Python C++ CUDA PyTorch JAX GPU MLPerf NVIDIA Deep Learning Computer Architecture Performance Modelling Automation Tools System Simulators Cloud Services Data Centers

Hybrid

Save

Principal Engineer, LLM

Upstart

Remote (Canada) 53 days ago $238,400–$330,200

LLM ONNX Vector databases LangChain LlamaIndex OpenAI APIs Kubernetes Docker Terraform Python FastAPI React TypeScript CI/CD Cloud-native architectures PostgreSQL Redis Git GitHub Jenkins Prometheus Grafana

Remote

Save

AI LLM Engineer

Siemens Healthineers

Atlanta, GA 11 days ago $93,680–$128,810

Python LLMFrameworks AzureOpenAI Databricks RAGPipelines PromptEngineering Snowflake PowerBI LangChain SemanticKernel MicrosoftCopilotStudio SQL VectorDatabases DataModeling ELTPipelines ModelDeployment CI/CD PowerAutomate MultiAgentOrchestration

Hybrid

Save

Applied LLM Research Engineer, Input Experience

Apple Inc

Cupertino, CA 58 days ago $147,400–$272,100

Python PyTorch JAX TensorFlow SFT RLHF Data Synthesis Parameter-Efficient Fine-Tuning RLVR Reward Modeling Environment Design Speculative Decoding CI/CD

Save

Principal SW Engineer - LLM Serving (Cloud AI)

Qualcomm

San Diego, CA 116 days ago $200,800–$301,200

PyTorch Python C++ LLMs Multi-modal models Reasoning models Neural networks High performance software Multicore systems Performance analysis Multi-core architecture SoC architectures Performance modeling Machine learning accelerators Neural network operators Linear algebra Math libraries

Save

Senior Lead AI Engineer (LLM Customization and Finetuning)

Capital One Financial

Cambridge, MA +3 131 days ago $229,900–$262,400

Python TensorFlow PyTorch Kubernetes Docker AWS Azure GCP CI/CD Git PostgreSQL MongoDB Redis Scikit-learn Pandas NumPy Jupyter Swagger GraphQL

Save