Browse tech roles

Basic role filtering by workplace, salary floor, and post age. For full AI matching and advanced filtering upload your resume using AI Match.

Filters

Workplace

Any Remote Hybrid On-site Within distance

Minimum salary (USD)

Posted within

Any time Last 7 days Last 30 days Last 90 days

Clear all filters

Clear

Active Keyword: Senior Inference Engineer Location: Santa Clara Clear all

12 of up to 20 (filtered)

Senior Inference Engineer, AIConfigurator for Dynamo

Nvidia

Remote (Santa Clara, CA) 5 days ago $184,000–$287,500

Actively hiring Posted this week Verified listing Above market

Python Rust Kubernetes TensorRT-LLM vLLM SGLang Triton Inference Server Dynamo CI/CD GPU computing Distributed systems ML infrastructure High-performance model serving Data-driven performance analysis Benchmarking Optimization NVIDIA GPUs Disaggregated serving Prefill/decode separation KV cache management NCCL NIXL NVSHMEM Expert-parallel MoE inference

Remote

Save

Senior Software Engineer, Deep Learning Inference - TensorRT

Nvidia

Santa Clara, CA 8 days ago $152,000–$241,500

Actively hiring Verified listing Competitive pay

C++ Python CUDA TensorRT PyTorch TensorFlow ONNX Runtime NVIDIA GPUs Machine Learning Performance Benchmarking Profiling Optimizations Compiler Development Graph Parsers Optimizers

Hybrid

Save

Senior Software Development Engineer – LLM Inference Framework in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Santa Clara, CA 15 days ago $204,000–$204,000

Actively hiring Competitive pay

Python C/C++ vLLM SGLang llm-d PyTorch TensorFlow AITER HIPBLAS-LT RCCL ROCm FP8 FP4 FlashAttention MLPerf CI/CD

Save

Senior DL Algorithms Engineer - Inference Performance

Nvidia

Remote (Santa Clara, CA) 43 days ago $152,000–$241,500

Actively hiring Competitive pay

PyTorch NVIDIA_TRT-LLM vLLM SGLang FlashInfer GPU_architecture CUDA OpenCL Deep_Learning Neural_Networks Performance_profiling HPC Computer_Architecture Python C++

Remote

Save

Senior Software Engineer, AI Inference Systems

Nvidia

Santa Clara, CA 50 days ago $184,000–$287,500

Actively hiring Above market

Python C/C++ CUDA Kubernetes Docker Triton PyTorch vLLM SGLang MLIR Linux Go Rust CI/CD AWS GCP Azure Prometheus Grafana GitHub MLOps

Hybrid

Save

Senior System Software Engineer - Dynamo-Triton Inference Server

Nvidia

Remote (Santa Clara, CA) +1 51 days ago $152,000–$241,500

Actively hiring Competitive pay

Rust C++ Python TensorRT PyTorch ONNX OpenVINO vLLM TRT-LLM GPU Distributed Systems GitHub CI/CD Kubernetes Prometheus Grafana NVIDIA Triton Inference Server

Remote

Save

Senior Software Engineer - AI Inference

Nvidia

Remote (Santa Clara, CA) 64 days ago $152,000–$241,500

Actively hiring Competitive pay

Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL

Remote

Save

Senior Software Engineer, Machine Learning Inference

Nvidia

Santa Clara, CA 68 days ago $152,000–$241,500

Actively hiring Competitive pay

C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

Hybrid

Save

Senior Software Engineer, Deep Learning Inference - Automotive Safety

Nvidia

Santa Clara, CA 91 days ago $152,000–$241,500

Actively hiring Competitive pay

C++ CUDA Python ISO 26262 ASIL Deep Learning Modern C++ Safety-Critical Software Development Performance Optimization Benchmarking Embedded Systems Compiler Development Systems Programming

Hybrid

Save

Senior Software Engineer, Quantized Inference

Nvidia

Redmond, WA +1 112 days ago $152,000–$241,500

Actively hiring Competitive pay

Python C++ PyTorch vLLM TRT-LLM SGLang CI HuggingFace Megatron-LM Triton PyTorch_custom_ops autograd ML_accelerators model_compression PTQ QAT structured_sparsity unstructured_sparsity

Save

Senior Compiler Engineer, AI Inference Platforms

Nvidia

Remote (Santa Clara, CA) +1 114 days ago $152,000–$241,500

Actively hiring Competitive pay

MLIR LLVM XLA Triton PyTorch JAX Nsight Compute C++ CUDA Python

Remote

Save

Senior DL Algorithms Engineer - Inference Performance

Nvidia

Remote (Santa Clara, CA) 119 days ago $184,000–$287,500

Actively hiring Above market

C++ PyTorch CUDA OpenCL Python DeepLearningFrameworks GPUArchitecture PerformanceProfiling AlgorithmDesign CI/CD

Remote

Save