Principal Software Engineer – PyTorch Training Frameworks in San Jose, California | Advanced Micro Devices, Inc

Amd

Hybrid

Quick summary

Work type
Hybrid
Location
San Jose, CASeattle, WAAustin, TX
Salary
$240,000–$240,000 / yr
Posted
136 days ago
Closes
Feb 26, 2027

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $211k
This role $240k
$162k most similar roles pay here $251k

This role pays more than 77% of similar roles. Most pay $184,525–$238,025 — the shaded band above. At the midpoint, this role pays about $240k versus about $211k for comparable roles.

Based on 239 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 64 open roles on FindRole.

Listed pay typically runs $190,000–$190,000 across 64 roles with salary data.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Principal Software Engineer – PyTorch Training Frameworks in San Jose, California | Advanced Micro Devices, Inc

AMD seeks a Principal-level Software Development Engineer with expertise in PyTorch training frameworks to enhance performance and scalability of AI training on AMD Instinct accelerators. This role involves optimizing distributed training, resolving hardware-related issues, and contributing to upstream PyTorch projects. The ideal candidate will lead technical initiatives, mentor engineers, and engage with strategic partners to ensure robust developer experiences. Key skills include deep knowledge of PyTorch internals, proficiency in Python and C/C++, experience with distributed training concepts like DDP and FSDP, and strong performance engineering capabilities. Familiarity with AMD’s ROCm ecosystem and Linux-based environments is essential for driving impactful solutions at scale.

What you'll do

  • Act as technical authority for PyTorch training at AMD.
  • Improve and debug performance in areas like DDP/FSDP, gradient checkpointing.
  • Partner with ROCm teams to resolve full-stack performance bottlenecks and issues.
  • Contribute to upstream PyTorch by influencing design discussions and code contributions.
  • Develop and maintain benchmarks and profiling workflows for key models.
  • Lead investigations of performance regressions and correctness issues across teams.

What we're looking for

  • Deep experience with PyTorch internals and distributed training systems
  • Strong performance engineering skills including profiling, tracing, and memory optimization
  • Expertise in Python and C/C++ programming for large codebases
  • Familiarity with PyTorch ecosystem components like TorchInductor and CUDA/HIP models
  • Ability to lead technical discussions and influence architectural decisions across teams
  • Experience working on Linux-based environments with OS/hardware integration
  • Clear communication skills for design documentation, code reviews, and stakeholder updates

More like this

Similar roles