AI Accuracy Architect
Qualcomm
At a glance
AI generatedQualcomm Technologies is hiring a Staff Engineer – AI Model Optimization Architect to join its Cloud AI team, focusing on developing hardware and software platforms for efficient inference of large-scale foundation models. This senior role involves architecting model optimization strategies that transform PyTorch models into accelerator-efficient execution, working closely with compiler, performance, and accuracy teams to ensure optimal throughput, latency, memory usage, and quality across various batch sizes and sequence lengths. Key responsibilities include designing fusion kernels using DSL-based approaches like Triton, profiling and optimizing large language and vision models for inference, enabling continuous batching strategies, and scaling distributed inference across multi-core systems. The ideal candidate has expert-level proficiency in PyTorch, experience with torch.compile, deep knowledge of transformer architectures, and a strong foundation in computer architecture and ML accelerators.
Skills
What you'll do
What we're looking for
Market check
How this pay compares to similar roles
This role pays less than 57% of similar roles. Most pay $162,000–$246,150 — the shaded band above. At the midpoint, this role pays about $198k versus about $204k for comparable roles.
Based on 240 similar postings.
Employer
Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.
Qualcomm currently has 595 open roles on FindRole.
Listed pay typically runs $148,300–$222,500 across 540 roles with salary data.
Most-posted roles
More like this
Qualcomm
Nvidia
Booz Allen Hamilton
Nvidia
Nvidia
Booz Allen Hamilton