AI Performance Engineer (Cloud AI Engineering), Sr | Staff | Sr. Staff

Qualcomm

Actively hiring
San Diego, CA · Markham, ON Posted 51 days ago $178,400$267,600 / year

At a glance

AI generated

TL;DR

As an AI Performance Engineer at Qualcomm Technologies, you will join a dynamic team developing cutting-edge hardware and software solutions for Cloud AI inference acceleration. Your day-to-day responsibilities include converting and optimizing models using PyTorch and ONNX, analyzing performance of large language and vision models, and mapping next-generation workloads onto current and future hardware designs. You will collaborate closely with internal teams and customers to drive innovative engineering solutions that enhance the efficiency and scalability of AI workloads. Essential skills for this role include hands-on experience in building and optimizing language models, a deep understanding of transformer architectures and attention mechanisms, proficiency in Python programming, and knowledge of computer architecture and ML accelerators. Bonus points for familiarity with machine learning compilers like torch.compile or torchDynamo and expertise in neural network operators and mathematical operations.

Skills

PyTorch ONNX Python Transformer architectures Attention mechanisms Sharding strategies Parallelism techniques Computer architecture ML accelerators Distributed systems Linear algebra Math libraries Machine learning compilers torch.compile torchDynamo

What you'll do

  • Convert and optimize models for efficient inference using PyTorch and ONNX.
  • Analyze and optimize large language, vision, and diffusion models for performance constraints.
  • Map next-generation AI workloads onto current and future hardware designs.
  • Collaborate with internal teams to develop engineering solutions for continuous performance insights.
  • Design high-level kernels in Triton to generate efficient low-level code.
  • Identify new optimization opportunities by understanding advanced algorithms and numerics.

What we're looking for

  • Hands-on experience with PyTorch and ONNX for model optimization.
  • Deep understanding of transformer architectures and attention mechanisms.
  • Experience in workload mapping strategies including sharding and parallelism.
  • Strong Python programming skills and knowledge of ML accelerators.
  • Proactive learning of the latest inference optimization techniques.
  • MS degree in Computer Science, Machine Learning, or related fields.

Market check

Salary context

This $178,400–$267,600 range sits above 57% of similar postings on FindRole.

Peer median band

$170,800$258,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$177,250$246,150

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 567 open roles on FindRole.

Listed pay typically runs $148,300–$226,100 across 534 roles with salary data.

Most-posted roles

View all roles at Qualcomm

More like this

Similar roles

Sr. Staff SW Engineer, Constrained AI Software

Qualcomm

San Diego, Ca,Us, US 12 days ago $178,400$267,600
C/C++ Qualcomm AI Stack TensorFlow PyTorch ONNX Hexagon DSP SDK Linux Android Windows QNN Genie Agile Git CMakes Docker Kubernetes CI/CD

Staff AI Development Engineer (Enterprise AI Ecosystem)

Qualcomm

Austin, Tx,Us, US 85 days ago $134,800$202,200
Python LangChain LangGraph AsyncIO Pydantic LLMs Embedding Models Vector Stores Distributed Systems Micorservices REST Docker Kubernetes CI/CD Milvus Qdrant Chroma Neo4j Ragas DeepEval

Staff Software Development Engineer (Cloud & AI)

CVS Health

Remote (Buffalo Grove-2100 E Lake Cook, US) 45 days ago $130,295$260,590
Azure GCP CI/CD Python Java Go FastAPI PostgreSQL Azure SQL Cosmos DB Apache Kafka Jenkins GitHub Copilot Agile methodologies
Remote

Senior Staff Generative AI Engineer - VP

Citi

Remote (388 Greenwich Street - Tower, US) 18 days ago $142,320$213,480
Python OpenShift Generative AI Large Language Models (LLMs) LangChain PostgreSQL pgvector CI/CD Devin and Copilot Docker Kafka React JS StreamLit Spring Boot N8N Flask Udemy for Business Pluralsight
Remote