Agentic AI Benchmarking and Evaluation Engineer

Qualcomm

Quick summary

Work type: On-site
Location: San Diego, CA
Salary: $122,800–$184,200 / yr
Posted: 99 days ago
Closes: Aug 24, 2026
Nearby: 99+ roles within 25 mi

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $204k

This role $154k

$108k most similar roles pay here $265k

This role pays less than 84% of similar roles. Most pay $162,000–$246,150 — the shaded band above. At the midpoint, this role pays about $154k versus about $204k for comparable roles.

Based on 240 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 615 open roles on FindRole.

Listed pay typically runs $148,300–$222,500 across 556 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Agentic AI Benchmarking and Evaluation Engineer

Apply Now Log in to save

Join Qualcomm’s AI Research team as an ML/AI algorithm evaluation engineer to work on cutting-edge machine learning technology and tools like the AI Model Efficiency Toolkit. You will collaborate with a multi-disciplinary team of researchers and engineers to design and implement highly optimized solutions for generative AI, focusing on evaluating software tools and algorithms to ensure they meet quality standards across various platforms including CPU/GPU/NPU. Your responsibilities include developing comprehensive evaluation strategies, defining key performance metrics, automating tests, and conducting in-depth benchmarking and model evaluations. You will leverage PyTorch and TensorFlow to design and implement ML algorithms, while also extending research and identifying areas for optimization. Strong skills in machine learning fundamentals, generative AI, and experience with Qualcomm’s AI Stack products are essential, along with proficiency in Python and software design.

Skills

Python PyTorch TensorFlow LLM LVM LMM NN Qualcomm AI Stack QNN SNPE QAIRT LangChain LlamaIndex Autogen Linux Android CI/CD

What you'll do

Evaluate ML software tools and algorithms to ensure they meet quality standards.
Develop comprehensive evaluation approaches for AI models on various platforms.
Implement automation and perform qualitative tests based on key performance metrics.
Optimize the performance of AI/ML solutions on multiple hardware accelerators like CPU/GPU/NPU.
Provide detailed analysis and identify areas for further optimization in ML algorithms.

What we're looking for

Strong understanding of machine learning fundamentals and generative AI applications.
Proficient in designing, implementing, and training ML algorithms using PyTorch and TensorFlow.
Experience with large language models (LLM), vision models (LVM), multimodal models (LMM).
Exceptional software development skills including analytical, debugging, and automation capabilities.
Familiarity with Qualcomm AI Stack products like QNN, SNPE, QAIRT preferred.
Ability to drive cross-functional projects and collaborate closely with AI application teams.

Save