Agentic AI Evaluation Engineer

Comcast

Quick summary

Work type
On-site
Location
Washington, District Of Columbia
Salary
$142,651–$213,977 / yr
Posted
1 day ago

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $203k
This role $178k
$130k most similar roles pay here $262k

This role pays less than 68% of similar roles. Most pay $163,683–$242,312 — the shaded band above. At the midpoint, this role pays about $178k versus about $203k for comparable roles.

Based on 240 similar postings.

Employer

About Comcast

Comcast is an American telecommunications and media conglomerate, providing cable TV, internet, and phone services under the Xfinity brand, and owning NBCUniversal.

Comcast currently has 42 open roles on FindRole.

Listed pay typically runs $129,515–$195,779 across 22 roles with salary data.

Most-posted roles

View all roles at Comcast

At a glance

TL;DR · Agentic AI Evaluation Engineer

As an Evaluation Engineer at the Agent Evaluation team, you will play a crucial role in ensuring that AI agents meet high standards of performance and reliability before deployment. Your responsibilities include designing evaluation pipelines across different environments, defining metrics for conversational AI quality, and building automated systems to assess agent behavior. You will manage datasets and annotation workflows, integrate evaluations into CI/CD processes, and conduct experiments to drive continuous improvement. Working closely with engineering and product teams, you will create technical documentation and mentor junior engineers while leveraging skills in CI/CD, machine learning, large language models, and responsible AI practices.

What you'll do

  • Design and develop agent evaluation pipelines across different environments.
  • Define and standardize metrics for assessing conversational AI quality.
  • Build automated systems to evaluate agent performance continuously.
  • Manage datasets, test sets, and annotation workflows for evaluations.
  • Conduct experiments and A/B testing to improve agent quality.

What we're looking for

  • 5-7 years of relevant work experience in AI or chatbot platforms
  • Expertise in CI/CD pipelines and machine learning models
  • Proficiency in designing evaluation metrics and benchmarks for AI agents
  • Ability to build automated and human-in-the-loop evaluation systems
  • Experience in managing datasets, test sets, and annotation workflows
  • Understanding of responsible AI principles including bias and fairness mitigation

More like this

Similar roles

Agentic AI Test Engineer

Comcast

Mount Laurel, NJ 1 day ago $114,985$172,478
Python CI/CD Jenkins Concourse React Streamlit TypeScript Java LLM-as-a-Judge Agent Evaluation Web Automation API Automation

Agentic AI Engineer

Booz Allen Hamilton

Washington, DC +1 36 days ago $99,000$225,000
LangChain AutoGen PydanticAI CrewAI LlamaIndex MCP A2A RAG Neo4j NebulaGraph Hugging Face PEFT LoRA Grafana Langfuse LangSmith Phoenix Docker Kubernetes AWS Azure ONNX GGML Ollama

Agentic AI Engineer

Booz Allen Hamilton

Washington, DC +2 51 days ago $99,000$225,000
LangChain AutoGen LangGraph Model Context Protocol MCP RAG Knowledge Graphs Neo4j NebulaGraph Hugging Face PEFT LoRA ONNX GGML Ollama ReAct loops microservice design edge computing CI/CD

Agentic AI Engineer

Booz Allen Hamilton

Arlington, Virginia 14 days ago $99,000$225,000
Python LangChain AutoGen CrewAI RAG LLMs GPT-class models multimodal models asynchronous programming event-driven programming CI/CD Secret clearance required skills cross-functional collaboration

Agentic AI Engineer

Booz Allen Hamilton

McLean, Virginia 19 days ago $77,600$176,000
Python LangChain AutoGen CrewAI RAG LLMs GPT-class models multimodal models asynchronous programming event-driven programming CI/CD Secret clearance required skills cross-functional collaboration

Distinguished AI Engineer (Agentic AI Platform)

Capital One Financial

San Jose, CA +4 35 days ago $269,100$307,200
Python Go Kubernetes LangChain AutoGen SemanticKernel LLMOps GoogleCloudVertexAI AmazonSageMaker AzureMachineLearning CI/CD ResponsibleAI DataPrivacy MultiTenantSecurityPatterns Prometheus Grafana