Senior AI Tools Engineer, SRE Operations - GeForce NOW
At a glance
AI generatedTL;DR
Join our dynamic Site Reliability Engineering (SRE) Data Team as an AI Tools Engineer and help build sophisticated AI-powered tools to optimize the global Geforce Now service. You will develop robust ML systems for root cause analysis and predictive maintenance, lead the creation of advanced LLM-based solutions, and manage large-scale data pipelines for model development. Essential skills include Python proficiency, experience with Kubernetes and AWS, and a deep understanding of AI frameworks and current developments in LLMs. Ideal candidates have 5+ years of relevant experience, strong automation expertise, and hands-on knowledge of monitoring tools like Grafana. This role demands an expert who can navigate the complexities of SRE principles and cloud technologies to ensure long-term technical sustainability and operational excellence at scale.
Skills
What you'll do
- Build robust AI/ML tools to analyze production data and identify root causes of complex incidents.
- Lead development of LLM- and Agent-based systems to enhance operational efficiency.
- Establish best practices for managing large-scale data sources critical for model development.
- Enhance LLM-based pipelines with a deep understanding of LLM progress in product development.
- Serve as an expert on AI Frameworks, recommending optimal platforms and toolsets for long-term sustainability.
What we're looking for
- B.S. in Computer Science, Statistics, or Engineering and 5+ years of AI/ML experience.
- Proficiency in Python; familiarity with Go or other systems languages preferred.
- Strong knowledge of AI frameworks and LLM-based platforms.
- Experience with Kubernetes and cloud environments like AWS.
- Expertise in building and optimizing large-scale data pipelines.
- Hands-on experience with monitoring and visualization tools such as Grafana.
- Understanding of SRE principles and production environment management.
Employer
About Nvidia
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 825 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.
Most-posted roles
- Senior Solutions Architect, AI Infrastructure 4
- Senior System Software Engineer - AV Platform 4
- Senior Circuit Design Engineer 3
- Senior Circuit Methodology Engineer 3
- Senior Deep Learning Performance Architect 3