Senior Platform Telemetry Engineer

Nvidia

Remote Actively hiring Posted this week Verified listing
Santa Clara, CA Posted 6 days ago $152,000$241,500 / year

At a glance

AI generated

TL;DR

We are seeking a senior software engineer to join our cutting-edge AI supercomputing team at NVIDIA, where you will drive the development of next-generation fleet management solutions for scaling AI infrastructure using GPUs and Grace processors. Your day-to-day responsibilities include collaborating with customers and architects to define requirements, conducting proof-of-concept validations, writing architecture specs, and ensuring thorough testing and productization. You will leverage your expertise in time series databases like InfluxDB and Prometheus, REST APIs (Redfish preferred), telemetry visualization tools such as Grafana, and C/C++ and Python programming languages to build scalable solutions. Additionally, you should have experience with server platforms, SCM tools like Git, and project management systems like Jira, along with a strong background in firmware optimization and algorithm analysis for time and space complexity. This role involves working on large-scale AI supercomputing platforms, requiring deep knowledge of system architecture and telemetry collection engines.

Skills

C/C++ Python Prometheus InfluxDB Grafana Redfish Git Jira PagerDuty REST APIs CI/CD ML multi-variable optimization techniques x86 ARM Confidential Compute OCP DMTF

What you'll do

  • Drive fleet management solutions for scaling AI infrastructure using GPUs and Nvidia Grace.
  • Design architecture for health monitoring and fault-remediation at scale for AI supercomputing platforms.
  • Write detailed architecture specs and design documents, conducting code reviews for implementation.
  • Ensure product testing by enhancing unit tests and creating proper test plans with development teams.
  • Articulate requirements in Jira and collaborate on end-to-end execution plans with other managers.

What we're looking for

  • 5+ years hands-on coding experience in C/C++ and Python
  • Strong knowledge of time series databases (InfluxDB), telemetry visualization solutions (Grafana), and REST APIs
  • Proven record of scalable system design and optimization for low latency APIs
  • Experience with firmware architecture, server platforms, and project management tools like Jira
  • Excellent communication skills and a strong sense of teamwork
  • Familiarity with Confidential Compute and experience in ML and multi-variable optimization techniques

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $155k
This role $197k
$102k most similar roles pay here $256k

This role pays more than 87% of similar roles. Most pay $135,300–$175,500 — the shaded band above. At the midpoint, this role pays about $197k versus about $155k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 824 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 812 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Communications Engineer

The Walt Disney Company

Remote (Anaheim, CA) 43 days ago $117,500$157,500
AutoCAD Bluebeam MS Office EIA/TIA standards Kubernetes Terraform AWS CI/CD Python PostgreSQL Cisco IOS Juniper JUNOS VMware vSphere Docker Prometheus Grafana
Remote

Senior Telematics Hardware Applications Engineer

Qualcomm

San Diego, CA 89 days ago $103,600$155,400
HighSpeedPCBDesign SoCInterfaces LPDDR MIPIDesign USB PCIe SGMII RGMII PowerDeliveryNetwork SignalIntegritySimulations DDR MultiLayeredPCBDesign SchematicAndLayout RootCauseAnalysis IntegrationDebug ProblemReproduction

Senior Communication Services Engineer

Intuit

Mountain View, CA 11 days ago $139,500$188,500
Zoom AWS Kubernetes GitHub Jira Python Shell CI/CD Prometheus Grafana Ansible Terraform Docker Kaltura Vimeo SIP Phones Network Design HVAC Systems AI Tools ChatGPT

Senior Platform Engineer

Arm Holdings

Austin, TX 15 days ago $161,500$218,500
Kubernetes Terraform Python Go CI/CD GitOps MCP Model Gateway RAG Systems LLM Observability Service Mesh Policy-as-Code Workload Identity Sandboxing Secure Runtime Environments Multi-Tenant Platform Designs Linux Cloud Infrastructure-as-Code Incident Management Demand Forecasting Production Readiness Practices Security Fundamentals Identity Secrets Management Access Control Network Segmentation Vulnerability Management Audit Logging
Hybrid

Senior Platform Engineer

Equifax

St. Louis, MO 13 days ago
GCP Terraform Jenkins Python Bash Docker Kubernetes Ansible CI/CD PostgreSQL Linux DevSecOps Operational Excellence Systems Thinking Troubleshooting
Hybrid

Technology Engineer Senior

PNC

Two Pnc Plaza (Pa374) 23 days ago
SQL NoSQL ETL Data modeling Relational databases Testing frameworks Application security Distributed systems Documentation Communication skills Python Java Kubernetes AWS CI/CD Git JIRA PostgreSQL MongoDB