Principal TPM -AI Infrastructure

Oracle

Quick summary

Work type
On-site
Location
Austin, TX
Posted
15 days ago

Market check

Salary context

How this pay compares to similar roles

Similar $206k
$156k most similar roles pay here $267k

This listing doesn't post a salary. Most similar roles pay $171,747–$240,800.

Based on 239 similar postings.

Employer

About Oracle

Oracle Corporation is a leading multinational technology company specializing in database software, cloud computing, and enterprise software.

Oracle currently has 774 open roles on FindRole.

Listed pay typically runs $97,500–$209,500 across 584 roles with salary data.

Most-posted roles

View all roles at Oracle

At a glance

TL;DR · Principal TPM -AI Infrastructure

As a Principal TPM in the AI Infrastructure GPU Operations Team at OCI, you will lead cross-functional programs connecting engineering, operations, and leadership teams to drive deployment planning and operational readiness for expanding GPU infrastructure. Your daily responsibilities include owning operating mechanisms for regional deployment readiness, tracking milestones, managing incident governance, and improving operational handoffs across multiple concurrent GPU operations initiatives. You will leverage strong program discipline and business analytics skills to turn ambiguous technical inputs into clear priorities and action plans. Ideal candidates have experience in technical program management, data analysis, and infrastructure operations, with advanced Excel and PowerPoint skills, and knowledge of Jira and Confluence. This role focuses on large-scale GPU fleet reliability, distributed AI workload performance, and the practical use of AI to enhance operational productivity across complex GPU operations programs.

What you'll do

  • Drive availability and reliability of large-scale GPU fleets by identifying systemic issues and leading recovery efforts.
  • Own end-to-end execution of critical AI Infrastructure GPU Operations programs to ensure alignment with business priorities.
  • Manage deployment governance, change review processes, and incident management mechanisms for high-volume activities.
  • Build and maintain executive-level reporting including monthly business reviews and weekly operational KPIs.
  • Improve operations productivity by driving the practical use of AI and automation in GPU operations workflows.

What we're looking for

  • 5+ years of experience in technical program management or related field
  • Proven ability to lead complex cross-functional initiatives with measurable outcomes
  • Strong operational background including governance mechanisms and KPI reporting
  • Advanced Excel skills for data modeling and financial/operational analysis
  • Experience developing dashboards and automated reports for business visibility
  • Knowledge of cloud infrastructure, AI/ML operations, and GPU fleet management
  • Excellent written and verbal communication skills for executive updates

More like this

Similar roles

Principal TPM -AI Infrastructure

Oracle

Seattle, WA +1 27 days ago
AWS Kubernetes Terraform Python PostgreSQL CI/CD Prometheus Grafana Ansible Docker NVIDIA_H200 NVIDIA_B200 AMD_Instinct_MI300X Excel PowerPoint Jira Confluence

OCI & AI Infrastructure Pursuits Lead

Oracle

Austin, TX 14 days ago $102,300$209,500
OCI AWS Azure GCP Python R SQL Pricing Deal Structuring Revenue Modeling CI/CD Prometheus Grafana Kubernetes Terraform PostgreSQL Oracle Cloud Infrastructure AI Services Salesforce JIRA Confluence

OCI & AI Infrastructure Pursuits Lead

Oracle

Austin, TX 14 days ago $102,300$209,500
Oracle Cloud Infrastructure AWS Azure GCP AI/ML HPC Salesforce Pricing Deal Structuring Revenue Modeling CI/CD Prometheus Grafana Python PostgreSQL Kubernetes Terraform Docker Git Jira

Principal Technical Program Manager- AI Infrastructure

Microsoft

Redmond, WA +1 12 days ago $142,800$274,800
Azure Kubernetes Docker CI/CD Python PostgreSQL Prometheus Grafana AWS Terraform Git Linux REST JSON/WebAPI Scalability Security Reliability PerformanceOptimization AIWorkloadsTrainingInference

Lead AI Platform Infrastructure Architect

Rockwell Automation

Remote (Milwaukee, WI) 8 days ago
AWS Azure GCP Kubernetes Terraform Infrastructure as Code CI/CD Python Docker Prometheus Grafana PostgreSQL AI/ML platforms GPU acceleration High-performance storage Network throughput Security controls Identity management Compliance controls
Remote Hybrid