Senior AI Site Reliability Engineer

Oracle

Actively hiring Posted this week
US Posted 3 days ago

At a glance

AI generated

TL;DR

As a Senior AI Site Reliability Engineer, you will join a collaborative team focused on developing and maintaining the next-generation Electronic Health Record platform, leveraging advanced AI technologies to enhance reliability and operational efficiency. Your responsibilities include designing and building highly scalable and secure infrastructure for large-scale analytics workloads, contributing to automation-first operations, and integrating AI-driven solutions for observability, incident response, and lifecycle management. You will need expertise in multi-cloud environments (OCI, AWS/Azure), CI/CD pipelines (Jenkins, Kubernetes), and data technologies such as Vertica and ETL frameworks. Additionally, hands-on experience with Generative AI tools like LangChain or AutoGPT for DevOps/SRE workflows is essential. This role requires a strong background in cloud infrastructure design, distributed systems, and problem-solving skills to ensure system reliability and performance at scale within the healthcare industry's stringent regulatory environment.

Skills

AWS Azure OCI Kubernetes Terraform Python Java Go Docker Prometheus Grafana CI/CD Vertica Snowflake Tableau Power BI Oracle Analytics LangChain AutoGPT Jenkins

What you'll do

  • Design and build reliable, scalable infrastructure for large-scale analytics workloads.
  • Implement AI-assisted approaches for observability, anomaly detection, and incident remediation.
  • Optimize system reliability through automation, monitoring, and performance optimization.
  • Enhance service architecture and operability by partnering with development teams.
  • Perform root cause analysis and implement long-term fixes for complex production issues.
  • Drive continuous improvement in DevOps/SRE practices, including CI/CD and Infrastructure as Code.

What we're looking for

  • 3+ years of experience in cloud infrastructure design, automation, and SRE practices.
  • Strong hands-on skills with multi-cloud environments (OCI, AWS/Azure) and hybrid architectures.
  • Expertise in AI-native engineering, including Generative AI for observability and incident response.
  • Proficiency in CI/CD pipelines, Infrastructure as Code (Terraform), and observability tools (Prometheus, Grafana).
  • Experience with data technologies such as Data Warehousing platforms (Vertica, Snowflake) and ETL frameworks.
  • Demonstrated ability to troubleshoot complex production issues and perform root cause analysis.

Employer

About Oracle

Oracle Corporation is a leading multinational technology company specializing in database software, cloud computing, and enterprise software.

Oracle currently has 343 open roles on FindRole.

Listed pay typically runs $97,500–$199,500 across 253 roles with salary data.

Most-posted roles

View all roles at Oracle