Senior Research Scientist, Multimodal Foundation Models and Robotics

Nvidia

Actively hiring
Us, Ca, Santa Clara, US Posted 139 days ago $192,000$304,750 / year

At a glance

AI generated

TL;DR

As a Senior Research Scientist in NVIDIA’s Generalist Embodied Agent Research (GEAR) group, you will contribute to groundbreaking work on multimodal foundation models and robotics, focusing on designing AI algorithms for humanoid robots and embodied agents. Your daily tasks include implementing large-scale training methods, optimizing models for physical simulation and robot hardware, and collaborating with cross-functional teams to translate research into practical applications. Ideal candidates hold a Ph.D. in Computer Science/Engineering or equivalent experience, with expertise in either multimodal foundation models—requiring hands-on experience with LLMs, vision-language models, video generative algorithms, and action-based transformers—or robotics, involving deep knowledge of robot kinematics, dynamics, control methods, and simulation frameworks like MuJoCo and Isaac Sim. This role demands proficiency in Python, C++, CUDA, ROS, and machine learning frameworks such as PyTorch and Jax, alongside a commitment to advancing the field of autonomous systems.

Skills

Python PyTorch Jax Tensorflow C++ CUDA ROS MuJoCo Isaac Sim Reinforcement Learning Imitation Learning PID Control MPC Whole-Body Control Robot Kinematics Robot Dynamics Sensor Integration

What you'll do

  • Design and implement novel AI algorithms for general-purpose humanoid robots.
  • Develop large-scale training methods for multimodal foundation models.
  • Optimize and deploy AI models in physical simulations and on robot hardware.
  • Collaborate with cross-functional teams to transfer research into products.
  • Conduct hands-on training and develop publications in advanced AI topics.

What we're looking for

  • Ph.D. in Computer Science/Engineering or equivalent research experience.
  • 5 years of relevant work/research experience in multimodal foundation models or robotics.
  • Hands-on training experience and publications in LLMs, vision-language models, video generative models, or action-based transformers.
  • Proficiency in Python, C++, CUDA, ROS, PyTorch, Jax, TensorFlow, and other ML frameworks.
  • Strong skills in large-scale machine learning systems and compute infrastructure.
  • Deep understanding of robot kinematics, dynamics, sensors, and control methods.

Market check

Salary context

This $192,000–$304,750 range sits above 81% of similar postings on FindRole.

Peer median band

$164,800$246,900

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$172,375$246,150

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Robotics Research Scientist

Nvidia

Us, Wa, Seattle, US 139 days ago $192,000$304,750
Python ROS2 PyTorch JAX CUDA Warp Isaac_Sim Isaac_Lab MuJoCo C++ CVPR ICRA IROS NeurIPS ICML ICLR ECCV Simulation Real-to-Sim Bimanual_Manipulation Mobile_Manipulation Vision-Language-Action_Models

Senior Research Engineer, Robotics Systems

Nvidia

Us, Ca, Santa Clara, US 22 days ago $184,000$287,500
Python Rust C++ ROS Gazebo Mujoco Isaac CI/CD Docker Kubernetes AWS Git Terraform Prometheus Grafana

Senior Research Scientist, AI-Mediated Reality and Interaction

Nvidia

Us, Ca, Santa Clara, US 24 days ago $192,000$304,750
Python C++ CUDA PyTorch Computer_Vision AI_Algorithms 3D_Graphics_Development Deep_Learning Neural_Rendering Generative_Models Large_Language_Models Human_Behavior_Understanding Digital_Human_Creation CVPR ICCV ECCV SIGGRAPH NeurIPS ICLR