Data Center Technician 2

USA today

Role Details

View More Jobs

.component-styling-wrapper-108043 .job-details__title { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif;
text-align: center; } @media all and (min-width: 768px) { .component-styling-wrapper-108043 .job-details__title { font-size: 36px;
line-height: 1 } }

Data Center Technician 2

.component-styling-wrapper-108037 .job-details__subtitle { color: rgb(29, 28, 27); }

United States
Abilene, TX, United States

.component-styling-wrapper-108027 .job-meta__title { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Semi Bold', Arial, sans-serif; }
.component-styling-wrapper-108027 .job-meta__subitem { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; }
.component-styling-wrapper-108027 .job-details__section { background-color: rgb(244, 244, 244);
border-color: rgb(62, 127, 124);
border-radius: 4px;
border-width: 1px; border-style: solid; }
.component-styling-wrapper-108027 .job-details__section .job-meta__pin-icon { color: rgb(49, 45, 42); }

Job Identification
331583
Job Category
Information Technology
Posting Date
04/30/2026, 09:30 AM
Role
Individual Contributor
Job Type
Regular Employee
Does this position require a security clearance?
No
Years
3 to 5+ years
Applicants
Less than 10 applicants
Additional Info
Visa / work permit sponsorship is not available for this position
Applicants are required to read, write, and speak the following languages
English

.component-styling-wrapper-108030 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108030 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108030 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108030 .job-details__description-content { line-height: 1.5 } }

Job Description

Daily Job Duties:

Hardware installation and decommission of enterprise servers and cabling infrastructure.
Fix and repair of data center hardware and networking infrastructure within Service Level Agreements (SLA).
Rack and stack of data center equipment, including but not limited to servers, networking devices, monitoring systems and other equipment.
Landing racks, cabling, power up and handoff of servers to internal provisioning teams.
Handle daily work through internal tooling systems such as JIRA and Confluence.
Performs moderately complex problem solving with some assistance and guidance
Documentation of activities and strict alignment to SOP’s
Inventory process: Receive, track, and ship data center equipment as needed.
Install, label, and fix fiber/copper/telecom cables as well as patch panel systems.
Understand, interpret, and alignment to local site requirements.
Be punctual in arriving for shifts on time and provide proper handover reports after shift to the next team member(s).
Provide on-call emergency support at pre-determined times.
Participate and complete training that aligns with corporate objectives to bridge skill gaps and learn new relevant technologies.
Support and receive direction from senior team members.
Standard assignments are accomplished without assistance by exercising judgement within defined policies and processes to determine the appropriate action
Works on problems of moderate scope where analysis of situations of data requires a review of a variety of factors
Developing professional expertise, applies company policies and procedures to resolve a variety of issues

Career Level - IC2

Requirements:

Knowledge of server/storage/network hardware systems and components.
Excellent time management skills.
Detail-oriented with excellent organizational skills.
Be a good team player.
Strong interest in learning new DC concepts and technology.
Dependable and trustworthy.
Able to work alone and as part of a team
Must be able to lift up to 50 lbs.
Able to climb a 9 ft ladder.
Strong verbal and written communication skills.
Shift schedule that rotates. Includes nights, weekends and holidays.
IT Hardware Concepts knowledge (RAID, Linux/Unix, CLI, etc.)
Good computer skills
Able to stand and walk for extended periods of time
Able to work 10-12 hour shifts
Any experience with mainstream ticketing systems is a plus
Willing to travel as required
Client experience is preferred

Note: This is an onsite 24/7/365 position, and includes day/night shift, weekend and holiday work with occasional travel based on business requirements. This position will also require consent to the processing of biometric data for identity verification and access control.

Preferred Qualifications

2-year or 4-year degree or relevant work experience
2 Years of Data Center Experience; 1 year of development
Linux administration knowledge/Ability to run Linux commands
Data center networking knowledge.
Experience in Data Center Infrastructure projects.
1 year+ experience in Structured Cabling Copper/Fiber and testing of cables.
1 year+ experience in Cooling and Electrical systems concepts inside the data center.
Experience in repairing and working on Oracle servers is nice to have but not required.

Employment for all candidates is contingent upon passing a background check.

Does this sound like you? If so, we hope to meet you!

Visa sponsorship is not available for this role. For clarity purposes, this means that Oracle is not in a position now, or in the future, to offer US immigration sponsorship. This includes, but is not limited to, support of H-1B, TN, O-1, or F-1 e.g. EAD, OPT, CPT, I-20, F-1 visa stamp etc.

Monitor the fundamental server/network capacity. Determine the characteristics of a problem and either resolve the problem immediately and/or record, escalate, and track the problem through to closure and to the customer's satisfaction.

Disclaimer:

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US: Hiring Range in USD from $27.07 to $54.13 per hour; from: $56,300 to $112,600 per annum. May be eligible for equity.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle’s differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

Career Level - IC2

.component-styling-wrapper-108031 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108031 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108031 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108031 .job-details__description-content { line-height: 1.5 } }

Responsibilities

Perform performance trend analysis and manage the server/network capacity. React to potential problems using automation, scheduling, and monitoring tools -- escalating to management where appropriate. Participate in configuration and implement technical solutions to enhance and/or troubleshoot the system. Responsible for support documentation as well.

.component-styling-wrapper-108042 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108042 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108042 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108042 .job-details__description-content { line-height: 1.5 } }

Qualifications

D

View Details

Data Center Technician 2

Oracle

USA today

Role Details

View More Jobs

.component-styling-wrapper-108043 .job-details__title { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif;
text-align: center; } @media all and (min-width: 768px) { .component-styling-wrapper-108043 .job-details__title { font-size: 36px;
line-height: 1 } }

Data Center Technician 2

.component-styling-wrapper-108037 .job-details__subtitle { color: rgb(29, 28, 27); }

United States
Abilene, TX, United States

.component-styling-wrapper-108027 .job-meta__title { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Semi Bold', Arial, sans-serif; }
.component-styling-wrapper-108027 .job-meta__subitem { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; }
.component-styling-wrapper-108027 .job-details__section { background-color: rgb(244, 244, 244);
border-color: rgb(62, 127, 124);
border-radius: 4px;
border-width: 1px; border-style: solid; }
.component-styling-wrapper-108027 .job-details__section .job-meta__pin-icon { color: rgb(49, 45, 42); }

Job Identification
331590
Job Category
Information Technology
Posting Date
04/30/2026, 08:52 AM
Role
Individual Contributor
Job Type
Regular Employee
Does this position require a security clearance?
No
Years
3 to 5+ years
Applicants
Less than 10 applicants
Additional Info
Visa / work permit sponsorship is not available for this position
Applicants are required to read, write, and speak the following languages
English

.component-styling-wrapper-108030 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108030 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108030 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108030 .job-details__description-content { line-height: 1.5 } }

Job Description

Daily Job Duties:

Hardware installation and decommission of enterprise servers and cabling infrastructure.
Fix and repair of data center hardware and networking infrastructure within Service Level Agreements (SLA).
Rack and stack of data center equipment, including but not limited to servers, networking devices, monitoring systems and other equipment.
Landing racks, cabling, power up and handoff of servers to internal provisioning teams.
Handle daily work through internal tooling systems such as JIRA and Confluence.
Performs moderately complex problem solving with some assistance and guidance
Documentation of activities and strict alignment to SOP’s
Inventory process: Receive, track, and ship data center equipment as needed.
Install, label, and fix fiber/copper/telecom cables as well as patch panel systems.
Understand, interpret, and alignment to local site requirements.
Be punctual in arriving for shifts on time and provide proper handover reports after shift to the next team member(s).
Provide on-call emergency support at pre-determined times.
Participate and complete training that aligns with corporate objectives to bridge skill gaps and learn new relevant technologies.
Support and receive direction from senior team members.
Standard assignments are accomplished without assistance by exercising judgement within defined policies and processes to determine the appropriate action
Works on problems of moderate scope where analysis of situations of data requires a review of a variety of factors
Developing professional expertise, applies company policies and procedures to resolve a variety of issues

Career Level - IC2

Requirements:

Knowledge of server/storage/network hardware systems and components.
Excellent time management skills.
Detail-oriented with excellent organizational skills.
Be a good team player.
Strong interest in learning new DC concepts and technology.
Dependable and trustworthy.
Able to work alone and as part of a team
Must be able to lift up to 50 lbs.
Able to climb a 9 ft ladder.
Strong verbal and written communication skills.
Shift schedule that rotates. Includes nights, weekends and holidays.
IT Hardware Concepts knowledge (RAID, Linux/Unix, CLI, etc.)
Good computer skills
Able to stand and walk for extended periods of time
Able to work 10-12 hour shifts
Any experience with mainstream ticketing systems is a plus
Willing to travel as required
Client experience is preferred

Note: This is an onsite 24/7/365 position, and includes day/night shift, weekend and holiday work with occasional travel based on business requirements. This position will also require consent to the processing of biometric data for identity verification and access control.

Preferred Qualifications

2-year or 4-year degree or relevant work experience
2 Years of Data Center Experience; 1 year of development
Linux administration knowledge/Ability to run Linux commands
Data center networking knowledge.
Experience in Data Center Infrastructure projects.
1 year+ experience in Structured Cabling Copper/Fiber and testing of cables.
1 year+ experience in Cooling and Electrical systems concepts inside the data center.
Experience in repairing and working on Oracle servers is nice to have but not required.

Employment for all candidates is contingent upon passing a background check.

Does this sound like you? If so, we hope to meet you!

Visa sponsorship is not available for this role. For clarity purposes, this means that Oracle is not in a position now, or in the future, to offer US immigration sponsorship. This includes, but is not limited to, support of H-1B, TN, O-1, or F-1 e.g. EAD, OPT, CPT, I-20, F-1 visa stamp etc.

Monitor the fundamental server/network capacity. Determine the characteristics of a problem and either resolve the problem immediately and/or record, escalate, and track the problem through to closure and to the customer's satisfaction.

Disclaimer:

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US: Hiring Range in USD from $27.07 to $54.13 per hour; from: $56,300 to $112,600 per annum. May be eligible for equity.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle’s differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.

Career Level - IC2

.component-styling-wrapper-108031 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108031 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108031 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108031 .job-details__description-content { line-height: 1.5 } }

Responsibilities

Perform performance trend analysis and manage the server/network capacity. React to potential problems using automation, scheduling, and monitoring tools -- escalating to management where appropriate. Participate in configuration and implement technical solutions to enhance and/or troubleshoot the system. Responsible for support documentation as well.

.component-styling-wrapper-108042 .job-details__description-header { color: rgb(29, 28, 27);
font-family: 'Oracle Sans Bold', Arial, sans-serif;
text-align: left; } @media all and (min-width: 768px) { .component-styling-wrapper-108042 .job-details__description-header { font-size: 20px; } }
.component-styling-wrapper-108042 .job-details__description-content { color: rgb(29, 28, 27);
font-family: 'Oracle Sans', Arial, sans-serif; } @media all and (min-width: 768px) { .component-styling-wrapper-108042 .job-details__description-content { line-height: 1.5 } }

Qualifications

D

View Details

Staff Product Data Scientists - Multiple Roles

Shopify

USA today

Role Details

Are you ready to decode commerce through the power of data? As a Staff Data Scientist at Shopify, you'll work with one of the world's richest commerce datasets - millions of merchants generating billions of transactions across the entire buying journey. You'll deeply understand our products, tell their stories, and figure out how to make them 10x better.

This is data science at unprecedented scale. Every analysis you run, every model you build directly impacts how millions of entrepreneurs grow their businesses and how billions of people shop worldwide. You'll own big problems and collaborate with cross-functional partners to build data-informed solutions that transform commerce from first click to final delivery. As the architect of insights that guide our product journey, you'll determine what we build and how we build it - all while moving at the velocity that makes Shopify legendary.

We're hiring across multiple levels and teams, each offering unique challenges and opportunities: Checkout, Search, Shop, Merchant Services, Infrastructure Data, Revenue Data, Executive Insights, and more. Every team applies different methodologies to solve distinct problems - from real-time ML to causal inference - ensuring you'll grow your skills while shaping how commerce works globally.

Key Responsibilities

Unearth insights through advanced analytics and data exploration across billions of transactions to inform product strategy and development
Partner with product managers and engineers to translate complex datasets into actionable strategies
Apply statistical models, regression analysis, segmentation techniques, and experimental methods to optimize product performance at massive scale
Build powerful data products that improve the experience of millions of merchants worldwide - from product discovery through post-purchase
Influence decision-making by providing clear, data-backed recommendations that shape how commerce evolves
Lead high-impact projects through complex, ambiguous problems while mentoring others
Continuously innovate by leveraging cutting-edge analytics techniques, including AI/ML tools where applicable

Qualifications

Strong mastery of SQL
Proficiency in programming languages such as Python, and experience with data visualization tools
Proven experience in statistical analysis, data modelling, and machine learning at scale
Skilled in translating complex data into actionable insights that drive business outcomes
Ability to own and solve complex product and business problems with data in ambiguous, fast-moving environments
Demonstrated ability to thrive working at startup speed with enterprise impact
Track record of shipping data products that create value

Ready to transform data into groundbreaking commerce solutions? Join the team that's making commerce better for everyone.

View Details

Senior Product Data Scientists - Multiple Roles

Shopify

USA today

Role Details

Are you ready to decode commerce through the power of data? As a Senior Data Scientist at Shopify, you'll work with one of the world's richest commerce datasets - millions of merchants generating billions of transactions across the entire buying journey. You'll deeply understand our products, tell their stories, and figure out how to make them 10x better.

This is data science at unprecedented scale. Every analysis you run, every model you build directly impacts how millions of entrepreneurs grow their businesses and how billions of people shop worldwide. You'll own big problems and collaborate with cross-functional partners to build data-informed solutions that transform commerce from first click to final delivery. As the architect of insights that guide our product journey, you'll determine what we build and how we build it - all while moving at the velocity that makes Shopify legendary.

We're hiring across multiple levels and teams, each offering unique challenges and opportunities: Checkout, Search, Shop, Merchant Services, Infrastructure Data, Revenue Data, Executive Insights, and more. Every team applies different methodologies to solve distinct problems - from real-time ML to causal inference - ensuring you'll grow your skills while shaping how commerce works globally.

Key Responsibilities

Unearth insights through advanced analytics and data exploration across billions of transactions to inform product strategy and development
Partner with product managers and engineers to translate complex datasets into actionable strategies
Apply statistical models, regression analysis, segmentation techniques, and experimental methods to optimize product performance at massive scale
Build powerful data products that improve the experience of millions of merchants worldwide - from product discovery through post-purchase
Influence decision-making by providing clear, data-backed recommendations that shape how commerce evolves
Continuously innovate by leveraging cutting-edge analytics techniques, including AI/ML tools where applicable

Qualifications

Strong mastery of SQL
Proficiency in programming languages such as Python, and experience with data visualization tools
Proven experience in statistical analysis, data modelling, and machine learning at scale
Skilled in translating complex data into actionable insights that drive business outcomes
Ability to own and solve complex product and business problems with data in ambiguous, fast-moving environments
Demonstrated ability to thrive working at startup speed with enterprise impact
Track record of shipping data products that create value

Ready to transform data into groundbreaking commerce solutions? Join the team that's making commerce better for everyone.

View Details

Staff Software Engineer - Ads

Shopify

USA today

Role Details

Great advertising connects merchants with customers who genuinely need what they're selling. As a Staff Software Engineer focused on Ads, you'll build the targeting and personalization technology that makes these meaningful connections happen at scale. You'll develop state of the art ad platform features that help merchants reach the right audience at exactly the right moment, creating advertising experiences that drive real business growth while respecting the customer experience.

Key Responsibilities:

Design and optimize real‑time ad serving, auction, and ranking systems with p99 latency targets.
Solve complex ad performance problems to identify optimization opportunities that impact millions
Collaborate closely with advertising data teams to integrate data driven solutions seamlessly into our ad platform
Ship experimentation tooling for ad A/B testing
Document technical insights and share best practices across engineering teams
Participate in on-call work to ensure reliability and performance of Shopify’s ad systems.

Qualifications:

Strong Engineering proficiency and a passion for exploring the limits of technology
Strong with streaming and low‑latency systems: Kafka, Flink/Spark Streaming; feature stores; high‑QPS caches/KV stores (Redis/Aerospike/RocksDB)
Experience in building scalable, ad-centric applications that enhance the commerce experience
Demonstrated problem-solving abilities and innovative thinking in ad technology solutions
Excellent communication skills to convey technical ideas effectively
A growth-oriented mindset, constantly seeking to improve and innovate.

Nice to have:

Fluency in ads metrics and levers: CTR, CVR, CPC/CPM/CPA, eCPM, ROAS, LTV, win rate, fill rate, pacing accuracy—and how to move them.

*At Shopify, we pride ourselves on moving quickly—not just in shipping, but in our hiring process as well. If you're ready to apply, please be prepared to interview with us within the week. Our goal is to complete the entire interview loop within 30 days. You will be expected to complete a live pair programming session, come prepared with your own IDE.

This role may require on-call work*

Ready to connect merchants with their perfect customers? Join the team that's making commerce better for everyone.

View Details

Staff ML Platform Engineer - Recommendations

Shopify

USA today

Role Details

Shopify is the commerce platform that powers millions of merchants worldwide. Behind the product experience are ML systems that drive recommendations, search, and personalization at massive scale.

We build the compute and serving layer behind these systems: multi-node GPU training clusters, real-time inference with strict latency budgets, and the performance engineering that keeps it all efficient at scale. Our models serve hundreds of millions of buyers, and the infrastructure we build directly impacts how merchants grow their businesses.

The Role

You will own the core infrastructure that ML ML Engineers depend on to train and serve models: GPU training clusters, real-time serving systems, and the performance and reliability layer underneath both. You'll sit between ML Engineers who need fast iteration and production systems that need to stay up during events like Black Friday/Cyber Monday, where traffic and stakes peak simultaneously.

This role carries real technical authority. You'll make architectural decisions about how we scale training and serving, set standards for infrastructure quality, and be the person the team relies on when systems need to scale by an order of magnitude. You'll mentor engineers across the team, drive alignment on infrastructure direction across multiple workstreams, and influence technical strategy beyond your immediate team. You'll also raise the engineering bar through hiring and technical reviews.

What You'll Do

Training Infrastructure

Design and operate GPU training pipelines on Kubernetes, including multi-node distributed training on GPU clusters
Own training reliability: checkpointing, fault tolerance, preemption recovery, and resource scheduling
Optimize training performance: mixed precision, kernel tuning, data loading throughput, and cluster utilization. You own compute efficiency; data correctness and freshness are owned by the operations side of the team.
Build abstractions that let ML Engineers launch and iterate on training runs with minimal friction

Serving Infrastructure

Build and maintain model serving infrastructure for real-time recommendation and LLM inference, with strict latency and throughput requirements
Optimize serving cost and performance: batching strategies, model compilation, GPU right-sizing, and autoscaling
Ensure serving systems meet availability and latency targets under peak traffic

Platform & Developer Experience

Build internal tools and platforms that accelerate the model development lifecycle
Define infrastructure patterns and best practices adopted across the team
Improve the inner loop for ML Engineers: faster iteration from code change to training result to production evaluation

Technical Leadership

Drive cross-team technical strategy for ML infrastructure - identify the next set of problems before they become blockers
Mentor and up-level engineers on the team through pairing, design reviews, and setting technical standards
Contribute to hiring: screen candidates, conduct technical interviews, and calibrate the engineering bar
Write technical proposals and RFCs that shape infrastructure direction across the organization

What We're Looking For

Required

7+ years in software engineering, with 5+ years focused on ML infrastructure or distributed systems
Deep hands-on experience with GPU training at scale: distributed training, checkpointing, fault recovery, and performance tuning. You've debugged real problems like NCCL hangs, gradient synchronization issues, or data loading bottlenecks.
Strong Kubernetes skills: pod specs, GPU scheduling, resource quotas, debugging scheduling failures, and operating stateful GPU workloads
Production model serving experience: you've built or operated serving systems behind real user traffic with latency constraints
Solid Python and systems fundamentals; comfortable reading and modifying PyTorch training code
Experience designing infrastructure abstractions used by other engineers
Demonstrated technical leadership: you've driven architecture decisions, written technical proposals, and influenced engineering direction beyond your immediate team
Track record of mentoring engineers and raising the technical bar on a team

Preferred

Experience with cloud-native ML orchestration (SkyPilot, Ray, or similar)
Hands-on with LLM serving stacks (vLLM, TensorRT-LLM, Triton, or equivalent)
Experience with model compression in production (quantization, pruning, distillation)
Experience operating recommendation or retrieval systems at scale
Track record of building internal platforms adopted by other teams

How We Work

You'll pair directly with ML ML Engineers. Understanding their models well enough to build the right infrastructure abstractions is part of the job.
We prefer automation over runbooks. If a process can be scripted, it should be.
On-call is shared. When you're on rotation, your scope is GPU cluster health, training failures, and serving availability - you own it end to end.
You'll profile GPU kernels, chase p99 latency regressions, and care about FLOPS utilization. This is a deeply technical infrastructure role.
Research and production are the same codebase. You'll see your infrastructure decisions reflected in real model quality and real merchant outcomes.
Shopify operates on high trust and low process. You'll have real ownership and the autonomy to make decisions, not just execute tickets.

What Success Looks Like

In 3 months: You've onboarded to training and serving infrastructure, shipped at least one meaningful improvement to reliability or performance, and can independently debug issues across the GPU stack.
In 6 months: You own a major infrastructure subsystem (training cluster or serving platform). Researchers are training faster or serving more reliably because of changes you've made.
In 12 months: You've shaped the technical roadmap for ML infrastructure and influenced engineering direction beyond the team. Other engineers across the organization come to you for architectural guidance. The platform scales to the next generation of models because of the systems and standards you've put in place. You've made the team stronger through hiring and mentorship.

View Details

Staff ML Ops Engineer - Recommendations

Shopify

USA today

Role Details

Shopify is the commerce platform that powers millions of merchants worldwide. Behind the product experience are ML systems that drive recommendations, search, and personalization at massive scale.

We build and maintain the operational backbone behind these systems: deployment pipelines, evaluation frameworks, data preprocessing, and the monitoring that keeps models fresh and reliable in production. Our models serve hundreds of millions of buyers, and the pipelines we build directly impact how quickly and safely we can improve merchant outcomes.

The Role

You will own the operational lifecycle of our ML systems: deployment pipelines, evaluation frameworks, data pipelines, and the monitoring and reliability layer that keeps everything running in production. You'll ensure models go from training to production safely, that we can evaluate changes rigorously, and that the data feeding our models is fresh and correct.

This role is the connective tissue between research and production. You'll build the systems that let engineers ship model improvements with confidence and speed, while maintaining the reliability standards required to serve hundreds of millions of buyers - including during peak events like Black Friday/Cyber Monday.

This role carries real technical authority. You'll set the standards for how models get deployed and evaluated, mentor engineers on operational best practices, and drive alignment on reliability and pipeline strategy across the team. You'll influence technical direction beyond your immediate team and raise the engineering bar through hiring and technical reviews.

What You'll Do

Deployment & Rollout

Own the model deployment pipeline end to end: export, validation, canary rollout, rollback, and A/B integration
Build and maintain CI/CD for ML: automated testing, model validation gates, and progressive delivery
Ensure safe, repeatable deployments with clear rollback paths and minimal manual intervention

Evaluation & Experimentation

Build automated offline evaluation pipelines against production baselines
Extend our experimentation framework so ML Engineers can launch and evaluate model changes with minimal friction
Maintain evaluation datasets and ensure data freshness and correctness
Integrate offline metrics with online A/B testing to close the feedback loop

Data Pipelines

Own data preprocessing for training: interaction histories, feature stores, and embedding tables
Manage workflow orchestration (Airflow or equivalent) for scheduled retraining and data refresh. You trigger and monitor training runs; the underlying GPU compute layer is owned by the infrastructure side of the team.
Ensure data quality, lineage tracking, and pipeline idempotency
Own data correctness and freshness; partner with infrastructure engineers on data loading throughput and efficiency

Monitoring & Reliability

Build monitoring and alerting across training jobs, serving endpoints, and data pipelines
Define and maintain SLOs for model freshness, serving latency, and training throughput
Participate in incident response and drive post-mortems for ML system failures
Identify and eliminate toil through automation

Technical Leadership

Drive cross-team technical strategy for ML operations - identify systemic reliability risks and pipeline bottlenecks before they become incidents
Mentor and up-level engineers on the team through pairing, design reviews, and setting operational standards
Contribute to hiring: screen candidates, conduct technical interviews, and calibrate the engineering bar
Write technical proposals and RFCs that shape operational direction across the organization

What We're Looking For

Required

7+ years in software engineering, with 5+ years focused on MLOps, data engineering, or production ML systems
Strong experience with ML deployment pipelines: model export, validation, canary releases, and rollback strategies
Experience with workflow orchestration for ML (Airflow, Dagster, Prefect, or similar)
Solid Python fundamentals; comfortable working with PyTorch model artifacts and training configurations
Production monitoring experience: you've built or operated alerting, dashboards, and SLO frameworks for ML systems
Experience with data pipelines at scale: batch processing, feature engineering, and data quality validation
Working proficiency with Kubernetes: able to debug pod failures, understand resource scheduling, and navigate GPU workloads
Demonstrated technical leadership: you've driven operational strategy, written technical proposals, and influenced engineering direction beyond your immediate team
Track record of mentoring engineers and raising the reliability bar on a team

Preferred

Experience with large-scale data warehouses (BigQuery or equivalent) for offline evaluation and metrics
Hands-on with experiment tracking and A/B testing frameworks
Experience operating recommendation or retrieval systems at scale
Familiarity with model compression workflows in production (quantization, pruning, distillation)
Experience with cloud-native ML orchestration (SkyPilot, Ray, or similar)

How We Work

You'll pair directly with ML Engineers. Understanding their models well enough to build the right operational workflows is part of the job.
We prefer automation over runbooks. If a process can be scripted, it should be.
On-call is shared. When you're on rotation, your scope is pipeline failures, data freshness alerts, deployment rollbacks, and evaluation integrity - you own it end to end.
You'll dig into Airflow DAG failures, data drift alerts, and deployment validation issues. This is a deeply operational role with high production stakes.
Research and production are the same codebase. You'll see your operational decisions reflected in real model quality and real merchant outcomes.
Shopify operates on high trust and low process. You'll have real ownership and the autonomy to make decisions, not just execute tickets.

What Success Looks Like

In 3 months: You've onboarded to deployment and evaluation pipelines, shipped at least one meaningful improvement to deployment safety or developer experience, and can independently debug issues across the operational stack.
In 6 months: You own a major subsystem (deployment pipeline, evaluation framework, or data pipelines). Researchers are shipping model changes faster or more safely because of improvements you've made.
In 12 months: You've shaped the operational roadmap for ML systems and influenced engineering direction beyond the team. Deployments are faster and safer, evaluation is more rigorous, and the team trusts the pipelines you've built. Other engineers across the organization come to you for guidance on ML operational best practices. You've made the team stronger through hiring and mentorship.

View Details

Staff Engineer - Real-Time Merchant Analytics

Shopify

USA today

Role Details

We're building the future of Real-Time Merchant Analytics at Shopify!

As a Staff Engineer you'll be at the forefront of reimagining how merchant data flows through modern streaming architectures. This isn't your typical infrastructure role – you'll be crafting solutions that challenge conventional approaches to data processing at global scale.

What Makes This Exciting?

You'll work across multiple languages and technologies – Java, Ruby, Python, SQL, Flink, and ClickHouse – choosing the right tool for each challenge, model data elegantly, and turning data pipeline development into a configuration exercise rather than a coding marathon.

You'll tackle fascinating problems: How do you architect lightning-fast real-time modeling that seamlessly combines data from multiple tables? How do you handle late-arriving data in distributed streams? What's the most elegant approach to backfill terabytes while maintaining real-time processing?

We embrace AI and LLMs to accelerate repetitive tasks, freeing you to focus on the creative problem-solving that makes this work truly rewarding.

If you love turning "impossible" requirements into beautiful solutions, this is your playground.

What You'll Do

Architect, build, and refine high-performance streaming infrastructure tailored to large-scale, real-time merchant analytics.
Develop tools and frameworks to boost platform efficiency, scalability, and developer experience across the team.
Collaborate with cross-functional teams to integrate streaming systems with Shopify's broader data ecosystem.
Partner with product and data teams to influence the technical roadmap and shape the future of merchant analytics.
Mentor and uplevel engineers on the team, fostering an environment of innovation and technical excellence.

What You'll Need

Extensive experience in data infrastructure engineering, particularly in building and scaling real-time data platforms.
Strong knowledge of Apache Flink or similar stream processing frameworks (Kafka Streams, Spark Streaming).
Proficiency in multiple programming languages (Java, SQL required; Python, Ruby a plus).
Experience with analytical databases like ClickHouse or BigQuery.
Strong understanding of containerization (Docker, Kubernetes).
Deep expertise in handling distributed systems challenges: late-arriving data, exactly-once semantics, backfill strategies, and data consistency.
Outstanding problem-solving skills with a focus on complex technical challenges at scale.
A collaborative mindset and the ability to thrive in a diverse, dynamic team environment.

View Details

Senior Staff Engineer - Measurement

Shopify

USA today

Role Details

About the role

Join Shopify's dynamic engineering team, where code is core and innovation drives commerce forward. As a Senior Staff Engineer on the Measurement team, you'll lead the development of cutting-edge marketing analytics tools that empower merchants to optimize customer acquisition and maximize return on ad spend. Collaborate with data engineers, product teams, and partners to build unbiased insights, integrations, and reports that analyze paid advertising performance. Help shape the future of e-commerce analytics in a fast-paced environment, solving complex attribution challenges to make data-driven decisions accessible for entrepreneurs worldwide.

Key Responsibilities:

Design scalable data pipelines for marketing analytics and attribution models measuring ad impact.
Build integrations with partners (e.g., Google, Meta, Pinterest, TikTok) for real-time ad data processing.
Develop analytics systems using streaming tech like Flink and Kafka for large datasets and timely insights.
Own end-to-end delivery of measurement tools, from prototyping to production APIs and reports.
Optimize data processing efficiency with SQL, ClickHouse, and analytics for fast query latencies.
Influence product roadmap and advocate strategic direction with executive stakeholders.
Evolve measurement infrastructure for emerging paradigms like multi-touch attribution and AI metrics.
Mentor engineers and collaborate on unbiased insights to optimize merchant ad strategies.

Qualifications:

Proven data engineering experience building scalable analytics systems.
Strong skills in streaming tech like Apache Flink, Spark, and Kafka for real-time processing.
Expertise in analytics, attribution modeling, and marketing/e-commerce metrics.
Experience with APIs for data integrations and strong SQL proficiency.
Ruby is a plus but not required.
Skills in performance optimization for data pipelines and large datasets.
Strong collaboration with teams like data scientists and product managers.

At Shopify, we pride ourselves on moving quickly—not just in shipping, but in our hiring process as well. If you’re ready to apply, please be prepared to interview with us within the week. Our goal is to complete the entire interview loop within 30 days. You will be expected to complete a pair programming interview, using your own IDE.

This role may require on-call work.

Ready to craft the world’s best marketing analytics tools and drive data-powered commerce forward? Join us and make commerce better for everyone.

View Details

Senior Staff Engineer - Machine Learning Inference

Shopify

USA today

Role Details

Step into the engine room of Agentic Commerce! Imagine owning the bleeding edge of machine learning at Shopify, where your acceleration, optimization, and scaling of ML inference will shape the experience of millions of merchants, and influence how commerce AI is done worldwide. We’re seeking a Senior Staff Engineer to architect, optimize, and own the high-performance runtime that transforms innovative models into production breakthroughs. Your work will be the engine behind our real-time AI systems, driving game-changing cost and latency reductions, and enabling rapid launches of intelligent features that keep Shopify (and our merchants) years ahead. Join a remote-first team of world-class experts, experiment fearlessly, and see your code move the needle for some of the largest-scale ML workloads in commerce.

Responsibilities

Architect, optimize, and own Shopify’s production ML inference. Designing for high throughput, ultra-low latency, and global reliability.
Leverage and extend technologies like CUDA, TensorRT, Triton, TVM, and custom GPU kernels to deliver state-of-the-art performance and efficiency at scale.
Partner with ML, infrastructure, and product teams to seamlessly deploy, benchmark, and scale cutting-edge models powering our platform.
Drive cost optimization and system efficiency, reducing cloud spend and carbon footprint by orders of magnitude without sacrificing model quality.
Lead deep performance investigations, apply advanced techniques (pruning, quantization, distillation, batching), and implement robust solutions for serving models in production.
Set technical strategy and culture for ML inference across Shopify, mentoring others and collaborating with global AI pioneers.

Qualifications

Proven, hands-on expertise in building and optimizing large-scale ML inference systems, with measurable performance and cost wins.
Deep experience in production model serving, runtime optimization, and acceleration. Especially leveraging GPUs (CUDA, TensorRT) and high-performance deep learning infrastructure.
Strong software engineering skills (Python, C++, and/or other relevant languages) with a robust systems and distributed computing mindset.
Demonstrated leadership in architecting or scaling reliable, real-time inference at scale, handling millions of queries per day.
Track record of cross-functional impact: working closely with ML research/engineering, infra, and product teams to deliver production results.
Advanced understanding of model compression, quantization, efficient deployment, and tradeoffs between speed, cost, and accuracy.

Nice to Haves

Open source contributions to inference frameworks (TensorRT, TVM, Triton, DeepSpeed, ONNX, etc.) or technical talks/publications at leading AI conferences.
Experience optimizing inference across a variety of hardware (NVIDIA, AMD, ARM, cloud TPUs).
Familiarity with building or integrating robust monitoring, observability, and auto-scaling for inference platforms.
Experience with modern MLOps pipelines and methodologies.
Prior experience in e-commerce, large-scale product infra, or globally distributed inference workloads.

At Shopify, we pride ourselves on moving quickly—not just in shipping, but in our hiring process as well. If you're ready to apply, please be prepared to interview with us within the week. Our goal is to complete the entire interview loop within 30 days. You will be expected to complete a live pair programming session, come prepared with your own IDE.

View Details

Job Listings

Why AI Match requires a free account

Resume-based ranking

Salary & location filters

Boost & block keywords

Data Center Technician 2

Data Center Technician 2

Job Description

Responsibilities

Qualifications

Data Center Technician 2

Data Center Technician 2

Job Description

Responsibilities

Qualifications

Staff Product Data Scientists - Multiple Roles

Senior Product Data Scientists - Multiple Roles

Staff Software Engineer - Ads

Staff ML Platform Engineer - Recommendations

Training Infrastructure

Serving Infrastructure

Platform & Developer Experience

Technical Leadership

Required

Preferred

Staff ML Ops Engineer - Recommendations

Deployment & Rollout

Evaluation & Experimentation

Data Pipelines

Monitoring & Reliability

Technical Leadership

Required

Preferred

Staff Engineer - Real-Time Merchant Analytics

What Makes This Exciting?

What You'll Do

What You'll Need

Senior Staff Engineer - Measurement

Senior Staff Engineer - Machine Learning Inference