Site Reliability Engineer - Kubernetes - Data Platforms - Adyen
Role Details
Back
Site Reliability Engineer - Kubernetes - Data Platforms
Infrastructure
Amsterdam
This is Adyen
Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.
For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team. Together, we deliver innovative and ethical solutions that help businesses achieve their ambitions faster.
Platform/Site Reliability Engineer - Kubernetes & Big Data
You will be building the rails of a self-service data platform inside Adyen, creating an ecosystem that is bigger than the sum of its parts. By blending Site Reliability Engineering, Software Engineering, Systems Engineering, and Data Engineering, you will power the many data, machine learning, and GenAI products running across Adyen.
You’ll be joining a dedicated team of 9 engineers—split between kubernetes cluster management and the core services running on top of them. We work in a flexible, Kanban-style environment, sitting right in the middle of our users. This proximity gives us a direct feedback loop, allowing us to build impactful solutions for both the "happy flow" and the "sad flow."
Beyond operations, you’ll have the opportunity to design, build, and scale infrastructure from the ground up on our on-premise environments—solving problems typically handled by managed cloud providers yourself. If you thrive on tackling real-life challenges, reducing manual toil through automation, and want unparalleled growth opportunities in SWE, Systems, or Data Engineering, this is your team.
What you’ll do
- Design & Build On-Premise (kubernetes) Infrastructure: Architect and scale modern, cloud-like services from the ground up on our on-premise infrastructure, managing core foundational layers including DNS, TLS, certification management, load balancers, and deep troubleshooting.
- Cluster Provisioning & Reliability: Build, maintain, and scale new Kubernetes clusters and Big Data services. You will maintain agreed SLOs, ensure high availability, and support end-users by keeping them unblocked.
- Mixed Workload Balancing: Prevent resource starvation by ensuring massive batch compute and ML training jobs do not consume resources required by critical, user-facing GenAI inference services and API gateways.
- Advanced Scheduling & Hardware Management: Enforce strict priority, preemption, and specialized scheduling policies (such as gang scheduling). Orchestrate diverse hardware profiles, managing GPU node pools, drivers, device plugins, and resource slicing to support intensive ML/AI processing.
- Storage & Network Optimization: Scale stateful workloads, Persistent Volumes (PVs), and high-throughput networking interfaces to handle massive data gravity and mitigate I/O bottlenecks.
- FinOps & Security: Implement intelligent autoscaling and interruptible instance management to control bursty infrastructure costs. Apply strict resource quotas, RBAC, and network policies to prevent "noisy neighbor" disruptions and guarantee secure isolation across different tenant teams.
- Automation & Operations: Dedicate time to the development of new features, applying releases, and building automations that eliminate unacceptable toil. Participate in an expanding 24x7 on-call roster to support the platform.
Who you are
- Experienced Platform/SRE Professional: You have a strong background in System Administration and Kubernetes management, with proven experience building and operating distributed systems.
- Technical Expertise: You have hands-on experience with K8s, Linux, and foundational networking (DNS, TLS, Load Balancing, ArgoCD, GitOps).
- Tooling & Ecosystems: You are highly proficient with configuration management and/or networking tools (Ansible, Puppet, Cilium, HAProxy, Nginx) and/or distributed storage and data systems (Hadoop, Minio, Ozone, Ceph, Mayastor).
- Observability Mindset: You have experience implementing and managing alerting and monitoring to keep complex systems healthy.
- Good to have: A background in Software Engineering, specialized networking, or GPU management. Familiarity with data ecosystem tools like Airflow and HDFS is highly appreciated.
- Ambitious & Collaborative: You are eager to grow (whether on an IC, Tech Leadership, or People Leadership track) and enjoy working closely with your users and team members to solve complex, scale-driven problems.
Our Diversity, Equity and Inclusion commitments
Our unique approach is a product of our diverse perspectives. This diversity of backgrounds and cultures is essential in helping us maintain our momentum. Our business and technical challenges are unique, and we need as many different voices as possible to join us in solving them - voices like yours. No matter who you are or where you’re from, we welcome you to be your true self at Adyen.
Studies show that women and members of underrepresented communities apply for jobs only if they meet 100% of the qualifications. Does this sound like you? If so, Adyen encourages you to reconsider and apply. We look forward to your application!
What’s next?
Ensuring a smooth and enjoyable candidate experience is critical for us. We aim to get back to you regarding your application within 5 business days. Our interview process tends to take about 4 weeks to complete, but may fluctuate depending on the role. Learn more about our hiring process here. Don’t be afraid to let us know if you need more flexibility.
This role is based out of our Amsterdam office. We are an office-first company and value in-person collaboration; we do not offer remote-only roles.
Apply now
The Adyen Formula
The way we work is guided by the eight principles of the Adyen Formula. Learn more here.
Learn more
Tech careers
Our engineers are building the first financial technology platform that combines payments, data, and financial services. We’re looking for more talented problem solvers to help us address a unique set of challenges.
Learn more
Related jobs
- Android CI/CD Engineer
- AV / IT Engineer
- CI/CD Engineer, Mobile
Related jobs
Tell your friends
- .st0{fill: currentColor;}
- .st0{fill: currentColor;}
For more details click Job Post.
About Ayden
Adyen is a global payments platform that provides end-to-end payment processing, merchant acquiring, and issuing solutions to large global companies, enabling them to accept payments anywhere in the world. Industry: Payments Technology & Financial Services