Senior ML Infrastructure Engineer job opportunity at Ellison Institute of Technology.



Date2025-12-12T14:20:53.315Z bot
Ellison Institute of Technology Senior ML Infrastructure Engineer
Experience: General
Pattern: full-time
apply Apply Now
Salary:
Status:

Job

Copy Link Report
degreeGeneral
loacation Oxford, United Kingdom
loacation Oxford....United Kingdom

At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity’s greatest challenges in four transformative areas: Health, Medical Science & Generative Biology Food Security & Sustainable Agriculture Climate Change & Managing CO₂ Artificial Intelligence & Robotics This is ambitious work - work that demands curiosity, courage, and a relentless drive to make a difference. At EIT, you’ll join a community built on excellence, innovation, tenacity, trust, and collaboration, where bold ideas become real-world breakthroughs. Together, we push boundaries, embrace complexity, and create solutions to scale ideas for lab to society. Explore more at  www.eit.org Our MLOps team Join our MLOps team to build the cloud and compute foundation that enables scientific breakthroughs. Deliver reliable, secure platforms and self-service guardrails that accelerate experimentation and turn ideas into results—faster, at scale, and with confidence.  Day-to-day, you might: Build, operate, and continuously optimise our high-performance GPU training and inference clusters, focusing on robust, high-availability scheduling, isolation, and automated lifecycle management.  Drive systems design and implementation for high-throughput data paths, optimising I/O, caching, and data locality across compute and storage (including our current Lustre implementation).  Proactively benchmark, profile, and resolve performance bottlenecks across the compute, network, and orchestration layers to maximise efficiency for distributed training and inference.  Establish comprehensive observability, resilience, and automated security controls to ensure compliance and robust operation of sensitive research environments.  Partner with Research, Data, and Applied teams to forecast capacity and cost for GPU and storage needs, setting quotas and streamlining ML experimentation pipelines.  What makes you a great fit: Proven experience leading the design, build, and operation of high-performance ML compute clusters at scale  A proactive, autonomous approach to systems design and the proven ability and desire to ideate, co-create and implement optimal solutions  Exposure to migrating or transforming ML infrastructure from traditional schedulers to modern, containerised systems  Expertise with high-throughput storage systems for ML/HPC workloads  Expert-level understanding of GPU architecture, high-speed networking for distributed training, and performance profiling to resolve bottlenecks  A solid grasp of IaC and CI/CD practices (e.g., Terraform, Argo CD)

Other Ai Matches

Senior Microbiologist - Pathogen Applicants are expected to have a solid experience in handling Job related tasks
CMMS Administrator Applicants are expected to have a solid experience in handling Job related tasks
Head of Metabolic Analysis and Modelling - Plant Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
Document Controller Applicants are expected to have a solid experience in handling Job related tasks
Logistics and Distributions Administrator Applicants are expected to have a solid experience in handling Job related tasks
Research Assistant - Generative Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
Backend Software Engineer - Pathogen Applicants are expected to have a solid experience in handling Job related tasks
Group Leader, Cell Based Production (Growth and Morphology) - PBI Applicants are expected to have a solid experience in handling Cell Based Production (Growth and Morphology) - PBI related tasks
Operations Manager - Generative Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
(Senior) Group Leader, Plant Phenotyping - Plant Biology Institute Applicants are expected to have a solid experience in handling Plant Phenotyping - Plant Biology Institute related tasks
Environmental Health and Safety (EHS) Coordinator Applicants are expected to have a solid experience in handling Job related tasks
Biological Safety Officer Applicants are expected to have a solid experience in handling Job related tasks
Postdoctoral Research Fellow - Generative Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
Instrumentation Software Engineer - Pathogen Applicants are expected to have a solid experience in handling Job related tasks
Senior Scientist, Biotechnology - Pathogen Applicants are expected to have a solid experience in handling Biotechnology - Pathogen related tasks
Platform Engineer - Pathogen Applicants are expected to have a solid experience in handling Job related tasks
Receptionist Applicants are expected to have a solid experience in handling Job related tasks
Head of Bioinformatics - Plant Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
(Or Senior) Data Platform Architect - Generative Biology Institute Applicants are expected to have a solid experience in handling Job related tasks
(Senior) Group Leader, Nitrogen Fixation - Plant Biology Institute Applicants are expected to have a solid experience in handling Nitrogen Fixation - Plant Biology Institute related tasks
Senior HR and Payroll Administrator Applicants are expected to have a solid experience in handling Job related tasks
Multiskilled Maintenance Engineer Applicants are expected to have a solid experience in handling Job related tasks
Security Operations Engineer Applicants are expected to have a solid experience in handling Job related tasks