Engineering Manager - Model Development, Machine Learning Platform position at Netflix Inc.| United States Of America

Netflix Inc Engineering Manager - Model Development, Machine Learning Platform

Experience: General

Pattern: Onsite

Country:

Copy Link Report

Bachelor's (B.Sc.)

Los Gatos,Cali..........United States Of America

Netflix is one of the world's leading entertainment services, with over 300 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.Machine Learning drives innovation across all product functions and decision-support needs, and building highly scalable and differentiated ML infrastructure is critical to accelerating this innovation. Our Machine Learning Platform (MLP) maximizes the impact of ML by building differentiated, scalable infrastructure that accelerates research and product iteration across recommendations, growth, studio, content understanding, and emerging generative AI use cases.The Opportunity:The Model Development & Management (MDM) team builds and evolves the unified developer experience—SDKs, frameworks, and libraries—that powers end-to-end model creation at Netflix. We focus on maximizing practitioner velocity while making infrastructure complexity invisible, integrating tightly with data/feature, training, serving, and evaluation pillars. Our portfolio-with-paved-paths strategy (Metaflow and other libraries exposed through one opinionated SDK) supports teams from a single data scientist to 100+ MLEs and model scales from ~10M to 100B+ parameters—spanning classic personalization, content understanding, and multimodal GenAI. We are looking for an experienced ML/AI infrastructure engineering leader to manage MDM and drive the next generation of Netflix’s model development platform! You will lead the team to architect, build, test, and launch a cohesive SDK and set of opinionated templates that let practitioners scaffold projects, configure and execute runs (from laptop to tightly coupled multi-node GPU training), track experiments and lineage, package models with evaluation hooks, and promote them confidently. Your work will enable partners across content, studio, consumer, ads, and games to develop and iterate on large-scale models—including LLMs, recommenders, computer vision, and foundation models—throughout the full lifecycle from early research and experimentation to productization and ongoing optimization. Success will be measured by concrete developer-experience KPIs such as time-to-first successful remote run, run success rate (ex-user code), mean time to actionable diagnosis, adoption of paved paths, and template reuse. We are a highly collaborative team. You will operate cross-functionally with Training Platform and Offline Inference, Serving Systems, Feature/Data Infrastructure, and MLP Tooling to deliver a seamless, consistent experience end-to-end. To thrive here, you bring a strong ML infrastructure background (SDK/CLI design, packaging and environments, experiment tracking/lineage, observability), excellent product taste for developer experience, and the judgment to balance paved-path simplicity with power-user control. You’ll design for extensibility as the space evolves, keep interfaces stable with clear deprecation policies, and prioritize measurable outcomes that lift practitioner velocity across Netflix.In this role, you will:Partner with ML practitioners and adjacent pillars (Feature/Data, Training, Serving, Evaluation) to translate needs into a unified developer experience that hides infrastructure complexity while preserving expert control.Drive the strategy and vision of the Model Development SDK—owning the portfolio of existing and new products, making build‑vs‑buy choices, and integrating libraries/frameworks into the unified platform.Build and execute a metrics‑led roadmap: define Developer Experience (DX) KPIs, plan incremental delivery and migrations, and demonstrate impact through adoption and reuse.Maintain and evolve current product offerings that are widely adopted both in OSS and internally (e.g., Metaflow).Communicate progress, milestones, and risks to stakeholders, customers, and senior leadership.Hire, grow, and coach a diverse team across Core Frameworks and User Experience pods (and incubate Exploratory Infra as needs emerge), fostering an inclusive, high‑ownership culture.To succeed in this role, you will need:10+ years of software engineering experience and 3+ years building and leading engineering teams.Experience leading teams responsible for building state‑of‑the‑art ML model development platforms that cover the full model development lifecycle.A track record working on distributed ML infrastructure that spans laptop‑to‑cluster execution, supports multi‑node GPU training, and serves large‑scale models (recommenders, computer vision, LLMs, multimodal GenAI).Deep familiarity with containerization/orchestration, dependency and environment management (e.g., pinned specs, environment locks), and secure packaging practices for reliable, repeatable runs.Proficiency with ML frameworks and commercial ML/AI infrastructure, such as PyTorch, SageMaker, Ray, and Hugging Face, etc....Strong technical acumen: act as a credible technical advisor to the team, set and enforce a high‑quality bar for code and system design, and mentor engineers across levels.A passion for translating the needs of ML practitioners into platform offerings with an emphasis on automation and self‑service capabilities.Strong communication and collaboration skills, with the ability to build durable relationships with internal customers and external partners.Demonstrated ability to develop, drive, and execute a technical vision and roadmap.A track record of attracting top talent and growing a high‑performing, diverse team of tenured engineers to deliver results in a fast‑paced environment.Experience managing a hybrid team with partners and team members distributed across U.S. geographies and time zones.To learn more about our ML Platform, you can review the relevant talks/blog posts on theNetflix ML Platform Research website.At Netflix, we carefully consider various compensation factors to determine your personaltop of market. We rely on market indicators to determine compensation and consider your specific job, skills, and experience to get it right. These considerations can cause your compensation to vary and will also depend on your location.The overall market range for roles in this area of Netflix is typically $190,000 - $920,000.This market range is based on total compensation (vs. only base salary), which is in line with our compensation philosophy. Netflix has a unique culture and environment. Learn morehere.Inclusionis a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want anaccommodation/adjustmentfor a disability or any other reason during the hiring process, please send a request to your recruiting partner.We are an equal-opportunity employer and celebrate diversity, recognizing that diversitybuilds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Other Ai Matches

Technology Audit Manager Applicants are expected to have a solid experience in handling Auditing related tasks

Remote

Software Development Engineer in Test 5 - TV & Web Player Platform Applicants are expected to have a solid experience in handling Engineering related tasks

Director, Product Management, Plans and Pricing Applicants are expected to have a solid experience in handling Product Management related tasks

Remote

Analytics Engineer 5 - Member Insights Engineering Applicants are expected to have a solid experience in handling Data & Insights related tasks

Analytics Engineer Intern - Content Decision Sciences and Conversation Scaling, Summer 2026 Applicants are expected to have a solid experience in handling Engineering related tasks

Remote

Engineering Manager, JVM Ecosystem Applicants are expected to have a solid experience in handling Engineering related tasks

Manager, Production Management - Thailand Applicants are expected to have a solid experience in handling Content Production related tasks

Technology Experience Specialist, Corporate Applicants are expected to have a solid experience in handling Engineering Operations related tasks

Expression of Interest - Feature Animation, Technical Directors and R&D Engineers, Sydney Applicants are expected to have a solid experience in handling Feature Animation related tasks

Remote

Software Engineering L5, Ads Campaign Management Applicants are expected to have a solid experience in handling Engineering related tasks

Associate, FP&A, Enterprise Operations Applicants are expected to have a solid experience in handling Financial Planning and Analysis related tasks

Remote

Distributed Systems Engineer (L6) - Managed Compute Applicants are expected to have a solid experience in handling Engineering related tasks

Specialist, Business Affairs (Content Licensing and Strategic Partnership) - Japan Applicants are expected to have a solid experience in handling Legal related tasks

Manager, Creative Producer BP and PM Applicants are expected to have a solid experience in handling Program Management related tasks

Creative Manager - Thailand Applicants are expected to have a solid experience in handling Content Development & Acquisition related tasks

Manager, Product Consumer Insights - Commerce Applicants are expected to have a solid experience in handling Data & Insights related tasks

Remote

Staff Product Designer, Cloud Infrastructure Applicants are expected to have a solid experience in handling Product Design related tasks

Specialist, Ads Marketing - Project Management Applicants are expected to have a solid experience in handling Marketing related tasks

Associate, Content Finance & Strategy - Japan Applicants are expected to have a solid experience in handling Analyst related tasks

Site Reliability Engineer L5 Applicants are expected to have a solid experience in handling Engineering related tasks

Retail Specialist, Spain and Portugal Applicants are expected to have a solid experience in handling Specialist related tasks

Technology Experience Specialist, Studio Applicants are expected to have a solid experience in handling Engineering Operations related tasks

Remote

Machine Learning Scientist (L4/5) - Studio Media Algorithms Applicants are expected to have a solid experience in handling Data & Insights related tasks

Engineering Manager - Model Development, Machine Learning Platform job opportunity at Netflix Inc.

Saved Jobs

No Job Saved

Other Ai Matches