Manager, Large Language Model Inference job opportunity at NVIDIA.



DateMore Than 30 Days Ago bot
NVIDIA Manager, Large Language Model Inference
Experience: 3-years
Pattern: full-time
apply Apply Now
Salary:
Status:

Large Language Model Inference

Copy Link Report
degreePhD
loacation US, CA, Santa Clara, United States Of America
loacation US, CA, Santa ..........United States Of America

At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. The TensorRT inference platform is the backbone of modern AI, delivering the industry's fastest and most efficient deployment of cutting-edge deep learning models on every NVIDIA GPU. With demand for AI exploding, particularly in the realm of large language models (LLMs) and vision language models (VLMs, VLAs), we are significantly expanding our team. We're seeking a highly skilled and driven Engineering Manager to take the lead in developing the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. This is a high-impact, hands-on leadership role at the intersection of deep technical expertise and world-class management. You won't just manage; you'll architect and guide a brilliant team of engineers who are building the core LLM inference runtime. Your work will be highly collaborative, interfacing directly with NVIDIA Researchers, GPU Architects, and other teams across the company to ensure we ship production-grade, lightning-fast software that sets the global standard for AI performance. What You’ll Be Doing: Lead and grow a team responsible for specialized kernel development, runtime optimizations, and frameworks for LLM inference. Drive the design, development, and delivery of production inference software, targeting NVIDIA's next-generation enterprise and edge hardware platforms. Integrating cutting-edge technologies developed at NVIDIA and offering an intuitive developer experience for LLM deployment. Lead software development execution, with responsibility for project planning, milestone delivery, and cross-functional coordination. What We Need to See: MS, PhD, or equivalent experience in Computer Science, Computer Engineering, AI, or a related technical field. 7+ overall years of overall software engineering experience, including 3+ years of technical leadership experience. Proven ability to lead and scale high-performing engineering teams, especially across distributed and cross-functional groups. Strong background in C++ or Python, with expertise in software design and delivering production-quality software libraries. Demonstrated expertise in large language models (LLM) and/or vision language models (VLM). Ways to Stand Out from the Crowd: Deep understanding of GPU architecture, CUDA programming, and system-level performance tuning. Background in LLM inference or working with frameworks such as TensorRT-LLM, vLLM, or SGLang. Passion for building scalable, user-friendly APIs and enabling developers in the AI ecosystem. Have a proven track record of growing and managing a team that encourages idea sharing, empowers team members, and provides opportunities for professional growth. We are widely considered to be one of the technology world’s most desirable employers, and we have some of the most forward-thinking and hardworking people in the world working with us. Due to outstanding growth, our best-in-class teams are rapidly growing. If you're a creative self-starter with a real passion for technology, then come join us. #LI-Hybrid Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 2, and 224,000 USD - 356,500 USD for Level 3. You will also be eligible for equity and benefits . Applications for this job will be accepted at least until January 13, 2026. This posting is for an existing vacancy.  NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Other Ai Matches

System Validation Engineer Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Senior System Software Engineer, NCCL - Partner Enablement Applicants are expected to have a solid experience in handling NCCL - Partner Enablement related tasks
Research Software Engineer, Advanced Development Applicants are expected to have a solid experience in handling Advanced Development related tasks
Product Validation Tools Software Engineer Applicants are expected to have a solid experience in handling Job related tasks
Technical Program Manager – Silicon Solutions Applicants are expected to have a solid experience in handling Job related tasks
Senior Account Manager, Consumer Sales Applicants are expected to have a solid experience in handling Consumer Sales related tasks
Senior Deep Learning Performance Architect Applicants are expected to have a solid experience in handling Job related tasks
Senior Chip Design Verification Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior GPU System Software Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior ASIC Verification Engineer - GPU Applicants are expected to have a solid experience in handling Job related tasks
Senior AI Applications Engineer - Finance Applicants are expected to have a solid experience in handling Job related tasks
Senior Silicon Power Performance Engineer Applicants are expected to have a solid experience in handling Job related tasks
NVIDIA 2026 Internships: PhD Robotics Research - US Applicants are expected to have a solid experience in handling Job related tasks
Senior Applied Power Architect - GPU Applicants are expected to have a solid experience in handling Job related tasks
Senior Director, PCB and PCBA Group Leader Applicants are expected to have a solid experience in handling PCB and PCBA Group Leader related tasks
Security Software Architect, Security Applicants are expected to have a solid experience in handling Security related tasks
Senior Interconnect Product Engineer Applicants are expected to have a solid experience in handling Job related tasks
Software QA Engineer, NIC Firmware Applicants are expected to have a solid experience in handling NIC Firmware related tasks
Senior Implementation Methodology Engineer Applicants are expected to have a solid experience in handling Job related tasks
System Software Engineer - Secure Cryptographic Services Applicants are expected to have a solid experience in handling Job related tasks
Senior System Software Engineer, GPU Performance Profiling Applicants are expected to have a solid experience in handling GPU Performance Profiling related tasks
Robotics Intern, Embodied Spatial Reasoning Applicants are expected to have a solid experience in handling Embodied Spatial Reasoning related tasks
Software Application Engineer – SoC Platform Applicants are expected to have a solid experience in handling Job related tasks