NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly on NVIDIA GPUs and systems.......
Hiring In US, CA, Santa Clara
full-time Sourced
High School (S.S.C.E) 15-yearsNVIDIAScraperLoicx
We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale w......
The vLLM and LLM-D Engineering team at Red Hat is looking for a customer obsessed developer to join our team as a Forward Deployed Engineer . In this role, you will not just build software; you will be the bridge between our cutting-edge inference platform ( LLM-D , and vLLM ) and our customers' m......
Hiring In Boston
full-time Sourced
Associate GeneralRed Hat, ...ScraperLoicx
Develop APIs for AI inference that will be used by both internal and external #customers
__
Benchmark and address bottlenecks throughout our inference stack
__
Improve the reliability and observability of our #systems and respond to system outages
__
Explore novel #research and implement LLM i......
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior AI Inference Compiler Engineer in the United States.This role offers the opportunity to advance the performance and efficiency of AI inference engines across GPUs, personal devices, robotics, a......
At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading developers, maintainers of the vLLM ......
Hiring In Boston
full-time Sourced
Associate 10-yearsRed Hat, ...ScraperLoicx
Develop #APIs for AI inference that will be used by both internal and external customers
__
Benchmark and address bottlenecks throughout our #inference stack
__
Improve the reliability and observability of our systems and respond to #system outages
__
Explore novel research and implement #LLM ......
Hiring In San Francisco
full-time Sourced
Professional Certificate 1-yearPerplexit...ScraperOlasoji
NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-d......
Hiring In US, CA, Santa Clara
full-time Sourced
General 3-yearsNVIDIAScraperLoicx
Job Summary At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading developers, maintainers ......
Hiring In Boston
full-time Sourced
Associate GeneralRed Hat, ...ScraperLoicx
We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale w......
Modern data centers are transforming into AI factories, and NVIDIA accelerated computing is the engine of artificial intelligence. Our data center platforms integrate CPUs, GPUs, DPUs, networking, and a full-stack software ecosystem to power AI at scale. We are looking for a Senior Technical Marketi......