Senior Backend Engineer, Inference Platform job opportunity at Together AI.



bot
Together AI Senior Backend Engineer, Inference Platform
Experience: 5+years
Pattern: full-time
apply Apply Now
Salary:
Status:

Backend Engineering

Copy Link Report
degreeBachelor's (B.Sc.)
loacation San Francisco, California, United States Of America
loacation San Francisco,..........United States Of America

#Together #AI is building the Inference Platform that brings the most advanced #generative AI #models to the world. Our platform powers multi-tenant serverless workloads and dedicated endpoints, enabling developers, enterprises, and researchers to harness the latest LLMs, multimodal models, image, audio, video, and speech models at scale. If you get a thrill from optimizing latency down to the last millisecond, this is your playground. You’ll work hands-on with tens of thousands of GPUs (H100s, H200s, GB200s, and beyond), figuring out how to fully utilize every FLOP and every gigabyte of memory. You’ll collaborate directly with research teams to bring frontier models into production, making breakthroughs usable in the real world. Our team also works closely with the open source community, contributing to and leveraging projects like SGLang, vLLM, and NVIDIA Dynamo to push the boundaries of inference performance and efficiency. __ Build and optimize global and local request routing, ensuring low-latency load balancing across #data centers and model engine pods __ Develop auto-scaling systems to dynamically allocate #resources and meet strict SLOs across dozens of data centers __ Design systems for multi-tenant traffic shaping, tuning both resource allocation and request handling — including smart rate limiting and regulation — to ensure fairness and consistent experience across all users __ Engineer trade-offs between latency and throughput to serve diverse workloads efficiently __ Optimize prefix caching to reduce model compute and speed up responses __ Collaborate with ML researchers to bring new model architectures into production at scale __ Continuously profile and analyze system-level performance to identify bottlenecks and implement optimizations
ai summary

Other Ai Matches

Customer Support Engineer, India Applicants are expected to have a solid experience in handling Customer service related tasks
Revenue Accounting Manager Applicants are expected to have a solid experience in handling Finance related tasks
LLM Training Resilience Engineer Applicants are expected to have a solid experience in handling Engineering related tasks
Solutions Architec Applicants are expected to have a solid experience in handling Solutions Architect related tasks
Staff Brand and Content Marketing Manager Applicants are expected to have a solid experience in handling Content marketing related tasks
Senior Strategic Sourcing & Procurement Lead, Compute Applicants are expected to have a solid experience in handling Procurement related tasks
Machine Learning Engineer Applicants are expected to have a solid experience in handling Engineering related tasks
Research Scientist, Post-Training Applicants are expected to have a solid experience in handling Research related tasks
Machine Learning Engineer - Inference Applicants are expected to have a solid experience in handling Engineering related tasks
Staff PR & Communications Manager Applicants are expected to have a solid experience in handling Communications Manager related tasks
Senior Systems Administrator Applicants are expected to have a solid experience in handling System Administrator related tasks
Senior Software Engineer - Together Cloud Platform Applicants are expected to have a solid experience in handling Software Engineer related tasks
Senior Backend Engineer, Inference Platform Applicants are expected to have a solid experience in handling Backend Engineering related tasks
Systems Research Engineer, GPU Programming Applicants are expected to have a solid experience in handling Research related tasks
Rust Systems Engineer - Inference Applicants are expected to have a solid experience in handling System Engineer related tasks
LLM Training Dataset and Checkpoint Optimization Engineer Applicants are expected to have a solid experience in handling Engineering related tasks
Machine Learning, Platform Engineer Applicants are expected to have a solid experience in handling Engineering | Developer related tasks
Infrastructure Engineer, Data Platform Applicants are expected to have a solid experience in handling Engineering related tasks
Distributed ML Systems Engineer- Inference Applicants are expected to have a solid experience in handling Engineer related tasks
Senior Software Development Engineer in Test Applicants are expected to have a solid experience in handling Software Engineer related tasks
LLM Inference Frameworks and Optimization Engineer Applicants are expected to have a solid experience in handling Engineering related tasks
GPU Cluster Resource Scheduling and Optimization Engineer Applicants are expected to have a solid experience in handling Engineering related tasks
Project Manager, Compute & Business Operations Applicants are expected to have a solid experience in handling Project Manager related tasks