Software Engineer - Observability job opportunity at xAI.



bot
xAI Software Engineer - Observability
Experience: General
Pattern: full-time
apply Apply Now
Salary:
Status:

Product

Copy Link Report
degreeOND
loacation Palo Alto, CA, United States Of America
loacation Palo Alto, CA....United States Of America

About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.  About the Team The Observability team builds and operates the core infrastructure that enables engineers to monitor, debug, and optimize the performance and reliability of their systems. We handle telemetry at massive scale — billions of time series and petabytes of logs — with strict performance and availability requirements. About the Role You will be part of the small, high-impact team responsible for building and maintaining X’s observability platform. You’ll own critical systems that power metrics, logs, tracing, and alerting enabling engineering teams to operate services at scale, identify issues before they impact users, and drive systemic reliability improvements. What You’ll Do Design and implement scalable observability infrastructure for metrics, logging, and tracing. Build high-performance telemetry pipelines that handle massive ingestion volumes. Develop APIs, query engines, and UIs that allow engineers to get real-time insights into their services. Define and enforce best practices for instrumentation, alerting, and reliability across the company. Partner with infrastructure and product teams to deeply integrate observability into our internal platforms. Own the reliability, scalability, and performance of the observability stack end-to-end. Ideal Candidate Production-level proficiency in Go, Rust, Scala, or a similar languages Deep understanding of distributed systems and telemetry architecture. Experience building and operating infrastructure at scale. Familiarity with observability stacks such as Prometheus, Grafana, OpenTelemetry, VictoriaMetrics, or ClickHouse. Experience with Kafka, Redis, or large-scale time series databases. Experience operating observability pipelines in Kubernetes or similar orchestration environments. Locations We hire engineers in Palo Alto, and San Francisco. Our team usually works from the office 5 days a week but allow work-from-home days when required. Candidates who join in San Francisco must make it to Palo Alto at least twice a week. Interview Process After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15 minute interview (“phone interview”) during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of 2 technical interviews and 1 project deep-dive interview: Practical coding assessment in a language of your choice. Systems design hands-on: Demonstrate practical skills in a live problem-solving session. Project deep-dive: Present and answer questions about exceptional work that you’ve done. Meet and greet with the wider team. Our goal is to finish the main process within one week. Final interviews will be conducted in person. Annual Salary Range $180,000 - $440,000 USD Benefits Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.

Other Ai Matches

SOC Operator Applicants are expected to have a solid experience in handling Security related tasks
Member of Technical Staff, Ads Applicants are expected to have a solid experience in handling Product related tasks
Software Engineer - Data Platform Applicants are expected to have a solid experience in handling Product related tasks
Senior Frontend Engineer - Autonomous Agents - Starfleet Applicants are expected to have a solid experience in handling Engineering related tasks
Member of Technical Staff, Product Safety Applicants are expected to have a solid experience in handling Product related tasks
Backend Engineer - Enterprise Applicants are expected to have a solid experience in handling Product related tasks
Fiber Foreman Applicants are expected to have a solid experience in handling Data Center Operations related tasks
remote-jobserver Remote
Safety Tutor Applicants are expected to have a solid experience in handling Human Data related tasks
Supervisor, Safety Applicants are expected to have a solid experience in handling Data Center Operations related tasks
Frontend Engineer / Design Engineer - US Government Applicants are expected to have a solid experience in handling Engineering related tasks
Client Partner Applicants are expected to have a solid experience in handling Sales related tasks
remote-jobserver Remote
Occupational Math Tutor Applicants are expected to have a solid experience in handling Human Data related tasks
Strategic Projects Lead Applicants are expected to have a solid experience in handling Human Data related tasks
Software Engineer - Real-Time Storage Applicants are expected to have a solid experience in handling Infrastructure related tasks
Exceptional Engineer - AI Coding Tools Applicants are expected to have a solid experience in handling Product related tasks
Client Partner, Ecommerce and Digital Native Applicants are expected to have a solid experience in handling Sales related tasks
remote-jobserver Remote
Applied Math Tutor Applicants are expected to have a solid experience in handling Human Data related tasks
Member of Technical Staff - Multimodal (Audio) Applicants are expected to have a solid experience in handling Foundation Model related tasks
remote-jobserver Remote
Accounting Expert - Tax Applicants are expected to have a solid experience in handling Human Data related tasks
IT Services Technician Applicants are expected to have a solid experience in handling Information Technology related tasks
Network Engineer - Backbone Applicants are expected to have a solid experience in handling Engineering related tasks
Member of Technical Staff, Video Generation - Agent, RL Applicants are expected to have a solid experience in handling Foundation Model related tasks
Software Engineer - Infrastructure/Supercomputing Applicants are expected to have a solid experience in handling Infrastructure related tasks