Deep Learning Kernel Software Performance Architect job opportunity at NVIDIA.



DatePosted 21 Days Ago bot
NVIDIA Deep Learning Kernel Software Performance Architect
Experience: General
Pattern: full-time
apply Apply Now
Salary:
Status:

Job

Copy Link Report
degreePhD
loacation China, Shanghai, China
loacation China, Shangha..........China

NVIDIA is seeking Software Performance Architects to optimize GPU kernel performance for state-of-the-art data-center platforms. We build automated, data-driven workflows to detect, explain, and prevent performance regressions across key deep learning workloads, partnering closely with kernel developers, compiler teams, infrastructure, and architecture/performance groups. What you'll be doing: Performance analysis + debugging Validate and analyze performance of GPU-accelerated kernels and key deep learning building blocks. Debug performance issues end-to-end: reproduce, isolate root causes, propose fixes or mitigation paths, and drive closure with the owning teams. Build performance narratives using structured evidence: baselines, controlled comparisons, and regression attribution.  Automation + regression infrastructure (Python-heavy) Develop and maintain Python-based automation for performance testing and analysis—using modern AI-assisted developer tools (e.g., Cursor/Claude Code/Copilot) to accelerate scripting while keeping code maintainable and reviewable. Design and operate performance test workflows: coverage definition, test/workload generation, automated large-scale execution (CI/nightly/on-demand), rerun rules, and reproducibility standards. Convert raw run outputs into actionable insight: statistics, noise control, post-processing, visualization, and large-scale result mining.  Cross-team collaboration and operating model Work with kernel developers and compiler/rotation teams to ensure performance checks are practical, scalable, and aligned to release needs. Partner with SWQA and infrastructure teams for execution at scale and reliable pipelines/dashboards. Contribute to clear ownership/triage/routing rules so regressions close quickly and consistently Following general software engineering best practices including support for regression testing and CI/CD flows What we need to see: Masters or PhD degree or equivalent experience in Computer Science, Computer Engineering, Applied Math, or related field Strong programming ability in Python plus C/C++ (performance-oriented code reading/debugging) Solid fundamentals in computer architecture and performance reasoning (latency/throughput, memory hierarchy, parallelism). Experience with performance analysis workflows: profiling, measurement methodology, reproducibility, and regression triage. Comfortable working across teams and driving issues to decision/closure with clear communication Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design Experience with performance-oriented parallel programming, even if it’s not on GPUs (e.g. with OpenMP or pthreads) Solid understanding of computer architecture and some experience with assembly programming Identify bottlenecks, optimize resource utilization, and improve throughput Ways to stand out from the crowd: Experience with high-performance kernels or math libraries (e.g., GEMM/attention, CUTLASS-like concepts) Experience building CI/nightly regression systems, dashboards, or large-scale performance analytics GPU programming/perf experience (CUDA or equivalent parallel programming) Strong ML/DL workload understanding (training/inference shapes, precision modes, perf bottlenecks) Familiarity with simulators/analytical modeling or performance characterization methodology

Other Ai Matches

HR Specialist - New College Grad 2026 Applicants are expected to have a solid experience in handling Job related tasks
Senior DGX Cloud AI Infrastructure Software Engineer Applicants are expected to have a solid experience in handling Job related tasks
AI Scientist, Robotics Digital Twins - PhD New College Grad 2026 Applicants are expected to have a solid experience in handling Robotics Digital Twins - PhD New College Grad 2026 related tasks
Senior Software Triage Engineer - Autonomous Vehicles Applicants are expected to have a solid experience in handling Job related tasks
Senior Deep Learning Engineer - Model Evaluation & AI Systems Applicants are expected to have a solid experience in handling Job related tasks
Senior Quantum Error Correction Research Scientist, Applied Research Applicants are expected to have a solid experience in handling Applied Research related tasks
Senior Software Engineer, Hardware-Oriented Applicants are expected to have a solid experience in handling Hardware-Oriented related tasks
Senior Storage and Networking Product Engineer Applicants are expected to have a solid experience in handling Job related tasks
System Application Engineer - Notebook Applicants are expected to have a solid experience in handling Job related tasks
Senior Quantum Software Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior Physical Design Methodology Engineer, Innovus Flows Applicants are expected to have a solid experience in handling Innovus Flows related tasks
Senior System Software Engineer, SOC Applicants are expected to have a solid experience in handling SOC related tasks
Senior Firmware PHY Verification Engineer Applicants are expected to have a solid experience in handling Job related tasks
Paralegal, Business and Trademark Legal Applicants are expected to have a solid experience in handling Business and Trademark Legal related tasks
Senior Storage Kernel Software Engineer, Linux - DGX Cloud Applicants are expected to have a solid experience in handling Linux - DGX Cloud related tasks
Senior Product Security System Test Engineer Applicants are expected to have a solid experience in handling Job related tasks
Engineering Build Manager Applicants are expected to have a solid experience in handling Job related tasks
Senior Failure Analysis Engineer Applicants are expected to have a solid experience in handling Job related tasks
Sales Development Specialist Applicants are expected to have a solid experience in handling Job related tasks
Software Engineer - Performance Verification Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Data Center Engineering and Operations Environmental Compliance Manager Applicants are expected to have a solid experience in handling Job related tasks
Software Infrastructure and Tools Engineer Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Visualization Software Sales Specialist Applicants are expected to have a solid experience in handling Job related tasks