AI Computing Performance Architect Intern, Perf Analysis and Kernel Dev - 2026 job opportunity at NVIDIA.



DateMore Than 30 Days Ago bot
NVIDIA AI Computing Performance Architect Intern, Perf Analysis and Kernel Dev - 2026
Experience: General
Pattern: full-time
apply Apply Now
Salary:
Status:

Perf Analysis and Kernel Dev - 2026

Copy Link Report
degreeOND
loacation China, Shanghai, China
loacation China, Shangha..........China

NVIDIA is developing processor and system architectures that accelerate machine learning, automotive and high performance computing applications. We are seeking a strong candidate to  do performance analysis and kernels development for NVIDIA's new architectures. Your work will play a critical role in shaping the future of deep learning hardware and software, ensuring optimal performance for next-generation AI applications.  This position offers the opportunity to make a meaningful impact in a fast-moving, technology focused company. What you'll be doing: Design, develop, and optimize major layers in LLM (e.g attention, GEMM, inter-GPU communication) for NVIDIA's new architectures. Implement and fine-tune kernels to achieve optimal performance on NVIDIA GPUs. Conduct in-depth performance analysis of GPU kernels, including Attention and other critical operations. Identify bottlenecks, optimize resource utilization, and improve throughput, and power efficiency Create and maintain workloads and micro-benchmark suites to evaluate kernel performance across various hardware and software configurations. Generate performance projections, comparisons, and detailed analysis reports for internal and external stakeholders. Collaborate with architecture, software, and product teams to guide the development of next-generation deep learning hardware and software. What we need to see: Pursuing BS, MS or PhD in relevant discipline (CS, EE, CE). Strong software skills with C/C++, Python, MPI, OpenMP etc. Solid computer science (CS) SW & HW arch background. Experience of DL workload and operator performance will be a plus. Familiarity with GPU computing and parallel programming models will be a plus. Excellent oral and written communication skills. Good organizational, time management and task prioritization skills. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. #deeplearning

Other Ai Matches

Senior Chip Design Engineer Applicants are expected to have a solid experience in handling Job related tasks
Reliability Test Manager Applicants are expected to have a solid experience in handling Job related tasks
Senior System Design Engineer Applicants are expected to have a solid experience in handling Job related tasks
CAD Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior Software Technical Program Manager, Compute – Aerial Applicants are expected to have a solid experience in handling Compute – Aerial related tasks
Senior Solutions Architect, NVIDIA Cloud Partners Applicants are expected to have a solid experience in handling NVIDIA Cloud Partners related tasks
Senior Software Engineer, DPU Platform Applicants are expected to have a solid experience in handling DPU Platform related tasks
Senior Software QA Engineer, Data Processing Applicants are expected to have a solid experience in handling Data Processing related tasks
Senior Developer Technology Engineer, High-Performance Databases Applicants are expected to have a solid experience in handling High-Performance Databases related tasks
Senior Patent Counsel Applicants are expected to have a solid experience in handling Job related tasks
Technical Lead, GenAI - Autonomous Vehicles Applicants are expected to have a solid experience in handling GenAI - Autonomous Vehicles related tasks
remote-jobserver Remote
Developer Relations Manager -Nordics Applicants are expected to have a solid experience in handling Job related tasks
Senior Software Engineer, AI Resiliency Applicants are expected to have a solid experience in handling AI Resiliency related tasks
Physical Design Backend Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior Architect Applicants are expected to have a solid experience in handling Job related tasks
Senior AI-HPC Cluster Engineer - MLOps Applicants are expected to have a solid experience in handling Job related tasks
Senior SoC Methodology Architect, VLSI Physical Design Applicants are expected to have a solid experience in handling VLSI Physical Design related tasks
Senior Manager, Software Engineering - Media and Entertainment Applicants are expected to have a solid experience in handling Software Engineering - Media and Entertainment related tasks
Senior Hardware Time Synchronization Architect Applicants are expected to have a solid experience in handling Job related tasks
Senior ASIC Timing Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior Software Engineer, Profiling Services Applicants are expected to have a solid experience in handling Profiling Services related tasks
Mechanical Team Manager Applicants are expected to have a solid experience in handling Job related tasks
Senior AI Algorithms Software Engineer Applicants are expected to have a solid experience in handling Job related tasks