INTERNSHIP DETAILS

AI Computing Software Development Intern - 2026

CompanyNVIDIA
LocationTaipei
Work ModeOn Site
PostedJanuary 14, 2026
Internship Information
Core Responsibilities
As an intern, you will focus on either optimizing inference pipelines for Large Language Models or improving graph transformations for the TensorRT compiler. You will collaborate with various teams to enhance GPU performance for AI inference.
Internship Type
full time
Company Size
43629
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page

About The Company
Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.
About the Role

We are now looking for an AI Computing Software Development Intern!

NVIDIA invites skilled interns in artificial intelligence computing solutions to join our AI Compute team in Taiwan. This is your chance to work on one of the globe’s most advanced AI systems. You will help develop technologies for Large Language Models, Recommender Systems, and Generative AI, and push the limits of GPU performance for AI inference.

What you'll be doing:

As an intern, you’ll focus on one of two specialized tracks: TensorRT-LLM – Inference Optimization (Python / PyTorch) or TensorRT Compiler – Graph Optimization (C++).

For TensorRT-LLM:

  • Build and enhance high‑performance LLM inference pipelines.

  • Analyze and optimize model execution, scalability, and memory use.

  • Collaborate across framework and research teams to deliver efficient multi‑GPU model serving.

For TensorRT Compiler:

  • Work on the TensorRT compiler backend to improve graph transformations and code generation for NVIDIA GPUs.

  • Develop compiler optimization passes, refine operator fusion, and optimize memory usage.

  • Collaborate with CUDA and hardware architecture teams to accelerate Deep Learning inference computations.

What we need to see:

  • Pursuing an M.S. or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Applied Mathematics, or related fields.

  • Excellent problem‑solving ability, curiosity for cutting‑edge AI systems, and passion for GPU computing and deep learning software performance.

  • TensorRT‑LLM: Strong Python programming and experience with PyTorch; solid understanding of LLM inference and GPU acceleration.

  • TensorRT Compiler: Proficient in C++, with experience in compiler or performance optimization.

Join us and play a part in building the AI computing platforms that drive innovation across industries worldwide.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Key Skills
PythonPyTorchC++Compiler OptimizationDeep LearningGPU AccelerationModel ExecutionMemory OptimizationGraph OptimizationOperator FusionProblem SolvingAI SystemsCuriosityScalabilityMulti-GPU Serving
Categories
TechnologyEngineeringData & AnalyticsSoftwareScience & Research