INTERNSHIP DETAILS

MLE Intern, ML Runtime & Optimization (Spring 2026, Master/PhD)

Companypony.ai
LocationFremont
Work ModeOn Site
PostedDecember 18, 2025
Internship Information
Core Responsibilities
As a Machine Learning Engineer Intern, you will develop technologies to enhance the training and inference of AI models in autonomous driving systems. This includes optimizing model training and deployment for performance and efficiency.
Internship Type
full time
Company Size
509
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page

About The Company
Pony AI Inc. (“Pony.ai”) is a global leader in the large-scale commercialization of autonomous mobility. Leveraging its vehicle-agnostic Virtual Driver technology, full-stack autonomous driving technology that seamlessly integrates its proprietary software, hardware, and services, Pony.ai is developing a commercially viable and sustainable business model that enables the mass production and deployment of vehicles across transportation use cases. Founded in 2016, Pony.ai has expanded its presence across China, Europe, East Asia, the Middle East, and other regions, ensuring widespread accessibility to its advanced technology. Pony.ai is among the first in China to obtain licenses to operate fully driverless vehicles in all four Tier-1 cities in China (Beijing, Guangzhou, Shanghai, Shenzhen) and has begun to offer public-facing, fare-charging robotaxi services without safety drivers in Beijing, Guangzhou and Shenzhen. Pony.ai operates a fleet consisting of over 250 robotaxis. To date, Pony.ai has driven nearly 45 million autonomous testing and operation kilometers on open roads worldwide.
About the Role

Founded in 2016 in Silicon Valley, Pony.ai has quickly become a global leader in autonomous mobility and is a pioneer in extending autonomous mobility technologies and services at a rapidly expanding footprint of sites around the world. Operating Robotaxi, Robotruck and Personally Owned Vehicles (POV) business units, Pony.ai is an industry leader in the commercialization of autonomous driving and is committed to developing the safest autonomous driving capabilities on a global scale. Pony.ai’s leading position has been recognized, with CNBC ranking Pony.ai #10 on its CNBC Disruptor list of the 50 most innovative and disruptive tech companies of 2022. In June 2023, Pony.ai was recognized on the XPRIZE and Bessemer Venture Partners inaugural “XB100” 2023 list of the world’s top 100 private deep tech companies, ranking #12 globally. As of August 2023, Pony.ai has accumulated nearly 21 million miles of autonomous driving globally. Pony.ai went public at NASDAQ in Nov. 2024.

Responsibility

The ML Infrastructure team at Pony.ai provides a set of tools to support and automate the lifecycle of the AI workflow, including model development, evaluation, optimization, deployment and monitoring.

As a Machine Learning Engineer Intern in ML Runtime & Optimization, you will be developing technologies to advance the training and inferences of the AI models in autonomous driving systems.

This includes:

  • Performing in-depth analysis and optimization to model training and deployment to achieve the state of art in performance and efficiency in autonomous driving.
  • Work across the entire AI framework/compiler stack (e.g. Torch, CUDA and TensorRT), support model development and prototype key deep learning algorithms.
  • Analyze the tradeoffs between performance, cost and energy for autonomous driving.
  • Collaborating closely with diverse groups in Pony.ai to influence the next-generation compute platform HW and SW design.
  • Research the latest model architectures, programming models and hardware.

  • Currently pursuing a Masters or PhD program or a related discipline.
  • Strong programming skills in C/C++ or Python.
  • Solid understanding of CPU or GPU execution model, e.g. threads, registers, cache, memory, cost and performance trade-off, etc.
  • Experience in benchmarking, profiling and validating performance.
  • Strong communication skills and ability to work cross-functionally between software and hardware teams

Preferred Qualifications:

One or more of the following fields are preferred

  • Experience with parallel programming: CUDA, ROCm, Triton, Cutlass, etc.
  • Experience in computer vision, image processing, machine learning and deep learning.
  • Experience in model optimization techniques such as quantization, pruning, etc.
  • Experience in optimizing the utilization of compute resources, identifying and resolving compute and data flow bottlenecks.
  • Strong knowledge of software design, programming techniques and algorithms.
  • Strong knowledge of common deep learning frameworks and libraries.
  • Strong knowledge on system performance, GPU optimization or ML compiler.

Note

  • This position is fully onsite in Fremont, at least 3 months.

Compensation

  • Master: $7000/month
  • PhD: $10,000/month
Key Skills
C/C++PythonCPU Execution ModelGPU Execution ModelBenchmarkingProfilingPerformance ValidationParallel ProgrammingComputer VisionImage ProcessingMachine LearningDeep LearningModel OptimizationSoftware DesignProgramming TechniquesAlgorithmsDeep Learning Frameworks
Categories
TechnologyEngineeringData & AnalyticsScience & ResearchTransportation