INTERNSHIP DETAILS

Research Intern – Reinforcement Learning (RL) - Onsite

CompanyLevel AI
LocationUnited States
Work ModeOn Site
PostedMarch 28, 2026
Internship Information
Core Responsibilities
The intern will design and build reinforcement learning environments modeling real-world customer interaction workflows and develop RL agents that learn from these environments using real-world data and feedback loops. Responsibilities also include defining reward models, structuring interaction traces for training, experimenting with multi-agent systems, and collaborating on production deployment.
Internship Type
full time
Company Size
216
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page

About The Company
Our state-of-the-art AI-native solutions are designed to drive efficiency, productivity, scale, and excellence in sales and customer service. With a focus on automation, agent empowerment, customer assistance, and strategic business intelligence, we are dedicated to helping our clients exceed customer expectations and drive profitable business growth. Companies like Affirm, Carta, Vista, Toast, Swiss Re, ezCater, etc. use Level AI to take their business to new heights with less effort.
About the Role

🚀 Build the next generation of Agentic AI with us

Our platform combines conversation intelligence, multimodal understanding, and agentic AI systems to power both human agents and autonomous AI agents across the entire customer experience lifecycle.

A core part of this vision is our investment in custom Small Language Models (SLMs)—purpose-built for CX workflows—paired with reinforcement learning systems that continuously improve decision-making in real-world environments.

We’re looking for a Research Intern (Reinforcement Learning) to join us in shaping this future.


What you’ll do

 

  • Design and build reinforcement learning environments that model real-world customer interaction workflows.

  • Design RL agents that learn from these environments using real-world interaction data, rewards, and feedback loops

  • Define reward models and feedback loops using real-world signals (outcomes and human feedback)

  • Enable learning from production data by structuring interaction traces into training-ready datasets for offline and online learning

  • Experiment with multi-agent systems and simulation frameworks for complex coordination and decision-making

  • Collaborate with engineering and product teams to deploy, evaluate, and iterate on learning systems in production at scale.

 


What we’re looking for

  • Currently pursuing (or recently completed) a degree in Computer Science, AI, Machine Learning, or related field

  • Strong understanding of reinforcement learning fundamentals

  • Familiarity with RL environments and training libraries such as Verl and Tinker

  • Strong foundation in probability, math, and optimization

  • Passion for building real-world AI systems


Nice to have

  • Experience with RLHF, LLM/SLM fine-tuning, or model alignment

  • Exposure to agent-based systems or multi-agent RL

  • Prior research, projects, or publications in RL or applied ML

  • Experience working with large-scale or production datasets

 


Why Level AI

  • Work on production-grade Agentic AI systems used by leading enterprises

  • Build alongside a team with deep expertise from Amazon, Google, and Meta

  • Be part of a fast-growing Series C AI company.

  • Direct exposure to 0→1 AI innovation in CX and decisioning systems

\n


\n
Key Skills
Reinforcement LearningAgentic AISmall Language ModelsRL EnvironmentsRL AgentsReward ModelsOffline LearningOnline LearningMulti-Agent SystemsSimulation FrameworksRLHFLLM Fine-tuningModel AlignmentProbabilityOptimization
Categories
Science & ResearchEngineeringSoftwareData & AnalyticsTechnology