INTERNSHIP DETAILS

Hunyuan Multimodal Reinforcement Learning (RL) Research Intern

CompanyTencent
LocationSingapore
Work ModeOn Site
PostedApril 24, 2026
Internship Information
Core Responsibilities
Conduct research on reinforcement learning algorithms for multimodal models, including diffusion and autoregressive frameworks. Design and develop infrastructure and reward modeling strategies to improve large-scale training efficiency and stability.
Internship Type
full time
Company Size
89057
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page

About The Company
Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world. Founded in 1998 with its headquarters in Shenzhen, China, Tencent's guiding principle is to use technology for good. Our communication and social services connect more than one billion people around the world, helping them to keep in touch with friends and family, access transportation, pay for daily necessities, and even be entertained. Tencent also publishes some of the world's most popular video games and other high-quality digital content, enriching interactive entertainment experiences for people around the globe. Tencent also offers a range of services such as cloud computing, advertising, FinTech, and other enterprise services to support our clients' digital transformation and business growth. Tencent has been listed on the Stock Exchange of Hong Kong since 2004.
About the Role

Business Unit

Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What the Role Entails

1. Conduct research on RL algorithms for multimodal models, including diffusion models for image, video, and 3D generation, autoregressive models for multimodal understanding, and potentially unified multimodal frameworks.

2. Design and develop RL infrastructure and reward modeling strategies to enable efficient large-scale training, improve training stability, and mitigate reward hacking and related failure modes.

3. Explore next-generation RL paradigms that more directly and effectively learn from environment feedback.

Who We Look For

1. Currently enrolled as a PhD student in Computer Science or a closely related field.

2. Demonstrated strong research capability, with publications in top-tier conferences such as ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV, SIGGRAPH.

3. Strong hands-on programming skills, with solid experience in deep learning system implementation, model training and inference optimization, CPU/GPU acceleration, and distributed training and inference.

4. Prior experience with diffusion models, autoregressive models, and/or text-to-image or text-to-video generation is highly preferred.

5. Participation in ACM/NOIP is a strong plus.

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.

Key Skills
Reinforcement LearningMultimodal ModelsDiffusion ModelsAutoregressive ModelsDeep LearningModel TrainingInference OptimizationCPU AccelerationGPU AccelerationDistributed TrainingImage GenerationVideo Generation3D GenerationReward ModelingPython
Categories
TechnologyScience & ResearchSoftwareData & AnalyticsEngineering