INTERNSHIP DETAILS
2026 Summer Intern - Clinical Insights and Automation (Early Clinical Development)
CompanyGenentech
LocationDaly City
Work ModeOn Site
PostedFebruary 3, 2026

Internship Information
Core Responsibilities
The intern will contribute to research on generative modeling and causal inference for synthetic clinical data. Responsibilities include developing methods, preparing datasets, and evaluating protocols for reproducibility.
Internship Type
full time
Company Size
18078
Visa Sponsorship
No
Language
English
Working Hours
40 hours
Apply Now →
You'll be redirected to
the company's application page
About The Company
About Genentech
We're passionate about finding solutions for people facing the world's most difficult-to-treat conditions. That is why we use cutting-edge science to create and deliver innovative medicines around the globe. To us, science is personal.
Making a difference in the lives of millions starts when you make a change in yours. If you’d like to join our team, view our openings at gene.com/careers.
Our patient resource center is dedicated to getting patients and caregivers to the right resources. You can reach them at 1 (877) GENENTECH (436-3683)
Monday-Friday, 6am-5pm PST or patientinfo@gene.com.
Community Guidelines:
1. We want to foster positive conversation around the issues we are passionate about. To that end, we remove profanity, content that contains threatening language, content that is aimed at private individuals, personal information, and repeated unwanted messages.
2. Don’t mention any medicines by name — ours or anyone else’s.
Because of the fair balance rules governing our industry, we cannot post any comments that reference any pharmaceutical brand, product, or service. Please do not mention any specific medicines by name, or include any links to third party sites in your comments.
3. This isn’t the place to report or discuss side effects.
This site is not intended as a forum for reporting side effects experienced while taking a Genentech product. Instead, you should report any side effects to Genentech Drug Safety at 1-888-835-2555. You can also report side effects of any prescription product directly to the FDA at 1-800-FDA-1088 or by visiting www.FDA.gov/medwatch.
4. Don’t pitch your product or service.
Please don't use our page as a place to promote your product or pitch your services. Please also avoid posting links to external sites. We reserve the right to remove any posts that are deemed promotional.
About the Role
<h3>The Position</h3><p></p><p><b><b>2026 Summer Intern - Clinical Insights and Automation (Early Clinical Development)</b></b></p><p></p><p><b><b>Department Summary</b></b></p><p><br /><span>Within the Clinical Insight and Automation (CI&A) team of the Early Clinical Development (ECD) department at Roche/Genentech, we develop quantitative and AI-driven methods that accelerate study design, evidence generation, and decision-making. This internship will contribute to an applied research effort on generative modeling and causal inference for creating high-fidelity synthetic clinical data, with an emphasis on producing a reproducible outcome by leveraging conditional deep generative models and open-source foundation models; as well as an use case for an AI-Driven Root Cause Analysis for Clinical Data Queries solution.</span></p><p></p><p><span>This internship position is located in</span><span> </span><b><b>South San Francisco, on-site.</b></b></p><p><br /><b><b>The Opportunity</b></b></p><p><span>We are seeking a PhD student who is excited to pursue publication-quality machine learning research at the intersection of generative modeling, causal inference, and healthcare data. In this role, you will collaborate with scientists, analysts, and engineers to develop and rigorously evaluate novel methods and benchmarking protocols, with the goal of accompanying reproducibility artifacts.</span></p><ul><li><p><span>Identify and prepare appropriate clinical datasets, define generation targets, and implement reproducible preprocessing pipelines.</span></p></li><li><p><span>Develop and compare modern generative modeling approaches for patient-level outcomes and trajectories (e.g., diffusion, transformer, and latent-variable models) conditioned on baseline covariates and study design assumptions; and unsupervised clustering/topic models to identify clinically meaningful patterns.</span></p></li><li><p><span>Incorporate causal inference considerations (confounding control, covariate balance, estimands) and quantify how synthetic controls impact downstream treatment-effect estimation; and perform manual "gold standard" labeling to create high-quality training datasets.</span></p></li><li><p><span>Design rigorous evaluation protocols for fidelity and utility, including distributional similarity, calibration/uncertainty, fairness and subgroup robustness, and privacy-risk checks, with ablations and sensitivity analyses.</span></p></li><li><p><span>Build end-to-end experiment infrastructure (training and evaluation scripts, configuration management, and experiment tracking) to support reproducibility and efficient iteration.</span></p></li><li><p><span>Co-prepare a conference-quality manuscript, figures, and supplementary materials, including the paper checklist and (where appropriate) anonymized code/data artifacts consistent with reproducibility and ethics expectations.</span></p></li><li><p><span>Communicate progress through regular updates; deliver a final technical report, curated repository, and presentation to cross-functional stakeholders.</span></p></li><li><p><span>Develop and execute Python scripts to ingest, clean, and normalize large volumes of unstructured clinical query text and patient-level datasets.</span></p></li><li><p><span>Help translate technical AI findings into a "Recommendations Matrix" that suggests specific site training or system improvements for stakeholders.</span></p></li></ul><p></p><p><b><b>Program Highlights</b></b></p><ul><li><p><b><b>Intensive 12-weeks full-time (40 hours per week) paid internship.</b></b></p></li><li><p><b><b>Program start dates are in May/June 2026.</b></b></p></li><li><p><b><b>A stipend, based on location, will be provided to help alleviate costs associated with the internship. </b></b></p></li><li><p><span>Ownership of challenging and impactful business-critical projects.</span></p></li><li><p><span>Work with some of the most talented people in the biotechnology industry.</span></p></li></ul><p><br /><b><b>Who You Are (Required) </b></b></p><p></p><p><b><b>Required Education: </b></b></p><ul><li><p><span>Must be pursuing a PhD (senior enrolled student).</span></p></li></ul><p></p><p><b><b>Required Majors: </b></b><span>Computer Sciences, Artificial Intelligence, Computational Sciences, or a related field with a focus on machine learning systems or similar.</span></p><p></p><p><b><b>Required Skills</b></b></p><ul><li><p><span>Programming proficiency in Python; hands-on experience with PyTorch and scientific computing libraries (NumPy, Pandas).</span></p></li><li><p><span>Experience developing and training deep learning models, with familiarity in modern generative modeling (e.g., diffusion models, VAEs, autoregressive/transformer models).</span></p></li><li><p><span>Strong understanding of statistical machine learning and experimental methodology (ablations, error analysis, and appropriate statistical evaluation).</span></p></li><li><p><span>Foundational understanding of causal inference and counterfactual reasoning (e.g., confounding, estimands, treatment-effect estimation) and how these considerations interact with modeling choices.</span></p></li><li><p><span>Demonstrated technical writing skills (e.g., research reports or papers); comfort preparing conference-style manuscripts in LaTeX and presenting results to technical audiences.</span></p></li><li><p><span>Commitment to reproducible research and software engineering best practices (version control, documentation, experiment tracking), with the ability to package artifacts for peer review; collaborative communication skills.</span></p></li></ul><p></p><p><b><b>Preferred Knowledge, Skills, and Qualifications</b></b></p><ul><li><p><span>Excellent communication, collaboration, and interpersonal skills.</span></p></li><li><p><span>Complements our culture and the standards that guide our daily behavior & decisions: Integrity, Courage, and Passion.</span></p></li><li><p><span>Prior publication or strong experience preparing submissions for top-tier ML venues (e.g., NeurIPS/ICML/ICLR), including familiarity with reproducibility expectations (paper checklist, artifact preparation).</span></p></li><li><p><span>Experience working with healthcare/biomedical datasets (EHR, claims, or clinical trial data); familiarity with data standards (OMOP, FHIR) is a plus.</span></p></li><li><p><span>Knowledge of synthetic data evaluation and privacy risk assessment (e.g., memorization tests, membership inference, differential privacy).</span></p></li><li><p><span>Familiarity with causal ML topics such as causal representation learning, domain adaptation, and externally controlled trial methodology.</span></p></li><li><p><span>Experience with scalable training environments (GPUs, distributed computing) and modern ML tooling (Docker, experiment tracking platforms).</span></p></li><li><p><span>Experience leveraging foundation models (LLMs) or structured prompting to incorporate domain knowledge into ML workflows is beneficial, but not required.</span></p></li><li><p><span>Ability to query and extract data from relational databases using SQL and hands-on experience with NLP frameworks and clustering algorithms. </span></p></li><li><p><span>Familiarity with clinical trial operations, Electronic Data Capture (EDC) systems, or the regulatory landscape of the pharmaceutical industry is beneficial, but not required. </span></p></li></ul><p></p><p><b><b>Relocation benefits are not available for this job posting. </b></b></p><p><span>The expected salary range for this position based on the primary location of California is $50.00 hour. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. This position also qualifies for paid holiday time off benefits.</span></p><p></p><p style="text-align:inherit"></p><p style="text-align:left"><span>Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.</span></p><p style="text-align:inherit"></p><p style="text-align:left"><span>If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form <a target="_blank" href="https://docs.google.com/forms/d/e/1FAIpQLSdZWlsbfQOvFVIQgHE_iDzWUTlhZvj6FytIzjS7xq6IGh1H5g/viewform">Accommodations for Applicants</a>.</span></p><p style="text-align:inherit"></p><p style="text-align:inherit"></p>
Key Skills
PythonMachine LearningDeep LearningGenerative ModelingCausal InferenceStatistical AnalysisTechnical WritingCollaborationData PreparationExperiment TrackingNLPSQLHealthcare DataReproducibilityData NormalizationAI
Categories
TechnologyHealthcareScience & ResearchData & AnalyticsSoftware
Benefits
Paid Holiday Time Off
Prep Tools
FREEYour ScoreTop Applicants
BOOST YOUR INTERVIEW CHANCES
?
»
8.5
Must-Have Skills for This Role
PythonMachine LearningDeep LearningGenerative ModelingCausal Inference
FREE
STAND OUT FROM THE CROWD
AI Cover Letter
Tailored for Genentech
Dear Genentech Hiring Team,
I am excited to apply for the 2026 Summer Intern - Clinical Insights and Automation (Early Clinical Development) position. With my experience in Python and Machine Learning...
Continue with AI →
FREE
ACE YOUR INTERVIEW IN REAL-TIME
Silent AI Co-Pilot
Real-time interview help
Listening...
"Why Genentech?"
💡 Mention their Biotechnology Research and your passion for Python