Full Transcript

OpenAI’s Chief Scientist on Continual Learning Hype, RL Beyond Code, & Future Alignment Directions

58:4713,171 words · ~66 min readEnglishTranscribed Apr 11, 2026

AI Summary

OpenAI's Chief Scientist details the path toward research-level AI agents by late 2024 and autonomous researchers by 2028, driven by advances in reinforcement learning and mathematical reasoning. The focus is shifting from simple pattern matching to models capable of driving scientific discovery and long-horizon task execution.

As OpenAI's Chief Scientist, Jakub Pachocki provides the most direct glimpse into the technical priorities and deployment timelines for the models that will define the next phase of the global economy and scientific research.

Section summaries

0:00-1:53

Intro and Background

optional

Host introduces Jakub and sets the stage; skip if you already know who the Chief Scientist of OpenAI is.

1:53-11:21

Timelines and Benchmarks

watch

Contains crucial info on the Sept 2024 and March 2028 goals for AI research autonomy.

11:21-24:23

Business and Developer Strategy

watch

Discusses whether to use RL or Context and how the developer role is shifting.

24:23-37:40

Continual Learning and Science

optional

Technical deep dive into math proofs and scientific discovery; valuable for those in STEM.

37:40-47:57

AI Safety and Alignment

watch

Explains why chains of thought are hidden and the technical path to alignment.

47:57-51:53

OpenAI Internal Culture

optional

Historical look at the shifts within the organization from academic to scaling lab.

51:53-58:43

Quickfire and Societal Impact

watch

Important discussion on wealth concentration, robotics timelines, and governance.

Key points

The Path to Research Autonomy — Pachocki distinguishes between a 'research intern' (specific technical tasks) and a 'full automated researcher' (long-term autonomous goal-setting). OpenAI remains on track for intern-level capabilities by late 2024, using math and coding as 'North Stars' because they are easily verifiable yet arbitrarily difficult.
Chain of Thought Monitoring for Alignment — OpenAI intentionally hides internal 'chains of thought' in products to avoid supervising the reasoning process directly, which preserves the model's private reasoning space as a tool for interpretability. By keeping this space 'unsupervised' during training, researchers can more accurately detect if a model is 'scheming' or hiding objectives.
Generalization as the Alignment Frontier — Long-term alignment is essentially a problem of generalization: determining what values a model falls back on when it is much smarter than its training data or faces a completely new distribution. Pachocki argues that alignment is not a nebulous philosophical problem but one solvable through concrete technical insights and scaling.

“I definitely agree that continual learning is really the thing. It's really the thing that we're building.” — Jakub Pachocki

“We are buying a lot of compute because we still believe... more than ever to some degree.” — Jakub Pachocki

AI-generated from the transcript. May contain errors.