John Schulman
@johnschulman2
Cofounder @openai, lead post-training for ChatGPT and the API. Interested in reinforcement learning, alignment, birds, jazz music
ID:1388977636618080256
02-05-2021 22:05:23
90 Tweets
38,7K Followers
609 Following
Follow People
I've been enjoying Richard Ngo's sci-fi writing at narrativeark dot xyz. It's a rare feat to combine these three properties: (1) about post-AGI worlds (2) plausible (3) actually fun to read.
Excited to share what I've been working on with John Schulman and Jacob Hilton!
We find that overoptimization of reward models can be modelled by simple functional forms with coefficients that scale smoothly with reward model size.
Paper: arxiv.org/abs/2210.10760
Glad to share that next episode we will be featuring OpenAI founder and researcher John Schulman !
With a focus on his recent work on RL from human feedback.
DM or reply with suggest questions