Paul Röttger
@paul_rottger
Postdoc @MilaNLProc, working on evaluating and improving LLM safety. Previously PhD @oiioxford & CTO/co-founder @rewire_online
ID:1280235569218494465
https://paulrottger.com/ 06-07-2020 20:22:08
274 Tweets
2,2K Followers
455 Following
If you are working on AI alignment, you should really check out PRISM. It is hard to overstate how rich and exciting this dataset is.
What a great week to be a co-author of Hannah Rose Kirk!
Personalised LLMs are great, but should there be limits to personalisation? If so, who should set these limits?
For answers to these questions and more, check out our paper on the risks and benefits of personalising LLMs, led by Hannah Rose Kirk 👇 out in Nature Machine Intelligence today!
New paper at #NAACL2024 🥳
We present GAHD, an 11k German Adversarial Hate speech Dataset 📜 and show that mixing annotator support strategies for finding adv. examples leads to a more effective dataset!
Great collab with Paul Röttger and Text Crunching Center @UZH!
Highlights below ⬇️