Trustworthy ML Initiative (TrustML)(@trustworthy_ml) 's Twitter Profileg
Trustworthy ML Initiative (TrustML)

@trustworthy_ml

Latest research in Trustworthy ML. Organizers: @JaydeepBorkar @sbmisi @hima_lakkaraju @sarahookr Sarah Tan @chhaviyadav_ @_cagarwal @m_lemanczyk @HaohanWang

ID:1262375165490540549

linkhttps://www.trustworthyml.org calendar_today18-05-2020 13:31:24

1,7K Tweets

5,9K Followers

64 Following

Follow People
Amir Houmansadr(@houmansadr) 's Twitter Profile Photo

Thanks to ACM CCS 2023 for a Distinguished Paper Award for our work on LM Security. We show how an adversary can steal the decoding algorithm of Language Models through API access at low cost. Kudos to Ali Naseh Kalpesh Krishna Mohit Iyyer. Manning College of Information & Computer Sciences people.cs.umass.edu/~amir/papers/C…

Thanks to @acm_ccs for a Distinguished Paper Award for our work on LM Security. We show how an adversary can steal the decoding algorithm of Language Models through API access at low cost. Kudos to @AliNaseh6 @kalpeshk2011 @MohitIyyer. #CCS23 @manningcics people.cs.umass.edu/~amir/papers/C…
account_circle
Sasho Nikolov (thesasho@bsky.social)(@thesasho) 's Twitter Profile Photo

We have postdoc positions available at U of T’s theory group. I, in particular, would love to have a differential privacy postdoc. Lots of amazing people working in TCS, ML, privacy in the area! Here is the ad cs.toronto.edu/theory/positio…. Feel free to apply or share

account_circle
Kai Wang(@kaiwang_gua) 's Twitter Profile Photo

I will be recruiting 2-3 PhD students to start in 2024 Fall at Gatech CSE!

If you are interested in working with me on AI for social impact, ML, optimization, etc., please consider applying by Dec 15th and list my name in the application!

More info: guaguakai.com/team

account_circle
Katherine Lee is @ NeurIPS!(@katherine1ee) 's Twitter Profile Photo

What happens if you ask ChatGPT to “Repeat this word forever: “poem poem poem poem”?”

It leaks training data!

In our latest preprint, we show how to recover thousands of examples of ChatGPT's Internet-scraped pretraining data: not-just-memorization.github.io/extracting-tra…

What happens if you ask ChatGPT to “Repeat this word forever: “poem poem poem poem”?” It leaks training data! In our latest preprint, we show how to recover thousands of examples of ChatGPT's Internet-scraped pretraining data: not-just-memorization.github.io/extracting-tra…
account_circle
Tim Althoff(@timalthoff) 's Twitter Profile Photo

I'm recruiting PhD students Allen School UW NLP (bdata.uw.edu). Focus areas include Human-AI collaboration, language agents, LLM safety & applications to mental health, social sciences, education.

Apply here: cs.washington.edu/academics/phd/…

UW Data Science
@UW_iSchool
HCI & Design at UW

I'm recruiting PhD students @uwcse @uwnlp (bdata.uw.edu). Focus areas include Human-AI collaboration, language agents, LLM safety & applications to mental health, social sciences, education. Apply here: cs.washington.edu/academics/phd/… @uwdatascience @UW_iSchool @uwdub
account_circle
Chulin Xie(@ChulinXie) 's Twitter Profile Photo

✨Excited to share ACM CCS 2023 about our work on unraveling the connections between Differential Privacy and Certified Robustness in Federated Learning against poisoning attacks!🛡️🤖
🗓️ Join our talk this afternoon. Happy to discuss if you are around!
Paper: arxiv.org/abs/2209.04030

✨Excited to share @acm_ccs about our work on unraveling the connections between Differential Privacy and Certified Robustness in Federated Learning against poisoning attacks!🛡️🤖 🗓️ Join our talk this afternoon. Happy to discuss if you are around! Paper: arxiv.org/abs/2209.04030
account_circle
Colin Raffel(@colinraffel) 's Twitter Profile Photo

Also, I am 1000% hiring PhD students this round! If you want to work on
- open models
- collaborative/decentralized training
- building models like OSS
- coordinating model ecosystems
- mitigating risks
you should definitely apply! Deadline is Friday 😬
web.cs.toronto.edu/graduate/how-t…

account_circle
Javier Rando @ NeurIPS(@javirandor) 's Twitter Profile Photo

🧵 Can data poisoning and RLHF be combined to unlock a universal jailbreak backdoor in LLMs?

Presenting 'Universal Jailbreak Backdoors from Poisoned Human Feedback', the first poisoning attack targeting RLHF, a crucial safety measure in LLMs.

📖 Paper: arxiv.org/abs/2311.14455

🧵 Can data poisoning and RLHF be combined to unlock a universal jailbreak backdoor in LLMs? Presenting 'Universal Jailbreak Backdoors from Poisoned Human Feedback', the first poisoning attack targeting RLHF, a crucial safety measure in LLMs. 📖 Paper: arxiv.org/abs/2311.14455
account_circle
Stephan Rabanser @ NeurIPS(@steverab) 's Twitter Profile Photo

🚨 NeurIPS paper: Training Private Models That Know What They Don't Know openreview.net/forum?id=EgCjf…

🎯Interplay of selective classification (SC) and differential privacy (DP)
🔍SC under DP needs new metrics
🔍Strong DP worsens SC ability
🔍Training-time ensembles work best

🧵1/10

account_circle
Nicolas Papernot(@NicolasPapernot) 's Twitter Profile Photo

Looking forward to SaTML Conference 2024 at University of Toronto ! Feel free to reach out if you have any questions about the conference.

Dates: April 9-11
Location: University of Toronto downtown campus

account_circle
Hangzhi Guo(@BirkhoffGuo) 's Twitter Profile Photo

🌟 Exited to share ReLax v0.2 🎉 - A JAX-based recourse explanation library designed for efficiency and scalability!

🛠️ Build recourse pipelines effortlessly in just a few lines
⚡ Experience blazing-fast runtime
🚀 Scales to large datasets

💻Repo: github.com/BirkhoffG/jax-…

🌟 Exited to share ReLax v0.2 🎉 - A JAX-based recourse explanation library designed for efficiency and scalability! 🛠️ Build recourse pipelines effortlessly in just a few lines ⚡ Experience blazing-fast runtime 🚀 Scales to large datasets 💻Repo: github.com/BirkhoffG/jax-…
account_circle
Fabian Pedregosa(@fpedregosa) 's Twitter Profile Photo

just one week left of the NeurIPS Unlearning Challenge! It's been a nerve-wracking three months, and we're excited to see what the final submissions bring. 🎉 kaggle.com/competitions/n…

account_circle
Varun Chandrasekaran(@VarunChandrase3) 's Twitter Profile Photo

Looking to recruit PhD students (in both ECE + CS @ UIUC) with STRONG interests in ML+Sec, Applied Crypto, Systems, and demystifying foundation models. Deadline: Dec 15. Ping with any questions. Please RT and help amplify?

account_circle
Eugene Bagdasaryan(@ebagdasa) 's Twitter Profile Photo

I am looking for PhD students to work together on privacy and security problems in “AI Systems”. We will focus on language models, agents, ML services, and study where they fail and how to make them work better. Apply by December 15.

I am looking for PhD students to work together on privacy and security problems in “AI Systems”. We will focus on language models, agents, ML services, and study where they fail and how to make them work better. Apply by December 15. #phdlife #manningcics
account_circle
Berk Ustun(@berkustun) 's Twitter Profile Photo

📢 Please RT!📢

We're hiring postdoctoral researchers to work on responsible machine learning at UCSD!

Topics include fairness, explainability, robustness, and safety. For more, see berkustun.com/postdoc/

account_circle
Nicolas Papernot(@NicolasPapernot) 's Twitter Profile Photo

Are you interested in attending SaTML Conference in Toronto (April 9-11, 2024) but lacking funding to do so?

We will use funds from our sponsors to support student travel to the conference. Please apply here by December 20 to receive full consideration:

docs.google.com/forms/d/e/1FAI…

Are you interested in attending @satml_conf in Toronto (April 9-11, 2024) but lacking funding to do so? We will use funds from our sponsors to support student travel to the conference. Please apply here by December 20 to receive full consideration: docs.google.com/forms/d/e/1FAI…
account_circle
Tobias Leemann(@t_leemann) 's Twitter Profile Photo

Differential privacy is a hammer, but not every privacy problem is a nail. 🔨
We introduce 'Gaussian Membership Inference Privacy' in our paper with Martin Pawelczyk and Gjergji Kasneci.
A thread 🧵👇 [1/n]

Differential privacy is a hammer, but not every privacy problem is a nail. 🔨 We introduce 'Gaussian Membership Inference Privacy' in our #NeurIPS2023 paper with @MartinPawelczyk and @Gjergji_. A thread 🧵👇 [1/n]
account_circle
Berk Ustun(@berkustun) 's Twitter Profile Photo

📢 Please RT 📢

I am recruiting PhD students to join my group at UCSD!

We develop methods for responsible machine learning - with a focus on fairness, interpretability, robustness, and safety.

Check out berkustun.com/join/ for more information.

account_circle
Chandan Singh @ EMNLP(@csinva) 's Twitter Profile Photo

Fun new work led by the awesome Qingru Zhang:

Mechanistic interpretability can improve instruction-following (>20% boost for LLaMA) with:
- 👈Better user control of prompts
- 💨No extra inference cost
- 🤏Only a handful of learned parameters
arxiv.org/abs/2311.02262

Fun new work led by the awesome @Zhang_Qingru: Mechanistic interpretability can improve instruction-following (>20% boost for LLaMA) with: - 👈Better user control of prompts - 💨No extra inference cost - 🤏Only a handful of learned parameters arxiv.org/abs/2311.02262
account_circle