Lab Horizons(@Lab_Horizons) 's Twitter Profile Photo

Exploring the frontier of AI safety, new policy forum discussed a roadmap to preempt the risks of advanced AI agents, advocating for robust regulations to secure humanity’s oversight. Read more: labhorizons.co.uk/2024/04/outsma… safety

account_circle
Technical AI Safety Conference (TAIS)(@tais_2024) 's Twitter Profile Photo

In his talk at , Manuel Baltieri shared the concepts of active inference and the free energy principle, highlighting their significance in . He explained how these ideas contribute to defining 'what agents are' and 'what agents do', particularly emphasizing the…

In his talk at #TAIS2024, @manuelbaltieri shared the concepts of active inference and the free energy principle, highlighting their significance in #AIsafety. He explained how these ideas contribute to defining 'what agents are' and 'what agents do', particularly emphasizing the…
account_circle
AllAboutAI(@AllAboutAicom) 's Twitter Profile Photo

The US and UK unite for AI safety. A new partnership aims to set global standards and
ensure AI technologies are developed responsibly.
Hit the link in bio and get ready for your mind to be blown!
ISafety llAboutAI rtificialIntelligence

account_circle
Zhijing Jin@ICLR/COLING/NAACL(@ZhijingJin) 's Twitter Profile Photo

I'll be at from Mon to Sun!

Happy to chat w/ students interested in applying for PhDs, and connect w/ researchers in .

I'll be presenting our paper on Correlation-to-Causation (Corr2Cause) Inference for LLMs on Thu: arxiv.org/abs/2306.05836🎉

I'll be at #ICLR2024 from Mon to Sun! 

Happy to chat w/ students interested in applying for PhDs, and connect w/ researchers in #LLMs #Causality #AISafety.

I'll be presenting our paper on Correlation-to-Causation (Corr2Cause) Inference for LLMs on Thu: arxiv.org/abs/2306.05836🎉
account_circle
Sanjay Puri(@spuri) 's Twitter Profile Photo

Is the US set to lead in global tech and innovation? Joe Morelle critiques the nation's R&D strategy in our latest episode, highlighting the need for a cohesive approach to technological innovation. Click for the full episode

account_circle
Sanjay Puri(@spuri) 's Twitter Profile Photo

How can AI be a force for good? Dive into a conversation with Trooper Sanders of Benefits Data Trust on crafting trustworthy AI. His unique insights provide a roadmap for ethical AI regulation. Click here for more.

account_circle
Google Cloud Security(@GoogleCloudSec) 's Twitter Profile Photo

Heather Adkins - Ꜻ - Spes consilium non est joined a thought-provoking panel of experts to discuss the effective ways organizations can harness the power of AI and the state of AI safety with regards to legislation. Stay tuned for more insights from RSA!

@argvee joined a thought-provoking panel of experts to discuss the effective ways organizations can harness the power of AI and the state of AI safety with regards to legislation. Stay tuned for more insights from RSA!

#RSAC #GenAI #AISafety #Cybersecurity
account_circle
NformAI(@Nform_AI) 's Twitter Profile Photo

AI is changing the game in public safety - from predicting crimes to optimizing emergency responses. Read our latest article to find out how.

AI is changing the game in public safety - from predicting crimes to optimizing emergency responses. Read our latest article to find out how. #AISafety #PublicSecurity #TechInnovation
account_circle
Technical AI Safety Conference (TAIS)(@tais_2024) 's Twitter Profile Photo

At , Dan Hendrycks, director of Center for AI Safety, unveiled his presentation on the WMDP Benchmark, focusing on measuring and mitigating malicious usage through unlearning. He introduced CUT, a cutting-edge unlearning technique. Watch now: youtu.be/cHPlQTJqtGw

At #TAIS2024, @DanHendrycks, director of @ai_risks, unveiled his presentation on the WMDP Benchmark, focusing on measuring and mitigating malicious usage through unlearning. He introduced CUT, a cutting-edge unlearning technique. Watch now: youtu.be/cHPlQTJqtGw
#AIsafety
account_circle
Technical AI Safety Conference (TAIS)(@tais_2024) 's Twitter Profile Photo

In their talk at , James Fox and @mattmacdermott1 explored the interconnectedness of causality, agency and . They illustrated potential real-world implementations of their theoretical insights by presenting their strategies for creating 'agency detectors'.…

In their talk at #TAIS2024, @James_D_Fox and @mattmacdermott1 explored the interconnectedness of causality, agency and #AIsafety. They illustrated potential real-world implementations of their theoretical insights by presenting their strategies for creating 'agency detectors'.…
account_circle
Sanjay Puri(@spuri) 's Twitter Profile Photo

Is the US set to lead in global tech and innovation? Joe Morelle critiques the nation's R&D strategy in our latest episode, highlighting the need for a cohesive approach to technological innovation. Click for the full episode

account_circle
Xuanli He(@zodiacJRH) 's Twitter Profile Photo

🚨 New Paper! (arxiv.org/abs/2404.19597)🚨 We uncover significant vulnerabilities in Multilingual LLMs (MLLMs) (e.g., BLOOM, Llama2, Llama3, Gemma, and GPT-3.5-turbo) to cross-lingual transferable backdoor attacks.

🚨 New Paper! (arxiv.org/abs/2404.19597)🚨 We uncover significant vulnerabilities in Multilingual LLMs (MLLMs) (e.g., BLOOM, Llama2, Llama3, Gemma, and GPT-3.5-turbo) to cross-lingual transferable backdoor attacks. #AIsafety #LLMs #backdoors
account_circle
Rick Roane(@ChiefAdversary) 's Twitter Profile Photo

AI deepfake tech and products are exploding with success right now, but so is the threat to innocent bystanders. What more can we do to prevent abuse before product launches and adoption? Safety Auventic, Inc.

AI deepfake tech and products are exploding with success right now, but so is the threat to innocent bystanders. What more can we do to prevent abuse before product launches and adoption? #Deepfake #AI #ArtificialIntelligence #Auventic #ClearedContact #AISafety @auventic…
account_circle
SPAR(@SPARexec) 's Twitter Profile Photo

SPAR is now accepting technical safety and AI governance mentees for Summer 2024 (June 14 - Sep 7)!

Apply here:tinyurl.com/SPAR-Mentee by 5/24!

SPAR provides opportunities to work with mentors to develop valuable experience in

account_circle
Diana Wolf Torres(@wolf95020) 's Twitter Profile Photo

Check out the latest article in my newsletter: Unraveling the Paperclip Alignment Problem: A Cautionary Tale in AI Development linkedin.com/pulse/unraveli… via LinkedIn

account_circle