Sourabh Medapati(@activelifetribe) 's Twitter Profileg
Sourabh Medapati

@activelifetribe

Research Engineer @ Google Deepmind

ID:1425636229698134019

calendar_today12-08-2021 01:52:31

23 Tweets

51 Followers

733 Following

Jeff Dean (@šŸ”)(@JeffDean) 's Twitter Profile Photo

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length

Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly longā€¦

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly longā€¦
account_circle
fly51fly(@fly51fly) 's Twitter Profile Photo

[LG] Grandmaster-Level Chess Without Search
A Ruoss, G DelƩtang, S Medapati, J Grau-Moya, L K Wenliang, E Catt, J Reid, T Genewein [Google DeepMind] (2024)
arxiv.org/abs/2402.04494

- The authors train transformer models of different sizes (9M, 136M, 270M parameters) on a datasetā€¦

[LG] Grandmaster-Level Chess Without Search A Ruoss, G DelĆ©tang, S Medapati, J Grau-Moya, L K Wenliang, E Catt, J Reid, T Genewein [Google DeepMind] (2024) arxiv.org/abs/2402.04494 - The authors train transformer models of different sizes (9M, 136M, 270M parameters) on a datasetā€¦
account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Google Deepmind presents Grandmaster-Level Chess Without Search

paper page: huggingface.co/papers/2402.04ā€¦

largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicitā€¦

Google Deepmind presents Grandmaster-Level Chess Without Search paper page: huggingface.co/papers/2402.04ā€¦ largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicitā€¦
account_circle
Jeremy Cohen(@deepcohen) 's Twitter Profile Photo

Iā€™m giving a talk at 2:40pm at the Heavy Tails workshop on Friday. The talk is about DL optimization dynamics in general, and adaptive gradient methods in particular. Iā€™m also around NeurIPS starting Wednesday, and would love to meet people with common interests - send me a DM!

account_circle
Zachary Nado(@zacharynado) 's Twitter Profile Photo

Itā€™s been a privilege to be part of the Gemini pretraining team and overall program, Iā€™m so excited that the world can finally see what weā€™ve been up to for most of the past year:

tl;dr weā€™re so back storage.googleapis.com/deepmind-mediaā€¦

Itā€™s been a privilege to be part of the Gemini pretraining team and overall program, Iā€™m so excited that the world can finally see what weā€™ve been up to for most of the past year: tl;dr weā€™re so back storage.googleapis.com/deepmind-mediaā€¦
account_circle
Jeff Dean (@šŸ”)(@JeffDean) 's Twitter Profile Photo

Iā€™m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,ā€¦

Iā€™m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,ā€¦
account_circle
Frank Schneider(@frankstefansch1) 's Twitter Profile Photo

After 3 years of hard work, our unprecedented neural network training algorithm competition is finally open! The exciting part starts now, seeing what the community can create. šŸ†Submit, become the next Adam, and bag $50,000 in prizes!
mlcommons.org/2023/11/mlc-alā€¦

account_circle
George E. Dahl(@GeorgeEDahl) 's Twitter Profile Photo

The community is currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. This has to change if we want to make progress speeding up neural network training mlcommons.org/2023/11/mlc-alā€¦

account_circle
Zachary Nado(@zacharynado) 's Twitter Profile Photo

tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 šŸ’øšŸš€

*a set of hparams, self-tuning algorithm, and/or update rule
**see rules for how we measure speed
***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps

account_circle
Google AI(@GoogleAI) 's Twitter Profile Photo

To highlight the importance of training & algorithmic efficiency, weā€™re excited to provide compute resources to help evaluate the best submissions to the MLCommons AlgoPerf training algorithms competition, w/ a chance to win a prize from MLCommons! goo.gle/3N3sHdD

account_circle
Zachary Nado(@zacharynado) 's Twitter Profile Photo

I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam šŸš€

I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam šŸš€
account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Benchmarking Neural Network Training Algorithms

paper page: huggingface.co/papers/2306.07ā€¦

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g.,ā€¦

Benchmarking Neural Network Training Algorithms paper page: huggingface.co/papers/2306.07ā€¦ Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g.,ā€¦
account_circle
Frank Schneider(@frankstefansch1) 's Twitter Profile Photo

Is Adam the best optimizer to train neural networks?šŸ¤”
We don't know. And we won't know until we test training algorithms properly.

šŸš€That's why we spent ~2.5 years building AlgoPerf, a competitive, time-to-result training algorithms benchmark using realistic workloads!

Is Adam the best optimizer to train neural networks?šŸ¤” We don't know. And we won't know until we test training algorithms properly. šŸš€That's why we spent ~2.5 years building AlgoPerf, a competitive, time-to-result training algorithms benchmark using realistic workloads!
account_circle
Dipankar Niranjan(@dip_niranjan) 's Twitter Profile Photo

with 216 passengers stuck in Magadan, Russia since 2 pm IST Tue 6/6 after technical issues. Passengers staying at makeshift accommodation in a nearby school. Conditions have become miserable, no updates from Air India, bad food. Crew not helping. PMO India Dr. S. Jaishankar (Modi Ka Parivar)

#AI173 with 216 passengers stuck in Magadan, Russia since 2 pm IST Tue 6/6 after technical issues. Passengers staying at makeshift accommodation in a nearby school. Conditions have become miserable, no updates from @airindia, bad food. Crew not helping. @PMOIndia @DrSJaishankar
account_circle
Jeff Dean (@šŸ”)(@JeffDean) 's Twitter Profile Photo

Bard is now available in the US and UK, w/more countries to come. Itā€™s great to see early Google AI work reflected in itā€”advances in sequence learning, large neural nets, Transformers, responsible AI techniques, dialog systems & more.

You can try it at bard.google.com

account_circle
AK(@_akhaliq) 's Twitter Profile Photo

MusicLM: Generating Music From Text

abs: arxiv.org/abs/2301.11325
project page: google-research.github.io/seanet/musiclmā€¦

account_circle