Sourabh Medapati (@activelifetribe) Twitter Tweets • TwiCopy

Sourabh Medapati

@activelifetribe

+ Follow

Research Engineer @ Google Deepmind

ID:1425636229698134019

calendar_today12-08-2021 01:52:31

23 Tweets

51 Followers

733 Following

Jeff Dean (@🏡)

2 months ago

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length

Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long…

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long…

thumb_up_off_alt6,3K

chat_bubble_outline0

account_circle

Elad Hazan

@HazanPrinceton

2 months ago

most exciting paper *ever* from our Google AI lab at Princeton University: naman agarwal danielsuo Xinyi Chen
arxiv.org/abs/2312.06837
*** Convolutional filters predetermined by the theory, no learning needed! ***

thumb_up_off_alt867

chat_bubble_outline0

account_circle

Zachary Nado

2 months ago

soon is happening! go use Gemini Ultra gemini.google.com/advanced?hl=en

thumb_up_off_alt16

chat_bubble_outline0

account_circle

fly51fly

2 months ago

[LG] Grandmaster-Level Chess Without Search
A Ruoss, G Delétang, S Medapati, J Grau-Moya, L K Wenliang, E Catt, J Reid, T Genewein [Google DeepMind] (2024)
arxiv.org/abs/2402.04494

- The authors train transformer models of different sizes (9M, 136M, 270M parameters) on a dataset…

[LG] Grandmaster-Level Chess Without Search A Ruoss, G Delétang, S Medapati, J Grau-Moya, L K Wenliang, E Catt, J Reid, T Genewein [Google DeepMind] (2024) arxiv.org/abs/2402.04494 - The authors train transformer models of different sizes (9M, 136M, 270M parameters) on a dataset…

thumb_up_off_alt17

chat_bubble_outline0

account_circle

AK

2 months ago

Google Deepmind presents Grandmaster-Level Chess Without Search

paper page: huggingface.co/papers/2402.04…

largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicit…

Google Deepmind presents Grandmaster-Level Chess Without Search paper page: huggingface.co/papers/2402.04… largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicit…

thumb_up_off_alt1,4K

chat_bubble_outline0

account_circle

Anian Ruoss

2 months ago

Gary Marcus TPUs go brrrr

arxiv.org/abs/2402.04494

thumb_up_off_alt4

chat_bubble_outline0

account_circle

Sourabh Medapati

@activelifetribe

4 months ago

for anyone looking to hire strong engineers!! bilal did great work with us at Google DeepMind last summer 👌

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Jeremy Cohen

4 months ago

I’m giving a talk at 2:40pm at the Heavy Tails workshop on Friday. The talk is about DL optimization dynamics in general, and adaptive gradient methods in particular. I’m also around NeurIPS starting Wednesday, and would love to meet people with common interests - send me a DM!

thumb_up_off_alt49

chat_bubble_outline0

account_circle

Zachary Nado

4 months ago

It’s been a privilege to be part of the Gemini pretraining team and overall program, I’m so excited that the world can finally see what we’ve been up to for most of the past year:

tl;dr we’re so back storage.googleapis.com/deepmind-media…

It’s been a privilege to be part of the Gemini pretraining team and overall program, I’m so excited that the world can finally see what we’ve been up to for most of the past year: tl;dr we’re so back storage.googleapis.com/deepmind-media…

thumb_up_off_alt1,4K

chat_bubble_outline0

account_circle

Jeff Dean (@🏡)

4 months ago

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,…

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,…

thumb_up_off_alt13,3K

chat_bubble_outline0

account_circle

Frank Schneider

@frankstefansch1

5 months ago

After 3 years of hard work, our unprecedented neural network training algorithm competition is finally open! The exciting part starts now, seeing what the community can create. 🏆Submit, become the next Adam, and bag $50,000 in prizes!
mlcommons.org/2023/11/mlc-al…

thumb_up_off_alt42

chat_bubble_outline0

account_circle

George E. Dahl

5 months ago

The #ML community is currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. This has to change if we want to make progress speeding up neural network training mlcommons.org/2023/11/mlc-al…

thumb_up_off_alt69

chat_bubble_outline0

account_circle

Zachary Nado

5 months ago

tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀

*a set of hparams, self-tuning algorithm, and/or update rule
**see rules for how we measure speed
***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps

thumb_up_off_alt373

chat_bubble_outline0

account_circle

Google AI

5 months ago

To highlight the importance of #ML training & algorithmic efficiency, we’re excited to provide compute resources to help evaluate the best submissions to the MLCommons AlgoPerf training algorithms competition, w/ a chance to win a prize from MLCommons! goo.gle/3N3sHdD

thumb_up_off_alt468

chat_bubble_outline0

account_circle

Zachary Nado

10 months ago

I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam 🚀

I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam 🚀

thumb_up_off_alt728

chat_bubble_outline0

account_circle

AK

10 months ago

Benchmarking Neural Network Training Algorithms

paper page: huggingface.co/papers/2306.07…

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g.,…

Benchmarking Neural Network Training Algorithms paper page: huggingface.co/papers/2306.07… Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g.,…

thumb_up_off_alt163

chat_bubble_outline0

account_circle

Frank Schneider

@frankstefansch1

10 months ago

Is Adam the best optimizer to train neural networks?🤔
We don't know. And we won't know until we test training algorithms properly.

🚀That's why we spent ~2.5 years building AlgoPerf, a competitive, time-to-result training algorithms benchmark using realistic workloads!

Is Adam the best optimizer to train neural networks?🤔 We don't know. And we won't know until we test training algorithms properly. 🚀That's why we spent ~2.5 years building AlgoPerf, a competitive, time-to-result training algorithms benchmark using realistic workloads!

thumb_up_off_alt686

chat_bubble_outline0

account_circle

Dipankar Niranjan

10 months ago

#AI173 with 216 passengers stuck in Magadan, Russia since 2 pm IST Tue 6/6 after technical issues. Passengers staying at makeshift accommodation in a nearby school. Conditions have become miserable, no updates from Air India, bad food. Crew not helping. PMO India Dr. S. Jaishankar (Modi Ka Parivar)

#AI173 with 216 passengers stuck in Magadan, Russia since 2 pm IST Tue 6/6 after technical issues. Passengers staying at makeshift accommodation in a nearby school. Conditions have become miserable, no updates from @airindia, bad food. Crew not helping. @PMOIndia @DrSJaishankar

thumb_up_off_alt42

chat_bubble_outline0

account_circle

Jeff Dean (@🏡)

1 year ago

Bard is now available in the US and UK, w/more countries to come. It’s great to see early Google AI work reflected in it—advances in sequence learning, large neural nets, Transformers, responsible AI techniques, dialog systems & more.

You can try it at bard.google.com

thumb_up_off_alt748

chat_bubble_outline0

account_circle

AK

1 year ago

MusicLM: Generating Music From Text

abs: arxiv.org/abs/2301.11325
project page: google-research.github.io/seanet/musiclm…

thumb_up_off_alt1,5K

chat_bubble_outline0

account_circle

fpc ok :)