Abhishek Yadav
@abhishek__AI
Data analyst by day, AI explorer by night. Passionate about all things data and AI.
Let's learn & grow together!
📖,🚘,🎧,⚽,🏊❤️
ID:715774453423206401
http://Www.futureoflife.org 01-04-2016 05:35:03
9,3K Tweets
4,3K Followers
1,0K Following
If you're still struggling to understand how transformers work here are some amazing resources! (including mine! :))
First of all Grant Sanderson just released 2 videos covering in fair amount of depth word embeddings, transformers and its submodules like embedding mechanism,…
1/ 🥁Scaling Laws for Data Filtering 🥁
TLDR: Data Curation *cannot* be compute agnostic!
In our #CVPR2024 paper, we develop the first scaling laws for heterogeneous & limited web data.
w/Sachin Goyal Zachary Lipton Aditi Raghunathan Zico Kolter
📝:arxiv.org/abs/2404.07177
🚀 Introducing Mistral-22b-V.01 A breakthrough in AI! 🧠💡
- First-ever MOE to Dense model conversion🔥 #Mistral22bV01
This model is NOT an MOE
(It only has 22B params.)
huggingface.co/Vezora/Mistral…