Hamel Husain(@HamelHusain) 's Twitter Profileg
Hamel Husain

@HamelHusain

Researcher focusing on LLMs: https://t.co/iVZDFdIQiE

Previously, dev tools and infra for ML. Ex @Github, @Airbnb, @DataRobot. @fastdotai core contributor.

ID:825766640

linkhttp://hamel.dev calendar_today15-09-2012 18:45:02

8,3K Tweets

17,8K Followers

1,5K Following

Andrei(@abetlen) 's Twitter Profile Photo

Multimodal support in llama-cpp-python powered by LLaVA 1.5

API is compatible with the new gpt-4-vision-preview model and supports JSON mode responses

Github:
github.com/abetlen/llama-…

Docs:
llama-cpp-python.readthedocs.io/en/latest/serv…

Multimodal support in llama-cpp-python powered by LLaVA 1.5 API is compatible with the new gpt-4-vision-preview model and supports JSON mode responses Github: github.com/abetlen/llama-… Docs: llama-cpp-python.readthedocs.io/en/latest/serv…
account_circle
Hamel Husain(@HamelHusain) 's Twitter Profile Photo

So let me get this right in addition to creating a new front end framework every 6 months, we are now also reviving dead ones (angular)? 😂

account_circle
Stas Bekman(@StasBekman) 's Twitter Profile Photo

Boom! DeepSpeed implemented Sequence Parallelism (SP)

arxiv.org/abs/2309.14509

For very long sequences the paper shows that DeepSpeed-Ulysses trains 2.5x faster with 4x longer sequence length than the existing method SOTA baseline.

Even if it's not faster, it's…

account_circle
Jeremy Howard(@jeremyphoward) 's Twitter Profile Photo

Wow storage for the new OpenAI Assistants API is eye-wateringly expensive! :O

Be careful of what you upload, or your wallet might get a nasty surprise...

Wow storage for the new @OpenAI Assistants API is eye-wateringly expensive! :O Be careful of what you upload, or your wallet might get a nasty surprise...
account_circle
Hamel Husain(@HamelHusain) 's Twitter Profile Photo

Agree with Charles. So many optimizations for inference

1. Flash attention
2. Flash decoding
3. Quantization
4. Speculative decoding
5. Model compilation (TensorRT, mlc, etc)
5. Inference server optimizations like continuous/inflight batching

Am I missing anything btw?

account_circle
anton (𝖜𝖆𝖗𝖙𝖎𝖒𝖊) 🏴‍☠️(@atroyn) 's Twitter Profile Photo

one thing that's conspicuously absent for me, and which i was half expecting, is openai to release their own evaluation platform.

account_circle
Hamel Husain(@HamelHusain) 's Twitter Profile Photo

Seems like new OpenAI features will compete heavily w/LangChain (RAG, code sandbox, threads etc)

Although there is always a market for OSS

account_circle
Hugo Bowne-Anderson is podcasting again(@hugobowne) 's Twitter Profile Photo

I learned so much from Hamel Husain and emilsedgh about evaluating and productionizing in this live stream. We talked about the AI assistant Lucy that they've built Rechat.

Check it out and lmk your thoughts. A few details in 🧵

1/

youtube.com/live/B_DMMlDuJ…

I learned so much from @HamelHusain and @emilsedgh about evaluating and productionizing #LLMs in this live stream. We talked about the AI assistant Lucy that they've built @rechathq. Check it out and lmk your thoughts. A few details in 🧵 1/ youtube.com/live/B_DMMlDuJ…
account_circle
Dreaming Tulpa 🥓👑(@dreamingtulpa) 's Twitter Profile Photo

Exactly 44 hours ago, 12 hours before Midjourney released the Style Tuner and 100 hours before OpenAI is gonna blow our collective minds, Runway released some major improvements to their text-to-video model.

19 crazy examples that caught my eye:

account_circle
Michał Krassowski(@krassowski_m) 's Twitter Profile Photo

Hamel Husain Hamel Husain inline completer was merged in JupyterLab last week github.com/jupyterlab/jup… and is slotted for release in JupyterLab 4.1/Jupyter Notebook 7.1. Now it is just a matter of connecting LLM providers.

account_circle