Jeremy Howard (@jeremyphoward) Twitter Tweets • TwiCopy

account_circle

Jeremy Howard

@jeremyphoward

12 hours ago

> 'I'm not based on LLaMA 3'

I'm surprised that most modern LLMs still aren't being fine tuned to correctly answer basic questions about themselves.

Intuitively, users expect that they can ask an LLM about itself, and they generally trust the answers provided.

thumb_up_off_alt178

repeat5

account_circle

Vipul Ved Prakash

@vipulved

16 hours ago

These models are incredible, and a massive step forward for OSS AI. Amazing work from Meta team!

On Together AI now at 350 t/s for full precision on 8B and 150 t/s on 70B.

api.together.xyz/playground/cha…

thumb_up_off_alt50

account_circle

Jeremy Howard

@jeremyphoward

15 hours ago

Looks like the answer is yes :D
twitter.com/charliebholtz/…

thumb_up_off_alt35

repeat0

account_circle

Jeremy Howard

@jeremyphoward

16 hours ago

Claude has a nice trick where you prefill the start of the assistant response, and it continues from there. Anyone know if Llama 3 can do the same thing?

thumb_up_off_alt188

account_circle

Teknium (e/λ)

@Teknium1

16 hours ago

We've just uploaded a GGUF of the 8b llama-3 instruct model on Nous Research's huggingface org:
huggingface.co/NousResearch/M…

account_circle

Maziyar PANAHI

@MaziyarPanahi

1 day ago

Mixtral-8x22B-Instruct-v0.1, going wild on TOOLS & FUNCTION CALLING:

'<unk>'
'<s>',
'</s>',
'[INST]',
'[/INST]',
'[TOOL_CALLS]',
'[AVAILABLE_TOOLS]',
'[/AVAILABLE_TOOLS]',
'[TOOL_RESULT]',
'[/TOOL_RESULTS]',

thumb_up_off_alt26

repeat3

account_circle

Piotr Mazurek

@tugot17

18 hours ago

As predicted in the llama-author-scaling-law, 328 people on the most recent llama paper 🦙

account_circle

Nathan Lambert

@natolambert

18 hours ago

Diff of llama 3 license to llama 2: Mostly around sharing, built with llama 3 branding, agree to meta brand guidelines for distributing trademark
Some minor other differences

thumb_up_off_alt87

repeat8

account_circle

Adrien Brault-Lesage

@AdrienBrault

18 hours ago

Teknium (e/λ) Am I comparing correctly to github.com/openai/simple-… ?

@Teknium1 Am I comparing correctly to github.com/openai/simple-… ?

thumb_up_off_alt61

account_circle

Daniel Han

@danielhanchen

17 hours ago

#LLaMA3 is out! It's the same architecture as Llama-2, except for some differences:
1. 128K Tiktoken vocab vs 32K vocab of Llama-2
2. 15 Trillion tokens instead of 2T
3. 8 billion model uses GQA (unlike Llama 7b)
4. 8K Context Length
5. Chinchilla scaling laws - log linear gains!…

account_circle

jackson petty

@jowenpetty

17 hours ago

Mikel Artetxe I'd just like to interject for a moment. What you're referring to as <model>,
is in fact, Llama 3/<model>, or as I've recently taken to calling it, Llama 3 plus <model>.

thumb_up_off_alt29

repeat3