Riley Goodside (@goodside) Twitter Tweets • TwiCopy

Riley Goodside

@goodside

2 days ago

Making ChatGPT remember you can’t nest triple-backticks in Markdown:

thumb_up_off_alt61

chat_bubble_outline0

repeat3

shareShare

account_circle

Human preference LLM arenas are poorly suited for evaluating ASCII art because the ASCII art that most impresses a human is often verbatim regurgitation of an existing human work and this is rarely true for text.

Votes on ASCII art should be detected and thrown out IMO.

account_circle

Riley Goodside

@goodside

3 days ago

It’s important to remember LLM capability is bounded by the skill of the humans who train them.

The only reason ChatGPT can identify common, short strings given their MD5 or SHA1 hashes is because that’s a completely ordinary talent that many humans have.

account_circle

Riley Goodside

@goodside

3 days ago

POV: You can’t remember the shell command to reverse an MD5 hash so you ask ChatGPT.

account_circle

Riley Goodside

@goodside

5 days ago

If you’re looking for a hard multimodal eval problem, none of my attempts to get ChatGPT, Claude, or Gemini to read the security code Gehn writes in his journal in base-25 D’ni numerals in the 1997 video game Riven: The Sequel to Myst have yet succeeded.

account_circle

Matt Shumer

@mattshumer_

1 week ago

The dataset is everything.

Great read: nonint.com/2023/06/10/the…

account_circle

Jeremy Howard

@jeremyphoward

1 week ago

Today at Answer.AI we've got something new for you: FSDP/QDoRA. We've tested it with AI at Meta Llama3 and the results blow away anything we've seen before.

I believe that this combination is likely to create better task-specific models than anything else at any cost. 🧵

Today at @answerdotai we've got something new for you: FSDP/QDoRA. We've tested it with @AIatMeta Llama3 and the results blow away anything we've seen before. I believe that this combination is likely to create better task-specific models than anything else at any cost. 🧵

account_circle

Simon Willison

@simonw

1 week ago

New paper from OpenAI on prompt injection - it's the most detailed evaluation of the problem I've seen from them so far, and has some very interesting details

Posted some of my notes on the paper on my log here: simonwillison.net/2024/Apr/23/th…

account_circle

Riley Goodside

@goodside

1 week ago

Most people rejected His message

account_circle

Riley Goodside

@goodside

3 weeks ago

A claim of consciousness from an LLM has no more evidential value than the same from a character in a dream.

The latter is more plausible a priori as the hardware is known to support it.

account_circle

Riley Goodside

@goodside

3 weeks ago

New Command R+ from Cohere — 128k context, open weights for non-commercial use, commercial API priced similar to Claude 3 Sonnet

Tokenizer is designed to be efficient in 10 languages so definitely consider for non-English text. Multi-hop tool use sounds interesting too

thumb_up_off_alt58

chat_bubble_outline0

repeat4

shareShare

account_circle