Eugene Yan (@eugeneyan) Twitter Tweets • TwiCopy

repeat11

account_circle

Eugene Yan

4 weeks ago

sota web agent

thumb_up_off_alt29

repeat0

account_circle

chris

@hingeloss

4 weeks ago

The best part of doing a startup is getting to choose the right thing over the prettiest thing.

The hardest part is saying no to everyone who just wants the pretty thing.

thumb_up_off_alt29

repeat3

account_circle

Hamel Husain

@HamelHusain

4 weeks ago

Love this essay from Eugene Yan

This is especially acute for tools and infra around AI

account_circle

tobi lutke

@tobi

1 month ago

Sunday rant.

For software engineering, my sense is that the phrase “premature optimization is the root of all evil” has massively backfired. Its from a book on data structures and mainly tried to dissuade people from prematurely write things in assembler. But the point was to…

account_circle

Eugene Yan

1 month ago

The doers are the major thinkers. The people that really create the things that change this industry are both the thinker doer in one person.

Of course it's very easy to take credit for the thinking. The doing is more concrete but it's very easy for somebody to say 'oh I…

thumb_up_off_alt26

repeat2

account_circle

Jo Kristian Bergum

@jobergum

1 month ago

imho top performers:

- figure out what needs to be done
- advocate for why it needs to be done
- prioritize what needs to be done versus should be done
- gets it done

account_circle

Eugene Yan

1 month ago

effective communication is just as high leverage as coding, if not more

thumb_up_off_alt29

repeat1

account_circle

Charles 🎉 Frye

@charles_irl

1 month ago

when i looked back at alexnet again in ~2020 and noticed it had model parallelism, i realized that i really needed to spend less time on mathematics and more on software engineering

thumb_up_off_alt116

repeat8

account_circle

Eugene Yan

1 month ago

You NEED “privately curated, internal benchmarks for each company’s own use cases. You can’t game your customers.”

thumb_up_off_alt27

repeat3

account_circle

Alexandr Wang

@alexandr_wang

1 month ago

How overfit are popular LLMs on public benchmarks?

New research out of @scale_ai SEAL to answer this:

- produced a new eval GSM1k
- evaluated public LLMs for overfitting on GSM8k

VERDICT: Mistral & Phi are overfitting benchmarks, while GPT, Claude, Gemini, and Llama are not.

account_circle

Hamel Husain

@HamelHusain

1 month ago

I’m getting lots of questions about why this is a bad idea.

Repeatedly peeking at the validation set in the process optimizing anything makes that validation set very biased

It’s very bad hygiene to intermingle your validation and test/eval set. The consequences of this…

thumb_up_off_alt84

repeat5