Ben Schmidt / @benmschmidt@sigmoid.social(@benmschmidt) 's Twitter Profileg
Ben Schmidt / @[email protected]

@benmschmidt

VP of Information Design @nomic_ai, building new ways to interpret and shape embedding models. Onetime history/digital humanities prof. @bschmidt.bsky.social

ID:222618390

linkhttps://benschmidt.org calendar_today03-12-2010 23:11:42

7,8K Tweets

10,1K Followers

1,2K Following

Follow People
AndriyMulyar(@andriy_mulyar) 's Twitter Profile Photo

Just a reminder you can literally train nomic embed from scratch, the training data is public.

The only above 62% on mteb below 500M params to do so 😀

account_circle
Dmitry Kobak(@hippopedoid) 's Twitter Profile Photo

Our paper 'The landscape of biomedical research' is out in Patterns journal! Great job by Rita González Márquez.

cell.com/patterns/fullt…

Amazing interactive explorer by Ben Schmidt / @[email protected] from Nomic AI: static.nomic.ai/pubmed.html

For details see my original Twitter thread: twitter.com/hippopedoid/st….

Our paper 'The landscape of biomedical research' is out in @Patterns_CP! Great job by @ritagonmar. cell.com/patterns/fullt… Amazing interactive explorer by @benmschmidt from @nomic_ai: static.nomic.ai/pubmed.html For details see my original Twitter thread: twitter.com/hippopedoid/st….
account_circle
Ben Schmidt / @benmschmidt@sigmoid.social(@benmschmidt) 's Twitter Profile Photo

The fake accounts here seem to be getting a lot worse, no? I've now got multiple bots with *the same profile picture* liking old reposts. Seems hard for me to believe that this is either useful spamming, *or* something that would be hard for functioning auto-moderation to find.

The fake accounts here seem to be getting a lot worse, no? I've now got multiple bots with *the same profile picture* liking old reposts. Seems hard for me to believe that this is either useful spamming, *or* something that would be hard for functioning auto-moderation to find.
account_circle
Jan Lause 🟦 @janlause.bsky.social🦉(@JanLause) 's Twitter Profile Photo

In their Specious Art paper, Chari & Lior Pachter claim that tSNE/UMAP are as arbitrary as a random elephant shape. But are they?

We show in our comment that this is false and throws the tSNE/UMAP baby out with the bathwater!

Details in 🧵& paper:

biorxiv.org/content/10.110…

1/8

account_circle
Ben Schmidt / @benmschmidt@sigmoid.social(@benmschmidt) 's Twitter Profile Photo

Biologists concerned about the usefulness of UMAP/T-SNE-type methods for dimensionality reduction need to read this article from Jan Lause, Philipp Berens, & Dmitry Kobak patiently what metrics they're good at. biorxiv.org/content/10.110…;

account_circle
Andrew Gray | @generalising@mastodon.flooey.org(@generalising) 's Twitter Profile Photo

I have a preprint out! Evidence for extensive appearance of chatGPT/LLM derived text in scholarly papers, signalled by words that mysteriously became a lot more popular in 2023 - eg 'commendable'. I estimate upwards of 60,000 papers last year (& rising...) arxiv.org/abs/2403.16887

I have a preprint out! Evidence for extensive appearance of chatGPT/LLM derived text in scholarly papers, signalled by words that mysteriously became a lot more popular in 2023 - eg 'commendable'. I estimate upwards of 60,000 papers last year (& rising...) arxiv.org/abs/2403.16887
account_circle
AndriyMulyar(@andriy_mulyar) 's Twitter Profile Photo

One under-rated Nomic Atlas capability is that you can just directly upload 2D point clouds.

Useful if you:
- Ran your own dim reduction algo like umap or tsne
- have long/lat coordinate metadata
- want to make your matplotlib plots interactive!

Here is a 2D Clifford attractor…

account_circle
Dmitry Kobak(@hippopedoid) 's Twitter Profile Photo

Huge thanks once more to all participants for coming together this week, and for Schloss Dagstuhl for hosting us! Looking forward to working on all the ideas that we discussed, and to future meetings!

dagstuhl.de/24122

Huge thanks once more to all participants for coming together this week, and for @dagstuhl for hosting us! Looking forward to working on all the ideas that we discussed, and to future meetings! dagstuhl.de/24122
account_circle
Alexander Visheratin(@visheratin) 's Twitter Profile Photo

Look at this beauty—four different embeddings on the same map! In another Hugging Face community post, I explore how you can use Nomic AI Atlas to view and clean your dataset. It also has a brief Lightning AI ⚡️ cameo =)

account_circle
Jared Wilber(@jdwlbr) 's Twitter Profile Photo

Wow - this browsertech episode with Paul Butler + Ben Schmidt / @[email protected] discussing the challenges of working with/visualizing large datasets Nomic AI is such a great listen. Straight into my veins 🤤

podcast.browsertech.com/episodes/ben-s…

account_circle
Ben Schmidt / @benmschmidt@sigmoid.social(@benmschmidt) 's Twitter Profile Photo

This is a such a cool feature of our embeddings we'd been keeping a little secret--they're trained not just to be nested (so fewer dimensions work really well) but also work as binary. Unlocks all sorts of amazing in-and out-of browser small-scale operations.

account_circle