UW NLP(@uwnlp) 's Twitter Profileg
UW NLP

@uwnlp

The NLP group at the University of Washington.

ID:3716745856

calendar_today20-09-2015 10:26:25

1,1K Tweets

11,1K Followers

160 Following

Mike A. Merrill(@Mike_A_Merrill) 's Twitter Profile Photo

The question below is pretty easy for humans. Why can't GPT-4 get it right? In our new preprint we introduce 'time series reasoning' and show that modern language models are surprisingly bad at interpreting these critical data. arxiv.org/abs/2404.11757

The question below is pretty easy for humans. Why can't GPT-4 get it right? In our new preprint we introduce 'time series reasoning' and show that modern language models are surprisingly bad at interpreting these critical data. arxiv.org/abs/2404.11757
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

The infini-gram paper is updated with the incredible feedback from the online community 🧡 We added references to papers of Jeff Dean (@🏡) Yee Whye Teh Ehsan Shareghi Edward Raff et al.

arxiv.org/abs/2401.17377

Also happy to share that the infini-gram API has served 30 million queries!

account_circle
Weijia Shi(@WeijiaShi2) 's Twitter Profile Photo

When augmented with retrieval, LMs sometimes overlook retrieved docs and hallucinate 🤖💭

To make LMs trust evidence more and hallucinate less, we introduce Context-Aware Decoding: a decoding algorithm improving LM's focus on input contexts

📖 arxiv.org/pdf/2305.14739…

When augmented with retrieval, LMs sometimes overlook retrieved docs and hallucinate 🤖💭 To make LMs trust evidence more and hallucinate less, we introduce Context-Aware Decoding: a decoding algorithm improving LM's focus on input contexts 📖 arxiv.org/pdf/2305.14739… #NAACL2024
account_circle
Taylor Sorensen(@ma_tay_) 's Twitter Profile Photo

🤔How can we align AI systems/LLMs 🤖 to better represent diverse human values and perspectives?💡🌍

We outline a roadmap to pluralistic alignment with concrete definitions for how AI systems and benchmarks can be pluralistic!
arxiv.org/abs/2402.05070

First, models can be…

🤔How can we align AI systems/LLMs 🤖 to better represent diverse human values and perspectives?💡🌍 We outline a roadmap to pluralistic alignment with concrete definitions for how AI systems and benchmarks can be pluralistic! arxiv.org/abs/2402.05070 First, models can be…
account_circle
Chris Rytting(@ChrisRytting) 's Twitter Profile Photo

Helping people practice key skills in situations that are/feel realistic is one of the coolest, most appropriate applications of LMs, IMO. Check out our new work (captained by the intrepid Inna Lin) on helping people communicate effectively in challenging interpersonal convos!

account_circle
Inna Lin(@iwylin) 's Twitter Profile Photo

Ever find yourself delaying a conversation because you're nervous about how it might go?😩
We developed IMBUE, an -backed tool, to help you improve skills and manage , through simulation and just-in-time feedback.

Paper🔗: arxiv.org/pdf/2402.12556…

Ever find yourself delaying a conversation because you're nervous about how it might go?😩 We developed IMBUE, an #LLM-backed tool, to help you improve #communication skills and manage #emotions, through simulation and just-in-time feedback. Paper🔗: arxiv.org/pdf/2402.12556…
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

Big milestone! Welcome Dolma to infini-gram 📖, now available on our web interface and API endpoint.

This brings the total size of the infini-gram indexes to 5 trillion tokens and about 5 quadrillion (5 x 10^15) unique n-grams. It is the largest n-gram LM ever built, both by the

Big milestone! Welcome Dolma to infini-gram 📖, now available on our web interface and API endpoint. This brings the total size of the infini-gram indexes to 5 trillion tokens and about 5 quadrillion (5 x 10^15) unique n-grams. It is the largest n-gram LM ever built, both by the
account_circle
Yizhong Wang(@yizhongwyz) 's Twitter Profile Photo

When you use ChatGPT, do you notice that it has a data cutoff date? 🗓️ But as models are pretrained on web text originating from many historical periods, do they have a sense that they should use their latest knowledge to answer questions rather than historical info?

Excited to

When you use ChatGPT, do you notice that it has a data cutoff date? 🗓️ But as models are pretrained on web text originating from many historical periods, do they have a sense that they should use their latest knowledge to answer questions rather than historical info? Excited to
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

[Fun w/ infini-gram 📖 #6] Have you ever taken a close look at Llama-2’s vocabulary? 🧐

I used infini-gram to plot the empirical frequency of all tokens in the Llama-2 vocabulary. Here’s what I learned (and more Qs raised):

1. While Llama-2 uses a BPE tokenizer, the tokens are

[Fun w/ infini-gram 📖 #6] Have you ever taken a close look at Llama-2’s vocabulary? 🧐 I used infini-gram to plot the empirical frequency of all tokens in the Llama-2 vocabulary. Here’s what I learned (and more Qs raised): 1. While Llama-2 uses a BPE tokenizer, the tokens are
account_circle
Zhaofeng Wu(@zhaofeng_wu) 's Twitter Profile Photo

Do next-word predictors capture sentence meaning? 🧙‍♂️ We show that they do, as reflected in their assigned sentence cooccurrence probabilities. LMs are sensitive to entailment, assigning different prob. to sentences entailed by context vs not arxiv.org/abs/2402.13956

account_circle
Will Merrill(@lambdaviking) 's Twitter Profile Photo

📢 Preprint: We can predict entailment relations from LM sentence co-occurrence prob. scores

These results suggest predicting sentence co-occurrence may be one way that next-word prediction leads to (partial) semantic representations in LMs🧵

📢 Preprint: We can predict entailment relations from LM sentence co-occurrence prob. scores These results suggest predicting sentence co-occurrence may be one way that next-word prediction leads to (partial) semantic representations in LMs🧵
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

[Fun w/ infini-gram 📖 #5] What does RedPajama say about Letter Frequency?

Image shows the letter distribution. Seems that there’s a lot less letter “h” in RedPajama than expected (using Wikipedia page as gold reference: en.wikipedia.org/wiki/Letter_fr…). Thoughts? 🤔

(I issued a single

[Fun w/ infini-gram 📖 #5] What does RedPajama say about Letter Frequency? Image shows the letter distribution. Seems that there’s a lot less letter “h” in RedPajama than expected (using Wikipedia page as gold reference: en.wikipedia.org/wiki/Letter_fr…). Thoughts? 🤔 (I issued a single
account_circle
Xuhui Zhou(@nlpxuhui) 's Twitter Profile Photo

Excited to share that Sotopia (openreview.net/forum?id=mM7Vu…) has been accepted to ICLR 2024 as a spotlight 🌠!
Sotopia is one of the unique platforms for facilitating socially-aware and human-centered AI systems.
We've been busy at work, and have follow-ups coming soon, stay tuned!

account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

The infini-gram API has served over 1 million queries during its first week of release! Thanks everyone for powering your research with our tools 🤠

Also, infini-gram now supports two additional corpora: the training sets of C4 and Pile, both in the demo and via the API. This

account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

Announcing the infini-gram API 🚀🚀

API Endpoint: api.infini-gram.io
API Documentation: infini-gram.io/api_doc

No API key needed! Simply issue POST requests to the endpoint and receive the results in a fraction of a second.

As we’re in the early stage of rollout, please pic.twitter.com/ckwsxiJPJF

account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

[Fun w/ infini-gram #3] Today we’re tracing down the cause of memorization traps 🪤

Memorization trap is a type of prompt where memorization of common text can elicit undesirable behavior. For example, when the prompt is “Write a sentence about challenging common beliefs: What

[Fun w/ infini-gram #3] Today we’re tracing down the cause of memorization traps 🪤 Memorization trap is a type of prompt where memorization of common text can elicit undesirable behavior. For example, when the prompt is “Write a sentence about challenging common beliefs: What
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

Thanks for featuring our work AK!! 😍

Super excited to announce the infini-gram engine that counts long n-grams and retrieve documents within TB-scale text corpora, with millisecond-level latency.
Look forward to enabling more scrutiny into what LLMs are being trained on.

account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

The infini-gram engine just got 10x faster! 🚀🚀🚀

Try infini-gram here: hf.co/spaces/liujch1…
To experience faster inference, select the “C++” engine before submitting your query.

On RedPajama (1.4T tokens), the C++ engine can process count queries in 20 milliseconds on

account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

[Fun w/ infini-gram #2] Today we're verifying Benford's Law!

Benford's Law states that in real-life numerical datasets, the leading digit should follow a certain distribution (left fig). It has usage in detecting fraud in accounting, election data, and macroeconomic data.

The

[Fun w/ infini-gram #2] Today we're verifying Benford's Law! Benford's Law states that in real-life numerical datasets, the leading digit should follow a certain distribution (left fig). It has usage in detecting fraud in accounting, election data, and macroeconomic data. The
account_circle