Twitter #bigscience hashtag • TwiCopy

#cliss2024
#BigScience
#clearlakeca

2 days ago

account_circle

CLISS

#cliss2024
#bigscience
#ClearLakeCA

22 hours ago

account_circle

CLISS

#cliss2024
#bigscience
#clearlakeca

22 hours ago

account_circle

BigScience Large Model Training

@BigScienceLLM

2 years ago

How is the BigScience 176B model trained: a visual overview of the hardware and parallelism setup

account_circle

It's that time of year! Learn all about Harmful Algal Blooms and who's tracking them in Clear Lake at the CLISS 2024! Until then, check out Clear Lake Water Quality page that posts all current HABS results.

#cliss2024
#BigScience
#ClearLakeCA

account_circle

BigScience Research Workshop

@BigscienceW

1 year ago

A few days left until the first BigScience follow-up project 🚀 Any guesses ?

account_circle

Ofir Press

@OfirPress

1 year ago

DeepMind's Gopher and BigScience's BLOOM already use relative position embeddings, but most other language models don't. I believe we should all start using relative positioning.

In this new post, I discuss the use case for relative position methods:
ofir.io/The-Use-Case-f…

account_circle

松xR

@matsu_vr

1 year ago

ローカルでLLMを動かすやつのBloomz版とも言えるbloomz.cppを初代M1 macbook air（メモリ16GB）で動かしてみました。動いた！試したのは70億パラメータの bigscience/bloomz-7b1 。アメリカの大統領を聞いたらG.W.ブッシュを答えたけど。レスポンスも7Bなら早いです
github.com/NouamaneTazi/b…

account_circle

AK

@_akhaliq

1 year ago

BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model
abs: arxiv.org/abs/2212.04960

account_circle

Saulnier Lucile

@LucileSaulnier

1 year ago

Wondering how one can create a dataset of several TB of text data to train a language model?📚

With BigScience Research Workshop, we have been through this exercise and shared everything in our #NeurIPS2022 paper 'The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset'

🧵

Wondering how one can create a dataset of several TB of text data to train a language model?📚

With @BigscienceW, we have been through this exercise and shared everything in our #NeurIPS2022 paper 'The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset'

🧵

account_circle

Edouard d'Archimbaud

@edarchimbaud

2 years ago

With the ability to train self-supervised models, research is scaling up both data and model size at an impressive rate. Hugging Face BigScience is a new effort to establish good practices in data curation.
bit.ly/3xog6uR
#dcai #SelfSupervised #labeling #data #ai #ml

account_circle

Taiga

@tg3517

1 year ago

1700億パラメータのOSS LLM。性能も良さげ。300GBくらいあるっぽい。
bigscience/bloom · Hugging Face huggingface.co/bigscience/blo…

account_circle

BigScience Research Workshop

@BigscienceW

2 years ago

Excited to announce the BigScience Biomedical Hackathon! Together we're creating an open source, community resource of over 150 biomedical datasets. Join us! 🙌

🌸 Our mission: hfbigbio.github.io
🚀 Contribute: github.com/bigscience-wor…

account_circle

Anna-Lena Rüland @arueland.bsky.social

@LeniRueland

1 year ago

Great presenting my paper on #conflicts in #BigScience Global Transformations and Governance Challenges conference. Thanks to my co-panelists Babak Rezaee, Dr Dominika Czerniawska & Inga Ulnicane for their useful feedback & the lively discussion.

Great presenting my paper on #conflicts in #BigScience @GtgcLeiden conference. Thanks to my co-panelists @BabakRezaee, @dczerniawska & @IngaUlnicane for their useful feedback & the lively discussion.

thumb_up_off_alt21

repeat3

account_circle

AK

@_akhaliq

1 year ago

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

abs: arxiv.org/abs/2303.03915

account_circle

yoheLab

@LabYohe

1 year ago

See the Yohe Lab featured here in contributing to the new UNC CHARLOTTE COLLEGE OF COMPUTING + INFORMATICS BioinformaticsUNCC UNCC Biological Sciences Center for Computational Intelligence to Predict Health and Environmental Risks ( #CIPHER )! 🦇👩‍🔬🦠🔬👩‍💻📊
features.charlotte.edu/laurel-yohe

See the Yohe Lab featured here in contributing to the new @CLT_CCI @UNCC_BIGScience @UNCCBiology Center for Computational Intelligence to Predict Health and Environmental Risks (#CIPHER)! 🦇👩‍🔬🦠🔬👩‍💻📊
features.charlotte.edu/laurel-yohe

thumb_up_off_alt18

repeat6

account_circle

Aran Komatsuzaki

@arankomatsuzaki

1 year ago

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Documents the data creation and curation efforts of ROOTS corpus, a 1.6TB dataset used to train BLOOM

Releases a large initial subset of the corpus

data: huggingface.co/bigscience-data
abs: arxiv.org/abs/2303.03915

account_circle

CLISS

2 days ago

Tule Talk: Learn about shorelines and the animals that love them at the CLISS 2024!

#cliss2024
#BigScience
#ClearLakeCA