elvis(@omarsar0) 's Twitter Profileg
elvis

@omarsar0

Building with LLMs @dair_ai • Prev: Meta AI, Galactica LLM, PapersWithCode, Elastic, PhD • Creator of the Prompting Guide (~4M learners)

ID:3448284313

linkhttps://linktr.ee/elvissaravia calendar_today04-09-2015 12:59:26

10,5K Tweets

188,7K Followers

486 Following

elvis(@omarsar0) 's Twitter Profile Photo

Really enjoyed the SWE-Agent technical paper.

My overall takeaway is the importance of context management for building complex agentic workflows.

There are a lot of key components necessary for LLM-based agents to autonomously complete software engineering tasks. This paper…

Really enjoyed the SWE-Agent technical paper. My overall takeaway is the importance of context management for building complex agentic workflows. There are a lot of key components necessary for LLM-based agents to autonomously complete software engineering tasks. This paper…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

500B billion parameters in-house model from Microsoft?

It wouldn't surprise me. I am also eager to see what they do with the Phi model series.

500B billion parameters in-house model from Microsoft? It wouldn't surprise me. I am also eager to see what they do with the Phi model series.
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

AlphaMath Almost Zero

Enhances LLMs with Monte Carlo Tree Search (MCTS) to improve mathematical reasoning capabilities.

The MCTS framework extends the LLM to achieve a more effective balance between exploration and exploitation.

For this work, the idea is to generate…

AlphaMath Almost Zero Enhances LLMs with Monte Carlo Tree Search (MCTS) to improve mathematical reasoning capabilities. The MCTS framework extends the LLM to achieve a more effective balance between exploration and exploitation. For this work, the idea is to generate…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

Hallucination of Multimodal LLMs

This new paper presents a comprehensive overview of hallucination in multimodal large language models.

Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…

Hallucination of Multimodal LLMs This new paper presents a comprehensive overview of hallucination in multimodal large language models. Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

I am getting a lot of requests for educational materials on how to build AI agents.

Here is a FREE 2-hour workshop to learn about complex AI agents.

It's a good opportunity to learn how to apply AI agents for automating tasks in areas like customer support, marketing,…

I am getting a lot of requests for educational materials on how to build AI agents. Here is a FREE 2-hour workshop to learn about complex AI agents. It's a good opportunity to learn how to apply AI agents for automating tasks in areas like customer support, marketing,…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

Hallucination of Multimodal LLMs

This new paper presents a comprehensive overview of hallucination in multimodal large language models.

Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…

Hallucination of Multimodal LLMs This new paper presents a comprehensive overview of hallucination in multimodal large language models. Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…
account_circle
DAIR.AI(@dair_ai) 's Twitter Profile Photo

The Top ML Papers of the Week (April 29 - May 5):

- Med-Gemini
- When to Retrieve?
- Kolmogorov-Arnold Networks
- Multimodal LLM Hallucinations
- Self-Play Preference Optimization
- In-Context Learning with Long-Context Models
...

account_circle
elvis(@omarsar0) 's Twitter Profile Photo

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction.

It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward.

I enjoyed…

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction. It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward. I enjoyed…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

An Open Source LM Specialized in Evaluating Other LMs

Open-source Prometheus 2 (7B & 8x7B), state-of-the-art open evaluator LLMs that closely mirror human and GPT-4 judgments.

They support both direct assessments and pair-wise ranking formats grouped with user-defined…

An Open Source LM Specialized in Evaluating Other LMs Open-source Prometheus 2 (7B & 8x7B), state-of-the-art open evaluator LLMs that closely mirror human and GPT-4 judgments. They support both direct assessments and pair-wise ranking formats grouped with user-defined…
account_circle
elvis(@omarsar0) 's Twitter Profile Photo

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction.

It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward.

I enjoyed…

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction. It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward. I enjoyed…
account_circle