elvis (@omarsar0) Twitter Tweets • TwiCopy

elvis

12 hours ago

Really enjoyed the SWE-Agent technical paper.

My overall takeaway is the importance of context management for building complex agentic workflows.

There are a lot of key components necessary for LLM-based agents to autonomously complete software engineering tasks. This paper…

account_circle

elvis

@omarsar0

1 day ago

500B billion parameters in-house model from Microsoft?

It wouldn't surprise me. I am also eager to see what they do with the Phi model series.

account_circle

elvis

@omarsar0

1 day ago

AlphaMath Almost Zero

Enhances LLMs with Monte Carlo Tree Search (MCTS) to improve mathematical reasoning capabilities.

The MCTS framework extends the LLM to achieve a more effective balance between exploration and exploitation.

For this work, the idea is to generate…

account_circle

elvis

@omarsar0

1 day ago

Hallucination of Multimodal LLMs

This new paper presents a comprehensive overview of hallucination in multimodal large language models.

Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…

account_circle

elvis

@omarsar0

1 day ago

I am getting a lot of requests for educational materials on how to build AI agents.

Here is a FREE 2-hour workshop to learn about complex AI agents.

It's a good opportunity to learn how to apply AI agents for automating tasks in areas like customer support, marketing,…

account_circle

elvis

@omarsar0

1 day ago

Hallucination of Multimodal LLMs

This new paper presents a comprehensive overview of hallucination in multimodal large language models.

Discusses recent advances in detection, evaluation, and mitigation strategies for hallucination. It also summarizes causes, evaluation…

account_circle

DAIR.AI

@dair_ai

2 days ago

The Top ML Papers of the Week (April 29 - May 5):

- Med-Gemini
- When to Retrieve?
- Kolmogorov-Arnold Networks
- Multimodal LLM Hallucinations
- Self-Play Preference Optimization
- In-Context Learning with Long-Context Models
...

account_circle

elvis

@omarsar0

4 days ago

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction.

It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward.

I enjoyed…

account_circle

elvis

@omarsar0

5 days ago

An Open Source LM Specialized in Evaluating Other LMs

Open-source Prometheus 2 (7B & 8x7B), state-of-the-art open evaluator LLMs that closely mirror human and GPT-4 judgments.

They support both direct assessments and pair-wise ranking formats grouped with user-defined…

account_circle

elvis

@omarsar0

4 days ago

The most exciting LLM paper of the week was the one from Gloeckle et al. that aims to train better and faster LLM via multi-token prediction.

It's an impressive research paper so I had lots of thoughts as usual, especially because it attempts to push LLMs forward.

I enjoyed…

account_circle