Ehsan Shareghi(@EhsanShareghi) 's Twitter Profileg
Ehsan Shareghi

@EhsanShareghi

Teaching/working on #NLProc. Currently an assistant professor at Monash University and previously a postdoc in University of Cambridge. Opinions are my own.

ID:1365097706704703488

linkhttps://eehsan.github.io calendar_today26-02-2021 00:33:57

70 Tweets

186 Followers

161 Following

Yinhong Liu(@YinhongLiu2) 's Twitter Profile Photo

🔥New paper!📜
Struggle to align LLM evaluators with human judgements?🤔
Introducing PairS🌟: By exploiting transitivity, we push the potential of pairwise preference in efficient ranking evaluations that has better alignment!🧑‍⚖️
📖arxiv.org/abs/2403.16950
💻github.com/cambridgeltl/p…

🔥New paper!📜 Struggle to align LLM evaluators with human judgements?🤔 Introducing PairS🌟: By exploiting transitivity, we push the potential of pairwise preference in efficient ranking evaluations that has better alignment!🧑‍⚖️ 📖arxiv.org/abs/2403.16950 💻github.com/cambridgeltl/p…
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Simple working idea: Taking a mixture of training data, train a task router that will guide each input to the right mode of solving. A single LoRA (not an MoE) instruction-tuned to make both Task Routing and Task Solving decisions. More: raven-lm.github.io

Simple working idea: Taking a mixture of training data, train a task router that will guide each input to the right mode of solving. A single LoRA (not an MoE) instruction-tuned to make both Task Routing and Task Solving decisions. More: raven-lm.github.io #EACL2024 #NLProc
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Expanding a Language Agent with uncertainty estimation mechanism not only improves the agent's performance🤯, but also reduces the number of calls it makes to external tools (far more economical🫰). Uncertainty-Aware Language Agent (UALA) paper & code to follow soon!

Expanding a Language Agent with uncertainty estimation mechanism not only improves the agent's performance🤯, but also reduces the number of calls it makes to external tools (far more economical🫰). Uncertainty-Aware Language Agent (UALA) paper & code to follow soon! #NLProc
account_circle
Shunyu Yao(@ShunyuYao12) 's Twitter Profile Photo

🧠🦾ReAct -> 🔥FireAct

Most language agents prompt LMs
- ReAct, AutoGPT, ToT, Generative Agents, ...
- Which is expensive, slow, and non-robust😢

Most fine-tuned LMs not for agents...

FireAct asks: WHY NOT?

Paper, code, data, ckpts: fireact-agent.github.io

(1/5)

🧠🦾ReAct -> 🔥FireAct Most language agents prompt LMs - ReAct, AutoGPT, ToT, Generative Agents, ... - Which is expensive, slow, and non-robust😢 Most fine-tuned LMs not for agents... FireAct asks: WHY NOT? Paper, code, data, ckpts: fireact-agent.github.io (1/5)
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

It is a waste of reviewers' efforts if an AC/SAC chooses to add a meta-review/decision inconsistent with the reviews/scores. This nullifies the rebuttal authors provide too. AC/SAC could allocate better reviewers at the start, or engage in the discussion. Not healthy!

account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

We (Fangyu Liu Nigel Collier ) reported related observations earlier in CogSci23 (arxiv.org/pdf/2208.11981…) - LLMs (those we tested) do not have a reliable sense of directionality. This is quite problematic for asymmetric relations. Good to see other works in this space.

We (@hardy_qr @nigelhcollier ) reported related observations earlier in CogSci23 (arxiv.org/pdf/2208.11981…) - LLMs (those we tested) do not have a reliable sense of directionality. This is quite problematic for asymmetric relations. Good to see other works in this space.
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

RL(HF) has great potentials. But it is quite a challenge to form the right reward or balance during optimisation. This is just a scratch on the surface (work in-progress), showing its potential in improving the quality of Structured Explanations.
Paper: arxiv.org/pdf/2309.08347…

RL(HF) has great potentials. But it is quite a challenge to form the right reward or balance during optimisation. This is just a scratch on the surface (work in-progress), showing its potential in improving the quality of Structured Explanations. Paper: arxiv.org/pdf/2309.08347…
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Speech encoders are foundation models too! We investigate (1) robustness in low-resource condition, (2) where they potentially capture content and prosodic information, and (3) their representational properties.
Paper: arxiv.org/pdf/2305.17733…

Speech encoders are foundation models too! We investigate (1) robustness in low-resource condition, (2) where they potentially capture content and prosodic information, and (3) their representational properties. #INTERSPEECH2023 Paper: arxiv.org/pdf/2305.17733…
account_circle
Christopher Manning(@chrmanning) 's Twitter Profile Photo

But most AI people work in the quiet middle: We see huge benefits from people using AI in healthcare, education, …, and we see serious AI risks & harms but believe we can minimize them with careful engineering & regulation, just as happened with electricity, cars, planes, ….

account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

I don't think anyone sane actually assumed small models trained on instruction data from an LLM would clone LLM's 'capabilities' too. It is just a rewiring step to make them behave similarly (i.e., to follow instructions). That is ground zero but unlocks many new possibilities.

account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Translation of natural language into symbolic first-order logic is very foundational IMO. In this work we used the latest of the NLP/AI world (SFT+RLHF), to train a small LM which both corrects a GPT-3.5 and works as a standalone translation tool.
arxiv.org/pdf/2305.15541…

Translation of natural language into symbolic first-order logic is very foundational IMO. In this work we used the latest of the NLP/AI world (SFT+RLHF), to train a small LM which both corrects a GPT-3.5 and works as a standalone translation tool. #NLProc arxiv.org/pdf/2305.15541…
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Needed a verifier to auto-correct the outputs from LLM? PiVe offers a simple solution to construct data and train such a verifier. Could be used standalone or iteratively with LLM to improve the output accuracy!
Paper: arxiv.org/pdf/2305.12392…
Code: github.com/Jiuzhouh/PiVe

Needed a verifier to auto-correct the outputs from LLM? PiVe offers a simple solution to construct data and train such a verifier. Could be used standalone or iteratively with LLM to improve the output accuracy! Paper: arxiv.org/pdf/2305.12392… Code: github.com/Jiuzhouh/PiVe
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

This may not age well ... have been thinking about this for a while now ... I will learn from your thoughts/views on this

This may not age well ... have been thinking about this for a while now ... I will learn from your thoughts/views on this #generativeAI
account_circle
Ehsan Shareghi(@EhsanShareghi) 's Twitter Profile Photo

Categorical dive into commonsense about the physical world, quantifying human norms and comparing against pre-trained large language models. Associative learning from data is not sufficient for: Mereo-Topology, Affordances, Nonsymmetric Relations (Cause-effect, Troponym)

account_circle
Nigel Collier(@nigelhcollier) 's Twitter Profile Photo

Delighted to announce our paper 'On Reality and the Limits of Language Data' in collaboration with Ehsan Shareghi and Fangyu Liu at . We've spent the last 9 months reading and thinking about the limitations of pre-trained language m…lnkd.in/d6RSeVXN lnkd.in/dT-t3n22

account_circle