Open Source Startup Podcast🎙(@OssStartup) 's Twitter Profile Photo

Time series databases have historically been focused on limited data types

Our latest Open Source Startup Podcast🎙 Podcast from Robby & Timothy Chen interviews Rerun CEO Nikolaus West on the need for multimodal time series databases 💪

account_circle
AskToRahulSingh©️(@AskToRahulSingh) 's Twitter Profile Photo

has updated its smart glasses with a new multimodal feature in the US and Canada.

Users can now interact with the glasses using voice commands by saying “Hey Meta,” allowing the AI to analyze visuals through the built-in camera and offer relevant information.…

account_circle
Roni Rahman(@heyronir) 's Twitter Profile Photo

Meta's new Ray-Ban multimodal smart glasses just dropped.

A smart, budget-friendly, lightweight wearable, unlike Apple Vision Pro.

Here are 7 things you can do with Ray-Ban Meta glasses:

(These are real glasses, not a concept)

Meta's new Ray-Ban multimodal smart glasses just dropped.

A smart, budget-friendly, lightweight wearable, unlike Apple Vision Pro.

Here are 7 things you can do with Ray-Ban Meta glasses:

(These are real glasses, not a concept)
account_circle
Poonam Soni(@CodeByPoonam) 's Twitter Profile Photo

Meta just announced multimodal Ray-Ban glasses and it's INSANE

SPOILER: Apple Vision Pro got a huge competition.

Here are 7 powerful things you can do with Ray-Ban smart glasses:

Meta just announced multimodal Ray-Ban glasses and it's INSANE

SPOILER: Apple Vision Pro got a huge competition.

Here are 7 powerful things you can do with Ray-Ban smart glasses:
account_circle
Nikita Drobyshev(@NikDrob23) 's Twitter Profile Photo

I am thrilled to announce that my latest paper has been accepted at CVPR 2024:
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
🔗 Project Page: neeek2303.github.io/EMOPortraits/

account_circle
Tanishq Mathew Abraham, Ph.D.(@iScienceLuvr) 's Twitter Profile Photo

Google announces Med-Gemini, a family of Gemini models fine-tuned for medical tasks! 🔬

Achieves SOTA on 10 of the 14 benchmarks, spanning text, multimodal & long-context applications.

Surpasses GPT-4 on all benchmarks!

This paper is super exciting, let's dive in ↓

Google announces Med-Gemini, a family of Gemini models fine-tuned for medical tasks! 🔬

Achieves SOTA on 10 of the 14 benchmarks, spanning text, multimodal &  long-context applications. 

Surpasses GPT-4 on all benchmarks!

This paper is super exciting, let's dive in ↓
account_circle
Rowan Cheung(@rowancheung) 's Twitter Profile Photo

Meta just announced that multimodal capabilities are rolling out to smart glasses.

Meaning it'll be able to understand visual surroundings through the built-in camera.

Some usecases include translating text, identifying objects, or providing other context-specific information

account_circle
Anil Ozturk(@anil_ozturkk) 's Twitter Profile Photo

Multimodal LLM'lerdeki halüsinasyon problemi üzerine yapılan çalışmaları derleyen bir repo. Ayrıca bu konu üzerine kendi hazırladıkları bir survey de var.

GitHub: github.com/showlab/Awesom…

Multimodal LLM'lerdeki halüsinasyon problemi üzerine yapılan çalışmaları derleyen bir repo. Ayrıca bu konu üzerine kendi hazırladıkları bir survey de var.

GitHub: github.com/showlab/Awesom…
account_circle
Min Choi(@minchoi) 's Twitter Profile Photo

Ray-Ban Meta smart glasses just got a massive Multimodal upgrade - Meta AI with Vision

It doesn't just take speech input, it can now answer questions about what you are seeing.

Here are 8 features that is now possible

1. Ask about what you are seeing

account_circle
TxDOT(@TxDOT) 's Twitter Profile Photo

Did you know Texas has 77 transit agencies? They provide 205 million trips each year, connecting communities across the state. The Statewide Multimodal Transit Plan aims to keep momentum going for an even more connected tomorrow. More info at ow.ly/t2fl50RrcGf

account_circle
Tibor Blaho(@btibor91) 's Twitter Profile Photo

After Stable Assistant, Stability AI might introduce Stable Artisan soon (AI Discord bot)

- It's a multimodal generative AI Discord bot that uses Stability AI's API

- You can add the bot to your own servers where users can work together to create and edit images

- Stable…

account_circle
Gradio(@Gradio) 's Twitter Profile Photo

LLaMA Factory is a Gradio UI that helps you in fine-tuning LLMs as well as MLLMs🤯
💪 Fine-tune multimodal LLMs⚡have never been this easy! Links below 👇

LLaMA Factory is a Gradio UI that helps you in fine-tuning LLMs as well as MLLMs🤯  
💪 Fine-tune multimodal LLMs⚡have never been this easy! Links below 👇
account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Data-Efficient Multimodal Fusion on a Single GPU

The goal of multimodal alignment is to learn a single latent space that is shared between multimodal inputs. The most powerful models in this space have been trained using massive datasets of paired inputs and large-scale

Data-Efficient Multimodal Fusion on a Single GPU

The goal of multimodal alignment is to learn a single latent space that is shared between multimodal inputs. The most powerful models in this space have been trained using massive datasets of paired inputs and large-scale
account_circle
Brian Roemmele(@BrianRoemmele) 's Twitter Profile Photo

Testing this today… Meet OSWorld a first-of-its-kind scalable, real computer environment for multimodal agents, supporting task setup, execution-based evaluation, and interactive learning across operating systems. It can serve as a unified environment for evaluating open-ended

account_circle