St Teresas Primary(@Stteresas_ng8) 's Twitter Profile Photo

Learning all about the amazing life of Saint Francis of Asissi who inspires us all to take care of our common home!

Learning all about the amazing life of Saint Francis of Asissi who inspires us all to take care of our common home!
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

Based on the zoom insights, we take all existing labeled images that OpenAI's CLIP ViT-L/14 never correct labels even given 324 zoom attempts and form a new benchmark called 🌟 ImageNet-Hard 🌟

7/n

Based on the zoom insights, we take all existing labeled images that OpenAI's CLIP ViT-L/14 never correct labels even given 324 zoom attempts and form a new benchmark called 🌟 ImageNet-Hard 🌟

7/n
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

For example, zoom-in helps models correctly classify in (junco) and (magpie).
Zoom-out also helps, but mostly on abstract images (e.g. ImageNet-Sketch) (hornbill).

3/n

For example, zoom-in helps models correctly classify in (junco) and (magpie). 
Zoom-out also helps, but mostly on abstract images (e.g. ImageNet-Sketch) (hornbill).

3/n
account_circle
みごん(@mgn_ng8) 's Twitter Profile Photo

実は私も昔に見つけて😡😡ってなってました〜!!
買う人居ないだろうけど見つけても買わないようにね、、、🥲🥲

実は私も昔に見つけて😡😡ってなってました〜!!
買う人居ないだろうけど見つけても買わないようにね、、、🥲🥲
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

Zoom definition: first, uniformly resize, and then crop.

Surprisingly, even 11-year-old AlexNet is good enough to classify > 90% of ImageNet given an *ideal* zoomed crop.

This upper-bound accuracy increases as models are more recent (AlexNet < VGG < ResNet < ViT).

2/n

Zoom definition: first, uniformly resize, and then crop.

Surprisingly, even 11-year-old AlexNet is good enough to classify > 90% of ImageNet given an *ideal* zoomed crop. 

This upper-bound accuracy increases as models are more recent (AlexNet < VGG < ResNet < ViT).

2/n
account_circle
BUZZNET(@BUZZNET) 's Twitter Profile Photo

When you travel, you want to make your time worthwhile. Well, we are here to help steer you away from some bad decisions.

account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

Integrating a zoom-in ⨁ process into MEMO (Zhang, Chelsea Finn Sergey Levine 2022), a state-of-the-art test-time augmentation method, can increase prediction accuracy even further.

6/n

Integrating a zoom-in ⨁ process into MEMO (Zhang, @chelseabfinn @svlevine 2022), a state-of-the-art test-time augmentation method, can increase prediction accuracy even further.

6/n
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

By center-zooming to each of the 9 anchor points on every image, we show that ImageNet-A and ObjectNet have a strong center bias ➡ model accuracy increases dramatically (e.g. +15% on IN-A) by simply center-cropping input images.

5/n

By center-zooming to each of the 9 anchor points on every image, we show that ImageNet-A and ObjectNet have a strong center bias ➡  model accuracy increases dramatically (e.g. +15% on IN-A) by simply center-cropping input images.

5/n
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

There are ~200 images (samples below) in ImageNet that are *never* correctly classified by any models even if each model is given 324 zoom attempts ❗️

Looking at this *unclassifiable* set of images informs us about 'Are we done with ImageNet-scale OOD benchmarks?'

4/n

There are ~200 images (samples below) in ImageNet that are *never* correctly classified by any models even if each model is given 324 zoom attempts ❗️

Looking at this *unclassifiable* set of images informs us about 'Are we done with ImageNet-scale OOD benchmarks?'

4/n
account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

How to score > 90% on ImageNet?

Our new study on the spatial biases of ImageNet and relevant ImageNet-scale, OOD benchmarks reveals that all common image classifiers tested can score > 90%, if the model looks at the correct crop, i.e.,

⭐️ Zoom 🔎 is all you need! ⭐️ 1/n

account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

ImageNet-Hard is a new benchmark that challenges state-of-the-art vision-language models and ImageNet-trained classifiers.

Work led by the amazingly talented taesiri!!! With Giang Nguyen , Sarra Habchi, and Cor-Paul Bezemer.

Code, paper & dataset: taesiri.github.io/ZoomIsAllYouNe…

n/n

account_circle
Nordace(@NordaceOfficial) 's Twitter Profile Photo

👍 You’ll Never Guess Why People Loves These Backpack so Much!
⭐ Airline Carry-On Size ⭐
✅ 15.6 Laptop Compartment
✅ Big Capacity
✅ Combining Style With Functionality

account_circle
Anh Nguyen(@anh_ng8) 's Twitter Profile Photo

On ImageNet-Hard, all 224x224-input models perform below 19%.

Higher-resolution image classifiers tend to perform better with the best SOTA of 39% by EfficientNet-L2@800px.
Manually increasing the resolution of input images by GigaGAN does not improve model accuracy.

8/n

On ImageNet-Hard, all 224x224-input models perform below 19%.

Higher-resolution image classifiers tend to perform better with the best SOTA of 39% by EfficientNet-L2@800px.
Manually increasing the resolution of input images by GigaGAN does not improve model accuracy.

8/n
account_circle