DeepSPIN(@deep_spin) 's Twitter Profileg
DeepSPIN

@deep_spin

Deep structured prediction in NLP. ERC project coordinated by @andre_t_martins. Instituto de Telecomunicações.

ID:1026171704744325120

linkhttps://deep-spin.github.io/ calendar_today05-08-2018 18:22:54

24 Tweets

362 Followers

76 Following

timorous bestie 😷(@vnfrombucharest) 's Twitter Profile Photo

Adaptively Sparse Transformers
@emnlp2019 +Gonçalo Correia, André Martins

α-entmax attention
α=1: softmax, α=2: sparsemax, continuous in between.
twist: we learn α for each head, w gradients! Some heads become dense, some sparse.

arxiv.org/abs/1909.00015
github.com/deep-spin/entm…

Adaptively Sparse Transformers @emnlp2019 +Gonçalo Correia, André Martins α-entmax attention α=1: softmax, α=2: sparsemax, continuous in between. twist: we learn α for each head, w gradients! Some heads become dense, some sparse. arxiv.org/abs/1909.00015 github.com/deep-spin/entm…
account_circle
DeepSPIN(@deep_spin) 's Twitter Profile Photo

A nice write-up of the challenges of lemmatization by DeepSPINner Erick! Multilingual examples reveal different complexities hard to imagine if focusing on English.

account_circle
Julia Kreutzer(@KreutzerJulia) 's Twitter Profile Photo

CfP for the workshop on structured prediction for NLP workshop at is out: submit a research/position/overview paper till March 6! structuredprediction.github.io/SPNLP19
NAACL

account_circle
DeepSPIN(@deep_spin) 's Twitter Profile Photo

DeepSPIN talks at !

- Thu, 11:00AM, talk @ Blackbox NLP
Interpretable Structure Induction via Sparse Attention.
Peters/Niculae/Martins.

- Fri, 3:36PM, main conf talk @ ML(3B)
Towards Dynamic Computation Graphs via Sparse Latent Structure.
Niculae/Martins/Cardie.

account_circle
timorous bestie 😷(@vnfrombucharest) 's Twitter Profile Photo

Towards Dynamic Computation Graphs via Sparse Latent Structure: + André Claire Cardie

- marginalize over structured latent vars w/ SparseMAP
- CG a function of discrete structure
- eg latent dependency TreeLSTM

pdf arxiv.org/abs/1809.00653
code github.com/vene/sparsemap…

Towards Dynamic Computation Graphs via Sparse Latent Structure: #emnlp2018 + André @clairecardie - marginalize over structured latent vars w/ SparseMAP - CG a function of discrete structure - eg latent dependency TreeLSTM pdf arxiv.org/abs/1809.00653 code github.com/vene/sparsemap…
account_circle
DeepSPIN(@deep_spin) 's Twitter Profile Photo

'Structure Back in Play, Translation Wants More Context'

DeepSPINner André Martins writes on the Unbabel R&D blog his notes from this year's and :

medium.com/unbabel/icml-a…

account_circle