DSALT: Distributional Semantics and Linguistic Theory

Gemma Boleda and Denis Paperno

  • Workshop
  • Week: 1
  • Time: 17:00 – 18:30
  • Room: D1.02
  • 15-19 August 2016, Bolzano, Italy

(Please also see the related Composes workshop, co-located with ESSLLI, to be held the day before DSALT starts)

The DSALT workshop seeks to foster discussion at the intersection of distributional semantics and various subfields of theoretical linguistics, with the goal of boosting the impact of distributional semantics on linguistic research beyond lexical semantic phenomena, as well as broadening the empirical basis and theoretical tools used in linguistics. Our contributions explore the theoretical interpretation of distributional vector spaces and their application to theoretical morphology, syntax, semantics, and pragmatics.

Program

Monday, August 15

17:00-17:45 Invited talk, Jason Weston (Facebook). Memory Networks for Language Understanding. (slides, tutorial) Abstract: There has been a recent resurgence in interest in the use of the combination of reasoning, attention and memory for solving tasks, particularly in the research area of machine learning applied to language understanding. I will focus on one of my own group’s contributions, memory networks, an architecture that we have applied to question answering, language modeling and general dialog. As we try to move towards the goal of true language understanding, I will also discuss recent datasets and tests that have been built to assess these models abilities to see how far we have come (hint: there’s still a long way to go!!).

17:50-18:10 Jerry R. Hobbs and Jonathan Gordon. Distribution and Inference (slides).

18:10-18:30 William Hamilton, Jure Leskovec, Dan Jurafsky. Distributional approaches to diachronic semantics (slides).

Tuesday, August 16

17:00-17:45 Invited talk, Katrin Erk (University of Texas at Austin). The probabilistic samowar: an attempt at explaining how people can learn from distributional data (slides). Abstract: There is evidence that people can learn the meaning of words from observing them in text. But how would that work, in particular, how would such learning connect a word with the entities in the world that it denotes? In this talk I discuss two proposals of how humans could learn from distributional data, both of which have a number of core assumptions in common. They both assume that the information that distributional data can contribute is property information: words that appear in similar contexts (for suitable definitions of “context”) denote entities with similar properties. Disributional  data is noisy and probabilistic; for that reason, both approaches assume that an agent has a probabilistic information state — a probability distribution over worlds that could be the actual world –, which can be influenced by textual context data.

17:50-18:30 Poster session 1.

Wednesday, August 17

17:00-17:45 Invited talk, Alessandro Lenci (University of Pisa). Distributional Models of Sentence Comprehension. (slides) Abstract: In this talk I will discuss the modelling of phenomena related to sentence comprehension in a distributional semantic framework. In particular, I will focus on how linguistic and neurocognitive evidence about human sentence processing can be integrated in distributional semantic models to tackle the challenges of compositional and incremental online construction of sentence representations.

17:50-18:10 Gabriella Lapesa, Max Kisselew, Sebastian Pado, Tilmann Pross, Antje Roßdeutscher. Characterizing the pragmatic component of distributional vectors in terms of polarity: Experiments on German uber verbs (slides).

18:10-18:30 Enrico Santus, Alessandro Lenci, Qin Lu, Chu-Ren Huang. Squeezing Semantics out of Contexts: Automatic Identification of Semantic Relations in DSMs (slides).

Thursday, August 18

17:00-17:45 Invited talk, Aurélie Herbelot (University of Trento). Where do models come from? (slides) Abstract: When contrasted with formal semantics, distributional semantics is usually described as a natural (and very successful) way to simulate human analogical processes. There is however no essential reason to believe that formal semantics should be unable to do similarity. In this talk, I will propose that a) the inability of formal semantics to model relatedness is linked to a notion of ‘model sparsity’, and b) the strength of distributional semantics lies not so much in similarity but in having a cognitively sound basis, which potentially enables us to answer the question ‘Where do models come from?’ I will give an overview of some experimental results which support the idea that a rich model of the world can be acquired from distributional data via soft inference processes.

17:50-18:30 Poster session 2.

Friday, August 19

17:00-17:45 Invited talk, Marco Baroni (University of Trento). Living a discrete life in a continuous world (slides). Abstract: Natural language understanding requires reasoning about sets of discrete discourse entities, updating their status and adding new ones as the discourse unfolds. This fundamental characteristic of linguistic semantics makes it difficult to handle with fully trainable end-to-end architectures, that are not able to learn discrete operations. Inspired by recent proposals such as Stack-RNN (Joulin and Mikolov, 2015) and Memory Networks (Sukhbaatar et al. 2015), where a neural network learns to control a discrete memory through a continuous interface, we introduce a model that learns to create and update discourse referents, represented by distributed vectors, by being trained end-to-end on a reference resolution task. Preliminary results suggest that our approach is viable. (Work in collaboration with Gemma Boleda and Sebastian Padó.)

17:50-18:10 Kristina Gulordava. Measuring distributional semantic effects in syntactic variation (slides).

18:10-18:30 Discussion and wrap-up.

POSTER SESSION 1

Gabor Borbely, Andras Kornai, Marcus Kracht, David Nemeskey. Denoising composition in distributional semantics.
Guy Emerson. Compositional semantics in a probabilistic framework.
Anna Gladkova and Aleksandr Drozd. King – man + woman = queen: the linguistics of “linguistic regularities”.
Dimitri Kartsaklis, Matthew Purver, Mehrnoosh Sadrzadeh. Verb Phrase Ellipsis using Frobenius Algebras in Categorical Compositional Distributional Semantics.
Reinhard Muskens and Mehrnoosh Sadrzadeh. Lambdas and Vectors.
Rossella Varvara, Gabriella Lapesa, Sebastian Padó. Quantifying regularity in morphological processes: An ongoing study on nominalization in German.
Ramon Ziai, Kordula De Kuthy, Detmar Meurers. Approximating Schwarzschild’s Givenness with Distributional Semantics.

POSTER SESSION 2

Kyröläinen Aki-Juhani, Luotolahti M. Juhani, Hakala Kai, Ginter Filip. Modeling cloze probabilities and selectional preferences with neural networks.
Alexander Kuhnle. Investigating the effect of controlled context choice in distributional semantics.
Andrey Kutuzov. Redefining part-of-speech classes with distributional semantic models.
Edoardo Maria Ponti, Elisabetta Jezek, Bernardo Magnini. Grounding the Lexical Sets of Anti-Causative Pairs on a Vector Model.
Pascual Martínez-Gomez, Koji Mineshima, Yusuke Miyao, Daisuke Bekki. Integrating Distributional Similarity as an Abduction Mechanism in Recognizing Textual Entailment.
Michael Repplinger. A Systematic Evaluation of Current Motivation for Explicit Compositionality in Distributional Semantics.
Marijn Schraagen. Towards a dynamic application of distributional semantics.

Programme Committee

Nicholas Asher, Marco Baroni, Emily Bender, Raffaella Bernardi, Robin Cooper, Ann Copestake, Katrin Erk, Ed Greffenstette, Aurélie Herbelot, Germán Kruszewski, Angeliki Lazaridou, Alessandro Lenci, Marco Marelli, Louise McNally, Sebastian Padó, Barbara Partee, Chris Potts, Laura Rimell, Hinrich Schütze, Mark Steedman, Bonnie Webber, Galit Weidman Sassoon, Roberto Zamparelli.

 

Funding and Endorsements

With funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 655577 (LOVe) as well as the 7th Framework Program ERC grant 283554 (COMPOSES).

Endorsed by SIGLEX and SIGSEM of the ACL.


Workshop Description

The DSALT workshop seeks to foster discussion at the intersection of distributional semantics and various subfields of theoretical linguistics, with the goal of boosting the impact of distributional semantics on linguistic research beyond lexical semantic phenomena, as well as broadening the empirical basis and theoretical tools used in linguistics. We welcome contributions regarding the theoretical interpretation  of distributional vector spaces and/or their application to theoretical morphology, syntax, semantics, discourse, dialogue, and any other subfield of linguistics. Potential topics of interest include, among others:

  • distributional semantics and morphology: How do results in the distributional semantics-morphology interface impact theoretical accounts of morphology? Can distributional models account for inflectional morphology? Can they shed light on phenomena like productivity and regularity?
  • distributional semantics and syntax: How can compositionality at the semantic level interact with syntactic structure? Can we go beyond the state of the art in accounting for the syntax-semantics interface when it interacts with lexical semantics? How can distributional accounts for gradable syntactic phenomena, e.g. selectional preferences or argument alternations, be integrated into theoretical linguistic accounts?
  • distributional semantics and formal semantics: How can distributional representations be related to the traditional components of a semantics for natural languages, especially reference and truth? Can distributional models be integrated with discourse- or dialogue-oriented semantic theories like file change semantics or inquisitive semantics?
  • distributional semantics and discourse: Distributional semantics has shown to be able to model some aspects of discourse coherence at a global level (Landauer and Dumais 1997, a.o.); can it also help with other discourse-related phenomena, such as the choice of discourse particles, nominal and verbal anaphora, or the form of referring expressions as discourse unfolds?
  • distributional semantics and dialogue: Distributional semantics has traditionally been mostly static, in the sense that it creates a semantic representation for a word once and for all. Can it be made dynamic so it can help model, for example, phenomena related to Questions Under Discussion (QUDs) in dialogue? Can distributional representations help predict the relations between utterance units in dialogue?
  • distributional semantics and pragmatics: Distributional semantics is based on the statistics of language use, and therefore should include information related to pragmatics of language. How do distributional models relate to such aspects of pragmatics as focus, pragmatic presupposition, or conversational implicature?

 

Submissions

Submissions to DSALT do not need to be anonymous. We solicit two-page (plus references) abstracts in at most 11pt font (no other requirements on format and citation style; you can use the ACL stylesheet if you want — but make sure to set font size to 11). No proceedings will be published, so workshop submissions may discuss published work (as well as unpublished work), and they can report on finished or ongoing work. The abstract submission deadline is April 12, 2016 (extended). Submissions are accepted by email at dsalt2016 AT gmail.com.

 

Important Dates

Deadline for abstract submission: April 12, 2016
Author notification: May 15, 2016
Workshop dates: August 15-19, 2016