Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
uh_huh
00:00:02
okay huh
00:00:05
yeah so hum says to talk on the natural language understanding group
00:00:10
the metro languages uh the crucial problem for a i virtually any thought we can have
00:00:17
we can turn into a sound stream communicated to someone else and they can have the same thought
00:00:22
so this isomorphism between language and thought means that studying
00:00:28
languages in some sense studying the nature of thought
00:00:32
and one thing we know about languages even though the sam stream is this
00:00:36
a flat sequence because for a lot of trouble the train create
00:00:40
a structured representations so clearly our thoughts are structured in some way um
00:00:46
what is the nature of that structure uh how can we uh
00:00:51
build a model that given such a structured representation would generate a sequence of sound to sequence
00:00:58
uh characters and had given the sequence of characters exam how can we
00:01:04
for such a structure thought the fundamental problems and they i
00:01:10
so intimate natural language understanding group we're taking a deep learning approach so
00:01:16
we do a deep neural networks and representations or learning in pretty much all that were
00:01:22
uh we apply that to machine translation uh information
00:01:26
access and and a document classification tasks
00:01:30
and a new line of work on um the semantics a fundamental problems
00:01:36
and cement it's cold and empty on that'll part about that
00:01:41
so the group currently is myself to pose starts one p. h. d.
00:01:45
student we have three more p. h. d. students are starting
00:01:50
in the next few months and then we have one p. h. d. student just is is graduating this month
00:01:56
we collaborate a lot with andrei plus people used to until september of last september
00:02:02
when i arrived he was the head of the n. l. p. group
00:02:06
um and so i inherited some of these projects uh some some
00:02:11
of this work is actually based on what he started
00:02:15
we cleverly club read a lot with the speech recognition group uh we also collaborate
00:02:20
with people in geneva and uh people in grenoble uh my former employer
00:02:30
so uh we've funded by a number of projects you projects
00:02:35
a us project from us on machine translation and information access
00:02:41
uh projects and we have two new that's where santa
00:02:46
set projects on this semantic entitlement a problem
00:02:53
so representation learning cuts across all our projects and the kinds of
00:02:58
this is basically deep learning you want to and for some deep hidden representation from some
00:03:05
observe observations and the kinds of things we we study in that context are
00:03:11
attention base representations for example here we have
00:03:15
a multiple sentences from our our document
00:03:19
and we uh a nice hierarchical attention model first you decide shown in red
00:03:26
uh which is sentenced to pay attention to then within the
00:03:29
sentence you decide which words to pay attention to
00:03:33
and this is actually an example from conditioning when you're trying to
00:03:36
translate a sentence conditioning on the previous sentences and sing docking
00:03:41
um well see similar models for word sense this in big you asian
00:03:46
and uh see uh in the production of the output
00:03:49
word an injunction uh up but since machine translation
00:03:55
another thing is output representation learning typically machine learning
00:04:00
your output classes are just a priori independent atomic uh symbols um
00:04:07
we want to uh we often have scenarios where we have a very large number about classes
00:04:13
but we have a lot of information about those classes like that the word that's used to label them
00:04:18
or some description we want to learn the similarity space about classes so we can
00:04:23
generalised better to new outputs uh maybe up it's we've never even seen
00:04:31
um and then this uh work that new projects on
00:04:36
on semantic intel mentor based on previous work
00:04:39
i did on a kalman factors this is an alternative to
00:04:44
a a normal a word in batting style vector spaces where instead of representing
00:04:51
similarity we're representing entanglement and i'm also in that context how do you
00:04:59
how do you uh have a representation where the the
00:05:02
number of parameters in representation grows with the complexity
00:05:07
thing you want to represent because in our contact senses can be very
00:05:12
long or very short and one different size representations for those
00:05:20
so the first set of a project so talk about our on machine translation here we're using
00:05:25
neural machine translations isn't noticed ration were given
00:05:29
input sentences apostles second and the pain
00:05:33
and we went to first in bed that in some uh representation
00:05:38
and then conditional net representation to generate our our output uh
00:05:44
translation and here where we journeyed in one word them or conditioning
00:05:48
on that word to generate the next one that's that uh
00:05:53
so three projects on that the first one looks at this
00:05:57
problem of generating the output sentence called the coding
00:06:02
um typically is just treated as a flat sequence but that has a strong bias it it though
00:06:08
to it is much more attention to the recent words and forgets word to it the
00:06:14
um i'll put a while ago so we add in attention shifting mechanism which
00:06:19
goes back and decides which of those previous words to pay attention to
00:06:25
um and the interesting thing about this is when you train attention a model
00:06:30
you get a pattern of attention that actually reflects some syntax like
00:06:39
um
00:06:41
as it's only example uh we also will that
00:06:45
a document level um no machine translation
00:06:49
so this is a paper that will be presented in an l. p. n. a couple months um
00:06:57
where were conditioning on the previous sentences here by first
00:07:02
computing in detention over the words of each sentence computing a representation from that attention
00:07:08
and then computing and now they're uh attention representation over the sentences
00:07:14
given those representations to compute another representation of the entire history
00:07:19
earning conditioning on that to to predict the next word of the sense you do this for every word
00:07:27
um
00:07:29
this uh this helps bye bye have a conditioning on the previous that uh translations of
00:07:34
previous sentence helps us a distant big you waiting and general translate better um
00:07:43
and we find that this this kind of conditioning helps both when you're
00:07:47
in coding the source ends and when you're predicting uh the target
00:07:55
um
00:07:56
one of the problems with machine translation is every point you're trying to predict the next
00:08:01
word there are a lot of words tens of thousands of words and so
00:08:07
computing the your probability estimates over such a big space can be complicated
00:08:12
so uh we hope that uh how to we share
00:08:16
parameters with the words that were predicting with
00:08:20
the words that would conditioning on so when when condition on the words we've already translated
00:08:25
we want to predict the mets word those two are from the
00:08:30
same space of words so when ideas just say they
00:08:33
have exactly the same parameters another ideas disabled it just totally
00:08:37
different problems both those turn out to be to strain
00:08:41
the device a more flexible approach where some of the parameters are
00:08:45
shared uh and some of them i learned so that
00:08:49
um it can learn them the amount of sharing that should
00:08:53
happen between input and output and that that performs better
00:09:00
um so are naps other projects are on
00:09:04
indexing in classification of documents so in
00:09:07
general you have dot lots of documents that are and some space of
00:09:12
for example we're o. or what words are in the start and some some cemented space and
00:09:18
you want to label them or or identify which ones are relevant to a common problem
00:09:25
so this is the green ones or business start and and the yellow and so on and
00:09:33
so um the first project takes uh the same idea that
00:09:37
we just saw no mm no machine translation but
00:09:41
applies it to document classifications or medical documents
00:09:46
the each stocking can have more than one label but there are a huge
00:09:50
number of possible labels many of which you've never seen it training data
00:09:56
um so we want to take a
00:10:01
we want to take advantage of this by the fact that we have knowledge about
00:10:04
these output classes there's a word that labels and there's often often but but
00:10:10
description that tells you what the definition of back classes we
00:10:14
can infer some semantic representation of the output class
00:10:18
and you said to general lies better even to generalised to classes that
00:10:23
didn't occur at all in the training data work or rarely
00:10:27
um but in this particular case we actually managed to improve the then on the
00:10:32
the classes where we do have lots of data which typically doesn't happens
00:10:42
um the next one is multilingual language model wings so work with the speech recognition group
00:10:48
um one of their uh problems is they need to uh have the of the project
00:10:54
what are the likely next words what you probably going to say next
00:10:59
and that's fine menu working in english but when working in swahili
00:11:03
uh it's very hard because you don't have a lot of data about what those languages or what
00:11:09
so when approached this problem by training uh multilingual language model
00:11:15
where the target language leverage is data from
00:11:19
some large resource source language you
00:11:23
do that with crossed when we're done bad things uh you come up
00:11:27
uh representations of the target words that are in the same space as a source word
00:11:32
menu share parameters between the two neural language models recluse shared a low
00:11:38
were a levels that are closer to the small to where
00:11:42
the buttons would be train the higher levels that are closer to
00:11:45
the sting output languages train those separately on that one
00:11:51
and so get improvements there
00:11:55
so the last area wanna talk about is that semantic internal meant area um
00:12:01
intel meant a particular text all tell meant is uh
00:12:07
problem of information inclusion for example public a help
00:12:11
cares less expensive that uh mm utterance includes
00:12:17
the the information in matter utterances included in both these utterances and some said
00:12:22
you're distracting away from some of the details here to get this
00:12:27
so you can think about this as an abstract label of that you can use it to cluster say all
00:12:32
these things are similar in that they all agree on this we all have this one consensus point
00:12:38
so you can use it for canyon summarisation which is our our target application
00:12:44
uh and this is just a fundamental problem in the the theory of of natural language semantics
00:12:53
so the where we have a published a couple years ago in continuing today
00:12:58
uh whoops at how do you represent uh the
00:13:03
how do you have a vector space representations that are useful for model in this
00:13:07
new show notion of entanglement in particular we're looking at lexical and elements so
00:13:12
for uh how do we know that cats are examples of of animals
00:13:17
so anything that's true of animal should be also to cat
00:13:23
um so we divide vectors based models that instead
00:13:27
of representing similarity represent this information inclusion
00:13:32
by having each day it represent where the you know something or whether it's on no
00:13:37
um zero fits on known one if it's known and
00:13:41
we'll number represents some probability that it's known
00:13:45
so that impairment is just a subset relationship and bits and then we
00:13:49
can derive some variation abusing approximations that you would use continuous numbers
00:13:55
and we do things like and for what is the
00:13:58
the most likely vector remote the optimal vector numbers
00:14:03
here at such that given that we know that felix is a cat and cats are random
00:14:09
um work to measure how to what extent isn't elements work
00:14:15
uh and then we apply this framework for doing a computing wording bearings um
00:14:23
so the uh idea is the two words occur in
00:14:28
the same close together in syntax them they're
00:14:31
probably unified over part of an bigger semantic so we should be able to unify them
00:14:37
and by training working bearings in this way we
00:14:40
predict these these cat animal hide on relationships
00:14:49
okay so that's still a sample of the kinds of things we do in the natural language understanding group um
00:14:57
is that we focus on on the learning architectures for uh
00:15:02
patient to an l. p. problems and in particular machine translation mission excess

Share this talk: 


Conference program

Introduction by Hervé Bourlard
BOURLARD, Hervé, Idiap Director, EPFL Full Professor
29 Aug. 2018 · 9:03 a.m.
Presentation of the «Speech & Audio Processing» research group
MAGIMAI DOSS, Mathew, Idiap Senior Researcher
29 Aug. 2018 · 9:22 a.m.
Presentation of the «Robot Learning & Interaction» research group
CALINON, Sylvain, Idiap Senior Researcher
29 Aug. 2018 · 9:43 a.m.
Presentation of the «Machine Learning» research group
FLEURET, François, Idiap Senior Researcher, EPFL Maître d'enseignement et de recherche
29 Aug. 2018 · 10:04 a.m.
Presentation of the «Uncertainty Quantification and Optimal Design» research group
GINSBOURGER, David, Idiap Senior Researcher, Bern Titular Professor
29 Aug. 2018 · 11:05 a.m.
Presentation of the «Perception and Activity Understanding» research group
ODOBEZ, Jean-Marc, Idiap Senior Researcher, EPFL Maître d'enseignement et de recherche
29 Aug. 2018 · 11:24 a.m.
Presentation of the «Computational Bioimaging» research group
LIEBLING, Michael, Idiap Senior Researcher, UC Santa Barbara Adjunct Professor
29 Aug. 2018 · 11:45 a.m.
Presentation of the «Natural Language Understanding» research group
HENDERSON, James, Idiap Senior Researcher
29 Aug. 2018 · 2:03 p.m.
Presentation of the «Biometrics Security and Privacy» research group
MARCEL, Sébastien, Idiap Senior Researcher
29 Aug. 2018 · 2:19 p.m.
Presentation of the «Biosignal Processing» research group
RABELLO DOS ANJOS, André, Idiap Researcher
29 Aug. 2018 · 2:43 p.m.
Presentation of the «Social Computing» research group
GATICA-PEREZ, Daniel, Idiap Senior Researcher, EPFL Adjunct Professor
29 Aug. 2018 · 2:59 p.m.

Recommended talks

Component Analysis for Human Sensing
Fernando De la Torre, Carnegie Mellon University
29 Aug. 2013 · 11:07 a.m.