Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
uh hello one my name is uh is not bad and um uh uh clears the student working on
00:00:06
speech recognition with uh i have a service them and so i will be fast but uh
00:00:13
yeah i i think you you you live in a lot of these uh four days about to kyle the bow to
00:00:20
uh no network arch architectures uh like a recurrent
00:00:24
neural network or conventional neural networks and
00:00:28
um what i will try to present is practical and some checks
00:00:34
and when you use a bite or show how how we can uh
00:00:39
perform and optimise the uh uh some some uh uh all crazy
00:00:45
so uh about quite harsh um a zinc and they said that uh
00:00:51
the difference between a bite or shouldn't know by is it
00:00:55
is it numbed by use in n. dimensional uh andrei and ten source is an n. dimensional data that
00:01:01
they have no to guide and automated differentiation so when we do an operation and
00:01:07
is a uh say okay i want to do about prague it's easy to use it's
00:01:12
very easy to use so uh we have in bite or should we have
00:01:16
some the uh libraries like uh no network you have all the uh uh
00:01:21
uh implemented architectures so uh in and then we see a for example
00:01:26
uh and then the narrower and and and it's the ends or a recurrent neural network
00:01:32
we have a package called optimism opting so we have all the empty miser had ah as you d. a. it huh
00:01:42
ah okay yeah
00:01:46
to see yeah okay and i didn't see the okay and we have we have some we have some uh um
00:01:55
functions so the the nonlinear functions like a a soft
00:01:59
max through a a and i should exceed
00:02:03
there so we can we can have its uh we can have all the staff swedes
00:02:07
so we can use uh you you can implement the our
00:02:11
neural networks you want with these three back each one
00:02:16
and so for example we can we can uh it show that if j. p. u.
00:02:20
is available if not uh use the uh device you use separate you directly
00:02:27
so for example we have uh this kind of uh
00:02:31
off format to ask uh how how i can use this
00:02:35
uh this function so writer she's good in that
00:02:39
so how i can use the how we can call this function so you can you can ask uh uh uh
00:02:44
uh i toss like this so for example for uh the
00:02:48
linear function uh you can call the this example
00:02:55
so so you have a a a you just for the c. i. okay i i want
00:03:00
to uh twenty input features and i want to uh five prediction classes or thirty
00:03:07
and generate a prepare our uh our data
00:03:11
uh for for example or a one hundred and eighty example and have a a twenty features and to
00:03:18
say okay i want to this data through the the the database uh through to know network
00:03:24
and we have a output to a automatically calculate and how um uh you you
00:03:31
can ask uh to how to use the optimism for example with adam
00:03:36
okay how to use how we said to the beetle one the beach that tool for for our then six it down
00:03:41
some some example nice and we need to to know that
00:03:45
the uh when using uh with this kind of framework
00:03:49
uh uh how a shoe dye prepare my data some of you you
00:03:54
uh are are using pad some of you are using cal d.
00:03:58
uh for example and it's interesting to uh oh i went to
00:04:02
use a expect or or a vector from kyle d. or
00:04:06
have bunk exacted from kyle d. and i use more complex to
00:04:10
a neural network with by touch how we can do that
00:04:14
and it's important to understand that i i how work and i. p. that my
00:04:19
data for uh using uh uh this kind of uh libraries so for example
00:04:27
uh the most the come on the use in uh
00:04:32
eh in my gosh is too fat in the first the motions there
00:04:37
in the first dimensions the the budget the the exemplars so and
00:04:42
in this the other dimensions we can we can uh
00:04:46
uh defines a need for example for the second dimensions we can define the the features
00:04:52
and so what the difference between passive flow and a careless and said
00:04:56
through a gas with back uh yeah that's what we like and
00:05:00
and by dashes that for quite harsh uh for sequence modelling we
00:05:05
use a blind emotional there and batch and the features
00:05:11
and not the batch than that time then features like a a
00:05:15
ton several and it's it will be easy i'm sure
00:05:18
to an example later it will be easy to easier to to use uh with this uh a kind of a
00:05:25
option a and so with that with the f. library we we have lots of my accent on on nonlinear
00:05:32
uh uh functions so about it if we have a for example feature prepared in calgary
00:05:39
and do we want to load it low these features uh from from cali
00:05:44
uh by cash crop was uh what we call that that's that object and data loaders
00:05:49
and this this uh it to 'em object is pretty a
00:05:54
simple to use so we we just create a class
00:05:58
and in yeah it uh this class from that uh that asset
00:06:03
object a a coded uh by your lighter sure a developer
00:06:07
and a chat and define three functions in eat get it
00:06:11
ends and plan and this she um functions and
00:06:15
i want to read the features for example if i i try to read features from cut it
00:06:22
i will i will i will try to to to to say okay i will define
00:06:27
the features uh uh parameters and the target parameters
00:06:32
and actually when when i want to ask for uh some example
00:06:36
i will define in the get the ten i will define uh
00:06:40
my dictionary to get a uh uh the uh the data
00:06:45
and the land function would be hell would be helpful for the um
00:06:50
for the that i loader because when we when we
00:06:53
need to process uh our no networking g.
00:06:56
p. u. so we have a we haven't the uh usually maury so we need to
00:07:01
to split our database into a batches and it's important to have a
00:07:07
uh this information to split our our data to into batches
00:07:12
and uh this is the example of of how we can uh i'll call the the data on directly
00:07:19
so we'll we'll see this uh this after knowing how uh how we can uh process it
00:07:24
with uh with cali so we can we can in this uh in this example
00:07:31
call the data loader on the data set and tell the and tell it that a batch sizes to
00:07:37
thirty the two and shuffle shuffle the example so we can have this kind of extra information
00:07:44
and i wouldn't show wheels a a had in this section
00:07:47
we have a a package cal cal detail by time
00:07:51
it's uh it's whopper for uh it had to read the uh the
00:07:55
the candy format directly in non by a a a format
00:08:01
so we can implement the uh we can import this uh this can do library
00:08:07
and say okay this is for my one follow my my for my way for my files and
00:08:14
uh the hours from this p. d. b. speech and we can see that the
00:08:19
the the data is bought one a hand or a fifty five uh you got it's it's a lot
00:08:26
and how i can do to process it uh into should be then top demise it
00:08:30
the the time uh uh during the training so we
00:08:34
have the first the call is okay i will
00:08:38
read all the the data yeah i we should have enough for um to have to load one
00:08:45
or fifty five but if it's not solution it's it's it's
00:08:49
uh uh for example we use uh uh when we pay back l. d.
00:08:55
we have a what we call s. c. p. files and we can we can
00:08:59
get the the offset of the metrics so we can have directly the offset
00:09:04
and we have the access to uh with this library with candy i you to him
00:09:09
to directly uh take the magic's so this much
00:09:14
x. will be directly and demented uh that
00:09:19
in in getty times and between and between i. b. a.s to to read the uh
00:09:26
uh the the features and push it to try to you for a fourteen
00:09:31
it's some the cheeks uh this is uh is some chicks in pie charts
00:09:35
to optimise your your uh your for your data flow uh and
00:09:42
so about this it's okay so when when you when we get
00:09:47
the magic so it's uh it's numbed by and the
00:09:50
single show you or how to convert from no implied to uh to pry trash
00:09:55
so uh for example i will show you is how to simple create a
00:10:00
neural network using pipers so just uh this is a linear location uh
00:10:05
so uh you just uh the find the class and initiate from and then what is
00:10:11
define your uh architecture had in a a neat function
00:10:16
so you can uh you can uh call the superclass constructor for uh and then modules
00:10:21
and then define your your that directly define your no network like an ass
00:10:29
and the uh after that uh you can you can um define your forward
00:10:35
uh for for what passed and don't take care about uh the backward class
00:10:40
because the but what cost it will be computed automatically uh using auto crap
00:10:46
so this is a very interesting the uh concept
00:10:50
uh for example the it we just and some said another network
00:10:55
and uh the second think uh what we can uh uh
00:11:00
a need to to have a a good implementation is the key italian that last function
00:11:07
that will be a good for your uh task
00:11:12
so for example this is the regulation i use a a mean square error
00:11:17
and for classification i used across entropy or or something else
00:11:23
and when you have a teacher and define the last function
00:11:27
you will uh and starts yeah a job to myself so why should uh as g. d.
00:11:32
and ten and uh and so this is my network minor network the and then object
00:11:37
and uh take my parameters so uh and then dot parameters to take
00:11:42
or uh to to push all the parameters to to the optimism
00:11:47
and simple way to train the neural network okay uh so i decide
00:11:51
to be a a twenty uh park it's uh it would be
00:11:55
uh more or less so i don't my data using data loader
00:11:59
or that asset you mean she or show with this uh
00:12:03
this uh uh after knowing that to use directly the data set uh from
00:12:08
from 'cause it the data set objective uh from cali and say okay
00:12:14
my optimised there a guide will be zero for each step is
00:12:19
and i will uh uh do the full what's that
00:12:23
compute the the last function and toolbar core
00:12:27
and when i try to do but what on last function it we compute
00:12:30
all the back part four or parameters that uh is try aboard
00:12:35
so uh this is the the kind of use the you can you can use in uh by touch and
00:12:42
for updating parameters it you just say the optimism dots that
00:12:47
and it will uh uh all uh update your parameter jackie
00:12:51
so this is kind of simple uh a neural network
00:12:56
and about uh for example we take time to
00:13:00
create a simpler neural network and the simple neural network is that
00:13:04
you have input you have a a a linear uh um
00:13:09
the i function input to output you have soft blocks and you have
00:13:13
an output and if you want to to have the loop
00:13:18
the uh the the the recurrent one so you from your input you
00:13:24
will train you know they're a linear function input to it then
00:13:28
get paid there and this even will be uh use it in the next times that
00:13:34
combined into the the next input times that and uh to predict
00:13:38
the uh to predict output and the next he didn't think
00:13:43
so this is quite easy to to um to divide up so
00:13:49
when something where we we do we define a class
00:13:53
with in it and show okay i want the the ending at
00:13:59
this one a shit this one on on the left
00:14:04
i went to uh in my in my uh input it would
00:14:07
be input size and he'd been a input size and hidden
00:14:11
size and i we have uh the output sorry it's this
00:14:15
and for the hidden in the next the time step
00:14:19
it would have the input size plus hidden size in at the inputs uh
00:14:24
uh at the input of this function and it will have hidden size uh at the output
00:14:31
and we we use a a lot soft my uh a soft max
00:14:34
on the on the on the output of this uh this lie
00:14:40
so for the for for the forward pass i have the input
00:14:43
i have the hidden i concatenate the uh uh both
00:14:47
and i try to predicts from the the combined with a hidden
00:14:53
i predict the hidden for the next step and output though for the current state
00:14:57
and i say okay my output will be uh uh i apply the nonlinearity
00:15:03
on the on my l. on the on the output and this is the the small uh
00:15:10
it's a cushion so uh imagine that you have have bunks uh uh at the input
00:15:15
hidden layers and the category is too i should of for example if you take a word or
00:15:21
on for size it or not you want to say uh to to to know if uh
00:15:25
uh from your database you have uh that the decision of um phases word unknown faces what
00:15:32
and you can take the this example
00:15:36
to to ted okay my my first input is is enough bunk
00:15:41
i mean it's a hidden state the with zero uh with
00:15:45
the vector and that i take uh i i push
00:15:50
my input through the and and and and uh i take uh i given the the hidden at the safety
00:15:58
and they get the new leader and the output this newly then we'll be can't even next
00:16:04
that link this is this is how we would train uh and we use uh
00:16:09
recurrent neural networks so we use this and that the last he'd instead with the new input
00:16:16
to predict the output of the current state uh this is this is it
00:16:21
presented like this so we use a then with the into the current it
00:16:26
the input we have output and the the hidden forthrightly and have other
00:16:33
have these like uh that for the uh the nine network so we
00:16:38
can take into account only uh how uh uh for example we
00:16:41
have to use of the no network we have the one the
00:16:46
which we will take into account only the the last one
00:16:51
so this is a this output darted a dozen to make
00:16:56
sense or the the i we didn't care about this
00:16:59
so when we opted nicely opting nice only uh like like that
00:17:05
and we have the second uh split the scott uh this kind of architecture is called many to one
00:17:12
so i have words somewhat and uh i i will uh a
00:17:16
big big intelligibility of uh the words or the offices
00:17:20
and the second is uh okay uh i want for each
00:17:25
input i want the the output is like many too many
00:17:30
uh it's the it's the sequester sequence model for
00:17:33
example what promised explain for city says
00:17:38
and for example for a wheel was um
00:17:43
we use as a soft max uh a lot soft max so for the last italian
00:17:49
we we use a negative log for uh like you would a soft max
00:17:54
so we asked us yet it and then the okay i choose michael t. miser it's are down
00:18:00
uh i give him the learning gate and i choose a park i'd i'd
00:18:04
that okay i will change my model you get the features get output
00:18:09
and jane a that i will uh i will change for
00:18:14
each example in my input so i take it each
00:18:18
example in mind input for example i have a thirty two
00:18:22
examples so that's why i see uh i shoes
00:18:26
uh the size one so we have time step the first emotion the batch example is
00:18:31
uh the second example uh the second the dimension and said one is it
00:18:35
buffy dimension and uh i tell okay for each example
00:18:41
i do my uh recurrent neural network yeah and i hope demise
00:18:46
only at the end so i do i do this
00:18:50
if i want do so many too many i just need to
00:18:56
to use 'em might might backward uh uh this one
00:19:02
sorry inside the top and they have a list of losses and
00:19:07
i do uh the loss the and was back or
00:19:12
so it's very easy to to modify the uh uh our non network
00:19:20
okay this is uh something poor but i explain but it's okay
00:19:26
so um uh for for that second example for this one
00:19:31
so we can we can use directly a group that we have a we have a a
00:19:36
return to a dated recurrent units or some that we can use some
00:19:40
of the architectures and for the for what i will take
00:19:44
a i would give him directly is um the batch and the
00:19:49
time stamp i we don't use the the for loop
00:19:53
so the the object will detect directly if you give them the and uh one
00:19:58
example and you you do uh the for the for loop four times then
00:20:03
or give me a given directly the batch with the time stamp
00:20:07
and the the for loop i will be inside the object
00:20:11
uh all inside ago or so they included it we we can have an
00:20:16
example using c. t. c. uh more uh c. t. c. laws functions
00:20:20
so i i i stance yet uh our own coders i tell
00:20:25
okay i will have a thirty two example with that
00:20:29
a one hundred fifty time stamp and to uh have bunk with delta with us on the river but
00:20:37
lance input i assume that the i'll the input have uh the the same length
00:20:42
but we can have the different uh for example we can have the first the example is two seconds the second is
00:20:49
the to buy a a point four point four x. there so we we can have the the different plant
00:20:56
and we have the output sequence of the characters so we can have the different land for
00:21:01
each uh so it's it's very it's very it's it's it's sick wants to sequence months
00:21:07
so to change that just say okay uh i put
00:21:12
my input to the encore that i get my
00:21:15
uh like you would a magic's and this much x. i call uh my c. t. c.
00:21:21
so uh i uh i put my lots of rocks my
00:21:25
output sequence my land input and might end up what
00:21:29
and close but what would compute a disc uh what thomas explain uh uh
00:21:35
before and last dutch how we we can train um
00:21:40
uh uh no network using c. t. c.
00:21:44
so for the predictions that really can use a greedy search
00:21:48
decoding or prefix being search so we see that uh
00:21:51
in the afternoon but the you when when we uh um
00:21:56
predict uh the it using the neural network the object
00:22:01
we need to set the coders dot divine if you we have
00:22:05
dropout tools we have uh and enable parameters it would
00:22:08
start a a physical and uh and we can use uh uh and and you can use it to predict
00:22:15
so that's all what i'm sure we'll uh in this uh afternoon
00:22:21
we'll have a vague says ice about to c. t. c.s

Share this talk: 


Conference program

Raw Waveform-based Acoustic Modeling and its analysis
Mathew Magimai Doss, Idiap Research Institute
14 Feb. 2019 · 9:12 a.m.
About Sequence Classification for Sound Event Detection and end-to-end ASR
Thomas Pellegrini, IRIT, France
14 Feb. 2019 · 10:14 a.m.
Case study: Weakly-labeled Sound Event Detection
Thomas Pellegrini, IRIT, France
14 Feb. 2019 · 11:05 a.m.
Introduction to Pytorch 1
14 Feb. 2019 · 12:06 p.m.
Introduction to Pytorch 2
14 Feb. 2019 · 12:26 p.m.