Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
but i just can't cut but is that before the meeting and it 'cause i had to lay it's now so
00:00:07
i was back to miami very sharp it's alright
00:00:10
to reveal my first yeah mm hardly spell ah
00:00:18
maybe he and the location should be changed and now and then
00:00:22
great comic to land ah yeah but it should pay them close
00:00:27
ah and vile okay k. is a continuous speech recognition for people with this as well just
00:00:32
uh did you well quite a deal my uh last time in court
00:00:37
was like how i'm not object so it's uh is there an a.
00:00:41
s. r. system last because that that's not an object thing but having
00:00:46
just a hot stars daytime recurrent but where the right but it's eight
00:00:50
i was also 'cause strong improving the a. s. r. performance down continuous speech like phrases
00:00:56
and sentences um times not exploding is uh well okay let's re end up costing information
00:01:04
so what has been done nowadays how ah but she
00:01:08
s. of the christian on how to stop it is plastic
00:01:11
rather than i mean it's draw the cotton domain knowledge base it really is the out of the major hardware tonight
00:01:18
i like that it it has that's deliberate speech horrific in it for business was speech recognition
00:01:25
no so ah whaley so that problem how to tell us that they can't make a a
00:01:31
lot of different types engines actions how uh in
00:01:35
tango the exact mark was the word and sentence
00:01:39
uh the shit okay it's always true how to start a continuous speech
00:01:44
recognition you so explore the tried not to because they can language model constraints
00:01:52
ah in his house here we tried to make a work that's names how uh x. during
00:01:58
the trial happy tonight 'cause they can language model
00:02:00
constraints call it doesn't uh speech recognition the background
00:02:05
that one a is how we thought yeah with the other one togo can provide average take long
00:02:11
whatever it's how because the sentences in the training data i get good katie in the test data
00:02:17
that has data oh that means but maybe it's that has good now we will see it in
00:02:22
the training data so the without shouldn't be very
00:02:26
realistic and a move of this work is actually
00:02:30
look at the current beta clark uh let's go to the r. i. a. s. r. system how
00:02:35
what local high and to measure how much of
00:02:38
the cup performance is coming from language model constraints
00:02:42
oh by the way the the thing is we won't get that uh in question how it should be a coating of uh
00:02:51
'cause we know they're charming designing speaker dependent language models
00:02:56
around picnic is intended to cause take models when considering higher
00:03:01
alright i bought a chip chlorine speech high this
00:03:04
means like most of the studies how is uh
00:03:08
uh is because of the speaker dependent acoustic model
00:03:12
well like what for example for typical speech or
00:03:17
huh because there's a speech at the acoustic model it's quite a speaker it
00:03:22
ended up one have landed on it because those we depended on it like
00:03:29
oh okay so the motivation hey there are challenges in this as a uh yes i
00:03:36
like it has passed me and also the tape no from i selected journalists are continuous speech
00:03:44
so i'm so good language models while it was past work uh yes i
00:03:48
which is kind of a lot a whoosh at the matter of it is
00:03:54
only wanted but ah okay back out of the main language model originating
00:04:00
from labour speech of a range of vocabularies that it would be a great
00:04:04
assurance house nice only experiment ah it's damn didn't he say he train of
00:04:11
the of the constraints falls speakers ways of varying degrees of these as well
00:04:17
i'm here we have the with ah it's like well you know you can't say it's o. d.
00:04:26
x. x. is it in the categories but
00:04:29
have precise estimate than the um the experiments how
00:04:34
over the frame recovery sizes hike from the group k. to
00:04:39
room two hundred k. vocabulary size and hans artist which because
00:04:45
i was lazy lately scrooge trigrams this means
00:04:49
they used in technical training and has only
00:04:52
soulful this man because you know the coverage i think we'll cover more work anyway so it
00:04:58
shouldn't be x. x. that's real fun just there you know there's there's all zero point five
00:05:05
okay so two hundred k. and we can say in
00:05:09
the two gripes equator doesn't actually writing a is there less
00:05:14
complexity the language model is required to have the dress without
00:05:18
high here than back so cosy is the a. h. crowd
00:05:24
and uh for example a male this one this one is that
00:05:29
but most people here this also speaker so when they want to
00:05:34
get the best results that whatever is nice might we write five k.
00:05:39
a lot will look when when i say the the the blue
00:05:44
eyes glanced at least here how this might be the miles speak english
00:05:48
so what it's okay can pass without its need like sh bash take me so much i mean
00:05:53
how long would model constraints should be um smaller then that's your t. v. or the test was quicker
00:06:01
no the quality of the uh a caustic milo also have
00:06:05
beers expect of those constraints on i. m. a. oh sorry
00:06:12
they just to clarify divisional mid different acoustic model but it's
00:06:16
on that right so yeah and it's uh have less constrained
00:06:21
panicking and and models and not so many concluded that's not
00:06:27
and may it be wise that accosted model important ah and
00:06:31
but also just because i uh uh did you can do
00:06:35
that i'll i'll optimal under a kind of three it's also jaime
00:06:39
speaking to pension when configuring is that's missed each other how
00:06:44
okay so these are the things that i have gone interface where
00:06:48
how and how she my current work you know when we
00:06:52
lose our nah no networks which really are there any new there
00:07:00
makes the makes use of paid and unpaid data to
00:07:05
well caustic tonight letter that how mm i. q. state in the nation
00:07:11
diagram no way i input is a caustic an
00:07:15
activity data we want to live news like cycle can
00:07:19
be a a a a a show up in colour
00:07:23
and also tries not market has model surrender battle caustic
00:07:28
to articulate remapping here and then uh who again there are like a a show at
00:07:35
ooh the the yeah like the t. v. trajectories and also
00:07:39
together with the cars dictators we won't go into the bottle
00:07:43
where where are they when a guy and a cost not only a number to get a
00:07:49
pet or am i averaged out as experiments
00:07:53
in makes man um what's our bash it it's
00:08:00
so i'm not sure how oh what we've gotten jewel is strike is the two so close
00:08:08
how close we have lots of acoustic data has ending at
00:08:12
maybe at hand painted and we have just the screen is
00:08:16
there no like electric data and that's we have is paid
00:08:20
me h. e. acoustic data so like these are large large
00:08:26
space it's not bring it along to their or what we
00:08:31
found one sure and the mapping between us to not opposed
00:08:34
to you know to uh could actually potentially they also want
00:08:38
your and it might be from pete coach speech disasters each
00:08:43
hi and now i am at
00:08:48
a man is um who they're oh how a frown i with computers
00:08:53
give like there and here i actually live in one column is a
00:08:58
task and they had a ah when it occurs to me a and
00:09:03
e. n. h. beta yeah maybe okay so articulate again and also for each
00:09:08
of them they had loading yeah and also the process data how many
00:09:13
percent new plastic paint until he relies spectrum frequent had more information out
00:09:20
to be rejected like the location of oh where oh actually it's really
00:09:28
are and also they were hatchery down process to yeah maybe yeah so it's
00:09:33
quite easy to use later mm mm named um good zero and techniques here when when
00:09:40
we would type cycle gotten channels go get me a new creation not also in culture
00:09:47
huh so for cycle i do eighteen used when
00:09:51
uh we we write x. and write a a step
00:09:57
lately and acoustic data and also the act like
00:10:00
trading one shouldn't use one modality will yeah another modality
00:10:08
during this cycle which consist 'cause the last how and
00:10:13
oh but but would would try this not was actually not very good at the good clothes
00:10:19
uh actually i was gonna basher or ideas from it was conversion so
00:10:23
the um the greyhound right i where ah m. that tomb ability you
00:10:28
have quite to the last page but was great because data collection again
00:10:33
yeah yeah i'm very preference one is gene frequency of a show respect to
00:10:41
space and another is just the location distance there so
00:10:46
yeah i write much then what we will do next to
00:10:49
explore how issue explore expect to include this kind of
00:10:55
recharge and another thing we want to use this be easy
00:11:00
one night a shaky like really acoustic data that we have a lot of these acoustic data
00:11:08
so we were in nine coder and decoder no solar very
00:11:13
nice ah goodness representation heat
00:11:19
and then we will like fixed decoding here a
00:11:23
a a and tried to lose weight huck actually
00:11:28
actually i don't think that together and to learning
00:11:32
coder and decoder here and then we will uh
00:11:38
strange ways natives representations to have no interest in the
00:11:42
representation from how close acoustically screen is imparted lashes grazed
00:11:47
and then maybe this unit is my presentation can goes into the a. s. r.
00:11:51
t. ten to activate and we'll include in the final yes i replied yet but
00:12:01
have comments where it can can also input we can
00:12:06
use that and tape it up because now lots of yeah
00:12:09
it's hard work on in shadows they wouldn't have lots
00:12:13
of grey haired ten how how does the maybe we can
00:12:17
make some within the care and share data how like
00:12:20
first uh in in his diagram a chain but yeah just
00:12:25
wanted don't have books as that regulated upgraded version and
00:12:31
i was then the is that paid data how to train
00:12:36
re change a latent representations and then use articulates right
00:12:40
a straight into the hot rhetoric data and it's grew right
00:12:44
generated never magicians fish or something else you um h. t.
00:12:54
such sometimes question how how the uh so
00:12:57
extension conference while i'm actually this year i
00:13:03
tend ah twenty amount you're supposed to do
00:13:08
or one there and sam rampage courses in attendance
00:13:12
and then it's no so i actually and i passed actually is good marriage berman and
00:13:19
and right hand what happened uh planting the well well well well try change my clothes judge
00:13:28
or condemn you should go home and checked out i was uh which i'm probably gonna
00:13:34
sit next to come back and walked in and worked for three months and then our ah
00:13:41
action and to compare it to screech compression yeah i just got ten standard ten minutes a
00:13:48
day of professional speakers questions has been an
00:13:51
h. and maybe even get some complicated document activities
00:13:57
uh like the department outrage where where there's been there's that
00:14:01
uh that's true option my or so that's what i've heard i would and
00:14:08
that's cool you attention to it

Share this talk: 


Conference Program

ESR03 : Interpretable speech pathology detection
Julian Fritsch
Sept. 4, 2019 · 2:30 p.m.
156 views
ESR09 : Clinical relevance of intelligibility mesures
Pommée Timothy
Sept. 4, 2019 · 4:49 p.m.
Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.
ESR11 : First year review
Bence Halpern
Sept. 5, 2019 · 11:20 a.m.

Recommended talks

Introduction to Phonetics and Speech
Rob van Son, Amsterdam
Sept. 24, 2018 · 9:02 a.m.
297 views