Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:01
hi everyone i'm shy away so ways my first name
00:00:05
so some of you has already know me after two times that your name and uh so i study for my p.
00:00:10
h. d. in an ad hoc i'll universities so i from
00:00:14
the same uh department with hank and um my supervisor is homer
00:00:20
and i also have two more supervisor sketch out who's also here today
00:00:24
andrew and so was soft on how many ways uh academic writing and the status of analysis
00:00:31
so um my project is about developing ballots measurements procedure
00:00:37
of pathological speech intelligibility so last time we have
00:00:43
a set down this project into two main goals so
00:00:46
line is to explore what's deviations of pathological speech has
00:00:51
influence um speech intelligibility and the other one is to explore
00:00:56
the possibility of measuring speech intelligibility
00:00:59
that can be automatic or semi automatically
00:01:04
and the problem in measure in speech intelligibility is
00:01:08
that they're not only two ways subjective and objective ways
00:01:13
and the subject you may is normally very time consuming and costly because
00:01:17
you have to clap arches a pencil listening to speech and give their ratings
00:01:22
and uh uh another problem do a our study that um
00:01:27
there are many different rating memphis has been used in different studies
00:01:32
those very hard to compare the results from different to cope or
00:01:37
uh so based on these two uh problems their
00:01:40
stuff maybe a nice for objectivity measurements of speech intelligibility
00:01:45
that's our automatically compute oh and
00:01:50
could be understandable to the therapies so the therapist could base down these features or these measurements
00:01:56
to give the feedback to the patient to improve their speech intelligibility
00:02:02
so um shoes all these three issues we have made the concrete goal
00:02:09
so we want to explore if there any interpretive all objective measurements
00:02:15
that are correlated to speech intelligibility or that
00:02:19
could be used to predict the speech intelligibility
00:02:23
uh amount different rating math as a man different
00:02:27
languages different speaker types as a different speech technology
00:02:33
so in the past in here we have to to to study
00:02:36
is so the first one is to explore the effects of acoustic
00:02:41
uh characteristics on pathological speech intelligibility
00:02:46
which has been accepted as abstract in
00:02:48
the international symposium um monolingual and bilingual
00:02:52
uh uh that has been for a present it by catch out last week increase
00:02:58
uh so in this study we have enclosed three different corpora the tower
00:03:03
go corpus and the the small and uh or group has published uh
00:03:08
to explore a difference uh so that ratings in different uh levels
00:03:14
uh which we made it as a i asked twenty sixteen 'cause this
00:03:20
i've been publishing inter speech to them sixteen so in this
00:03:24
three different corpora uh they're a use difference rating necessary just explain
00:03:30
so in target the use french yeah that s. s. rake a pass astoria assessment
00:03:35
should for the uh f. d. a. and uh the corpus used
00:03:38
touch intelligibility assessment that he'll one and uh at the ass twenty sixteen
00:03:44
there are many different masses be explored of for example the visual analogue scale
00:03:49
the a seven points a liquor scale and also the transcription word level or
00:03:54
sub word level but here we only choose of euro analogue one the last one
00:03:59
and uh the language in this recorder is also
00:04:03
different the cargo is about english and the corpus is
00:04:07
a flemish which is which is a variety of touch and the ass twenty sixty is also in dutch
00:04:15
i will talk uh back to this later uh another
00:04:18
studying is uh uh to explore the acoustic correlates of speech
00:04:22
intelligibility the usability of the e. g. maps feature set for
00:04:26
a typical speech this will be presence in the in the
00:04:30
in the wood shop uh the speech and language uh technology in education
00:04:35
after d. into speech uh later this models so in this study uh we
00:04:41
basically focus on the corpus uh we involve
00:04:45
to two different speech to sit speech thing only
00:04:48
so why in the world is basically is uh the wireless use in that yeah
00:04:53
and the other one is a run in text a pop and models are
00:04:57
being wishes abandon models is a fanatic about us and whether we use the text
00:05:03
so this this been shows this because uh you can compare with other studies
00:05:09
uh and also the two speaker traps uh is um involved uh the
00:05:14
sensor again and that's actually was really a refer to the reference speakers
00:05:21
so here i want to share some interesting results uh from our study so
00:05:25
in the first one uh as i just said we involved is very different corpora
00:05:30
and you can see here they're quite difference 'cause in the target and then as twenty sixteen
00:05:35
uh it's quite smaller because of the in
00:05:38
total speakers are only fifteen and twelve uh respectively
00:05:43
uh but the corpus why is quite a large uh
00:05:46
we have selected one hundred thirty speakers uh but uh
00:05:51
but uh uh pro uh probably or limitation here
00:05:54
is that the only have the intelligibility unfolding level
00:05:58
so it's basically a much detail and the other one's is more general
00:06:03
a general ratings of speech intelligibility and two compared to
00:06:09
where hash uh it's um a three percent saying like
00:06:14
the f. t. a. they gave to a jury levels
00:06:16
of the intelligibility which we have transferred tonight to one
00:06:21
is also because they have some a slash b. or b.
00:06:24
slash sea levels in pitch and uh uh okay so um
00:06:32
way accolade sound uh relations speech in the
00:06:36
features and the intelligibility and is very corporate
00:06:40
so i each triangle is so correlation plots and the dots on it
00:06:44
uh the blue one is a positive to read one it's a negative and the dot
00:06:48
is a larger that as the colour is a darker means uh the correlation is stronger
00:06:54
and uh on on your left hand the top of the triangle
00:06:59
the uh features in law is related to bottle of features
00:07:04
which i use about triangle published by rob an extra uh
00:07:09
to extract and uh uh your right hand turns uh features impart
00:07:14
role has more global features related to teach it to intensity and performance
00:07:20
the selection of based features you so because some of them has
00:07:23
been or many of them has been reported have correlations with speech intelligibility
00:07:28
and some of them has been shown that uh they
00:07:31
could predict to speech intelligibility so if you see this
00:07:37
plus here uh on the top so lines those are all space era
00:07:43
and interestingly if you see them either one and the right hand one out which
00:07:47
is asked on the sixty and the call i can see a strong or media's drawl
00:07:53
positive correlation between the other space or and intelligibility
00:07:57
which is not the case for the tower goal and similarly if you see that
00:08:03
the distance between the bottle who and the centre of the offset of bottles space area
00:08:09
a similar pattern could be found in the ass twenty sixteen and a corpus
00:08:14
but it's not the saying for target which uh the results indicate here dad
00:08:20
you could be the difference that uh the
00:08:22
croatians in different languages have different relationships ways intelligibility
00:08:28
uh we also applied uh that's that was much more linear regression
00:08:33
how this features uh and uh on as we as a just fat
00:08:41
that's well space area has correlation is speech
00:08:44
intelligibility even though it might difference the language
00:08:48
and uh the positions with you can see from here to the visions of the bottle a corner models
00:08:55
has also been selected by that's a linear regression model which means that is
00:09:00
of the positions of the quantum models could be potential predictors of speech intelligibility
00:09:08
and uh on it seems like that's a intelligibility might be different
00:09:13
so there's a need for um the speech
00:09:17
intelligibility ratings at different to that level of granularity
00:09:24
and then your second study a way more focus on the corpus data
00:09:29
and uh uh we enclose the uh the uh the the i mean so
00:09:32
word list one and also with a tax models short for t. m. and uh
00:09:39
so if we're sick a few focus here on the on the the ellen
00:09:44
so we have you can see that's on multiple r.
00:09:46
square score which is um the propose a proportion off the
00:09:53
or asian independently rabble that could be explained by
00:09:56
the predictors so which means that the score is higher
00:10:00
at the predictors could explain the variations in that dependent of arrival
00:10:04
which means it can't predict the dependent arrival so the higher is better
00:10:10
and the you can see that uh the score on
00:10:13
that's aspects uh speech is higher than that i'm normal one
00:10:18
and uh we first started it could be the case at a variation the intelligibility
00:10:24
uh it influence results so way at the ex
00:10:27
uh a small extra experiment that if only select the
00:10:32
yeah sounds like speech that has the same branch off the intelligibility
00:10:36
on it and then he shows up that multiple ask were scored
00:10:40
still higher for the s. s. with them for the normal ones
00:10:45
so well we conclude that each announced features that seems like is effective
00:10:50
in this house like speech but not the case for the normal speech
00:10:55
um and we also have plot the receipt those and
00:10:59
fitted bottles off that's the final models you have somehow
00:11:04
if your focus first on the left hand on the on that that's as big well you
00:11:09
can see that there's a large receive those in the running text uh an similarly in the table
00:11:16
you can see that uh its multiple as we're also shows up that uh
00:11:20
the score is much higher on the workplace than on the run in text
00:11:25
so i'll uh isn't that a phone in intelligibility
00:11:29
which is obtained from that the uh uh uh
00:11:33
reviews buttons aspect of a general speech intelligibility so could be intelligibility
00:11:41
is a is a combination of different levels of intelligibility
00:11:49
and move you focus on the normal speech similar patterns could be found in that yeah and the running
00:11:54
text similarly in the table that's you can see
00:11:57
the score is not changed very much is quite similar
00:12:02
so oh uh it could be the case at the the representation of speech intelligibility normal speech
00:12:08
does not very much across different material
00:12:12
so it's all visits to the pessimistic speech
00:12:18
and um for the running text 'cause uh the representation of the running text um
00:12:25
acoustic features is very different from a word list so we have had another extra or
00:12:31
feature sets a as a complimentary feature of words you g.
00:12:35
maps which is related to speech rate it's a temper features that
00:12:39
uh and you can see that the the multiple as we're score has increased
00:12:44
so all in means that in the run in taxes speech rate is important
00:12:50
uh explained explained made to re factor to the u. g. maps feature sets
00:12:56
so i'll um
00:12:59
from the results we obtained till now dance
00:13:02
it seems like it is necessary to contacts studies
00:13:07
uh was intelligibility ratings that's collected by difference
00:13:12
a rating method by different speaker by differences speech stimulating
00:13:17
and based based on this um we have made our future plan
00:13:23
so the first thing is ah as you can see that entire
00:13:26
go and as twenty sixteen they're quite a small um date sat
00:13:31
so there's a a low resource issues he intelligibility research
00:13:36
and uh we want to explore the polls possibility of affine
00:13:41
transform learning in the person chick language like flemish and dutch
00:13:47
um and the second one is out uh we want to establish a reliable subject measurements
00:13:53
this is because even the whole to uh you can do without one model like a. s. r. o. to learning
00:13:59
two predicts speech intelligibility these to rely on the subject you ratings
00:14:05
so um violated a measurements of subjects ratings do very necessary and important
00:14:11
we have proposed a a procedure of measuring dads and
00:14:15
has been set up as a web application uh using jungle
00:14:19
uh and so we will start later this mouse
00:14:22
maybe next wants to connect the no intelligibility ratings
00:14:27
and uh the third one is we want to
00:14:29
explore if there are any shared objective measurements of teachers
00:14:34
that's before on a very well in predicting pessimistic speech
00:14:38
intelligibility bases incorporate with my second men's in the and yep
00:14:43
uh which is a group that that has
00:14:46
established this corpus uh uh this corpus database
00:14:51
and uh the finally to achieve the final go that way
00:14:56
after we have this all the knowledge about subject to ratings object in measurements
00:15:01
that we want to explore if there are any interpret people objective features
00:15:05
that is related to speech intelligibility that could be understand to buy
00:15:10
the circuits which can be used to keep the feedbacks to the patients
00:15:16
thanks for listening in question

Share this talk: 


Conference Program

ESR03 : Interpretable speech pathology detection
Julian Fritsch
Sept. 4, 2019 · 2:30 p.m.
160 views
ESR09 : Clinical relevance of intelligibility mesures
Pommée Timothy
Sept. 4, 2019 · 4:49 p.m.
Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.
ESR11 : First year review
Bence Halpern
Sept. 5, 2019 · 11:20 a.m.

Recommended talks

Tracking 'the 2nd channel' of information in speech [slightly higher video quality]
Nick Campbell, Trinity College Dublin
Sept. 13, 2009 · 2:35 p.m.