Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:03
okay hello everybody camilla baskets i'm the user number seven
00:00:08
are working on modelling the progression of patients we know the degenerative disorders
00:00:14
especially patient with parkinson disease based on speech i mean the frederick alexander diversity
00:00:20
normandy number and also in cooperation with university of and you get a columbia
00:00:25
so first and what i did so far this year errors these are these publications
00:00:30
agree accept that already on the review or three to write papers and one chapter
00:00:36
really impress and these conference proceedings already affected some
00:00:41
for interest speech some for the f. b. i. mothers
00:00:45
proceedings but the stock with it would be for courses is typically and these
00:00:51
tool papers already at that that the first one it's my paper for inter speech is called for net is ah
00:00:59
don't get too abstract phonological features based on speech and then the second one that
00:01:05
paper based on on to provide for presentation learning for classification of pathological speech if
00:01:14
so the first one
00:01:17
i'm going to talk about barnett is output based on data recommend
00:01:22
neural network you seem a director and you need to have trapped
00:01:25
phonological pasta years phonological features from speech so they i'm of these
00:01:32
this paper of these approaches that high general high
00:01:37
dimensional features and sets like mel frequency cepstral coefficients
00:01:41
or directly in bearings from a neural network are rarely use it in the medical
00:01:47
community due to the lack of interpret that really team bought the korea also um
00:01:53
differing or suitable information to correct arise but the logical speech but in on the other hand
00:02:00
phonological features can be more comprehensible for the
00:02:03
clinicians than the traditional high dimensional acoustic features
00:02:09
also these features are commonly understood by clinician scenes they are related specifically with the
00:02:15
movement of multiples within the vocal tract and with the movement of place of articulation
00:02:23
build the aim of this study is to map the high
00:02:26
dimensional feature vectors typically annie mel filter banks into a sail off
00:02:33
different phonological posterior than can be comprehensible an understandable
00:02:38
from medical community and also that the toolkit would be open source available for
00:02:44
the research community to be use it for their own experiments or their own analysis
00:02:51
well first we define a set of eighteen different phonological classes
00:02:57
they said these was especially the saying or spanish language
00:03:01
which is the first pro that i consider i have different classes for vowels or consonants
00:03:08
i have will collate nasal thoughts strayed ends a lateral label then
00:03:13
tell depending on the modern manner of articulation of the different phonemes
00:03:20
and i'm how fun it works
00:03:24
it's our bank of part a little recurrent neural networks we it bidirectional gate recurrent units
00:03:32
they were training with a c. m. p. s. corpus is our up
00:03:37
an office that always a in spanish language it has seventeen hours of
00:03:44
read your pockets in mexican spanish and we the or so
00:03:48
like men at phoneme level and use it as labels for training
00:03:53
the phonological feature extractor so we have this is the architecture
00:03:58
of the r. m. and implement that and we have eighteen
00:04:02
different networks to that they each of the phonological classes this youth
00:04:09
so for the results of this first thirty we this is they
00:04:13
have one score of thing for the different phonological classes that we have
00:04:19
right now there are some of them agree accurate to
00:04:23
the take for instance the nice else speech sounds are really
00:04:27
upgrade the tech that the same four letter let's and
00:04:30
students about there are another ones that still can be improved
00:04:36
in some percentage
00:04:39
this is an example
00:04:42
i'll put up a net and the left part i have a healthy control speaker
00:04:48
cooper announce this punishment and make as a guinea and
00:04:53
these are the different the profile of the phonological features
00:04:56
for the nasal the bar double colleague the lousy for the
00:05:01
stop consonants and the extra then that correspond to the it's sound
00:05:08
so in this case they had to control subject the phonological features are really well defined
00:05:13
a it mark some really good where they the phoneme
00:05:17
star the phoneme where the phoneme start where the phoneme finish
00:05:21
and also the probability posted your is quite competent about
00:05:26
the probability of the detection of their phonological plus and on the right hand on the right
00:05:32
side of the slide i had the same sentence box user around by uh parkinson disease patient
00:05:40
i know that in this case the phonological feature are not as
00:05:43
good depicted as for the case of the of the control subject
00:05:47
which is related mainly to the presence of the disease of the patient
00:05:52
for instance the nasal sound they had the speaks here
00:05:57
also they close it sound related to the pronunciation of they he is also not well defined
00:06:05
so we i'm to the take the difference between button logical
00:06:09
unhealthy boys is based on these profile of the phonological classes
00:06:15
top or ongoing work related to the development of the tool kit
00:06:20
a future well those will include the estimation of phonological plus it's another language
00:06:26
other languages specifically you introduce german and dutch i already have the data to
00:06:32
train these models results or collaboration with other members of the type of network
00:06:38
the d. string that will those are currently available as an open source though kit
00:06:43
that can be used by all researchers from the network around for um the research community
00:06:51
and also to use these phonological features these phonological posterior
00:06:56
to classify parkinson disease patient and heard the control subject
00:06:59
oh and also to evaluate the starter severity of the patient the neurological if they have the patience
00:07:05
based on the provider the phonological features a i
00:07:09
have preliminary results regarding this topic eh yeah i
00:07:13
haven't accuracy around seventy seven percent plus it by him parkinson disease patient on how to control subject
00:07:19
and also correlation with the disorder severity of the speech severity of the patient
00:07:25
with a correlation around point six spearman correlation between
00:07:29
their features extracted from the phonological profiles and the
00:07:33
make all the severity of the patient's also for those mainly what then they replayed
00:07:40
the conference last weekend alone yeah we also the below up atkinson is an android application
00:07:47
a four to monitor continuously they motors in downtown they
00:07:52
speech defeats severity of participation with parkinson disease
00:07:57
so we are and also to deploy on it into a concern
00:08:01
so we can evaluate they pronunciation of the different phonemes sounds into
00:08:07
the into our local environment you know mobile phone up the patients
00:08:13
and on the other hand this is the second study that i would like
00:08:18
to tap today it's call our little presentation learning to classify button logical speech
00:08:23
the aim of these are is that in general hand
00:08:27
crap feature distracted in the literature like you know shame error
00:08:32
a formant frequencies or prosody features may not contain enough information to correct their rice
00:08:39
the speech signals associated with different voices or there's for instance parkinson
00:08:44
disease or patient we'd gladly pamper lake also that may have a bates
00:08:51
and because of that may have a base on feature representation learning
00:08:55
especially on supervise feature presentation learning have the
00:08:58
potential to attract more informative and most readable features
00:09:03
done those mentally computed in the literature so we had
00:09:09
to propose an overall strategy based on to provide a
00:09:12
presentation learning using different of protective of open colours to
00:09:16
classify patients with different but allows for different boy disorders
00:09:21
we propose or recommend out and call or an accommodation autumn colours
00:09:26
which are trained to extract informative features to characterise
00:09:29
the presence of different but allow this in the speech
00:09:33
and in addition we propose on novel feature set by some
00:09:36
dame construction or or or the autumn colours in different frequency bands
00:09:42
of the first architecture is a combination autumn colour that maps
00:09:47
ah mel filter bank spectrogram we one hundred and twenty eight filters
00:09:54
computer for a time frame or four five hundred milliseconds
00:09:58
so damn obvious conclusion autumn colour is to map or
00:10:02
learn the representation the spatial this special distribution of the energy contained within the spectrogram
00:10:09
the bottleneck features the bottleneck representation here with two hundred fifty six dimensions feature vector
00:10:16
represent a compression radio of ninety eight percent compared to the size of the import spectrogram
00:10:23
this is even an example of how after the training spectrogram looks like i can
00:10:28
direct construct the input spectrogram removing some necessary
00:10:33
information like background noise and other features or
00:10:39
and on the other hand we propose also a regular out and colour to map
00:10:45
to or from well the temporal evolution of the spectral components of a speech frame
00:10:51
in each case the input and output the the same i have our mail filter banks
00:10:56
spectrogram we one hundred and twenty filters at the input i know
00:11:00
they are not the output and the name bottleneck representational so this
00:11:07
as our two hundred fifty six dimension that also represent not an id
00:11:11
eight percent of compression with respect to the size of the input spectrogram
00:11:17
so from both oakland colours the combination alone the recurrent
00:11:21
one wits wrap two different feature sets the first one
00:11:24
is the classical one of my teachers what is the
00:11:28
compression the representation in the middle layer of the autumn colours
00:11:34
well i also propose a new feature set both on they
00:11:38
ever construction or all of the autumn quarter in different frequency regions
00:11:43
basically as i have ah mel filter banks with one hundred and twenty eight filters
00:11:49
i we want to it's right the reconstruction narrow the mean square
00:11:53
error in each of the one hundred and twenty eight band frequency bands
00:12:00
and we want to use those features to classify
00:12:04
different speech pathology is oh no don't callers were trained
00:12:09
this case as well with a c. m. p. s. core course
00:12:12
support was spanish mexicans punish house seventeen hours by that you previously
00:12:19
and i have two different experiments with two different test data the first one
00:12:25
these children with cleft lip on par late there i have one hundred thirty five children we
00:12:31
already surgery it with repair cleverly populate and p. t. a. have the
00:12:36
control subjects these patients are columbian
00:12:40
spanish native speakers and the children spoke
00:12:44
basically insulated spanish words like vocal grapple to move the different
00:12:49
articulators that can be related to the movement of the balloon specifically
00:12:55
impart for the lovely populate and on the other
00:12:59
hand i have data for parkinson's disease classification the
00:13:02
specifically they basically that corpus that have fifty patients with parkinson's disease
00:13:07
and fifty how to control subjects also colombian spanish native speakers
00:13:13
in this case they speakers pronounce different exercises like
00:13:18
the the buyer the kinetic that which is the rapid
00:13:21
repetition of syllables like but back our bread sentences
00:13:25
right there it's a normal lock about the daily activities
00:13:31
first i want to observe how is the abductor construction
00:13:35
or or for the autumn colours in different frequency regions
00:13:38
a disease for they combine those out and colours
00:13:42
and how different they are construction or or for our
00:13:45
parkinson disease patient for a healthy control subject in this case or the combination out and caught there is no
00:13:53
observable difference between the healthy controls on the parkinson disease patient but
00:13:57
for the patient from the other that that's at the class they populate
00:14:02
patients and their whole to control children we observed that there is a
00:14:07
difference in their construction or of the autumn colours the healthy controls children are
00:14:13
there are construction or uh for for the controls children is lower than
00:14:17
the absurd but for the for the children's for the patients this case
00:14:23
and for the record rent out and call their we observed that in this case
00:14:28
there is a difference the appear in the case of parkinson disease patient and
00:14:33
with respect to the whole to control subject in this case there is um
00:14:38
the fact that they patients especially in the low frequency regions
00:14:43
the patience a wreck construct the spectrogram from the patient ever constructed with
00:14:47
lower ever done they had the control subject this is mainly because or we
00:14:54
i believe that it's mainly because they patiently parkinson disease are
00:14:57
monotonic speech and one a lot less speech that could be easy
00:15:03
should be easier to reconcile by the open polar than the normal healthy speech and for the case off
00:15:11
cleft lip and palate speech eh we're subs are the
00:15:14
same effect than in the previous case that they children
00:15:18
with the deceased children with cleft lip and well laid the erroneous fibre then for the whole to control subjects
00:15:25
regarding the classification of boat cleverly populate versus how the
00:15:30
controls on patients with parkinson's disease bows buttons for the controls
00:15:35
we classify the different feature sets and they convolutional autumn quarter only with the
00:15:40
bottleneck features we have a few and are we on the the rock your for
00:15:46
ninety five percent for children with cleft lip and palate buses for the controller on up to
00:15:52
seventy two percent for patients with parkinson's disease buses for the controls
00:15:56
where the the high gets results in this case for children with club they pump our late is absurd
00:16:03
with the combination of the arrow based teachers and the bar let features
00:16:08
using only the recommend open cause or if i had the a and r. e. under the rock your of point ninety five
00:16:16
and with respect to the parkinson disease but also to control subject they best result is the painted
00:16:22
with the combination of the record an album colour them while the temporal evolution of the spectrogram
00:16:29
and with a combination of autumn colours using
00:16:32
both features the arrow based features and the bottleneck
00:16:36
features in this case we have on the on the the rock fewer of point eighty four
00:16:42
oh and regarding this topic what is the ongoing and
00:16:46
for their work we and tool not only to classify part
00:16:50
the presence of the disease but also to evaluate the severity of the patient
00:16:55
a for the case of parkinson disease patient how with the disorder severity of them
00:17:00
and for the case of children with cleft lip and well laid
00:17:03
how within a police station level of the patient how the nasa light
00:17:07
and also to evaluate other diseases in this case like
00:17:12
patience with huntington disease larynx cancer cochlear implant users and others
00:17:19
the disease everything from i think you you have questions or

Share this talk: 


Conference Program

ESR03 : Interpretable speech pathology detection
Julian Fritsch
Sept. 4, 2019 · 2:30 p.m.
160 views
ESR09 : Clinical relevance of intelligibility mesures
Pommée Timothy
Sept. 4, 2019 · 4:49 p.m.
Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.
ESR11 : First year review
Bence Halpern
Sept. 5, 2019 · 11:20 a.m.

Recommended talks

Data-driven Speech Representations for NMF-based Word Learning
Hugo Van hamme, KU Leuven
Sept. 8, 2012 · 2:51 p.m.
289 views