eTherapy

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

okay welcome everybody i'm gonna say some words about e. therapy

00:00:07

okay tonight on um i'll i'll present two examples of his orders then say something about

00:00:15

speech technology uh go on to say effect when we design and you've ever be

00:00:21

a third them and summarise and of course i can't cover all and very thankful for the talk

00:00:27

right before we said a lot about therapy and the question is how can speech technology how

00:00:34

so just for clear clarifying you know when we have the uh uh um

00:00:41

planning respiration formation articulation we talked about this function

00:00:45

of the neurological process we talk about language

00:00:47

disorders we talk about this function of the expectation that this is the voices order

00:00:53

and uh finally at this function articulation week about speech

00:00:58

no image just pick i picked two different things so to speak the

00:01:03

on the different uh uh ends of of these orders so

00:01:07

the first one is the signatures some of it's a

00:01:12

phonetic disorder miss pronunciation of the and the

00:01:15

uh intelligibility is not very strongly compromise and it's

00:01:20

still normal until the age of five

00:01:24

well or doing the second then temptation i'm wrong positioning

00:01:28

of the stuff of pong and keith and

00:01:34

so that would be the right word to pronounce the german word c. brought so you have it in the dental

00:01:43

and lateral

00:01:44

00:01:46

but but the thing is what can he therapy even do when the situation of course what what we need we

00:01:52

need to say that the person we would tell the person to say what a a certain word so that

00:02:00

00:02:03

i'm sorry i took that out before yeah mm hum well let's let's give it a

00:02:15

c. o. g. known she known beep unknown so yeah the slight differences so basically

00:02:25

uh oh what what would that you therapy do well first of all we

00:02:28

need to know the person say the word we tell them you say that

00:02:31

work so papers and say the word so you need a speech recogniser

00:02:35

and then it needs to the speech recogniser already knows what it's what was supposed to be set so you need the

00:02:41

time alignment and then you need to go into the place where you you say this is where i want

00:02:47

where i'm gonna look at and is syrup is there and prove meant is that correct or so on

00:02:53

now uh uh let me look at another varies different one

00:02:59

so examples for disorders advantages of parts yeah we're needed

00:03:29

we really got madder than a smoker otherwise when

00:03:32

of a recogniser and repeated separate um they

00:03:38

what what can speech recognition to it's it's a very it's it's it's a very

00:03:42

you want much more difficult ah because you don't really hear what super announced

00:03:49

correctly what the articulation in order but the problem is did he say a word

00:03:53

the new lodges and given you as a speech recognition system the problem

00:03:58

did he say a word out of my lexicon is it something out of vocabulary

00:04:03

i needed a very different approach to the technology if i wanted to

00:04:07

use that for a a a pack the thing with something

00:04:12

and uh you know depending on the the the the the disorders

00:04:16

that we have we have voice articulation semantics morphology pragmatic

00:04:21

and it might be affected some of them we know we do your effect and then that haven't

00:04:26

and in fact on the speech technology to the you're gonna use for the therapy in that case

00:04:31

because then you know in some cases it suffices if you to to say okay i'm gonna

00:04:37

tell you this is the x. is that you have to do repeat these words happen

00:04:43

and in other situations it doesn't matter so much of you know you want to

00:04:47

to to to do with therapy things on like memory skills or so

00:04:51

in this case you know memory scale works recognition as much uh has a very different uh a requirement

00:05:00

um and if we look at at at what can speech technology do in the medical domain

00:05:06

well the diagnosis is how intelligible is the patient would be an into like a a holistic impression

00:05:13

okay how how intelligibility i just want to find out that what i need an evaluation sure if

00:05:18

i have a house from the that's the pay next allies that's a very distinct asked

00:05:23

in this case i needed different uh it that very often but

00:05:27

fourteen intelligibility we have shown and many groups have shown that

00:05:31

a a speech recogniser can replace the human listen and naive listener by just you'll you do give

00:05:39

the person the text he reads it and you just see how many words were correct

00:05:42

and it's a pretty good it's pretty good at uh at now if

00:05:46

if you may still lies in that aspect you might need

00:05:50

not only can i recognise somehow intelligible but also i need a phonological features i need

00:05:57

the features that give me phonological realisations that are trained on plus minus nasal

00:06:03

okay so in a uh so that's that's the diagnosis and therapy

00:06:09

control has the situation of the patient improve doing therapy

00:06:13

well in some way that that's the same thing as diagnosis except now i need in a user model right and you

00:06:21

say okay you started here where he's going but otherwise the

00:06:25

on the like diagnosis is is the same thing

00:06:29

if if if you start you know you diagnose the person at a

00:06:32

certain moment and then three weeks later another three weeks later

00:06:37

with the same thing and then you have the user model e. to say okay now with you i can detect a change

00:06:44

of course when you have therapy different or methods you can say which of the therapy

00:06:48

method leads to press the best result for a group of patients which is

00:06:52

again in some way therapy control last now i have two groups

00:06:58

monitoring is there a change in the patients situation in a way the same

00:07:04

underlying technology just with you need to tweak on the parameters because

00:07:08

you know you're you're looking for mile four minute changes were in the therapy you looking for okay i can see a strong

00:07:16

screening does the person show early signs of a depression for instant yes

00:07:22

screening is a little bit out out here because i mean

00:07:25

here you try too much more look for control words patient and

00:07:30

finally that's what we're talking about here computer assisted therapy

00:07:33

it the patient perform the exercise correctly of course that had a lot to do

00:07:39

with diagonal the but there are people call the underlying technologies them but

00:07:44

what is new here what is really not so much is the the patient much more

00:07:50

what you were saying about that but but but you know the it it i cannot i i

00:07:57

have to to encourage the person to keep on going the gaming or so effect you know

00:08:02

so the motivation is something which is much more important and out of

00:08:06

that and we see that the graphical user interface is is is

00:08:10

it's much more important then in a situation up the therapy controlling

00:08:15

the therapy control i have to to to to graphically show

00:08:20

the the therapist up eight you know the person was here now is

00:08:25

here uh with the patient that's on a new different ball

00:08:32

so let's say livid about speech technology

00:08:35

um and i will address within phoneme uh recognition

00:08:40

acoustic speaker model laying prosodic analysis and visualisation

00:08:44

and you know i will say a little bit here and there and you can they are but i can use a

00:08:49

uh uh i can stop that with the net instead of this classifier doesn't really

00:08:54

matter that much i mean the the idea is on the line every

00:09:00

and i see i i like that's like um that that's from

00:09:06

mike to genie which any of 'em i. b. m. initials

00:09:10

how from nineteen ninety to wear a switchboard was

00:09:13

introduced over thirty years the recognition error

00:09:18

uh changed now this is a recognition problem that is

00:09:23

i'd say well nolan it's difficult and the recognition rate so far from things with

00:09:30

medical domains but still and you know granted that this is a logarithmic scale

00:09:36

what we see is we started thirty years ago what eighty percent error

00:09:41

and then you know you had improved fast added patient of the of the hidden

00:09:46

markov models and improve it markov model training and it got down to about

00:09:52

twelve percent and now in the last three four years the impact of the learning really got

00:09:58

us down to five percent and granted that the recognition rate in in in in other

00:10:04

the costs that when not as familiar with didn't work and all that hot for thirty years and a

00:10:10

lot of people have work and oh so great that that it might be a little bit worse

00:10:16

um on other task it did bring us closer to human performance so

00:10:21

what we have right now is technology that really is close to

00:10:26

human performance and that we can store really thing about using

00:10:32

and you know speech technology when you have a speech of phoneme recogniser

00:10:36

it it's still there is end to end systems but still the majority is hidden markov

00:10:41

uh or model pasted off the shelf technology you have hidden markov models

00:10:47

i'm in the medical domain you will you might you you might it might be helpful to be able to adapt

00:10:54

a a two eighty uh that with a small amount of acoustic and language model uh training data

00:11:01

and typically you take them out cepstrum coefficients and energy and the rivet now

00:11:07

but think the good thing is there's really um oh what recognition out there that you can use

00:11:16

on copy and one of our partners that u. m. l. the

00:11:20

european media lab so with you really wanna use anchorage

00:11:24

uh some of you yes ours to to think about should i spend some time there they

00:11:29

will allow you to use a a speech recognition system and to adapt the language model

00:11:37

well

00:11:39

the the the um the possibilities you you either you train

00:11:44

your role or you use an off the shelf

00:11:46

recognise and you might want to have to do at that uh that so i'll acoustic model ling

00:11:54

um if you if you if in the therapy you need to learn to do some of the acoustic modelling it might be

00:12:01

beneficial to happen at a access to the acoustic model which in in in this case might be good to to

00:12:09

train a standard recogniser using call the and then take some

00:12:15

of your uh uh um but the locket speech samples and that that which

00:12:21

means that your speech recogniser now goal was from a naive listener

00:12:29

who is trained on a human who on a human speech model for initial for terminal for dutch

00:12:36

to a more expert listener who will list who will be trained

00:12:40

people already have heard some of the pathology that you're looking

00:12:46

and you know typically you use them state of your adaptation for if if that is that

00:12:52

now language modelling depends on the kind of spoken text so if you haven't known text

00:12:58

you've you might wanna restrict to the vocabulary restricted vocabulary because

00:13:04

you know always reading the north wind on some

00:13:07

you know he's saying isolated word you know always reading sentences

00:13:10

so might as well restrict your your um vocabulary um

00:13:19

to to to the word of that to the path that and still see how good this recognition

00:13:24

if you have an on a a text on onto a radio raider you can

00:13:31

replace one of the raiders by a a medium size vocabulary on you know on it

00:13:36

that that uh uh recognition system in that language medium size that'll do the job

00:13:42

you might in that case just use of every stupid language

00:13:46

model because language model tells you what sequence of words

00:13:51

comes in what order and in that case you know if you're looking for

00:13:55

the acoustics you wanna put emphasis on how does he realised that sentence

00:14:00

if i give him the sentence they read that so you might which the language model

00:14:06

it's on the other hand you you want to see if uh a person with dementia how we see saying

00:14:11

these words of course the language model will be very important and you can't do that so spontaneous speech

00:14:17

means it basically you want to see how the person produces the speech

00:14:23

what other words what is the on the lines and that

00:14:26

what is the underlying semantics in that case you you want to to be as

00:14:31

error free as possible and then do an analysis based on the transcription

00:14:37

so based on what you do when your therapy uh it it it have an impact on

00:14:43

what we were going at that the line that the phoneme or speech recognition stuff

00:14:50

00:14:52

another thing is that you don't want to go and and use for word recognition but care to write the

00:14:59

acoustics direct now in this case you would you would you say acoustic space of speakers you can model

00:15:07

can model an acoustic space i have all the people who speak english then i

00:15:12

have some known nonnative set some women that's yeah i can model the variation

00:15:17

so the space represents the multidimensional characteristics uh off voiceover speak

00:15:24

and the degree of pathology varies in this acoustic space and it's one dimension in the acoustics

00:15:31

and if i keep certain other ones constant so i only look at

00:15:36

young people and i only look at young uh a controlled another's

00:15:43

then i can say okay one of the bed and they

00:15:45

all say the same thing then one of the variations will be via uh in the acoustic space according to the pathology

00:15:52

and then i want to try to find characteristics of the degree of the speech or speeches or

00:15:58

and so as i said you know one when i hear it is taken from speaker recognition systems

00:16:05

and and you can replace now the u. b. m. with the

00:16:09

neural network but basically they all do the same idea

00:16:13

they say i'm gonna model the space and i'm gonna model the deviation

00:16:17

whether that is an eye vector or a change in a universal background model it doesn't matter you know

00:16:24

um so you you one of the acoustics for instance back out

00:16:28

mixture model you train a universal background model with normal speakers

00:16:33

your training gaussian mixture model with a have a lot six pieces and then you transform

00:16:40

this model into a vector and then you do a classification or regression

00:16:46

so to to to to say let's say this is my feature dimensions

00:16:53

okay and i start training a system and basically the variation here is

00:17:00

he acoustic variation of the underlying phones 'cause that's the strongest deviation

00:17:07

so now i have a model i have not mean backed and might deviation might might covariance matrix

00:17:14

then i'd just take some pathology could be one person or or pathology pool

00:17:22

and i train the haitian how do these underlying phonemes no t. v.

00:17:31

then i have those model and i can transform just taking all the mean

00:17:36

vectors or all the covariance matrices i get them into big back

00:17:40

so now i asked the person to speak something and out of a certain

00:17:45

i have a fixed i mentioned vector and now i can use that

00:17:51

to say the differences between my groups now either i have my super vectors and a half two

00:17:58

groups like without you wanted to or that probably you wanna control and i want to classify

00:18:04

in this case i pick some classifier which will training me to fuck spaces i have a new

00:18:11

person i just look what type you don and i cough cool based on that stupid

00:18:19

the other it's uh uh and and noticed that beneath them engines are now the mention of my

00:18:27

in this case my sober vector spaces only the x. x. it

00:18:31

and i think we apology is the wire

00:18:36

and now what i want to estimate is not classified healthy or not i want to classify how strong is the degree

00:18:45

again i train my so my so go back 'cause i train some kind of repression i'd

00:18:53

create any uh support vector for a pet speaker i estimated degree pop up without

00:19:01

okay now this is the underlying technology that you used to say

00:19:06

did you perform it correctly has improved from the last session to the news

00:19:11

basically you you measure the degree well what do you need then for

00:19:16

the in therapy well you need your model to say it has

00:19:22

um the last time you perform these exercises like that now we've improved and

00:19:29

the underlying technology used a a correlation that estimate for the week

00:19:37

00:19:39

things that that we do our uh in this direct

00:19:43

marketing is that we we calculate acoustic or prosodic

00:19:48

or form a phonological features and these are the underlined dimensions for i was uh uh so perfect

00:19:55

so prosody is written into nascent stress related attributes we

00:20:02

compute these onward levels across several works across

00:20:06

syllable nuclei in which case we need

00:20:09

uh automatic uh speech recognition what we also can do is just to say well the hell with it will just

00:20:16

may show them all ten milliseconds and then compute function over

00:20:21

so in that case we have something like the mean f.

00:20:25

zero role and the standard deviation and and and

00:20:31

we can have local features like part of before and after segments

00:20:35

we can then calculate dysfunctional slight mean standard deviation maximum minimum

00:20:42

um we can have global features so independent of what the person

00:20:48

says i'm gonna have something like the children the shimmer

00:20:52

um or the voiced unvoiced characteristics i get about two hundred features per

00:20:58

utterance and then i'd take the functional it's and i get about

00:21:03

four five thousand six thousand features that get enough functional is like position in the third what time of

00:21:10

fourth what have first what pile standard deviations tunas and so on i can get a huge feature

00:21:17

and if you look at one of our partners the uh university of awkward cannot

00:21:23

hearing you can get and feature vector that will do the job for you

00:21:28

it's it's a six thousand dimensional feature vector it's it's a good start and then you can look at what's

00:21:35

up vector of this is helping okay but it's a good start you can download open smile um

00:21:42

uh uh it it's been used for ten years in the in the speech um

00:21:48

a a para linguistic challenges and always produced competitive results and then you

00:21:53

know this the work begins because then you have to say

00:21:56

which feature really shows me in my situation something back and then use

00:22:01

as an indicator you did that correctly or there's an improvement

00:22:07

and another one if you go to the it yep form bar here you unit can

00:22:13

get phonological features rented that they were trained on english but they work amazingly

00:22:18

well even in other languages and if you do have good um if you do

00:22:23

have good data you can retrain the whole thing on your own thing

00:22:30

again what you basically get is you can transfer to a certain part of the speech signal into

00:22:37

the the feature vector that says howl nasal lust was that's out how ms like what that sound

00:22:43

and then you you create functional suntan and say how strong is an excellent station over the whole complete other

00:22:51

um so basically what we use for evaluation well we're

00:22:56

word accuracy in word correctness for intelligibility we have

00:23:01

calculated features based on the acoustic models on the prosodic model

00:23:07

um we can use correlations like spearman pearson uh um

00:23:13

based on these calculated features or the word accuracy

00:23:16

word correctness uh and compared with human listeners

00:23:20

um we can classify based on these calculated features or interpret interpret which of

00:23:26

these relevant features which of these uh uh features with the most relevant

00:23:31

now another thing that um

00:23:35

or the or the last thing i want to within this uh objection i wanna go go to is

00:23:40

how could visualise you know a a therapist as wide use

00:23:44

like that's your problem they improved enough i tell them

00:23:47

well that's because i can look at the variation in the seventh not cepstrum coefficient is not gonna believe that

00:23:54

okay so so basically station is important but keep in mind busily chaise in

00:23:59

can be either for the uh uh for the patient or whatever

00:24:07

so i'll basically when when we transform the speaker into a vector this beach

00:24:14

can be seen is the point in the very high dimensional space

00:24:18

okay i have us huge factor in it has not cepstrum features prosodic features whatever

00:24:24

i i find out okay this is an important one then every to sit

00:24:27

in dimension to you i still have a hundred dimensional feature space

00:24:32

that's pretty are quite so basically what we do what how helps a lot

00:24:36

of what i can recommend this you transform that into us lower space

00:24:41

and then you you you explain you say okay i need a lot of time mentions

00:24:47

to get too into that but now if i try to read to was that

00:24:51

with certain techniques then i can show you a group of people and

00:24:57

where is my patient okay and i did that as an example

00:25:03

um we started with young reference book okay now we talk some

00:25:09

that cepstrum coefficients because with that both the character right no

00:25:13

it's interesting to see okay here is the males here's the thing so there's no mention nicely separates the two people

00:25:21

so now we add it

00:25:24

i'm old reference book and in that case only male

00:25:29

so now you still have that stop

00:25:33

space

00:25:35

and then you space for the for the new speakers how do they come up well the acoustic difference

00:25:42

give someone new subspace that is close to the mail subspace because they're all elderly met okay

00:25:50

now i play uh people with the phone pathology in this case

00:25:55

um after the speak with the uh at the learning correct me so

00:26:01

of course they will be most of them are elderly

00:26:06

in this case they were all male that's why we added bills control speakers what what what what i am well i cannot

00:26:13

be somewhere closer to them than that i'm right which is very nice yeah that's where my larry get the means come

00:26:22

now i get some chronically boards

00:26:26

people well you know i don't expect them to be here expect them somewhere here right and that's exactly what happens

00:26:34

see here if your a lower and lower lower rachel voice here is young elderly speakers use

00:26:41

you young speakers here it's your chronically four speakers nicely separating the male from the female

00:26:49

and if i have a transformation like that and i project a new patient in it not project a

00:26:55

new patient ordered protectorate i can show that to somebody who doesn't know what about cepstrum coefficients

00:27:01

i can still say or tell them okay this is when you started and this is where the therapy problem too

00:27:08

okay so basically you need to find a way to transform

00:27:17

as speakers state or speak in space which is very high dimensional into low dimensional

00:27:23

and then you say okay this is speaker group one this is big group to this is bigger group three

00:27:29

this is way to store that this is where you was not after three months after six months

00:27:34

so that this visualisation is very very important because you can not argue

00:27:39

with certain uh features unless you have a very good feature for exactly one exercise

00:27:45

when you say okay say that uh isolated by our nose say it as long as you

00:27:50

can and then you say okay your pronunciation time or five seconds six seconds seven thing

00:27:56

but very often you you have um high dimensional space in which case it's important division

00:28:05

so when when we had exercise with isolated by all of all we need the phoneme recognition and it caught

00:28:12

up acoustic and prosodic features like you know shader or some like the one we have unknown text

00:28:20

technology can give us a naive listener how intelligible is that

00:28:27

unknown text to listen of course not to the speaker when we have

00:28:31

known text we kind of can simulate a a expert listener

00:28:35

we need worked recognition we need acoustic speaker modelling we need prosodic analysis

00:28:42

and when we have spontaneous speech we need to we might need prosodic analysis

00:28:47

as we saw in the the main showcase you know how long are the parts is very very important

00:28:53

we also need a syntactic semantic analysis based on the word recognition

00:28:58

so word recognition is only the tool and then you look at the

00:29:01

words and you simulate okay i assume that uh i have

00:29:06

sixty seventy eighty ninety percent correct it's it's and it's

00:29:12

in my opinion it's still on soft how how good for instance for

00:29:17

the men shot detection we don't need a hundred percent worker

00:29:21

i think you know i i

00:29:25

i i don't have the data but that would protect protect

00:29:28

somewhere at at eighty five uh percent you can stop

00:29:32

because those fifteen percent will not don't don't try to go to a hundred percent you know

00:29:38

the stork with word recognition that doesn't make errors and then still assume i have a hundred percent

00:29:46

uh on the other hand you know that if you have a

00:29:51

and that leads me to the to the to the next subject

00:29:55

um if you if you have a two class problem like the the patient

00:30:01

perform something correct it's a two class problem and you have eighty five

00:30:06

percent that's pretty good or eighty percent but eighty percent means in one

00:30:10

out of five pratt exercises you tell the person the wrong thing

00:30:15

and that is wrong

00:30:17

because that thing with it so it's much more important doing how

00:30:22

urchin okay so that you know like like if you look

00:30:28

at what the teacher to what is it the bright purpose to see immediately adapts to the level of the person

00:30:34

and then he makes corrections in a certain amount like we had a a a

00:30:41

language teaching tool we asked people the teachers to say when would you intervene

00:30:47

and

00:30:49

they listen to the stuff and they said okay i would not accept this pronunciation that that's that's the agreement was horrible

00:30:56

what was interesting though that every single teacher on average

00:31:01

over all that gets more like five percent

00:31:07

which means you know on average i'm gonna interrupt my teacher and say no

00:31:11

that was not correct on average in one out of twenty cases

00:31:16

okay so five percent mark for the same thing you look at all the teachers which means

00:31:23

you know you can always sometimes uh that we pronouncing it wrong

00:31:28

still encouraging encouraging and that's that's another thing then you

00:31:32

don't wanna do the same thing to the therapist you wanna tell the therapist so the visualisation has to be different

00:31:40

okay so let me as a uh come to some aspect when we design and if there

00:31:46

are people on the segment has some we designed a couple years ago or tool

00:31:51

um it has to be child according to the h. soul you

00:31:56

know they loft animal so we always all the all

00:32:00

the words that we got to have had something to do with animals it was that kind of uh uh um

00:32:07

easy interface um they were doing it together with the therapist

00:32:14

um and and of course but you know you you need a child appropriate response and

00:32:19

a lot we we played around and a lot of smiley it's you know

00:32:24

on the th speech technology we used was phoneme in word recognition basically we talk of

00:32:30

uh uh the word did you say the word then we went in the phoneme

00:32:34

how good was the phoneme found some phonological uh uh knowledge and

00:32:40

what we're currently working on it or that this ought to get tool um

00:32:46

and i'm just gonna say some of the work that we're planning we're currently implementing it so

00:32:53

what we say it it's got be big enough so we don't train on the on the on the smart phone we have a habit

00:33:00

and what we're planning to do is make it simple for the people

00:33:04

so we have the the training that's the first takes the hell and the upload into onto the server

00:33:13

and uh the training always tells us where are you in the

00:33:17

training what is the next exercise and then let's start

00:33:23

and then you know you have a uh the best turned into a display where the therapist

00:33:30

gives you that there are uh gives you the exercise and then you read it

00:33:35

and we have a whole battery of exercise and what we believe is that you know it's gotta be good

00:33:43

for the therapist and for the patient so the therapist should be able to say okay i can personalise

00:33:50

so we know in order to personalise it you say okay have a whole big battery right

00:33:54

of of of therapy so when you say something like type token aces articulation

00:34:00

lip closure or sustained vowel how long perturbation lip shaping and so on

00:34:05

so in this case we have a camera and the microphone

00:34:10

and the therapist it's okay to this exercise and

00:34:14

then the the uh you see what what am i supposed to do and you do it

00:34:21

um we the patient can view with certain aspects and like i said in

00:34:26

that case it's gotta be very simple like to ration of of well

00:34:30

so it's gotta be something where you say o. k. a. cola you did you know you get that now for

00:34:36

seven weeks in every week you get to a a a better you start it very good now you so

00:34:43

so the patient can few with the results and compare peppers performance

00:34:47

over time and it helps them to pete motivate okay

00:34:53

um that the results are also uh uh this uh both for the to the physician of speech therapist

00:35:00

and what we plan to do is for privacy reasons only do the evaluation on the device

00:35:09

um so

00:35:12

but it it p. it's important to to to keep the motivation high it's much more important keep

00:35:18

the motivation high than to say you get this wrong listen to me let me repeat

00:35:22

so you'd rather when you set the threshold you should rather say

00:35:27

okay good let's continue will uh instead of saying repeating it three four five

00:35:32

times because then the person says i can't do it and if so

00:35:37

it's also important to pay the fee that you know you do when good you're getting better

00:35:43

now for the therapist it's also important to monitor the patient

00:35:47

over time but also project them into this petition space

00:35:50

as i indicated before where is he where was the easy making progress

00:35:54

it is this improvement good after three month or is he

00:36:01

compared to the other people much slower within proof or see more constant

00:36:08

and so so so you have the results for two

00:36:12

different evaluations for the patient results to the previous

00:36:17

results of the same patient and for the therapist you take into account also the patient group

00:36:25

00:36:27

i think it important thing is also the the the the training and have to have some

00:36:33

some uh uh update information so the the so it it it should be able you should be able to contact

00:36:40

the therapist the therapist should be able to download after he

00:36:45

sees a result download the exercises and and it should

00:36:49

also have an alarm from some function hey you didn't do your exercise if you wanna do it now

00:36:55

on so new patients have to be registered easily uh so the interface to the the therapist

00:37:02

very often therapists are not very keen on technology so it should be easy to register

00:37:09

and it should be easy to town download the trainings fine so you we

00:37:14

have a hierarchy of of of exercises and what we're planning to do

00:37:19

is is to say okay now i wanna practised for this i have just with this exercise

00:37:25

let me individualised the plan for the next five of four weeks for the words

00:37:32

so let me summarise um the disorders affect different linguistic levels

00:37:39

speech technology can guide through therapy we need for the e. therapy and incremental

00:37:45

user model which we don't need for diagnose it we don't need

00:37:49

uh that much for a a therapy control um so the incremental user model has to update

00:37:55

itself after each week or a check to size the pain has to be encouraged

00:38:02

uh too many correct corrections that discouraging so rather have

00:38:07

this you know remember the phase where you know

00:38:09

it's mild it didn't smile and the the one in between uh_huh but okay let's keep on

00:38:16

third it uh has to have the possibility to individualised the therapy so

00:38:21

i don't believe that that that's a stand alone therapy you buy and that's it and you do that

00:38:26

that's not gonna work you need somebody to look over it and individualised it and the the performance analysis for

00:38:34

the patient has to be different from that for that that with that i thank you for you

00:38:40

listening

00:39:11

oh absolutely

00:39:18

i i agree with you on the thing is i can do with something with eighty percent even if i have only eighty percent

00:39:26

on the top five problem because the only thing i need to do is have them do when x. is that ten times

00:39:32

and then my eighty percent will say he's doing it right or wrong

00:39:37

okay so

00:39:39

where would be eighty percent i. k. i should not give

00:39:43

the individual feedback too often i mean i agree with

00:39:47

you that you know if it's very clear that's that's what i said that the teacher will correct the

00:39:54

the try out the one language learning eye on average in five percent some with three percent

00:39:59

some with a eight percent but on average you know if you if

00:40:02

you look at the docking curve it was all wonderful around

00:40:06

uh uh for twenty teachers we had twenty teachers and they all had the the carton around five percent over all the students

00:40:13

so you know if you're sure then intervene but the tendency even

00:40:19

with the eighty percent you can give a good tendency because

00:40:22

you know you're getting pretty good when i mean if you do it ten times that means in eight out of

00:40:28

uh uh ten yes you're doing the right thing and then you much more sure about

00:40:33

but what you do one and then you get the therapy uh the feedback he

00:40:36

didn't do with that good enough that it's the one where he still has

00:40:40

where he got the worst course he can that the therapist can control it and then say okay let's keep on doing that

00:40:47

that that's that's what i'm but i think that's an important aspect because otherwise you tend to discourage and then they stop

00:41:12

that that what's the you face a in in effect yeah i i i think

00:41:18

what we need to do is is is have not uh to class problem

00:41:23

but the bike that into i'm definitely gonna say that was wrong let me

00:41:28

play it again and then something like all in could you please repeat

00:41:34

and the other one was good let's go one and we we need to it that that but i i think you

00:41:40

you know user um i play around with users or or say i'm gonna i'm gonna

00:41:47

be at c. b. or teacher or a soft teacher you know that that

00:41:53

where you shift these thresholds when may do you make that decision on when you would do you

00:41:58

make that decision that is something that that we uh uh we should we should do

00:42:14

the therapist have to do that i don't think i don't think the in therapy can do that if i a human

00:42:22

so much better it at any time at their place and i know this is the person who who will

00:42:28

uh uh to give up easily they're not better take that that the soft teacher

00:42:33

and i know this guy really really really wants to get better you know i can give my heart teacher

00:42:39

i think i think it you know i would i would never touch that right now with a machine i think the therapist has he

00:42:46

uses i mean basically it's that the what's the difference between therapist and

00:42:51

machine learning the machine learning is trained on the scene data

00:42:55

it doesn't know all that this is a person who will give up easily

00:43:00

a therapist you know he uses all his common sense and evaluate the person

00:43:05

in many different dimensions these dimensions ah not known in the statistic model

00:43:10

so don't leave that to the machine give that to the teacher and say you know

00:43:28

you

00:43:35

that

00:43:36

matthew i agree but that's big data analysis that like data analysis that's

00:43:41

not something because that information is not there for the machine learn

00:43:59

yep in our in our terribly uh uh uh uh uh uh system what we do is

00:44:07

um the therapist makes does the exercise and it's still

00:44:12

and then you know in the case of correction it says let me show you again and then it shows you again

Share this talk:

Conference Program

44:08

Introduction to Phonetics and Speech
Rob van Son, Amsterdam
Sept. 24, 2018 · 9:02 a.m.

297 views

44:10

Dysarthria
Marc de Bodt, Antwerp
Sept. 24, 2018 · 9:45 a.m.

303 views

47:33

Children’s speech: development, pathologies and processing
Alberto Abad, Lisbon
Sept. 24, 2018 · 2 p.m.

102 views

45:13

Speech after Treatment for Head and Neck Cancers
Michiel van den Brekel, Amsterdam
Sept. 24, 2018 · 2:45 p.m.

144 views

41:11

Speech therapy
Marc de Bodt, Antwerp
Sept. 25, 2018 · 11 a.m.

114 views

44:32

eTherapy
Elmar Nöth, Erlangen-Nürnberg
Sept. 25, 2018 · 11:45 a.m.

128 views

36:49

Assessment in speech disorders
Virginie Woisard, Toulouse
Sept. 25, 2018 · 2:45 p.m.

140 views

Recommended talks

13:25

Bridging the Gap between Signal Processing and Learning - Research Overview
Rahil Mahdian and Youssef Oualil, Idiap Research Institute
June 21, 2012 · 11:44 a.m.

154 views

eTherapy
Elmar Nöth, Erlangen-Nürnberg

Embed

Transcriptions

Conference Program

Introduction to Phonetics and Speech
Rob van Son, Amsterdam
Sept. 24, 2018 · 9:02 a.m.

Dysarthria
Marc de Bodt, Antwerp
Sept. 24, 2018 · 9:45 a.m.

Children’s speech: development, pathologies and processing
Alberto Abad, Lisbon
Sept. 24, 2018 · 2 p.m.

Speech after Treatment for Head and Neck Cancers
Michiel van den Brekel, Amsterdam
Sept. 24, 2018 · 2:45 p.m.

Speech therapy
Marc de Bodt, Antwerp
Sept. 25, 2018 · 11 a.m.

eTherapy
Elmar Nöth, Erlangen-Nürnberg
Sept. 25, 2018 · 11:45 a.m.

Assessment in speech disorders
Virginie Woisard, Toulouse
Sept. 25, 2018 · 2:45 p.m.

Recommended talks

Bridging the Gap between Signal Processing and Learning - Research Overview
Rahil Mahdian and Youssef Oualil, Idiap Research Institute
June 21, 2012 · 11:44 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

eTherapy Elmar Nöth, Erlangen-Nürnberg

Embed

Transcriptions

Conference Program

Introduction to Phonetics and Speech Rob van Son, Amsterdam Sept. 24, 2018 · 9:02 a.m.

Dysarthria Marc de Bodt, Antwerp Sept. 24, 2018 · 9:45 a.m.

Children’s speech: development, pathologies and processing Alberto Abad, Lisbon Sept. 24, 2018 · 2 p.m.

Speech after Treatment for Head and Neck Cancers Michiel van den Brekel, Amsterdam Sept. 24, 2018 · 2:45 p.m.

Speech therapy Marc de Bodt, Antwerp Sept. 25, 2018 · 11 a.m.

eTherapy Elmar Nöth, Erlangen-Nürnberg Sept. 25, 2018 · 11:45 a.m.

Assessment in speech disorders Virginie Woisard, Toulouse Sept. 25, 2018 · 2:45 p.m.

Recommended talks

Bridging the Gap between Signal Processing and Learning - Research Overview Rahil Mahdian and Youssef Oualil, Idiap Research Institute June 21, 2012 · 11:44 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

eTherapy
Elmar Nöth, Erlangen-Nürnberg

Introduction to Phonetics and Speech
Rob van Son, Amsterdam
Sept. 24, 2018 · 9:02 a.m.

Dysarthria
Marc de Bodt, Antwerp
Sept. 24, 2018 · 9:45 a.m.

Children’s speech: development, pathologies and processing
Alberto Abad, Lisbon
Sept. 24, 2018 · 2 p.m.

Speech after Treatment for Head and Neck Cancers
Michiel van den Brekel, Amsterdam
Sept. 24, 2018 · 2:45 p.m.

Speech therapy
Marc de Bodt, Antwerp
Sept. 25, 2018 · 11 a.m.

eTherapy
Elmar Nöth, Erlangen-Nürnberg
Sept. 25, 2018 · 11:45 a.m.

Assessment in speech disorders
Virginie Woisard, Toulouse
Sept. 25, 2018 · 2:45 p.m.

Bridging the Gap between Signal Processing and Learning - Research Overview
Rahil Mahdian and Youssef Oualil, Idiap Research Institute
June 21, 2012 · 11:44 a.m.