Voice source analysis

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

morning eh my name is the modifier or scroll uh it i am from columbia i. e. the the p. h.

00:00:07

d. without worrying a london eh currently i am professor

00:00:11

in the the university of what you're feeling columbia

00:00:14

and i am also related to let me in like that eh and today i'm gonna talk

00:00:20

about speech signals for presentation use in in your production made about a l. p. c.

00:00:29

so we're gonna talk about uh a little bit about the vocal

00:00:32

tract model in a then a about the source filter model

00:00:38

there we are gonna talk about what we can do with this kind of models there and at

00:00:43

the end about what we can do with the residual signal and the p. l. p.

00:00:48

like most of the information here it comes from the book of one class at all and all that

00:00:57

oh okay so what's the vocal tract typically the vocal tract the is divided into three the

00:01:05

pieces that say the these area is a core the

00:01:09

sub laura track is mainly the locks and

00:01:14

the this part of the of the trapped the before the glow cheese then

00:01:18

between the glow teas and of heirloom is called the vocal tract

00:01:24

and through the nasal cavities call the the nasal the tract

00:01:29

so when we talk about vocal tract we mainly talk about these area

00:01:35

but we have to keep in mind all of these eh part because is what the the energy is produced

00:01:41

with the not the speech and also about the nasal

00:01:45

cavity because it's important to to produce nasal sounds

00:01:49

for instance the end or in in portuguese is also important in french also

00:01:55

and he's here in the vocal tract the we have to keep in mind the

00:02:00

what is called the articulators particular resumed its are mainly composed by the value

00:02:06

that song

00:02:08

the lips and the joe okay the so the to the task

00:02:14

of the balloon is mainly to open or close in

00:02:17

the the the are passing through the nasal cavity okay so when you don't have the value more

00:02:22

you're going the you're not able to control it then your speech uh it sounds different

00:02:30

okay and this is just assume in of the previous uh figure

00:02:34

to highlight where the vocal forts are you're exactly the

00:02:38

glow in the boat is so basically eh we model the

00:02:42

or the producing the lungs and then the pressure here

00:02:47

starts to to to go up and then at up to the point that that is the

00:02:53

the vocal cords have to open the two let's do the air passing through and then

00:02:58

the pressures starts to go down and then the vocal folds close again and then be open again when the pressure is high you know

00:03:05

so that's why the vocal folds rubber grating when you are eh speaking

00:03:10

okay so boggle fords are a crucial to to produce a speech then that song

00:03:16

the chunk is very important because depending on where you put the tong then you can produce different sounds so

00:03:22

basically when you model the vocal tract you try to model resins is here in the vocal copy okay

00:03:30

and the bizarre to to the upper and lower lips which

00:03:34

are also important because then you you change the

00:03:38

the shape of the vocal tract and depending on the shape then you get different sounds and different resonances

00:03:44

and again this is the value to let you the last uh air through the nasal cavity

00:03:51

okay so we

00:03:53

would we model all of this the vocal tract through all in your filter

00:03:59

okay so linear uh in this case means that we're

00:04:02

not considering eh changes due to thermal changes

00:04:06

or due to the viscosity in inside the the the cavities okay

00:04:13

and we consider the input of the over the filter that b. d. care that is coming

00:04:18

through the the vocal cords and the filter itself is the the ease modelling the resonances

00:04:25

here in the vocal tract so note is that we're not considering the nasal cavity

00:04:32

so uh the if we want to waddle phenomenon uh ya phenomenon related to

00:04:37

to nasal problems or or the problems to control the volume liking

00:04:42

children with eh cleft lip and palate any have to introduce something else

00:04:49

okay so here we have the excitation signal this is the linear filter and output is to the speech

00:04:56

signal and so this is what i said before this is the

00:04:59

and are also showed the the lower volume velocity starts

00:05:04

to to increase when the pressure here is increasing and then up to the point the vocal folds open then

00:05:12

it's not listening to it starts to increase because years passing through and then starts to go down again and then the

00:05:18

vocal folds on our clothes and then the open again and it's up real big or a quasi periodic or phenomenon

00:05:27

and we are also mean in this model of the vocal tract but the way if

00:05:32

you some mechanical it is actually a mechanical signal and is our plane signal okay

00:05:39

that is important for the model was on that and it is propagating through their live too far right groups

00:05:50

so i hear as as well as i said the shape of the vocal tract the changing where your speaking so

00:05:59

this this uh it shows that changes so the the crows area of the

00:06:05

of the to of the tube which change in over the time

00:06:09

and over the distance in this case distances from here to here froze from the plot is to the the lips

00:06:16

so from the got it to lips before the model we assume but these are a few can be model by

00:06:23

eh on a ray of of concatenated a small slices

00:06:28

of fields there without any loose among them okay

00:06:33

so the only thing we uh consider is the changes in the area but no change in in the time

00:06:39

okay so we're not including the time domain when you really i assume these they're slices and now

00:06:45

we are going to model what one person one is like there's lights or slice yeah

00:06:53

so we take one of those slices

00:06:58

then we have a we consider there is our fluid was into the to which is the air in this case

00:07:05

so there is a certain uh press your enter in to the beginning of the end of the

00:07:10

end of this is small slice so we assume that is the same in in both sides

00:07:15

and the slices uh just uh both of of the distance and if we

00:07:20

take another slice a little bit uh uh a larger some delta

00:07:26

then there is a difference in pressure here we also changes in the distance not in that time

00:07:33

and when we solve the to the the fluid eh dynamics equations

00:07:39

then we get the this equation system met basically model in

00:07:43

the change changes in pressure over the these times and changes

00:07:47

in the lower velocity volume the in the distance okay

00:07:53

when we solve articulation system we find is a these two

00:07:58

equations for the velocity volume and for the pressure

00:08:01

'cause i'm is basically a these new blasted symbol is a a telling us uh they are

00:08:07

was in from these c. eh sliced to the next

00:08:10

one and didn't the minus the sign is

00:08:14

telling us about the the are coming from these sliced abuses lights

00:08:19

and a the length of each of them is constant okay so that we we

00:08:24

we're not considering it different to eh shapes in the in the slides

00:08:32

now when we take the the set transform over this uh it two equations

00:08:39

and we describe prizes the model that we we find this all

00:08:43

pole model which is the so called linear predictive coding okay

00:08:47

i'm basically this is the model in the transfer function of the vocal tract

00:08:53

and the the input as our showed could be the

00:08:57

the the laurel excitation which is quite a periodic

00:09:01

uh or uh a gaussian noise white noise okay

00:09:05

depending on on which the phoneme you are

00:09:08

going to produce you have a either here or here or a combination of both

00:09:14

and the the output is the speech signal so

00:09:20

when we when we take the set transform uh that in verse one

00:09:24

over this the opal model then we get this expression where

00:09:31

this is the the error that we that you uh you can make

00:09:35

when you try to model the speech signal now we are

00:09:40

what we are dealing with this kind of models or with the with or with this

00:09:45

kind of filters is just trying to predict how the speech signal behaves eh

00:09:53

in one sample considering the past p. samples where p. is the more the the order of the model okay

00:10:00

so that means that we are gonna to we're gonna predicts a are a sample of the speech

00:10:06

signal 'cause you're in the previews eh p. samples of the same signal that we already have

00:10:12

what so that means the the the production error is the difference between the

00:10:16

current the eh speech signal and the signal that we are the predicting

00:10:22

and we want to and this is expressed like this and we want to minimise the the the

00:10:27

production or or so in order to to minimise the production error or what we do is

00:10:32

to find these corporations which are the linear coefficients that uh

00:10:37

allows us to to to find a minimal are

00:10:43

so this is the not so in order to to to compute the are we with some of

00:10:47

the through all of the and possible uh examples then this is the total production or or

00:10:54

so this is the expression that we have to minimise and the the the

00:10:58

those corporations that minimise this expression are called the l. p. c. coefficients

00:11:04

so in order to to find the mean the optimal corporations what we do is to take the derivative over

00:11:10

the a. e. which a are they put them in your coefficients and is make it equal to zero

00:11:18

so after it uh taking the the robot if we can uh it find his expression and

00:11:25

we can see eh instead of having this some then with that we can

00:11:29

say that we have a a set of linear equations over here

00:11:34

and then we can change a little bit to play around with this and this

00:11:39

in here and then put it here in this one inside here and

00:11:46

we can say that this multiplication is actually a correlation

00:11:51

the function okay son and we if we change this expression into and use the

00:11:57

the good the correlation coefficient instead of it in here what we have is

00:12:03

uh the correlation in in which i and j. does sort

00:12:08

and j. does appear here and here is the correlation function with applied by the sum of all of the

00:12:14

court the the p. coefficients and these expression is the

00:12:18

is well known as the yule walker eh equations

00:12:22

and it can be efficiently solved the following a on over it and uh the

00:12:28

proposed by a by a uh to uh to levinson and darwin eh proportion

00:12:38

and so one of the of the methods to solve that uh it equation system is the autocorrelation method

00:12:45

and in order to do that the first thing we have to do

00:12:48

is to take uh the only one a a a interval of

00:12:52

the signal as a by the interval and to assume that is zero

00:12:57

in the rest of the intervals notice that in the past

00:13:01

here within the define any uh the interval for

00:13:05

the for the window to be analysed

00:13:08

now we are saying that we're gonna take and samples of the signal okay

00:13:14

so that means for the total uh the production or or we have and for the speech signal and be

00:13:20

for the predicted signal okay remember that we took a filter with p. samples to

00:13:25

predict the next one so we install we have n. plus a. p. examples

00:13:30

and that is the the new total the error that we have to minimise

00:13:36

if we write the previews equation then we can find that

00:13:41

is basically the correlation the correlation over the the the two samples is

00:13:47

basically the autocorrelation with a certain delay so is the same signal

00:13:51

also correlated with itself uh eh whatever whether sort and a delay

00:13:57

and it can be eh eh written the following just

00:14:02

the definition of the of the autocorrelation signal

00:14:07

and if we if we use the the matrix it representation we take

00:14:12

this file will find this and then you can see that

00:14:15

all of the the they are the most of the matrix are uh the the same

00:14:21

so and that's it's a symmetry is called the top it's the symmetry

00:14:26

and eh levinson and organ the recording is just taken advantage of

00:14:31

of this eh symmetry in order to find eh

00:14:34

efficiently did the a. j. equations which readily

00:14:38

near the equator the the the ha uh corporations which are the linear uh am

00:14:44

eh the article visions for the linear filter that allows us to to model the vocal tract

00:14:51

okay so now we have found the the efficient the

00:14:57

uh efficiently we have found the optimal eh quotations that allows us to model the vocal tract

00:15:03

now the question is what we can do with us with with those a coefficients

00:15:09

so we can we can do several things yeah but before talking about it

00:15:14

i will recall earlier with or what um our started talking about

00:15:19

uh and that is being friends of time window in eh in

00:15:24

in the speech processing on in this case in the linear uh production the

00:15:28

first thing is let's assume we have a a ball of a

00:15:33

of a person with a with a pitch of a hundred and ten

00:15:37

hertz that means are a fundamental period of about nine milliseconds

00:15:43

and in here in a in indian see you we can see the result in a spectrum over

00:15:49

the rectangular using rectangular a windows here is with very milliseconds on here with fifteen seconds

00:15:55

okay and you can see here the harmonics that ever was talking about

00:16:01

but here when you when you we use uh having window

00:16:04

if they having window which is not long enough that means if we don't include at least to to

00:16:10

to to to uh the for the winter period

00:16:13

then we cannot see the properly the the

00:16:18

the harmonics okay so this is our requirements of the hamming windowing

00:16:22

okay which is not the case for the rectangular window okay

00:16:26

so if if we are using having windows then we have to use eh we we we

00:16:32

have to make sure that we we are included at least uh to the fundamental periods

00:16:39

and we can see that this was no record of my speech might voice i have uh oh

00:16:45

a different page uh of course and when we use having window that doesn't include eh eh

00:16:52

the more than two eh eh eh from the weather periods then you you don't see all of the

00:16:59

the harmonics but funny we take a longer harmonica a window

00:17:03

then you start to see a all the harmonics

00:17:09

another phenomenon that appears to in the spectrum when your window in you the leakage

00:17:15

eh and other was also talking about it and that is that they get a nominal

00:17:19

our due to the to the in when you want

00:17:22

to do that uh in a to discontinuity

00:17:28

then what you are interviews in the spectrum is additional uh a spectral components that

00:17:33

are not uh or re aura that are not part of the speech signal

00:17:38

and when you come both that or when you multiply the spectrum with the speech signal

00:17:44

the result is that you'll cancel the components or you add components that you don't want to see

00:17:49

and that is what is happening here and here we use rectangular eh eh um

00:17:56

eh windows and it doesn't matter whether you are you are taking a

00:18:00

long or short window it always appears due to these the discontinued

00:18:04

and here when you take having windows are they are eh as

00:18:08

they are uh it's soft or changes lower bit then

00:18:13

eh you you kind of serve that eh the harmonics are a clearly eh describe

00:18:20

in the spectrum okay so it's important to keep that in mind so

00:18:25

both things how long has to the the window to be and the

00:18:29

and eh which kind of of of window you want to choose

00:18:34

normally if you want to to be in a safe side you you go for thirty bit twenty five milliseconds of windowing

00:18:44

and this is eh the leakage in in the case of

00:18:47

my voice eh use a rectangular a window thirty milliseconds

00:18:54

okay now for l. p. c. analysis then this is also stand our a

00:19:01

and this is a portion of the the sorry milliseconds of the of the signal

00:19:08

now we take the spectrum over the that the window and

00:19:13

we compute the l. p. c. coefficients and this is

00:19:16

the transfer function of the result and eh filter

00:19:22

and the important thing for us is not only the fundamental frequency but also those speaks

00:19:29

over the spectrum because all of those speaks our room very much

00:19:33

related to to uh a resonances in the vocal tract

00:19:37

and so using these uh the information about the position of this eh

00:19:41

eh formants which uh which is the name of this fixed

00:19:45

you can infer which uh it it which kind of uh power

00:19:48

which kind of so why am open sound are you producing

00:19:54

how can we do that so we know that there are there is uh this uh mix

00:20:00

and if you go to the set a a domain

00:20:04

disciple a domain representation you can find easily

00:20:09

the all of these resonances eh by taking the the the

00:20:13

position of the angle of that uh of those sports

00:20:16

okay and we can identify different for uh it sounds like ours

00:20:21

isn't that a representation so that means the computation you don't yeah you need to find a peaks over the

00:20:27

presentation of the of the envelope you you can come here and and pick the exact one is it

00:20:36

how can we identify different also uses the known of the open space in

00:20:41

this is the ah sound this is that you this is the i

00:20:45

the the rest of our more dollars and these three balls are core the corner rob about was and

00:20:52

they are very very important because the they hum to some extent represent us

00:20:58

uh it doesn't matter which language do you speak a a

00:21:03

they represent us the the whole eh possibilities of of moving

00:21:08

the town okay independent on on on the language

00:21:12

so for instance for the egg bowl well the the mm the average

00:21:17

f. one or for the first formant is eight hundred fifty hertz

00:21:21

and the second one is to eh uh about sixteen hundred hertz so

00:21:27

if you see here than we are about the around six hundred for the first week and around

00:21:36

in my case this is for my boys yeah we are uh about uh

00:21:39

eleven or twelve hundred uh for the holiday for the our eighty

00:21:44

so we your round here and here for the a and for the ball high we are around two hundred

00:21:54

okay for the first formant and for the second one is a

00:21:57

little bit to about two hundred two thousand so in my

00:22:01

case i am like here for the for the eye for the you'll well well the average is two hundred fifty

00:22:10

in my case i i'm more or less uh in two hundred fifty and

00:22:15

the second uh formant is around here close to six hundred hertz

00:22:21

taking here okay so you can do you can use that uh the button is

00:22:27

not just to confirm that the the person is pretty is uncertain bowel

00:22:32

that is very useful to to that knows how type cable is a person to

00:22:37

move the town properly we'll see uh oh some example yeah on that

00:22:42

and we can also track the the stability of the vocal force vibration and the capability of

00:22:48

the person to put the tonkin a certain position you're in certain amount of time

00:22:54

so for instance for the a bowel this is a time signal

00:22:58

and this is the the fundamental frequency and this is the first uh

00:23:02

formant you can do the same for the with the second formant

00:23:05

yeah this is for my uh for for my voice and this is

00:23:08

for a parkinson's person and you can see that the fundamental frequency

00:23:12

is chaotic and also the the first formant is is is hard for

00:23:17

them to keep the tonkin uh uh in a certain position

00:23:20

so and you can use the for instance the just the the

00:23:25

the deviation of the scarf to model how able are the the person

00:23:29

these these kind of people to to to move the song properly

00:23:36

you can also do plot the vocal triangle as i said this corner

00:23:41

bowels are very eh informative so the a. i. and you

00:23:45

eh so be area of the triangle gives you information

00:23:49

about the the articulation capability of of a person

00:23:54

note here and this is the speech of a person eh eh this is not my

00:23:58

speech but uh by the other but uh present the close to sixty years old

00:24:03

and this is the other person with parkinson's disease and the the reference are the

00:24:09

same in the plot and you can see the compression of the vocal triangle

00:24:13

so just the area of the triangle is given you enough information about

00:24:17

uh it they are eh um their vocal at a capability

00:24:24

okay now let's talk about the residual when you when you model the the

00:24:30

the vocal tract then you good this eh there and transfer

00:24:35

function if you take the inverse of that the

00:24:39

filter then you kind of train the the the the residual okay and then you can compare the

00:24:48

the the original signal with the with the reconstructed one and the difference is the the or

00:24:54

or or the residual so you can see that for us the same well well

00:24:59

there is always a peak in the beginning of each the fundamental period so that is

00:25:04

useful to detect when the boys and starts like here for instance we have uh

00:25:11

the transition one s. and uh_huh uh_huh

00:25:17

oh okay okay that's more should work okay this is just it's ah it's ah so

00:25:25

so this is an ass and this is a on a and and so on so on

00:25:29

we are interested in modelling or in understanding the transition here or p. c. eh

00:25:35

how useful is the the the production error or

00:25:38

there's residual signal to find that a transition

00:25:44

and you see here that for the s. sound there is no any eh

00:25:49

eh pretty basically or non any or but here as soon as the vocal fords start

00:25:55

to buy rate to produce the a sound then topic appear here and then

00:26:00

appear here again when another period starts and so on and so on

00:26:04

so the the production error is useful to detect this uh the starting

00:26:10

points of the of the eh eh um local for vibration

00:26:17

now the l. p. c. r. the useful out very eh

00:26:23

important but as as i said uh it you

00:26:26

make several assumptions like linear eighty eh

00:26:29

in in the representation of the vocal tract and also you don't uh consider

00:26:36

that the human auditory system it has less eh a resolution about

00:26:42

the the eight hundred kilo uh a cards okay that means

00:26:47

in lower frequencies the uri the the humans you're very well but above eight

00:26:52

hundred not so got that that's a good so we don't need to

00:26:57

to to have the same eh resolution in the upper side of the spectrum

00:27:04

and also the the linear prediction is not properly

00:27:09

the or not is is not considering properly

00:27:12

the the characteristics of the of the perception in the humans

00:27:17

so in order to to work on these problems or or or those eh weeks points of of the modelling

00:27:25

the scenic or musky proposed uh on a different way of doing it and that

00:27:29

is the p. l. p. which is stands for a a perceptually their production

00:27:34

it considers all of these psycho acoustics eh aspects of human hearing

00:27:39

and it it consist of a taking the speech signal then you take the concept of critical

00:27:45

bonds and here you can use bark once in the case of the original p.

00:27:49

l. p.s you use bark once but you can also use mel the dancing with no

00:27:53

problem or different uh a button scale the typical in these cases the bark bands

00:28:00

then you do a process of the equal loudness real persons and

00:28:03

then you you do the conversion from intensity to loudness

00:28:07

and then you take the inverse the eh fourier transform and then you find again here the same eh eh

00:28:14

the coefficient for all the linear representation and so what you found is the missing opal model

00:28:21

so here is the representation of the critical bands the following the

00:28:26

bark a scale and as as i was uh it showed

00:28:30

yeah before here in the low eh eh spectrum you have uh

00:28:35

a better resolution and here in the upper uh a spectrum you have

00:28:39

uh and uh of course resolution of the of the signal

00:28:44

so here is important part of the spectrum here is not

00:28:46

so important so you use eh eh um more course

00:28:51

the presentation so here they equate the the equal of this preamp

00:28:56

was this is basically in order to approach to mice

00:28:59

what the humans do in the non equal sensitivity when you are listening to

00:29:04

you you you are not a listening equally all of the frequency bands

00:29:08

so in order to compensate let's say eh that's a way of listening

00:29:12

then this eh korean posses approximates that what you do in

00:29:17

the in the what all the possible in the human in the in the hearing and now the intensity to love this conversion

00:29:23

consist i had basically in taking the the um

00:29:28

a one third to uh the power one third of the of the of the

00:29:32

intensity in order to to get the the loudness instead of the intensity

00:29:39

and that is mainly to to reduce the dynamics of the amplitude in the spectrum

00:29:44

and then you got the you have a better solution in the changes of the the changing of the of the spectrum

00:29:52

which are the advantages of the p. l. p. over the l. p.

00:29:54

c. so basically is considering psycho acoustics characteristics characteristics of human hearing

00:30:00

and it has shown good results in the

00:30:03

speaker independent speaker eh speech recognition

00:30:07

and uh additionally but was also but with the lid reduce number of provisions and that is because of

00:30:14

what i said about the low uh eh are those changes in the in the speech spectrum

00:30:21

i and also is more sensitive to search and uh phonetic units like in the cells

00:30:26

due to the same thing and also due to the ability of modelling better the the bandwidth

00:30:31

of the vocal although informants which are important for for more than a nasal vowels

00:30:37

oh okay so that's all i have to these are the the preferences

00:30:44

so as i said most of the information uh in the slides come from this book

00:30:50

this is also um i'm very handy the source of information

Share this talk:

Conference Program

48:20

Speech analysis and characterisation
Elmar Nöth, Erlangen-Nürnberg
Feb. 11, 2019 · 9:18 a.m.

326 views

31:03

Voice source analysis
Prof Juan Rafael Orozco - Arroyave, Colombia
Feb. 11, 2019 · 10:10 a.m.

110 views

26:45

Speech synthesis 1
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:12 a.m.

144 views

36:09

Speech synthesis 2
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:39 a.m.

Recommended talks

22:31

Non-Stationary Signal Processing and its Application in Speech Recognition
Zoltán Tüske, RWTH Aachen University
Sept. 7, 2012 · 2:29 p.m.

130 views

18:48

Voice source analysis
Prof Juan Rafael Orozco - Arroyave, Colombia

Embed

Transcriptions

Conference Program

Speech analysis and characterisation
Elmar Nöth, Erlangen-Nürnberg
Feb. 11, 2019 · 9:18 a.m.

Voice source analysis
Prof Juan Rafael Orozco - Arroyave, Colombia
Feb. 11, 2019 · 10:10 a.m.

Speech synthesis 1
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:12 a.m.

Speech synthesis 2
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:39 a.m.

Recommended talks

Non-Stationary Signal Processing and its Application in Speech Recognition
Zoltán Tüske, RWTH Aachen University
Sept. 7, 2012 · 2:29 p.m.

Cochlear Implant-like Processing of Speech Signal for Speaker Verification
Cong-Thanh Do, LIMSI-CNRS/Universite Paris-Sud
Sept. 7, 2012 · 1:14 p.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Voice source analysis Prof Juan Rafael Orozco - Arroyave, Colombia

Embed

Transcriptions

Conference Program

Speech analysis and characterisation Elmar Nöth, Erlangen-Nürnberg Feb. 11, 2019 · 9:18 a.m.

Voice source analysis Prof Juan Rafael Orozco - Arroyave, Colombia Feb. 11, 2019 · 10:10 a.m.

Speech synthesis 1 Peter Steiner, TU Dresden, Germany Feb. 11, 2019 · 11:12 a.m.

Speech synthesis 2 Peter Steiner, TU Dresden, Germany Feb. 11, 2019 · 11:39 a.m.

Recommended talks

Non-Stationary Signal Processing and its Application in Speech Recognition Zoltán Tüske, RWTH Aachen University Sept. 7, 2012 · 2:29 p.m.

Cochlear Implant-like Processing of Speech Signal for Speaker Verification Cong-Thanh Do, LIMSI-CNRS/Universite Paris-Sud Sept. 7, 2012 · 1:14 p.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Voice source analysis
Prof Juan Rafael Orozco - Arroyave, Colombia

Speech analysis and characterisation
Elmar Nöth, Erlangen-Nürnberg
Feb. 11, 2019 · 9:18 a.m.

Voice source analysis
Prof Juan Rafael Orozco - Arroyave, Colombia
Feb. 11, 2019 · 10:10 a.m.

Speech synthesis 1
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:12 a.m.

Speech synthesis 2
Peter Steiner, TU Dresden, Germany
Feb. 11, 2019 · 11:39 a.m.

Non-Stationary Signal Processing and its Application in Speech Recognition
Zoltán Tüske, RWTH Aachen University
Sept. 7, 2012 · 2:29 p.m.

Cochlear Implant-like Processing of Speech Signal for Speaker Verification
Cong-Thanh Do, LIMSI-CNRS/Universite Paris-Sud
Sept. 7, 2012 · 1:14 p.m.