Embed code
for for summer
i'm not
one uh_huh
uh_huh
oh okay
uh_huh
thanks thanks for joining me for this um talk this is very
informal um please stop me if you have any questions
very much an overview of what we've been doing um so they all
lots lots and lots of references but i'm really going to
try to give you a picture of what everything doing in south africa i'm going to talk a little bit about a group
oh when we on what we do about the work that we've done in south africa and south african
languages um some of the challenges we faced and what we've been doing to get all the resources
and then just touch on the web that we've been doing on the babel project we and a resource means something different to us
um number of o. is given here i think in the first
yeah it was something like a hundred thousand that was considered
under resources well we sort of think of um ten hours a sort of time we've got something we can work for
so um that's really a and ah and i mourn is what i'd like to talk about
surface of will have full given that we here i should share you way
we or so um i think we know how probably way around they
and ah university is subtracted here we cease read
thoughts because they'll three campuses um and
it really is the multilingual university as well also given in three different languages
um and that's dots cost about two hundred kilometres apart so it's not
the very um no clash campers and we even for the
oh research lab sits here which is a thousand four hundred kilometres from and nearest
campus uh oh you kilometres models when no none of multiple of um
so a to a very small town call them on it which i'm somewhat yeah if it if it um
which happens to be a really quiet place to do really focus research and we found that
i'll 'cause i'm right this appreciate that and um we've
ya we really welcome visitors to our shorts
i'm waiting it's cold in winter here in the northern hemisphere we all signing so um
oh we do you um we did you really invite collaboration
it's only a small group how group consists of um five
senior researchers and in student to come and go
um what about so focused um all the senior research is all at the bases lebanon mon
is students all of it by sometimes come and visit us for awhile off um
this really is ah focus to work on speech technology for the least missiles
and as um uh if some of you have mentioned earlier south
africa release multilingual they are eleven official languages officially recognise
um i one students who could speak full eleven at varying levels
of proficiency but most people's speaks to format which is um
so the
at amounting to chant happening and the amount and language
mixing is just um phenomenal um um group works
from basic research through technology development topic actions i'll show you how that will fit together but like it but in
the small uh um field offer 'cause really when we
get applications and is on speech recognition and um
a basic research is all very much on pattern recognition and speech processing we try to
provide an environment way these young people pause through really welcome projects with ours
learn from i'm working on a project and linking up with
international research is we really see that as part of our role as you see there's a long way to go
from when we or two when most of the speech research yeah and that really is happening in the mall
and then for senior researches and we try to provide environment
with distractions i mean it's minimise with on committee meetings
people can really focus on the research and um that is why
it's a very nice environment for someone taking a sabbatical
okay so how does it all fit together on epic action side um
most of all we're currently is actually focused around this project which is a um speech transcription
project for parliament which is similar to something that you've been doing here as well um
some time ago already um we sponsored by government to create this speech transcription
platform utterance final stages we doing uh um we didn't uh uh first
fool and usability paste with this speed with the
um parliamentary transcription unit this july um
and in a way that
it takes in many of the other bits and pieces that b. g. m. guys into that platform
a big part of our work as really being using techniques that excess technologies
that exist in creating the resources to make it working on parliament
so the loads of um uh databases incorporate that we've collected all of these
oh my it available as open content free of charge if you
able when he is any of this it's available it they
um these all specialist pronunciation dictionaries playing with what
happens if people say pronounce it proper names
across languages i'm i'm the lube adversaries to sit in on what happens to those plots rations um
this one was a real direct requires corpus laws was really we we started when we did that in two
thousand and seven i can nine they really wouldn't know the resources to get started with in these
they with few proprietor create um resources but if you want to to download something and
get going takes dictionary speech just wasn't right so we both initial pain our
um i hope right all the languages and also bootstrap the
pronunciation lexicons got to fix corporate corporate to get going
and then like to be extended that to proper co prof that really
can now uh support application set those all sixty two hundred hours
um this was a telephony channel this is nice for and um you can do a lot with that
um so a lot of all speech technology development
is also supportive off the um resource collection
we cracked at the number of corpus election ballot action to law show you some of those
uh but like in the presentation um practical tools to assist you in creating corpora
on this speech prices inside if we using stuff that's available all of the the one thing that's difficult
full fundamentals languages is pronunciation modelling so much of our workers focus on that as well
how to do go to shows all caffeine weak systems what you do non result environments multilingual proper nouns
oh that and some of those just to get a feel for the top of with that we can do
and they now mine projects at the moment um this speech transcription platform
on mention that be full babble this one that's finished but
um as a setup touch on that and then we've just i'm
finished completing the beatles collection when you've multilingual voice which is
which is a new thing we have the interest the spectrum back to base here and that will also be released free
of charge all the was open content all the data for back so what we have they all different voices
producing phonetically balanced sentences different languages that all all somewhat compact
so it's a very interesting um resource to work with especially feuding permitted t. v.
eight and you want to create different reactions of voice speaking different languages
okay so back to be off um
on what we do you first of all i'm making to do you shoot their languages of south africa
that was all the um official ones um all spoken with
these officially recognised that's the number of speakers the
um wallace home line which is very difficult to identify what um language
is in many instances and fees show the families that come from
so um basically right at the top is about twelve million speakers and maybe lay here at the bottom and one
um one may not want 'em and you'll see that english lot it's widely spoken really is
not them the most um prominent language when it comes to i'm i'm i'm language because
most of the land which is all from the southern bounty family um basically
splitting up into these two groups ingredients it it's one with some other
different ones i'm in between and in english afrikaans that stands out as the t. germanic languages
there's also means there's a lot of work that oh interesting analysis with regard to language families and how we
can then across that that is possible to increase the guy with um with that specific language make
so you know intentional t. corpus the big one that we can make that we basically use more
fun by solve this thing so people would output smart phones that these uh small that
that day you up front is provided the piston records it it gets cold
it's stored and i'm from that week we hiked up corpora and
it works it's cheap it's easy way to click tied to it works really well um
so what can go wrong it in if you've done actual reasons good resource collection out
in the field no is that these are some of the technical problems that happen
people um mispronounced the word very repeat what i have to say i'm tired
i start to um by an like i'm doing at the moment
and wayne one that we that was a bit more unexpected to us was that
often people really struggled reading these problems so in me off the languages
people really are not he's doing reading aloud in their own language
i'm reading is you don't english speaking is done in a um in a your
lambs that you really comfortable with that reading aloud is not um room
as common so we you know uh you know like two with work we actually um
when we created a problem with we use 'em children's books to get the problems from to
make it more easily readable it what it sees um improve the quality of the director
so the other problems the normal ones background speech freeway lots
of cows chickens and got this one was um
i read a lot of problem
um and bane
especially when the things got a bit see long people rushing to finish so ah there's one example i wanted
to play which is an error that is not one of the ones that i've just mentioned um
student this um i have to say that phrase
and i'll play you
what got recorded
this can now and stuff
i think it keeps going on for a long time
so some air is expected some unexpected but in the end
it's important to have quick and simple tools that you can use the preferred way out in the field wall he in your small village
collecting your data set with that um you know that the debt isn't good enough quality that you don't have to go back
now um
mainly confidence scoring algorithms politicking aldrin sixes
many of the posterior brice would just your best algorithms really have some reliance on a language
model that it's very difficult to get to where if you language models really small
so um that's why we win for fun by scoring
um the basic idea is very standard lots of people do that you compare your phone alignment
into find up offices you see whether they match and at some other threshold you okay
um at the p. p. algorithm that we've been using um add if you
know wants us to that that might be really really um effective
we simplified often sick so that you don't have all this issue with the stones in
africa it's that sometimes match and some comes done um we have a um
the normal free funny code that we use when we cracked up the string
but for the reference string we created a very specialised garbage model show that on the next slide
we try not scoring matrix to estimate there is we make sure
that we're just for that i'm seeing target that you cannot
so easily i'm gay directly from the mike tricks we normalised by the number of fines unequal that b. p.
i'm evaluating that uh against a number of other whipped based um
algorithms cables very useful results but evaluating is not a simple uh this the
garbage model for such a chair that if it's so it's the
one of these is very similar to the eyes to guys it's be model
except that it lost in this case you absorb people speech yeah
set in our action game yeah view movable that's a beginnings that is that
your normal i'm making states but then you really allowed to jam that
we can go through it you don't have to go through it you can jump across this whole thing so that
you really when you aligning you trying to find those pieces of speech at a
useful and you try to jump of anything else that is so um
getting back to the evaluation protocol if you always think i to what you want
typically you really want to
except everything that's great and reject everything that's wrong with it
it's so almost matched or wrong little bad audio deleted with that goes
that comes in that goes out and not your typical evaluation protocol
you could out it would be much more lenient and so i well exact match you want to do these things
you really do not want some of these might be useful some of these might not see really done okay
or you could have the session evaluation protocol which is actually the one that we've been aiming
fall we visa really old acceptable because everything is useful and that really all gets rejected
and using the p. d. p. schools um this is an example of the date date curve that we could get um
i just wanted to show you the difference is that's part of the whole fun bystander programs calls the p. d. p. schools
is that what difference it makes to try new all time scoring metric
on your training day so all the training happens on your training back to be
trying to clean your training director so so does see lines all the um
using just a normal flak mike tricks you just scoring like even stand distance yeah you know using
a train my tricks any see how much nicer the m. b. t. your um perform
to cancel scheme um some of the other languages really had a lot of background noise all problems in
pronunciation um he is the delay and you see that
you do not get the same quality um
oh when you applied to to take me
but that's allowed us to really take all our director score everything
and select the actually two hundred hours that are really
considered to be um consistent according to the schools that so this is
what our final corpus looked like what we actually did is we
package the really keen factor as one sick one opens but think i
did these additional pots it depends what you wanna work with
if you want something really clean patch wanna work with that if you want more of these here is actually interested in that
um the other visions of a cool place right um almost with were uh schools so that you know what you think
okay any questions here which i keep going
star some thoughts on pronunciation modelling
uh you've all seen pictures like fifth way to try and see
the difference between when you building a system this one
i think this was forty how now so this is where we go from zero to ten twenty thirty forty hours
what happens if you know um systems are that purely graph we make the was performing one the rate line
um uh grafting t. phoneme convert to some women make
and the neural phone based results right at the bottom the base that you
could do with this corpus um so in one of us that is
i took this picture and really try to understand what's
happening inside the based on um with categories
so you see if you look at the tops of the
categories of words that all observed in the training vector
you really generic would actually did pretty well that's who we're right on that's a very small would you like
but this is not for the caffeine existence as just a graph in existence a you understand why
spell out characters really it's not more that there's not much that that everything
system can do with that i'm acronyms perform poorly um foreign with it
i really do not do well proper names the system struggle spelling errors
um these become more we usable in a graph unique system but still
it's a normal standard with the system that's really really well so
actually if you are this more than one category applies this means it might have been uh
and then with the team foreign or proper name that was spelled out something
like that so it was just it was difficult to classify um
now this was just the caffeine existed not if you
can pay the three different systems you see that
great green means it does the based on that specific can't agree yellow in between read
where is not i can see what's really happening so y'all caffeine by system um
actually does pretty okay on a um many of the
categories but fell spectacularly on some of the others
way y'all call dictionary um does pay to even
on some of the strange ones the and you could be with we can i'm
gonna come back to that because that seems like an unexpected result that um
i just do it so i gave a graph even iced tea to p.
e. and the goal dick that's really every single we're getting to
the g. d. p. one was trying on generic with the plot but not so i'm not many anything
so what is happening yeah some things on lace
something's really just of very difficult to recognise
spelled out characters are really um short is like in h. d. um
and some things are really easy to recognise
so what we did in this would work with the state is to really um identify these
categories pride to system development and regular regular ice the spelling and pretty back into the
graphic system so it'll still have a direct linux system but all the strange wits between spells
say just fix the spinning um because you really only trying to fix these thing
i'm not these ones that are really doing pretty well
and they are some examples um
hall which could be really spelled back into the graph unique system up a language that we've been
working for in this case um uh that actually looks to me like a a english one
so this system we try no we tried and tested different
techniques but in in the simple joy in sequence model
and work very well we just use the second or the one because we did not want
to catch all the detail would just want to to capture the broad patterns and um
chimes these strange wit into something that a system could work with that um
here is what happens to our with your right if we
actually translate right one category at the time that was the graph unique system the raid one to start with
when we just don't let right foreign whether it's proper names spelled with an e. everything that we can
think of to johnson tried that was the um that within how close we got to the phonemic system which is
the last place if it been building anti phonemic system um
you know which which identity problems you know what
spelled character looks like you know the problem and looks like in the big identified in foreign with
foreign which you know like to but we also use chase him buys language identification
full would identification we found works really really well if you've got would list obviously that's that
that game for these languages we often do not even have proper which lists um
some other languages like fillies if you'd native so these with all these all words that just keep on
um you just keep on adding to the cadbury if you hold the stakes the cabbages keep some graphic
so this is something that we've been doing assistance for all the
great female find the strange with translate right payment you ask
almost as good as i'm getting the whole five by system yes
um that one was the way this stick it it's the written this i'm well the spelled out
where it lays there written the signs followed with something like b. b. c. um
that acronym is something like oh i'm thinking also that within one but at us as a be
inside that you don't side is at stuff as ideas i you save as single way
so it's also capitalised it's also it looks the same as a spell out
with but it's pronounced as a wood instead of i spelled out late
yeah well enough yet
there was
it says
itself out enough to last often that if
i'll just say he's okay with me
so but the the the categories that you brought to be handled outside a good thing
i would actually
um thinks they
i would expect you to he
to be good on acronyms and that and spell that were
unless it's a lexus bottled was really all um is it would
it would you put it it hadn't been identified as well
so that's like b. b. c. but thirty to be with try to the book
so another thing that is also with i'm noticing here is that on that i'm not in it
that is really not much difference a man's these and on foreign words as well um
with different systems do they really
there is also not as different as you would have expected you really expect and buckled it to be really excellent
so one of the things we look at at some proper names is really
what happens when people or pronouncing far nines pronouncing words that i'm mike
you the the speaker might not be exactly shot pronounce there's all each language group might have
it on whether or not it's uh not yeah the goal dick that's is the
english pronunciation of this weird if there is but actually with in language groups are completely
different pronunciations that at the can standardise that it's not included in our culture
so what we looked at this was it um an
evaluation of uh a lot of proper nines specifically
pretty used um by speakers off topic on english data
and zillions of nineteen is that all afrikaans
in these citizens it so the speakers or in the um the
the languages all the rows and the columns honesty is
i'm a business are it's because all the rose their names all the comments yeah
so what we doing his thing seeing an english this and would
pronouncing english name cindy i percent of the time
i got a great with great isn't now
also something that we've established from the data back so what we did is
we look at with inland which groups of utterances so that it had
there's an speaking aches pronouncing in nine speaking x. we're looking at what those um
what although sometimes sounds like we think it's um phonetically transcribed
when we say what is the most common pronunciation that comes from this satan we say okay that's the onset
this deficits sees but isn't exciting this would pronounces it like that
so this what separate means something very specific in this
um in this type of it means that i think that that would
then from the data how close the speakers normally i'm gay too
um that single pronunciation ethnic select and with in a language is really high um
some of the um this city differences is because um to do as
it as it involves system with its high and low um
visions of and the end of which people sometimes produce and
sometimes not sometimes like it transcribed correctly and sometimes not
so that is um mostly the reason for this discrepancy
now oh what happens is people who um
um it's speak of a construct decide english with the id much will
put uh especially the traps i see to get even with an
english people cannot pronounce it with what's in the in this database
so you get a feel for how much lately um people
approximately two pronunciation the one that image looking here
now gets measured by speakers of other languages
what was interesting when we say okay let's forget about what is great according to the
um first language speakers this to see how consistent all pronouncing sashes within the second
language group we saw that in all cases people that a lot beta
so here they was much more consistency in how afrikaans speakers
pronounces it says see two nouns then in the ability to
approximate the true which just sees you can really
base this information in here fact is also valuable even though it's
not great in those scenes of what our once people
so um basically just the i'm not gonna go through the detail
of this but the whole debate is that even though um
the getting
the speaker specific pronunciation um all but
it's very it's very close to this one if it is the the sign speaker out a the um
if you're trying for should g. t. p. system not try to
to pieces that um to try and approximate that ideal pronunciation
doesn't do that well but it actually does a lot paper with a proxy when we may shit i'll speak is against this
um you could be pronunciation so the g. d. p. is really
predicting some of the here is that it's this because i'm making crossing
which brought us to the concept of meeting that language abysmal were coming on that
that really when people speak is the language rice he is a language that would be spoken
the origin of that with that they speaking and it's the language that they thinking of when i speak like
trying to produce an english we're trying to use to see to it that often
find decent and i'm in the g. t. p. system that the mascot
so we now marching those three as parameters when we do a g. t. p.
um would like to speak a language untainted language and trying to pull apart these different factors
so that's been a way to get you
all these ideal pronunciations that we've been can translate right
into opera phoenix system in order to deal with
pretty complex phenomena
okay so this is very short i just wanted to as an aside on code switching um
wanted to give you a feel for just a big the problem set we had a um
i did student of mine really look into uh uh the type of data that usually
available over the radio and um we started a regular broadcast coopers
we we really just try to feel free could lead that's 'cause
switching happened now the first thing to notice is that
it's often difficult to say what is in english um days uh this obviously this continuum
between really just using english lit inside you'll see the d. and c. intends
to um which is not usually used to just pop it in to something that is things
become support all the languages rely on it to something that as being shy range um
so that it can attest change is actually committed the so
here are some examples these may not english with that
that that i have been um that become part of the
language using some of these um um modifies now
that was modifies can be used with was anyway so you have all the
seine modified with which also once identifies age and she bought of it
in uh english fashion and part of it as a this is actually separated
but that was actually a smaller part i do have uh and uh uh i
went show it now but it is the analysis of often does happen um
but most of comes when we fit in with these guys waiting is real
english but this happens in between enough to just be in a sense
so this was really the frequency of code switching we so um
these with different roles on the different um news brought cost
and wayne act is performed in actual sit they replied on this and the buckets four percent
um uh oh that was the you know that was the time right sure the um
case when i find in forty percent of the time that was english and
um and we say it was english if fries will be added definition of a
fright contained in english with it and the number of those foreign with our
same thins is this like here so um this is up to twenty with direct a
very small number in pain and pinky between five and ten it becomes very significant
yeah i'm easy that hard to think of the fact that up to two hundred and fifty sentences with that
just with right
so that's a lot of code switching that need to be model um
one of the things we soul though when we tried to actually and allies there's um
we owe the the the study that they didn't how their english dolls really op
analysis in this environment specifically look at all the funds but the the balls
really all the interesting ones um and then we had a transcribers mock is this
that we'll finish fun is this one of this that the fans and
i'll into transcriber agreement as to what people didn't agree and actually when we
looked at this i'm not sure if this is as possible um
what i've plotted here is for some of the um samples i of
the uh if one any t. the two formants and what the
tag unfold so the water tag uh was it yet p. based
one um so it's like to direct show off the um
of the target fine um if we really think it's an all all
the um what this this fund this is for the tree um
and you'll see that on the late side this the manual pack on the right
side is the um what'd packets version it all here is here with
a really agree they eyes very clearly around here that's very simple but in
the applies is actually with the fish well what a around here that
um it much more difficult to identify and here but even that was become um
overlapping and yes some of them as well so they owe us some patterns
and it's something you can he's that it's not clear that thing and um
with that we trust the guy go with really trust individuals
difficult decide it's not um it's not clearly more problem
but using this with it um i play around with years but i just included this lot because of um
of days and he says i'm not going to go into the detail but we did something that is
somewhat similar to trying to um incorporate acoustic info into the into the prices but
in a in a much laptop fashion then your very nice framework so um
basically we modified the lexicons to you classes infer from that that k. u. e.
that was it i to you again do the forced alignment audio a hypothesis
seconds what's supposed to be you see what there's nothing candidates off
if you have those might be candidates you just push them all in as possible the we and we see what the tag things which one is space
and from that you can really generate additional variant and we did
get some improvements that right and this is with its accuracy
sees a it's it's very very like yeah and because it's for subset of these would sweep is really the problem um
and using these additional variance we could get some improvement but is
meetings with them so it might be interesting to see what
we can combine from i'll make it with the stuff that you've been doing
and um
so last three um
back to babble so this i am for those of you
who is not familiar with the um was a um
fallen off your project i'm sponsored by your part we the that was run it's a
challenge we really try to solve the arms but it didn't detection paul's fundamentals land
we were part of the babylon team and all group
that we screwed really focused on um the pronunciation
trying to see what we can do with limited resources in this environment
or just to put it it's i'm good but more context
these with if you have four languages each year we got this bunch
of languages and started fall in began five and going six
every time at the end we had one surprise language we we had trouble the system with and in the eighteen
that whole system from when you get the died it'll you submit your results i'd be done within a week
sarah they thought of the focus wasn't eating it needs that are language independent that
you pay before and you need to be sent it back to you um
see what happens now this is these two osteoporosis light so i've not added this tight side
action because he's supposed to see meeting so at that like the on still um
back at least on mine so this is just for babylon for the whole team um the
program pocket initially in the projects started about programme talk it with a
good at a point three that's the actual thumb with a value
for that is that's not unfamiliar with spoken ten detection is really convoluted way to get in
how well off button to detection works and but just think of it as a value between zero one one is
um so when the program started it was set it up with three and then off to
you won the results were so good that grey um up it's up and sixes okay
and in the final here we could get back on all the systems and makes it it go direct with um really tricky
so um that's the context what i really wanted to show he because this awful was quite interesting
from the beginning of the program um many different um speech
recognition techniques were developed and fees with the ones
we um the babylon team got its gainesville so from the real by sly on this one um
adding each of these added
um and that those points within the city just on the other one um
and the youth and different techniques with what company which
means that the will of one wasn't was out
one of that haven't not that they a lotta other techniques that are not qantas flight because
i didn't i wouldn't um the ones we not to tie these things really worked um
those are the um bottleneck features this is me i dean ins got into the picture
um these will lots of different um keyword spotting techniques from
b. b. in very spatial arts and that open passion
so um you would've seen this before as well we take your whole database you you've changed in some way
um in this case um the speed was changed see you had it typed copy
of the whole thing of its level but force that we add noise
the double back the um is in a lot of different um
there are lots of different coding format in its its meaning me see lots of different um background sound
it's a silent portions from the corpus was taken from different call prof and
it to others and in all of this run into the pot
um the multilingual bottleneck features really muddy huge difference 'cause this was now getting to the
end of the prices with lace wins um that we could find in places
so that is we just try not only data to detect the only different languages in many languages
as you have i think oft about eleven twelve when started dropping um
but twelve twelve either of those languages to try to bottleneck features out your plight
so um that but gives the when and makes it very quick once you start to try on
both your system you have your features already because that's inspect whatever to try another one
this was weight factor um
and that was a whip automatically where is that went out got lots of data to try and improve the language model
um joint alignment decoding when you actually work on the hypothesis label when you combine will stop the
two common action at um by combining mysterious rather than systems um i mean stop word models
that caught up to ten point four out of a cat
so that's with the different gangs come from i'm
not going to talk about any of this
say one interesting one from their soap with models that
we worked on which is um automatics identification
which is a very simple technique and actually make quite a difference to the system
so um basically um we were responsible to generate some of the
as good at l. t. p. maps with lots of complicated
a pronunciation modelling and make very little difference you could get a hold point here
point they but um in the in the the wins with very very slim um
oh well does it so that discussion was quite a simple prices it was based on assigning consonants and
vowels to all the words and images that thing into syllables based on uh automatic detection of
the class the consonants by starting from what's available beginning what about it looking
been really much of consonants or you could split it in different ways
also while we so um this looks good but we want to
pull 'em syllables what it more units to work with
so we thought it just we classify not twelve this constant is just to get the good chance so this is a very
brian dade out with that and that's why i wanted to show the results because that quite interesting
so um the top one is two different reactions of the algorithm
doesn't make much difference but we here we start um
increasing the syllable they announced stanley ah performance um and this is
all but in and out of the cadre keeps increasing
and if you look at the uh i'm changing the number of syllables you see that
every time we increase these are the sign petitions we know increasing the syllable linking ace
all syllables of getting one mole your syllables is the right she's given between the syllables and the number of words
in the vocabulary now um you see me getting to the whole with system you know why we finding a
somewhat in a good way to get a whole with system and if you look at the result
any see the um hold it system that's the pink one
these are a lot of the different languages that's the based um how would system for in the cadbury
expect that could be the base you know um there's
there's five approaches that um in many cases
i'm out of the cadbury
it does even beat
these ones on now the one that ms that's suitable but it's
so um that's still not the most interesting result because these all
somewhat comparable only to combine everything it does like that
the um what mall themes these can fry mighty that
were uh i'm sure we'll for change right
um how's with the syllables and the combination with canned beans system
so yes actually the interesting part um while i see this all really um
it approximate elder performance but it still gives you something to work with that's not
always something a bit smaller but more chunky that um i realise useful um
but the interesting thing is that syllables all a lot easier to extract on the last
day to see its way um the last artistic to say many and uh
or maybe that's just a match people very very short time off the youth on the deck that
all the preparation and family whips it you not to syllable the file these results as well
and trying the morphological um prices on a text are which means that um
these all the i mean not cadbury this is out of the cab
really um results from all things and syllable any see that these
i'm out of the cadbury could to to pay pay to but very similar um in
vocabulary it a lot data because you guys the bigger chunks um but the name
when we got to a million weird remote phoenix systems were not it wasn't possible to developing
which means that i'm an nice going with was um in the final system georgian with uh
mm evaluation so that was a surprise but but dropped on it and then in the final
system of syllables really could help because they were just up quick and simple to well
i think in conclusion i'm just in the whole field of and the results languages they all
many challenges mini issues still to be resolved many things
that still on working we'll still not understate
um there's some graphical environment really the interesting one we have a very um
interesting language makes integrating seat of resources that's been just thought with you don't
have to go out and we i they're all pro before that
and use these more info if anyone's interest
i think that's a rough ah ha
so yep yeah oh
stuff it was itself with you guys system having thing
the many different support units that the product
and you would see that if the uh oh i think i mention it they who once the um
once you get him to a fairly lot with corpus it was seven out of the capri words in
you will spur content depiction pacing in in g. you end up with a book every system
uh_huh
'cause that's if that s. f. plus
i think the actual final system um which was pull it apart plummet
being really pretty good it was to null it's probably going to
so it's off
um so the
there was some lamb specific with that was that that fact provided very very little guy
so the multilingual aspect of the battle project was first of all there
the um the the techniques really should be applicable across
all these languages that i i to be very not choosing to specifically
but the winds from the multilingual using the multilingual
data really kind from the multilingual feature
they will lots of different things that would be if they're especially um
my beauty um but the the big when he yeah uh
they it was really there um multilingual speech
and that's bottleneck is
losses sold with it and it has this um
ah got the says
yeah oh yes
yeah i'll have this be honest and
i'm not sure what we came to me and i remained it has a strange all
all that will serve as transvaal i'm elated to sound relationship that we found difficult
the t. not find the sign
the the big thing with this um a program was that we were not allowed to use the big
so uh they will lexicons i'm right at the start and then they got right now um
and that means that any time they is some um
a description see it is very difficult to fix that
um but on it this specifically i think i'll have to skip that on so
i'm actually not sure and i left to go look mine redid mine out
uh_huh
that the um but it's still um but the standard
is that we stand so once a um
so part of the prices of writing takes resources would be
to their five is um the the spinning of them
says often the official standard spelling might not necessarily be the one
that is used in that the data that we work with
they all for example um some of the language would also have a cross border the we said the
this is city from south african the city from the city would have a slightly different and spellings
and if you just get says it it takes you don't know which one is with these

Share this talk: 


Conference program

Multilingual speech recognition in under-resourced environments
Marelie Davel, North-West University, South Africa
2 June 2017 · 11:05 a.m.