Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
thank you all for for coming here i'm i'm i'm not quite sure if we are complete but we we will notice i'm
00:00:07
what else is my task here is to make it a bit
00:00:09
more practical which are her to in the earlier uh two presentations
00:00:14
um and to start out with these so immediately you saw the preview speakers
00:00:19
just accepting this mike and with only you know just saying it will be recorded
00:00:26
no i did as a date off so i would say from the g. prop point of you well
00:00:31
what would be legal basis before that so we have no
00:00:36
need to draw two or three or several based six actually
00:00:41
ah one of them being public interest one of them being and a consent
00:00:47
uh and now this mike here and they just told me it's
00:00:51
for equality i've no idea what it is um so i would say
00:00:58
perhaps university things this is a public interest they
00:01:01
are because of public interest now to record these sessions
00:01:06
and perhaps draw it cost them over the internet uh
00:01:09
or do research on it um but is this public interest
00:01:15
i would say this is a clear case where you would need
00:01:18
the consent speakers and and make clear for what it is used
00:01:23
but i'm sure we will get the consent form letter dated and but
00:01:27
that that is what when it comes from where where things getting more practical
00:01:32
um and this is what i would like to tell you bitch about um my background
00:01:36
to as you already told you design director of the centre for language and speech technology
00:01:42
uh and the my faculty that's the that's the place where they collect most data
00:01:46
and what happens then is they need a data uh research data officer page
00:01:52
go to our department say you do something with data danger
00:01:56
uh and and this is how it started for me and i
00:01:59
uh do you got more involved in this whole uh area uh apart from that i'm also
00:02:05
had of the humanities lap uh um uh department
00:02:09
at our university art uh faculty faculty of arts
00:02:13
where we have many research is needing nowadays more
00:02:16
more i cities work um we also run experimental
00:02:20
lap where they do and uh experiments with a eye tracking you g. and and so forth um
00:02:27
but you see also but usable seagulls which you are interested when it comes
00:02:31
to humanity to more him we need i cities work wouldn't talk about big data
00:02:37
um so uh i'm number of the steering group of data which i will tell you a little bit more
00:02:43
i'm going to shape particular knowledge centre for a typical communication expertise
00:02:48
and i do day to cure rationing for a clue yeah um and this puts my position
00:02:54
uh anyhow between the lawyers and the research is being a researcher myself
00:03:00
i now know about the reflex is of little use uh and they of course
00:03:05
are always saying would be careful who would be clear who and this can really
00:03:11
swore to can hamper your research uh you want to do
00:03:14
some research and then yeah the roll your immediate says whoa oh
00:03:19
um now that was a very interesting question by you to ban
00:03:24
uh asking uh you would like to make some public recordings and good
00:03:27
to use them for week for research so what should we do um
00:03:31
i've had the same question for stuff from several of my research
00:03:34
is i want to do this i'm doing and the bolivian experiment
00:03:38
so people should not know who the time uh but we are observing them and and see what happens
00:03:43
uh so i did not have the same reflexes banners that that will have a talk with me
00:03:49
and uh let's see what what kind what are your research questions and uh how are you doing
00:03:55
when you have any idea of how you would like to this technically
00:03:58
to have recordings that you can use in the end for your research
00:04:03
and going to the dean these aspects it was a
00:04:06
quite clear that we needed the consent of those invoke which
00:04:11
given the context of what the research you wanted to do was impossible
00:04:15
so i came to the same conclusion should we needed the consent for that
00:04:19
uh but it was not say right from the beginning because i really try to
00:04:24
put myself on the side of the researcher and then see what
00:04:28
limitations are non avoid so sometimes really have to consult the lawyers but
00:04:33
i'm i'm trying to have this mid position and at least also defend
00:04:37
the the position of researches in some respects uh where i can um
00:04:46
so this is an overview of my presentation um some brief remarks on the ballot
00:04:52
network some notes on the g. d. p. r. and this is one spike but
00:04:57
i could talk about that slide the remaining of the session uh becomes
00:05:01
it it really down to the point where where we are doing research uh
00:05:07
and that's we use cases of a jeep i think we went to to given if the time we have
00:05:12
um and then various options axes and process sensitive language and speech data um
00:05:20
so you now have also learned to put because sensitive data into the g. d. p. r. s.
00:05:25
special category their data but i we use the term sensitive data well
00:05:30
um clarinet work what data storage and this uh uh
00:05:36
knowledge centre wheeling their uh that we are now running for it to go communication expertise in that perhaps
00:05:42
could give you also a way of uh uh
00:05:45
uh uh uh sharing your data because it's all about
00:05:50
uh collecting producing sharing this type of data on how how can
00:05:55
you do this also given uh the g. d. p. r. um
00:06:00
so we have a a kind strange paradox and without type of data first
00:06:05
here we see we have these uh well say uh working with kind of distorted speech and language
00:06:12
this is very difficult to find the collections up lights will be a very
00:06:17
hard to share so we use difficult billy ah this is a small
00:06:22
size data it's very difficult to get no it's costly et cetera et cetera
00:06:27
and to do for the research on it you need to combine effect these data because the whole
00:06:32
made for a particular purpose and not the to the made for the purpose you would like it
00:06:37
for uh for the language so we really would like to collect lots of data available for other people
00:06:42
to to use for for the research um so you may need this is the
00:06:46
type of data which you maybe reduce reuse that's the most difficult to get it
00:06:51
um so that's the paradox and i really see here and that was also the uh say the the the
00:06:57
bases and and and the motivation for starting this adele
00:07:00
out a a group um we started isn't two thousand fifteen
00:07:05
uh with the idea of how can we share this type of
00:07:10
data and how we can we work together in
00:07:13
internationals perspective to get this done um so um
00:07:19
i've permission to share this picture for the the other group um
00:07:25
and uh and uh and we would like to stimulate the earth
00:07:29
even perhaps not bilateral basis but that data can be shared a
00:07:34
best of will perhaps in the portal where they said you would
00:07:37
uh and where we know the g. d. p. r. relations have
00:07:41
been followed collecting them and the the making sure that cheryl um
00:07:48
so i'm not going to war with this um uh the interesting thing is that
00:07:51
in in the workshop that we had the last one in in november and uh
00:07:58
this year uh we clearly involve the clarion and network which is a a and uh
00:08:04
network for language uh resources infrastructure where they have the possibility to
00:08:11
the work and make sure bill uh data and tools um
00:08:16
who if you have worked with trevor and i know that one
00:08:21
race and so everyone out so perhaps i should
00:08:25
yeah that's another one i see and uh so perhaps should tell something
00:08:30
about these more than i perhaps what we'll see how i would get it
00:08:34
i'm we have collaboration with uh the top was uh during these last two work groups we had
00:08:40
uh and number from top was suggesting that i saw
00:08:43
heidi sitting there yet i hope has a has been there
00:08:47
um and and and all the members of of uh so we have already some close connections
00:08:52
so it now nice that from the day that i can come to you and um
00:08:57
uh tell you what we have been doing and and what is the idea behind it
00:09:01
so when this last workshop uh um uh i did you know what what we what
00:09:07
we are doing that and i i hope this give you an idea of the relevance
00:09:11
um so you want to increase the the part is initiated
00:09:16
um we discussed at the cone legal aspects of these type
00:09:20
of corporate sharing them under the g. d. p. r. um
00:09:25
uh we're looking at ways to share the data and how and how we can do this
00:09:30
and particularly we have had invited people want one person to from caring to tell us how
00:09:36
what then network could be it could mean for us in that respect
00:09:40
and so if you see the issues that we are discussing here it's uh
00:09:46
i. p. r. affixes the format of the data type annotation that
00:09:49
that are involved mete data levels of minimisation levels of public access
00:09:55
uh at the technical details of of ways to share the data
00:09:59
and and this uh at the at the at the proper way
00:10:03
so i think and this is relevant information also to to to share with you
00:10:09
um and this will in my presentation this will will come back everything that we've learned there i want
00:10:15
specifically say did i do to their or something like that
00:10:18
but it is a relevant from from this perspective um so
00:10:23
i'm happy to share that information with you um and i just have no to the few things which are on a
00:10:29
practical basis relevant for for your dealing with u. d. p.
00:10:33
art has been also um a stressed by the previous speaker um
00:10:40
uh well let's go let's go and to the to the first one for anonymous date is not subject to
00:10:45
the g. d. p. r. has expressed like them as well and put a suit on my state effect yes
00:10:52
so if you have the g. leading to the personal information of the per then still this is the
00:10:59
part of the g. d. p. art so that you could share the other thing but as long as
00:11:03
you have the key that is subject to the g. d. p. r. and that poses very interesting questions um
00:11:12
i will just one uh uh at at the fixed point here um
00:11:18
another thing is uh what are the legal basis on which we work um
00:11:22
with with with with our research data uh two of them having mentioned public interest
00:11:28
originally interest uh and consent but you should you wear but there's a search one
00:11:34
which could be very could be an relevant to you this is the uh the contract
00:11:38
have you might want to have um uh uh perform our
00:11:43
filling your database with uh say strange sounds
00:11:46
or different specific uh for sign language for example
00:11:50
and you make contact with this person and everything that is needed for uh to get
00:11:55
your data and to share it et cetera to put in the database and isn't that contract
00:12:01
but there's another way in which are involved uh that's your own contract with the university
00:12:06
so what does it say about your data about your lectures so if people want to use your lectures for their
00:12:12
research we have very interesting examples of lectures giving the lectures
00:12:16
both in dutch and english very interesting to compare that material
00:12:22
so with the owner of the data is the research is university it should be the compact way
00:12:28
somewhere if you guys have a question i never know i just the okay stretching you think it's that's good um
00:12:38
um so that's something to to look at sort your
00:12:42
own contract when you are you when you're teaching interesting um
00:12:48
then um the how to anonymous speech of eighty eight data
00:12:54
yes so uh normalisation is how how how does it work um
00:12:59
so perhaps you can do something on your and that matter data
00:13:04
works personal data e. an anonymous that keeping the names in
00:13:09
the yeah addresses of the people emails in the bank accounts uh
00:13:13
apart or uh he'd been some point uh and um but how is
00:13:18
this with the speech data or video data that most of you are collecting
00:13:24
and video data i think you can clearly recognise the people are
00:13:27
are there so can can you share it and just like that
00:13:34
um now there is a way to make it anonymously that is the blur the faces into blurred voices and to
00:13:40
i put beeps at every point where a name is mentioned in the data
00:13:45
and this is most of the time typically what we do not want as researchers
00:13:50
it's just this probably in your research data is that what you want so um
00:13:57
how to continue on this well you want to have the data and if you want to share them
00:14:03
you go to kind of consent what you would need from
00:14:08
participant now the u. d. p. r. is it has been said already is not on things such
00:14:16
it is a it is a little giving you the framework
00:14:22
to um collect process and she at a personal data but
00:14:28
ethics is it is another domain is the previous speaker amen well i think clearly katie
00:14:33
pointed out um we had a had a case in the netherlands with the funeral insurance company
00:14:38
uh and this company uh collected fingerprints of the say there that customers
00:14:45
and uh and then would it's been fingerprints they went to the there and nearest uh a
00:14:52
kind and and they said we we can offer you jules with fingerprint of you work enough one
00:14:58
um so um are are you interested in that and then people so this
00:15:04
is also well okay you do this this is against the g. d. p. r.
00:15:08
but it isn't as you can because g. d. p. i. does not apply to get
00:15:14
uh but it was very unethical i hope you to be on that point
00:15:19
a bikini morgan unethical when it appeared that the fingerprints were not which from the
00:15:24
you see one but or one of the employees um but that's not
00:15:29
more unethical but still if this simply was aware of that it could have
00:15:33
been on the g. d. p. r. perfectly right yes and then okay um we're
00:15:40
restrictions are placed on participants requesting deletion of uh uh
00:15:45
of collected data bases for research i think an important one
00:15:50
um it's not like that if you are have started you research it you've published
00:15:54
it if published data set that won't start doing a collection that uh at at and
00:16:02
one chip discipline script some can come back and say i want my date removed from
00:16:06
that set data set on the rock clear regulations on that that that that's that's not
00:16:13
you need to be the case um
00:16:17
of course uh data subject king close its participation to any of the
00:16:21
following a a interviews or a experiments but um and there are rules
00:16:29
for us as universities that we can keep the data and use it for research
00:16:34
but that does the trick if you right so you feel that but it there are kind of regulation which you
00:16:39
can do this you put the article there in case you might need it to more or something and by the way
00:16:47
and do you often have people saying that they want to be removed from the collection anyone
00:16:53
could ever get a request no one here yes yes you do and it'll be in yeah
00:17:01
yeah
00:17:05
now you are in a specific field but considering is so very special interests or group
00:17:10
because we're always talking about the risks of what we're doing and that's an important thing another one
00:17:22
how
00:17:48
yeah
00:17:54
yeah and that that's what you see in in taxes
00:17:59
yeah
00:18:02
i had the i do i i had one a and ask people uh in in the room
00:18:07
no no uh most of them i i seldom get those people raising their hands
00:18:12
and the i one of the previous workshops and not topless uh one were raised his hands
00:18:17
in yes i've had one and set all this interesting yes i said it was a student
00:18:21
during uh i watch to doing something on a g. d. p. r. people's pocketbooks
00:18:28
collection and yeah that task to uh to try to find is that um okay
00:18:35
but um i mean uh we have to see things in a perspective right that's why i say um
00:18:42
now coming to five um something uh when we talk about
00:18:46
practical things right so uh yeah i things to consider now um
00:18:51
you have an your data it's speech recorded at separate interviews and you have to don't know mice that
00:18:58
so yeah the key leading to the personal data
00:19:01
um and then the best way to to remove that
00:19:07
from g. p. r. in some way as a research data is just this card
00:19:11
this whole five or the t. or whatever because then it's impossible to make the link
00:19:18
now this can be a dangerous for several ways because if it's it's
00:19:22
recording to people would recognise themselves that is a a tricky but you
00:19:28
and you could walk on two point when you say okay you asked me to a remove your personal data but i did
00:19:37
because i remove the key is to not to mention one so interested in what you're going
00:19:42
to say that but then this is set and that that is a way to uh uh yeah
00:19:47
kind of uh uh put what in that case i would really
00:19:51
advise you to keep the consent forms in in whatever way be calls
00:19:57
and there's also a responsibility for user research to prove
00:20:01
and and and that uh you go you you really have nice person sitting you're not taking your data
00:20:08
faking date is a very interesting topic in its own right but um when it comes to
00:20:13
uh say committees uh on replication research and they really want to know whether you wear and
00:20:19
your participants were in that research it's good that you could have a consent
00:20:23
form so i have one uh but i lost the and remove the key
00:20:27
so there there you moving in between the various
00:20:31
responsibilities that you may have like sort of um
00:20:35
so um that that would be you prove the yeah so i
00:20:41
in my own in first class at this is an option right miss and
00:20:44
yeah be careful come on we're not going to implement this right now and so
00:20:49
but um and and special category data i mentioned off now um but in
00:20:56
which you can really say about special category data is affecting consent of the serpent
00:21:01
not only because of ethical reasons but also the g. d. p. r. will uh and
00:21:05
and it's the safest way to go and most the time to do experiments
00:21:10
uh where you invite people to come over and do a and make
00:21:14
frequent so you have them there so yeah but you didn't work record or
00:21:20
and be at our university try to combine it so we have the consent
00:21:24
form which we say here and which people have choices they can for example
00:21:30
a chick weather and they wanted also to be used
00:21:33
for educational purposes uh or very in websites or so
00:21:38
uh but for the research part we really clearly state what our public interest is
00:21:42
and we say at this and this we're going to save the data for so long we are
00:21:45
using it for research in this area uh and typically beyond the project there for which it is connected
00:21:52
um and this is the information that people get and if they wouldn't agree to that they can
00:21:57
uh just drop out and say okay i'm not good dissipate but on the other hand when they do they know
00:22:03
yeah the information of what we do with the data so you could
00:22:07
see that more as the kind of the public interest partly not going to
00:22:11
going to the bait with the research with the participants on what you
00:22:15
like to have it or only this one or would that work or what
00:22:18
what would the flavour been your your uh participation in these big big right
00:22:23
so um there we just say this is what we get to do it
00:22:26
and so we combine say that's a cool thing at school for with
00:22:31
the g. d. p. r. uh she data sharing collection and processing or
00:22:36
and i think that's the two things that are combined here that's relevant to to take into account and
00:22:43
then um when you also work together with others please
00:22:47
check the process of controller agreements that you inserted because
00:22:50
you are then the controller and perhaps other pets company or another university so it didn't work on your data
00:22:56
and you need to check what uh what templates they have
00:23:00
and and how you have to do that because that's relevant um
00:23:05
with the d. p. a. four and it's also that you point out it is
00:23:09
relevant uh no i i don't think i was your university we're still struggling with that
00:23:13
to get this in place uh and uh i think and
00:23:17
it is very relevant for special category data so i think
00:23:21
we are on the right track but why we should really have protocols for the property
00:23:27
um check policy young data leaks a utterance that you'd it's also a relevant um
00:23:33
the biggest risk of course well and i say damage to your participants most the times
00:23:40
not that that much but if and you really should take care less about deep about very important uh no question about this
00:23:47
a bit of university yeah when you when you come in the
00:23:50
news like that uh it's an enormous damage to your reputation or so
00:23:56
yeah you wear that as well as a very practical remark
00:24:03
yep
00:24:05
okay um so i'm going to present a couple of the use cases to you i think to uh would be enough
00:24:12
um and and the show you how research is dealt with uh uh e. d. i.
00:24:19
data sharing and so on what they uh what was the right way to go so here we
00:24:24
have acoustic data from scottish children uh technology gently
00:24:29
you know to allow you to use the slides um
00:24:32
to lose speech disorders i think perhaps it's uh in the line of
00:24:36
your own research so we have a control group and then we have that
00:24:40
uh uh several groups of shows sound uh does or does it make
00:24:45
up sucks and uh ultra also ultrasound recordings but also uh audio
00:24:50
recordings of data and this is the data type um so um
00:24:58
um they also have a lot of tools uh develop them to finalise the data as well um
00:25:05
so and most of the day that's not available on the top well if you
00:25:10
look at this um they have a website for this month i can also should yeah
00:25:15
so this isn't due to that site where you where you can collect the tools uh for putting the data
00:25:20
um uh but the data uh for for processing data
00:25:23
uh they themselves they are only available by a link
00:25:27
no uh uh so uh get up is not really suited to store data right and so they have it at the
00:25:35
uh i tried to server and you really have to ask access to to get
00:25:39
there and here they have a typical consent form because this is you have participants in
00:25:44
this time you need consent forms but also for the parents of the uh to of
00:25:49
course i'm uh which is also posing an interesting question was for the next use case
00:25:55
but okay so and this is what they they do and they have to consent form like this uh
00:26:00
and they have been specified it um just to show you a so university teaching is one of the things
00:26:07
uh and and the the participants get a a check check on this whether they want it or not them for uh
00:26:15
perhaps also for the for the public record uh demonstrations uh
00:26:20
or a lecture was or what have a broadcast news and um
00:26:26
then for analysis by other receivers searches outside university um
00:26:32
uh and this goes on that and getting really uh
00:26:35
personally think uh or i mean a formula trace asked last
00:26:40
um and then to access the records of contact with
00:26:45
the uh the g. p. the doctor uh and uh even
00:26:51
um with a speech therapist or or come so this really goes far i'm i'm mostly
00:26:57
you want me that but just to show you what's the level of uh concerns me um
00:27:05
and this also note uh in in this consent form it x. plus
00:27:09
it's explicitly mentions that there's a possibility that the speech sound can be recognised so
00:27:13
i mean it is normalised in in the sense of uh of
00:27:17
the matter data and so on but of course it's still speech
00:27:20
and uh so they make this remark uh so people are aware of that that if they
00:27:26
take these boxes that this option and then they agree to it and give consent
00:27:31
to that if you give consent it is it is okay um so constantly looking and
00:27:38
but uh you should minimise it as well so always
00:27:41
keep mine do i really need basically something i really need
00:27:45
and if you really need it and you can motivated at this person was data can be but you meet
00:27:50
really searching can motivated genie p. r. is not obstructing you is just telling you that you should motivated properly
00:27:58
uh_huh
00:28:02
and it's not accepted motivation and of course you have a problem
00:28:06
um so yeah let's go to uh to uh to the second phase of this is archival a speech
00:28:15
yeah yeah yeah sure
00:28:18
yeah
00:28:26
yeah yeah it's the cookie example right so if you go to and then say yes if you want something else you can go
00:28:32
to this by a page and then you have to okay a is the option to forget it just okay i want to seek
00:28:41
yeah it could well be so this is why i really i'm not going to that a
00:28:45
level of detail i would just the basic things and then explain what we do with the data
00:28:50
um but in this case it perhaps was needed for proper research to to make these connections to all kinds
00:28:56
of people and then you have to do it to because then you can motivated right the basis on top
00:29:17
understood
00:29:19
yeah
00:29:21
yeah
00:29:25
exactly yeah
00:29:30
so uh that's one thing it should be and in a language that can
00:29:34
be understood uh so if you work with people um also have say a
00:29:41
a mental uh problems that comes in from over there this potion poses the
00:29:46
special problems and how to make it to these people to give you the data
00:29:50
so i think in an allusion to do a research on that in each room on on how the state okay
00:29:57
collectively use i can sense for them it's it's good point yeah um
00:30:04
archival data and it has been collected and has been there for a long time
00:30:09
and at the time of the creation different regulations a word place um
00:30:15
they'd are acceptable doesn't need to clarify access and shit and channels so
00:30:20
here we have an example of a week ones of polish hearing impaired children
00:30:27
they um but uh
00:30:30
we're told to speak assumed tainted using numbers modality of
00:30:34
communication the so uh this was a total communication approach
00:30:37
um um but i think the only the speech recordings where uh are collected um
00:30:45
and uh okay there's a there's a picture that um so
00:30:50
there's lots of information that they had a where does collect it with the number of
00:30:55
children here but also about specific children they have information is beginning of the speech therapy
00:31:01
present dozens of other disorders continuity a degree of hearing loss moment
00:31:06
of hearing loss so they have all your recordings and transcriptions um
00:31:12
uh_huh uh even more some very specific at the data on health it's
00:31:17
unhealthy people with a special category if we do not see the spend no
00:31:23
open your eyes um and ah yes now in dealing with
00:31:27
speakers with children at the time so corny there now adults
00:31:32
the quantum made based on all the humans with the school directors rather than
00:31:37
adults or children cells there's no contact with the priest this speakers at doesn't
00:31:43
what makes the case of the this was be more problematic than
00:31:46
sensitive nature the mete data collected along the recordings i just showed you
00:31:51
they need the regular is not clear with respect to dating before the present regulation so what can you do
00:32:00
is there a basis of or is it
00:32:04
impossible last case will present the car yeah
00:32:19
so constant is the impossible i think we would agree on this but
00:32:23
you could revert to public would you didn't interest but it's very very relevant
00:32:28
um but even if you do that and you would say okay that's
00:32:32
maybe we'll make the data available and effect you still have the obligation
00:32:37
to let the participants know what you did you date 'em
00:32:42
um but still this is impossible and what you can do and
00:32:46
i think it has been mentioned is that in this case it's
00:32:50
it would read in house could this be proportional amount of effort
00:32:55
to a thing to use to to to inform the people about what's going to happen with the data
00:33:00
uh and you can as a research institute beaver to that and say
00:33:04
in this case it beside tool around sixty two or something uh i
00:33:09
uh and going uh to use the data and a claim that i
00:33:14
not able to to this to this proportional amount of after they have
00:33:18
to do to be go back to the to the persons and so uh
00:33:24
there's a lot of complications yeah because children grown up et cetera et cetera
00:33:29
and you can also say for example the boys people could have completely different
00:33:33
voices now as a man so not recognisable or or at least far less recognisable
00:33:39
than uh than so yeah uh_huh i think you would be on the safe side
00:33:44
but uh was never that i can say to you you should the second movie
00:33:51
and i've uh so that you can use case which i will skip for because of the time how long do i have for you
00:33:57
do you
00:33:59
have infinite time that's good
00:34:03
so um there are various options share sensitive speech uh data uh wow
00:34:10
provided to triple the consent and everything what about um i think one of me
00:34:17
um and bake 'em dichotomy is i can you take data stored somewhere and then the
00:34:27
researcher downloaded in and use it and use your tools on it et cetera et cetera
00:34:32
or are you in a situation where the data owner holes is on location
00:34:37
and says conch leave this place uh and when you want to do research on it you have to bring your tools
00:34:45
bringing to the data so that's that's the other way yeah and
00:34:49
and then they can create say a virtual research requirement for you
00:34:54
um and we have several projects the yeah in which goes like
00:34:58
that and sometimes it's quite different because they have to create that environment
00:35:02
for our uh uh tools can run so a proper operate disk operating
00:35:06
system and et cetera et cetera so that can be quite a nuisance
00:35:11
there are also a systems which are being devoted to develop
00:35:14
the typically for that for example hospitals uh where they have
00:35:18
this virtue research of runs with the data stick at the
00:35:21
places and researchers can go there and there's a set of
00:35:26
uh to so they can use the our statistics et cetera et cetera you in so they can go on and do this
00:35:31
but sometimes to set you really need to find out there and it's difficult to
00:35:35
bring them in that got these are the several options that that you have um
00:35:40
so then the shock project we're now uh internet make an inventor yourself piece
00:35:45
the shock a project is a um european projects
00:35:48
uh on the c. infrastructure of of of data
00:35:52
sharing how to uh think about in which also clarinet takes place in one of the work project work
00:35:58
packages is directing to a direct would sensitive data
00:36:03
and how to share them and uh yeah we have a university a part of that uh a task um
00:36:13
and how much time do
00:36:19
five five years so i'm not giving a to show you that
00:36:23
that directly um because i think the most important things where the the
00:36:27
practical things that they want to use our two most time to
00:36:30
do this and now we um i'm sure you've you've just and yeah
00:36:36
have to skip a couple of half of slides problem i've got these from
00:36:40
my colleague i did i'd funk use the technical director of clarity and and so
00:36:47
they have a a big infrastructure um uh they have a say
00:36:53
kind of sustainable status as an eric this european research uh uh
00:37:01
consortium and where they have a longstanding time to to work
00:37:07
um and it's based on a countries becoming a member of caring and then
00:37:14
using their services and typically need kind of federated way where will line all all
00:37:21
and countries participating have their own data centres uh but the data in the matter davis
00:37:27
the key to mete data supports together half the states and made visible in at one portal so that you can see
00:37:34
what is which can get from every country and what type of data is
00:37:38
not that the data connected to it um apart from data they also provide tools
00:37:45
to work on data um and to scan come separately um
00:37:51
but they and these this is an overview of the member
00:37:55
currently i'm so it's me spreading was a europe and there's even one um
00:38:02
number in the u. k. which is this new um interesting for us i think because
00:38:09
basically is to talk bank clinical bangs from about crime we need you know talk thanks
00:38:16
who presents when those people only even more than in there and it's about the same it's
00:38:21
also about the same people yeah um so um and this is also clear that this thing um
00:38:32
so they collect text and speech data the the
00:38:35
do with federated aces i'm just a senior um so
00:38:45
i was talking about the tools they also are developing now with course which what's when
00:38:50
you have pipelines of to which you can connect in order to if your data process
00:38:55
um so uh we ourselves are now working on speech recognition so many
00:39:00
yeah you searches have audio recordings uh interviews or whatever interviewer typical case
00:39:05
of a speech data did you see across different disciplines and so we are developing a pipeline
00:39:12
of uh doing a automatic of speech recognition on this and then for the processing the data
00:39:20
um so they have several uh examples uh which to provide to me
00:39:25
um but uh here it is a clear that this isn't done
00:39:30
when you want access to this data you need to philly i before
00:39:34
uh in order to apply it to to reply to the license that
00:39:38
be hot for sharing the data so they would check manually with you
00:39:44
to do that and it goes and this is another example but this is for
00:39:50
educational purposes so this uh there's really a a basis of the showing in and
00:39:56
making the dated uh and available short parts of every police so you can just
00:40:04
quite a quite different purpose
00:40:07
uh_huh yes
00:40:11
so if you want to make your data eh available to all those and
00:40:18
it's interesting and and what is glaring can offer they have all
00:40:21
said indication assistance uh preferably single sign on and really make clear um
00:40:28
what authorisation is not that i'm from the beginning not that if you click on click on just
00:40:34
it's you have us new hurdles all the time in the end it appears that you cannot access
00:40:39
the whole data which you were interested in so please make clear from the beginning what what people
00:40:46
yeah
00:40:49
yeah so uh just uh going to this can centre this is a new initiative falling from there not
00:40:57
an and at our institute we now have set up a
00:41:00
and acknowledged knowledge censoring claire in context for a typical communication expertise
00:41:05
and that involves a a language acquisition development data a sign language
00:41:11
but also a language or distorted data and so relevant for here um
00:41:18
and this is where say the work of dale out in clarion uh
00:41:22
and uh and maybe topless can come together for really sharing data but
00:41:28
you at this is about data that can be sure that it's not
00:41:31
about data that you would like that that must stay in location and
00:41:35
that you should access bring your tools there uh that's not a few
00:41:39
doesn't explain this is where you when you have this possibility sharing data
00:41:44
plant can provide data she used that you need a yeah that that to access
00:41:51
the data according to what you have communicated to your participants and for which of course
00:42:00
um so we have a larger uh audience as a as a player and not centre
00:42:07
but clearly also enables a you so what we're going to have a website where we have a
00:42:14
examples of consent forms and how they can look at several places
00:42:19
um because there's also one thing that you should be aware of and i think also been highlighted days
00:42:25
we have g. p. habitual so you've also the implementation rules of different countries so this differs at
00:42:31
countries in the netherlands there we have very light implementation will it really
00:42:36
a very lenient to watch researchers uh but in other countries it that's multiple the
00:42:41
case so this is really or something or what are the uh regulations the the rules
00:42:48
affect your own country following from e. okay so we
00:42:52
can help you uh questions on all kinds of topics technical
00:42:56
assistance or making the database but also forty rating them
00:43:00
and then i think in this respect so the yeah uh
00:43:04
we can work together with one of the planning data centres but also in amy and it's
00:43:09
a language archive and there can be sent and they have this data show that we need
00:43:15
uh with all kinds of access layers different types of
00:43:19
data mete data raw data for room which you can access
00:43:23
um that black application forms uh on the web where you can say i would like to
00:43:28
see do broad data my identity this isn't it checked and so and it's also important that
00:43:35
at the t. lay you cannot only store data from
00:43:38
dutch the solar dutch region but also from a national
00:43:43
other location so really it's also relevant for you i think so with this data centre with
00:43:48
the knowledge and and this data since i think perhaps we have a good a good way
00:43:53
some of you also mention talk thank because it's very visible many uh people know
00:43:58
about hoping can go there if they want to access data therefore from specific type
00:44:04
and so we are not establishing a the corporation with um brian qui ne uh as well to get
00:44:11
this done at the clinical bangs the to the data can also be put there and one of the un
00:44:18
things that in the ballot community they had a worries about was the way the
00:44:23
data are the shit this type of data shared because they are stored on chimera console
00:44:29
and and the uh the authentication principles are quite weak so
00:44:34
we're now i'm figuring out and working collaborating with the topic on
00:44:41
just storing data at a t. lay on so
00:44:44
european so and have the uh authentication mechanism surrounding that
00:44:50
for accessing the data which will give a much more peace it
00:44:54
too hard so found a lot to researchers um so i think
00:45:00
that should be it maybe this time for some questions thank you
00:45:32
so you run the risk if people who uh uh doesn't say this is clearly me
00:45:36
and i wanted to be removed from the database that is uh i think what what
00:45:41
is a but which conceded there but in the end of course it's it's the universe is over data
00:45:49
yes the cons
00:45:57
will be regulated from the you just then perhaps i will be national i guess
00:46:04
so because this is relevant because it depends on what the regulations in your country loss and you can be on the page
00:46:11
uh and and wouldn't come then you show what you have done to safeguard this
00:46:16
uh and when he said that's not sufficient i still think that yet okay then you problem and then uh you can go into a cost
00:46:23
with the to clean in that once they say when you do it like this and they can
00:46:29
say this is not allowed and dirty they will not send you claims uh it it this isn't an ambulance
00:46:35
so and they can say you should remove the it's the it's too as it's it's too risky shouldn't do it
00:46:41
and remove it but they will not uh uh make a big deal of it and put you in the in the in the plane in shame
00:46:48
a part of the newspapers and give you a find not not like that
00:46:52
depends a bit on so that's the risks you have to take into account
00:47:00
yeah
00:47:10
yeah it's a uh i've i've and not my state park in effect because
00:47:16
i removed the data and you're not identifiable anymore on the basis
00:47:21
of the of this data and then um you can say you and
00:47:26
so i know no no bit or that bit so if them yeah but but my
00:47:29
day case there and i'm which wasn't count because you don't know what your data is
00:47:35
and we have removed the key we have made it virtually impossible uh that anyone will
00:47:40
we'll do that um and uh and then uh it's impossible
00:47:43
to get that person be access to data and find out
00:47:47
where they case it would because then you should have this the other day it as well which is impossible so yeah yeah
00:47:57
yes that's my that's my point i think you can
00:48:06
i know ah yeah yeah yeah yeah yeah yeah i well
00:48:18
so that that's that's something else uh uh the um did that could be something to take into account
00:48:26
so but it's really really uh depending on case by case what most
00:48:31
sensitive thing to do is um have spent as a comment on this or
00:49:00
uh huh
00:49:11
yeah
00:49:15
yeah
00:49:21
yeah
00:49:28
you want to add or something or or it
00:49:43
uh
00:49:47
oh
00:49:49
i have to go and which wrote actually
00:49:54
yeah yeah
00:49:56
and what would you like to do with this then what would you like to do with that data
00:50:11
yeah
00:50:14
uh
00:50:19
yeah yeah so that's so it makes it easy identify when they talk about
00:50:22
personal things but still on the voice tell the good the identifiable so really
00:50:27
i would go for the consent thing yeah way telling what you get with the data
00:50:33
yeah
00:50:35
and then when when you've clicked and just you knew what we do
00:50:38
in our uh form so we say you can withdraw for this data collection
00:50:43
uh within two weeks after having done i think participated in this experiment also that
00:50:50
we're starting and allies me then it's possible for you to with their to withdraw your data
00:50:54
from that because it's a public interest you have to continue now and see it so this this
00:51:00
if i mean this is not an heap you are saying it's two weeks or whatever but this is the way we try to find a
00:51:06
way here to avoid all the difficulties get weak well encountered p. if
00:51:11
we let people randomly say i want to uh out of this data set
00:51:15
the data is not that the republican one more we have the obligation
00:51:20
that um our experiments should be clickable uh a little other should be able to do the same
00:51:27
and then all of sudden for possible anymore because one of the subjects there
00:51:31
so we have more obligations under stop and we have to find a balance between

Share this talk: 


Conference Program

ESR03 : Interpretable speech pathology detection
Julian Fritsch
Sept. 4, 2019 · 2:30 p.m.
160 views
ESR09 : Clinical relevance of intelligibility mesures
Pommée Timothy
Sept. 4, 2019 · 4:49 p.m.
Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.
ESR11 : First year review
Bence Halpern
Sept. 5, 2019 · 11:20 a.m.

Recommended talks

Multilingual speech recognition in under-resourced environments
Marelie Davel, North-West University, South Africa
June 2, 2017 · 11:05 a.m.
338 views
Bridging the Gap between ASR and HSR (AHSR)
Liang Lu, University of Edinburgh, UK
June 21, 2012 · 11:32 a.m.