VISCERAL and Evaluation-as-a-Service

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

thank you very much for pretty introductions up i would present like a little bit about the basal project and uh

00:00:05

evaluation as a service it's more of a concept that came out of the

00:00:09

project like some of the reflections support the work that we have done

00:00:14

and if we work in medical imaging we we always have like i have plenty of nice pictures to show us

00:00:21

mm if so when um i started preparing for this which the toppings reproduce abilities i was looking around with what

00:00:28

actually exists to reproduce ability and it's i think in line with a remote this morning from here about uh

00:00:34

uh open science et cetera they're actually quite a few like

00:00:38

a initiatives person institute for incentive for open signs

00:00:41

in the us non for profit organisation that tries to really look at like open transparent reducible signs

00:00:48

because everybody should have an interest in this and uh there are many many other initiatives there's

00:00:53

a group but uh uh stanford university around like john you and you guys who has

00:00:57

written a couple of articles very critical about reproduce ability of medical research

00:01:02

one of them called why most published research findings of falls

00:01:05

and um so he also has like a matter centre like metal research innovation centre

00:01:11

and i saw that even now have a stanford centre for reproduce when your slides

00:01:15

so meaning that i think many universities around the world actually really invest into this domain

00:01:21

and it shows that i think in the long run it it is something that uh when we make research better and stronger

00:01:27

but there's also like with respect to what we saw this morning there also a critical

00:01:31

voices there's wasn't article in in nature like the risks of up the replication drive

00:01:37

and i think something we need to take into account is what particular experimental sciences so

00:01:41

i think computational sciences we can very easily reproduce results of experiments but if we're running like

00:01:48

medication test these kind of things actually experimental conditions like small changes and experimental conditions

00:01:54

can lead to incorrect results and so if we want to reproduce medical results

00:01:59

and actually we want as an outcome that they're not reproduce but it's very easy to reach so

00:02:03

just make like small changes modify things a little bit so there's also

00:02:07

um uh and risk in that but sometimes it's just impossible

00:02:10

to reproduce exactly the same conditions and i think we need to take that into account also when we talk about reproduce ability

00:02:18

this is something that um um i looked into because

00:02:22

i like the scientific environment is very competitive so

00:02:27

we stay in here like everybody wants to publish we uh we

00:02:30

want to get funding and uh quite often like to

00:02:34

fighting sort of against each other and sometimes it's good like if we just sit back and look at each other

00:02:40

and uh how can we do that better and how can we maybe uh i

00:02:45

how to use this also to take that assigns i think this is something that i would like to see also out of this meeting because i'm like

00:02:51

which to institutions very close to each other that we like to see more

00:02:55

of these things will actually looking together to uh to get results

00:02:59

in in terms of like competitions will often scott scientific

00:03:03

competition i would rather call it cool petition

00:03:05

like a mix between everybody wants to get good results was only to look into actually corporations

00:03:10

really working together to uh to get results and this is a it's from the citation of nineteen

00:03:18

sixty eight so i uh uh that was richard hamming when he received the turing award

00:03:23

so he said like uh well uh newton said uh i sat stood

00:03:27

on the shoulders of giants today we stand on each other's feet

00:03:32

um perhaps the central problem we face in all of computer science is how we get to the situation we built on top

00:03:38

of the work of others rather than re doing so much of it which really different way that with sixty eight

00:03:46

fifty years ago and very often i i feel a little bit like

00:03:50

that like will making things slightly differently but we all have

00:03:53

like pressure we need to publish to work modifying things a little bit and but much of it is like should really different

00:04:01

and when i started my p. h. d. in image retrieval content based image retrieval

00:04:07

like i started reading papers and relies actually have no idea whether these algorithms good or not

00:04:13

i reprogrammed several algorithms from other people that are reading papers they didn't work on my data

00:04:19

and it was like impossible it was impossible to compare to other programs so i started working on evaluation making

00:04:25

data set available pushing people to compare everybody said like it's great that's what we need to do

00:04:30

but i had almost no one when the end came down and compare the results t. so was a little bit frustrated that's when we

00:04:37

traded like uh i've had a couple of names here but of

00:04:40

of when i think about fifty scientific challenges over the years

00:04:44

and arm it's quite interesting to see because then everybody has exactly the same

00:04:49

environment the same data the same experimental setup uh to compare results

00:04:54

and uh over the years we've had several results in also we've seen this morning like people

00:04:58

pushed was making code available making data available which do down like a bit pneumatic analysis

00:05:04

so if the code is available deposit to get more citations in particular the data papers like it

00:05:11

was making data available also usually get a many more citations than papers that don't too

00:05:16

it doesn't happen within a year or two but it's rather over a longer period of time saying like five to ten years

00:05:24

um they're also like commercial platforms like cackle of that

00:05:28

have made available data sets or a scientific

00:05:31

environments and the plenty of room with the prize money so people can participate in this war

00:05:37

in our scientific challenges for we don't have prize money but to people but it's more of

00:05:41

a scientific will in one of the things that struck me at some point is like

00:05:46

the plumbing needed to move data is an unavoidable part of purchasing data silence so moving data arm

00:05:53

was seen as like go something important but um one of the things that we released over the years is actually i'm

00:06:00

like yes we want to share data um we want to have sort of the point of large data sets

00:06:06

what have them well annotated and out here this is the paper off the that the

00:06:09

group didn't mention the beginning and they said like um if like very often

00:06:14

results on very small data sets the more competitive the domain of the more wrong all the findings because people

00:06:20

don't take the time to actually properly analysis but get results are very quickly because there's pressure people

00:06:26

um but um there's also n. i. h. that really pushes people

00:06:30

to make data sets available out of projects every n. i.

00:06:33

h. funded presentation institutes of all in h. funded project has to make the data available the end of the project

00:06:39

the problem is you can make it available in in many ways that make it unusable so that's what quite often have often happens

00:06:46

so it's not well annotated sometimes a important parts and missing and how the data were created so that there are many problems

00:06:53

and so over the years we realise that actually with the challenges we've organised on there are many many

00:06:58

challenges with organising challenge so one is extremely large

00:07:02

data sets so what kyle said like um

00:07:06

shipping the data plumbing around shooting the data is unavoidable um like many data sets like i've

00:07:12

just downloaded five tera bytes from the national library of medicine with like department data set

00:07:17

we like three weeks five million zip files and then another two weeks

00:07:21

to unzip them so it's not not messy very convenient to to

00:07:24

do that and there's always respect depending on what your we have pretty fast lines in switzerland if you happen to be in

00:07:31

i don't know in a um a jerry at university and you want to do that

00:07:36

are it'll take you two years to download the data so it's just not possible

00:07:39

i've also had hot tests and there's like the national lines being from the us it's not three tera bytes i think

00:07:45

so i sent the hottest there they send you what is that the house was broken when it right

00:07:49

so how do we manage to respond to reconstruct most of it but but there are problems with

00:07:54

another problem i work on medical data so medical data is confidential in general so it's hard to distribute it

00:08:01

small data sets can often synonymous we can check it manually particular or

00:08:05

text documents and images you can check it but i'm even

00:08:10

there's always a small wrist that something was not the scenic prostheses and realise they have unique identification number so you can actually

00:08:16

there's a prostate is somewhere in some graphic image you can actually we and number and you can we identify the pros

00:08:21

so they're always like small rates that we might not see actually that we only relies afterwards and if you

00:08:27

have a small problem and you multiply with ability in actually have a pretty big problem so

00:08:32

um we need to look at like how can we how can we deal with it

00:08:35

than any other domains like enterprise search can make available email archives of companies

00:08:41

investigative domain can police officers make available data on terrorist

00:08:46

and all like screening and things like that

00:08:48

g. p. s. status a telephone companies or so who would have a lot of g. p. s. data from people

00:08:54

and actually you can see where the people lets you cannot really distribute the data to researchers even though they

00:08:59

would really like to use apart from my cackle uh or anything similar to actually run things on

00:09:05

no other products quickly changing data so some domains we have new data writing constantly and

00:09:10

we would like to evaluate the or rhythms always on the like to status it's

00:09:14

oh if we create test data set we generated we make it available

00:09:18

on the platform with people work on it we get results evaluated

00:09:22

that's nine months to a year in some domains that's the that's quite quite a lot

00:09:26

and then as soon as we have that's what what i mentioned in cairo us with one half price money is cheating actually problem

00:09:34

so can people just annotate the test data and then train with

00:09:38

that to get better results maybe in a purely scientific setting

00:09:42

like we we never worried about it in make most of the the main like most of the contains we had

00:09:47

is the fact i got lucky provided paper if it's not reproduce abode at some point it

00:09:52

people might relies it because they cannot reproduce exactly the results if there's like there's no competition on

00:09:58

line cancer with into a million dollars of prize money so i think there's incentives

00:10:03

in this case to get the optimal results to get the best possible results

00:10:08

another problem is like groups with a lot of computing power have advantages

00:10:13

so if you you can one more complex models you can

00:10:15

run more training so potentially you can get better results can we

00:10:18

actually normalised it can can we make things really comparable

00:10:23

as a for like this was one to get rid of that so we're looking into this so we're looking into clout computing

00:10:31

it was like for this whole project started i think in two thousand twelve and

00:10:36

at that time it was like came up so we said like okay

00:10:40

we want to make a competition with the participants actually get a virtual machine

00:10:45

they can work on a small data set so they can that that their own rhythms

00:10:49

uh that that training systems and there are like segmentation systems or retrieval systems

00:10:55

and then for testing we take over the virtual machines we kept

00:10:59

the axis and we run it on a large data set

00:11:03

so i think some of like what you've seen this morning with the b. platform many

00:11:06

the objectives are actually quite similar to to what was in to be like from

00:11:10

this is not is integrated but i think that's why also asked the question about like what two main is it for

00:11:15

medical imaging people use totally different tools and uh it's very hard with a

00:11:20

virtual machine they can use the favourite environment huh they can use matlab

00:11:24

if they want to they can use windows or linux machines and then install so this was a little bit the idea so this is what

00:11:31

we had in in a project proposal i think in two thousand ten or eleven movie but when we go to propose

00:11:37

and then we started annotating data so we owe the

00:11:41

first competition we ran wasn't medical image segmentation

00:11:44

so that was the cover picture that you've seen so we had twenty organs that be annotated

00:11:49

in eighty images for testing and then we had another forty in l. a.

00:11:52

for training and another forty images that uh that we then when used

00:11:56

for testing some of the organs are not not visible in in all

00:12:00

modalities so we can only undertake what what's actually visible into data

00:12:04

and are you want thing like just as an example so these are people with these organs annotated you might think that i'll

00:12:11

one segmentation to pretty simple problem but it's the basic thing humans up pretty different when we look at that like uh

00:12:18

there are some have like pick levels smaller those small lines big longs so there's a lot of variability

00:12:24

and when we want to treat medical data when we actually want to make sure that we're comparing the same things

00:12:29

that we are we need to actually look into these uh these kind of things into it automatically

00:12:35

and these are some of the results like in like grade we have marked what radiologists annotated

00:12:42

and then we had different participant algorithm so this is then what we can compare weak compared to

00:12:46

in in in in in three dimensions can see also sometimes the automatic algorithms totally fail sometimes

00:12:53

the the organs are really difficult also to to to undertake in the date another thing we realise also that

00:12:59

even humans have a lot of variations so we had everything was annotated by several humans

00:13:05

um we then run comparisons so we also have a

00:13:07

baseline to once we consider like a maximum performance

00:13:11

but then i remembered you save the organs are at human performance actually itself

00:13:16

uh uh if the agreement with the same then that was so that that was considered

00:13:20

for us for like uh like being in an an optimal a segmentation or group

00:13:27

in other task we we ran with in the visceral oh

00:13:31

a project was reason detections so detecting small lesions in our

00:13:36

bones the liver brain long and in lymph nodes

00:13:40

and we are you can like make data sets available that people could then one on

00:13:45

unfortunately this is very hard task and uh for this task we

00:13:48

have like a basically nobody who participated in the competition

00:13:52

another uh part was are a similar case retrieval so the idea was we give a case

00:13:58

uh with a a region of interest marks we mark which organ is actually interesting

00:14:04

uh we do with automatic segmentation we have the radiology

00:14:07

reports but text documents are hard to actually distribute

00:14:11

be because an an imitation is not as easy as with uh with images of structured data

00:14:17

so we extract it semantic terms from these documents and then are that the participants had to

00:14:23

find similar cases in the database of i think like three thousand or four thousand to

00:14:27

volumes with a touch with so that was so the third task that we run within the spongy

00:14:33

another advantage off having the code of people is also that we can

00:14:38

create something that we call the silver corpus so we actually

00:14:42

like this is an example we took for our becomes we'd run like the algorithms on new data because we do have

00:14:49

um hundred twenty volumes that i manually annotated have a few thousand volumes well we have no annotations

00:14:56

so by by using several algorithms to simple label fusion we can actually

00:15:01

create something that is fairly similar to what the human might annotate

00:15:05

and i'm like that we can create additional training data uh that can then be released we can

00:15:11

then use for the algorithms to train again into to run on the test data so hopefully

00:15:15

uh with an iterative but automatic approach we can we can improve

00:15:20

our rooms and it can also serve to like really

00:15:23

get annotations even if they are not perfect for a a

00:15:27

very large so uh for very large number of cases

00:15:31

this also can allows for example to choose those cases for manual annotations with all of the

00:15:36

participant algorithms have like different results is even if all of the participant algorithms are wrong

00:15:42

we would only get more information in terms of ranking the algorithms with the cases where they actually different

00:15:48

so um maybe then brown chooses not the most important part but really finding these cases that

00:15:53

allow us to uh to to separate the pots so in the end like arm

00:15:59

this is pretty much what we had at the end of a project so this isn't got a lot more complex

00:16:04

so we still have virtual machines for the participants with registration system

00:16:08

so people can register automatically when they sign the license agreement

00:16:12

they get a virtual machine assign we haven't analysis system so when they train it um and they say

00:16:17

like okay now i submit my sit system the link is cut the analysis system runs on

00:16:23

uh the test data uh and then the analysis system

00:16:27

submits result to us organise us into the participants

00:16:30

and also all of the annotation actually happens in the system so we don't need to move data anymore

00:16:36

all of the data remains at the same place uh all the annotations happened see and with time we

00:16:43

actually realise also that which machines are actually not as portable as with all like moving them

00:16:48

from an is er clout to numbers and clout is nontrivial

00:16:53

so that's why we are actually move toward stock and i'm really happy like was it would be too

00:16:58

but to go to was because doc that's really like white approach to our part to move

00:17:03

quote unquote actually gets increasingly are mobile in in this respect we had several projects so

00:17:10

in the us and we're also discussing that in switzerland we can like this we cannot only work on data that is in the clout

00:17:16

which is usually like most most often anonymous data we can actually move the code further

00:17:21

on we can move it to hospitals so we can work directly on life data

00:17:26

because we don't have the problem of our our having to anonymous the

00:17:30

data having to move the data out because nobody only the

00:17:33

code sees the data no researcher actually sees the data the research

00:17:36

is only get evaluated results back and ups and um

00:17:42

and with the national cancer institute we also uh uh have worked too

00:17:46

uh look into how we can generalise this so they can actually

00:17:50

run their data challenges also in distributed way so potentially

00:17:53

sending the code to several institutions where the results evaluated been abrogated for a

00:17:58

further in august and this is just as an overly so really

00:18:02

it's a lot lighter than than virtual machines it's much more mobile it's easy to move around and it also voice over had

00:18:08

for groups that need to reinstall the whole software step because they can just

00:18:12

use like a local doctor contain and move that a move that over

00:18:17

um and there are couple of other lessons we learned from from from the this uh project so

00:18:22

cloud space is actually a pretty expensive so when initially like fortunately thank you microsoft so they

00:18:29

okay 'cause i think a hundred fifty thousand dollars in job computing resources over the years

00:18:34

um so we had to dispense running things are evaluating them so but you

00:18:40

think that bush machines open without doing anything it's quite so now we

00:18:43

developed a methods actually shut them down to stop them because that cost in terms of like sustainability

00:18:50

also i like with a beep from if you have people running things on the platform

00:18:54

you're responsible for for making making it like like sustainable and so we're looking into what what actually possibilities to

00:19:00

do that to control the cost and to limit maybe also the amount of computation but somebody can take

00:19:06

and um for installation is additional work we also have requests that the time for g. p. use in two thousand twelve

00:19:13

in the a.'s or cloud the window g. p. use now they have the traditions that you used so we could more easily do that but at

00:19:19

the time couldn't so some groups wanted to use did learning at the time uh they were pretty much excluded from uh from from this

00:19:26

it's also when we run competitions or cool petitions

00:19:30

um it actually takes quite a while initially like in when we rent the

00:19:34

project for the first time where five participants the twelve and seventeen

00:19:38

in i think now are like three years after the project finished i think we have over two hundred groups that are registered on the web page

00:19:45

uh on on the registration system we've run the code on the data that

00:19:49

also shows that we really need to think i think long term if

00:19:52

we want to have an impact on this and i think that's really something that's important when we think about open signs it's not like

00:19:58

feedback like uh you will not get our uh like a gratification

00:20:04

for your data or your code within here it's really

00:20:07

with a span of like five years that you you did you see whether people pick it up with the people would use it

00:20:12

and um we also realise that like trouble shooting like the prototypes admission it's it's

00:20:18

actually probably because people cannot see what happens on the test data so sometimes

00:20:22

the systems faded like out of memory errors and then with a small number it's fine

00:20:26

because we can go there we can check it out in manually supply feedback

00:20:30

but for larger number you need something more more like automatic which just like that that beep from for example

00:20:36

um but uh i i think i think we should be already like or a soft quite a few of the problem so

00:20:42

um we can make large data sets available so we don't have the problem shipping so

00:20:47

in the work like tens of a terrible it's not a problem we bring

00:20:51

the order against was the data not moving it the other way around

00:20:54

um we can run it on confidential data are we can always use

00:20:58

the later status because sometimes we actually in our data set

00:21:02

sometimes the ground to change because somebody said like always saw that there's an incorrect or

00:21:06

segmentation so we really are we modified the data and run run everything again

00:21:12

if we have the code it's it's triple to do we have more cases we can re run it again like that we can also see like how stable the ranking

00:21:19

because there is like variability depends on the topic it depends on the exact images

00:21:23

and very often the differences of the best systems are actually

00:21:26

within arrow so there's no statistically significant difference quite often

00:21:30

we can reproduce things because it's it's it's it's virtual machines so it's relatively easy even if the if

00:21:37

we're working on different hardwood because everything's virtual actually um it's it's relatively easy to cops to produce

00:21:45

and well we can also avoid cheating because nobody sees the test data nobody

00:21:49

can annotated nobody can optimise manually on the test data looking about

00:21:54

um and we also removed by so we can we have computation time so

00:21:57

we know how long it took forever everybody got the same virtual machine

00:22:01

so we can see it some of the other rooms for two minutes of us run for like twenty four hours for exactly the same thing

00:22:06

so there's like huge differences in the order of like a a factor of a hundred in that respect but

00:22:13

still reuse of components is not trivial making people collaborate more on components

00:22:20

was something that we found quite hard in really pushing people to to like work

00:22:24

together i think that those were the two parts that uh the different then

00:22:29

based on the outcomes of this so within think like how can we

00:22:33

okay if you push this this further uh what what are the problems actually with making disposable more

00:22:38

more general line simple in we came up with organised one watch appearance here in two thousand

00:22:44

fifteen in march and then another one in boston in november two thousand fifteen with

00:22:49

rip some very different stakeholders affirm industry with people from intel for microsoft

00:22:53

from funding organisations from academic part most user groups et cetera

00:22:58

to look at um how can we how how what are the existing approaches and how can we do with it

00:23:04

so there's there's a white paper are that we put on archive stowed openly accessible it's not he reviewed

00:23:11

um uh but people people can have a look at that and um

00:23:15

like the different ways of doing this using a. p. i. is of using executable code um

00:23:21

i'm looking at like what are the different stakeholders that uh that we can

00:23:24

look at we can see that for example the national cancer institute

00:23:28

push towards uh are um the organisation of challenge which got the digital mammography dream challenge

00:23:34

it's organised by sage fire networks which sure is an american company

00:23:38

there in seattle and they run scientific challenges and uh

00:23:42

they picked up exactly the small so they used talker containers people wanted people haven't limited amount

00:23:47

of computation available it's one on i. b. m. infrastructures and i'm isn't infrastructures in

00:23:53

um they have a million dollar prize so if your interest and that it's uh it's quite

00:23:58

interesting and they have what is called a community face initially everybody's separately submits results

00:24:04

but in the end they really want people to collaborate to work together to get the best results so the best

00:24:09

possibility very likely to win one of the prices a have good results early

00:24:14

but then team up without which which i quite liked in that respect

00:24:19

their business model so there's a company called other medical vision but they also make data available docket containers and

00:24:26

read the idea is that of algorithms they can commercial ice results and then ship benefits on a lot

00:24:33

of it's it's it's i think an interesting model and it shows that there is also possibility

00:24:37

of business model on microsoft chose that tells that they want to make a billion available resources so

00:24:43

every two months i think they have a deadline for project submissions of uni computing power

00:24:47

it's relatively easy to to to get and there's a lot of discussion

00:24:51

in like the the literature of the various stakeholders like politics so

00:24:56

on the thing and us it has not changed a little bit but uh by doing

00:25:00

definitely a push quite to was like making things available and we asked like how

00:25:05

we want to have the kinds the moon shopkeepers that like make data available share data

00:25:09

are those those were the things that uh that read researchers boston for

00:25:14

something for intel so they have like was call for a cancer clout

00:25:18

that they uh put up and the model is really that

00:25:21

hospitals will have computing infrastructures in the future so we can actually move because their work directly on the data so does

00:25:27

not need the institution in the risk of like particularly in the us were many data sets can be bored

00:25:33

also sometimes data that on like all protected data and so you can match actually different data sources to really identify patients

00:25:40

from voter records to uh i different registered in by keeping the data

00:25:44

in the indian situations which you more these kind of problems

00:25:48

institutional supports we at the national cancer institute participating and they're really pushing for running challenges making

00:25:54

are making things available and this is one of the projects were i'm involved in

00:25:59

uh of the scope quantitative imaging network of the national cancer institute

00:26:03

where they have a are applied for what every every

00:26:07

maybe evaluation of data in this context is to be run by doctor containers the containers are kept

00:26:13

available it's one in the clouds are to really make sure that everything is absolute reproduce well

00:26:19

from data creations so they're working on image data creation feature extraction 'cause even

00:26:25

standard features like of features based on co currents mattresses for for

00:26:29

text extraction had extremely different results so they had like

00:26:33

seven hospitals participating in one of those are a test and they realise that even like

00:26:39

very very basic features depending on the installation that people were using where they

00:26:44

were normalising the way they were normalising they they didn't correlate at all

00:26:48

and uh i i think it's really important to have these things to to look at like what what we

00:26:53

start with because even the image creation was very different so even the same machine installed into different hospitals

00:26:59

didn't deliver the same image and they work physically shipping like a phantom but they took images so

00:27:04

so if the images are not the same with the same parameters we need to look at like if we extract quantitative

00:27:09

features everything's based on on what is done before in terms of also like reproduce ability that's actually really really important

00:27:16

as i mentioned there's when we look at this so it's political part of my talk i think um when we look

00:27:22

at uh evaluation is the service is a concept really need to look at like with this with the stakeholders

00:27:28

so we have funding institutions like this is national science foundation we have

00:27:33

companies who have their part in the we have scientists everybody has different constraints people we have a problem it's like on

00:27:40

clinicians in our case to have a problem that they would like to solve what they would like to to use

00:27:45

data sciences tools to to help them and how can we uh like put these different interest together because i think

00:27:51

everybody would have an interest in the system that would be more more efficient more effective to india and

00:27:57

so there's more on this in the white paper and um on on on off

00:28:05

yeah okay okay you know five minutes i'm i'm always it but so for me or like one of the messages

00:28:14

putting from today's really um if you like many things are centred around data and we've mentioned like was mentioned this morning that

00:28:22

data sets on a side table itching nature has now journal called

00:28:26

scientific data is nature scientific data which only partially status

00:28:30

it's an open access journals if you have a nice data set out submitted here you will expose it to the community

00:28:37

uh they have a very good review process and they really give very very good feedback uh on things

00:28:43

um and i think it's important to make them available in meaningful ways we have one of our data sets

00:28:47

published here and they were actually downloading the data they were reading are constructions even the instructions on how

00:28:53

to use the data they corrected the invasion there so i was quite impressed but really they they they take

00:28:59

it seriously uh to that the data made available in a meaningful way so so people can actually

00:29:05

rely on it and we use it and i think we also need to look into how can we share

00:29:09

infrastructures sitting one of the difficult part is really hard to make it sustainable how to make this

00:29:14

but from like beat how can we make it sustainable independent from

00:29:18

like our project funding 'cause i think like dependent project funding

00:29:21

you depend on project funding it's it's not easy to to have something that uh uh that is actually totally independent from

00:29:29

and um our our code will definitely become more portable talking really helps with that and i think we also need

00:29:37

to look into like public private partnership how can we get like infrastructure providers in here now with like the

00:29:43

e. p. f. l. d. th that data sciences centre maybe there will be like a scientific lot for switzerland that

00:29:48

i would be easy to use that's something to look into it but if you really want like a reproduce ability like

00:29:55

long term reproduce ability insuring of tools we need to look at like how can we create an infrastructure

00:30:00

that is flexible enough are maybe different infrastructures for different domains but

00:30:04

um because it it it would be much more efficient to to do that because

00:30:09

our it just doesn't make sense that you know like how much people develop in in two different directions

00:30:15

well i think data signs really requires an infrastructure want he like

00:30:20

everybody's working in one way or another on artificial intelligence so

00:30:23

everything evolves around data in in medical field i think we need to work on routing data currently like whenever we

00:30:29

produce data for clinical trial we have like a perfect image quality everything's very controlled reaching data is not

00:30:36

like you don't want to have the best image that you want to have the lows radiation dose for your patients so you're

00:30:41

trying to limit things to load those to the quality is not the same so if we want it to be usable

00:30:47

in in in real setting which you need to look at how how how we can easily do that we need to look at ways

00:30:53

how to actually limit the amount of manual annotation so we can use

00:30:57

like active learning we can use weekly annotated data to learn

00:31:01

um are in might get like much much larger data sets from hospitals that can actually allow us to do that

00:31:08

um and then well you need to shave infrastructures i think and uh i'm really like in most domains

00:31:14

want to build decision support from needs in the medical field but in many other fields it really can somebody to

00:31:20

and a lot more available on like the different web pages and uh

00:31:25

also don't don't hesitate to contact me if you have any questions

Share this talk:

Conference Program

02:18

Welcome
Sébastien Marcel, Senior Researcher, IDIAP, Director of the Swiss Center for Biometrics Research and Testing
March 24, 2017 · 9:17 a.m.

473 views

33:22

Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:20 a.m.

250 views

08:56

Q&A: Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:54 a.m.

30:41

Reproducible Research with Bob and the BEAT Platform
André Anjos
March 24, 2017 · 10:03 a.m.

231 views

17:31

Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:40 a.m.

02:39

Q&A: Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:57 a.m.

31:33

VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 11:35 a.m.

109 views

04:29

Q&A - VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 12:07 p.m.

16:15

A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:12 p.m.

265 views

01:10

Q&A - A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:28 p.m.

12:45

Making experiments on remote heart-rate measurement reproducible
Guillaume Heusch
March 24, 2017 · 12:30 p.m.

151 views

16:16

Reproducibility of signals in Electromyography & Research Data Sharing
Manfredo Atzori
March 24, 2017 · 12:43 p.m.

139 views

06:03

Q&A - Reproducibility of signals in Electromyography & Research Data Sharing
March 24, 2017 · 1 p.m.

Recommended talks

21:49

Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.

31:10

VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)

Embed

Transcriptions

Conference Program

Welcome
Sébastien Marcel, Senior Researcher, IDIAP, Director of the Swiss Center for Biometrics Research and Testing
March 24, 2017 · 9:17 a.m.

Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:20 a.m.

Q&A: Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:54 a.m.

Reproducible Research with Bob and the BEAT Platform
André Anjos
March 24, 2017 · 10:03 a.m.

Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:40 a.m.

Q&A: Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:57 a.m.

VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 11:35 a.m.

Q&A - VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 12:07 p.m.

A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:12 p.m.

Q&A - A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:28 p.m.

Making experiments on remote heart-rate measurement reproducible
Guillaume Heusch
March 24, 2017 · 12:30 p.m.

Reproducibility of signals in Electromyography & Research Data Sharing
Manfredo Atzori
March 24, 2017 · 12:43 p.m.

Q&A - Reproducibility of signals in Electromyography & Research Data Sharing
March 24, 2017 · 1 p.m.

Recommended talks

Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.

Can we tame mobile research in the wild?
Daniel Gatica Perez, Idiap Research Institute
Sept. 3, 2012 · 2:31 p.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

VISCERAL and Evaluation-as-a-Service Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)

Embed

Transcriptions

Conference Program

Welcome Sébastien Marcel, Senior Researcher, IDIAP, Director of the Swiss Center for Biometrics Research and Testing March 24, 2017 · 9:17 a.m.

Keynote - Reproducibility and Open Science @ EPFL Pierre Vandergheynst, EPFL VP for Education March 24, 2017 · 9:20 a.m.

Q&A: Keynote - Reproducibility and Open Science @ EPFL Pierre Vandergheynst, EPFL VP for Education March 24, 2017 · 9:54 a.m.

Reproducible Research with Bob and the BEAT Platform André Anjos March 24, 2017 · 10:03 a.m.

Running state of the art speaker verification and attack detection experiments on BEAT Pavel Korshunov March 24, 2017 · 10:40 a.m.

Q&A: Running state of the art speaker verification and attack detection experiments on BEAT Pavel Korshunov March 24, 2017 · 10:57 a.m.

VISCERAL and Evaluation-as-a-Service Henning Müller, prof. HES-SO Valais-Wallis (unité e-health) March 24, 2017 · 11:35 a.m.

Q&A - VISCERAL and Evaluation-as-a-Service Henning Müller, prof. HES-SO Valais-Wallis (unité e-health) March 24, 2017 · 12:07 p.m.

A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform Adrien Depeursinge March 24, 2017 · 12:12 p.m.

Q&A - A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform Adrien Depeursinge March 24, 2017 · 12:28 p.m.

Making experiments on remote heart-rate measurement reproducible Guillaume Heusch March 24, 2017 · 12:30 p.m.

Reproducibility of signals in Electromyography & Research Data Sharing Manfredo Atzori March 24, 2017 · 12:43 p.m.

Q&A - Reproducibility of signals in Electromyography & Research Data Sharing March 24, 2017 · 1 p.m.

Recommended talks

Big Data with Health Data Sébastien Déjean Sept. 5, 2019 · 9:20 a.m.

Can we tame mobile research in the wild? Daniel Gatica Perez, Idiap Research Institute Sept. 3, 2012 · 2:31 p.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)

Welcome
Sébastien Marcel, Senior Researcher, IDIAP, Director of the Swiss Center for Biometrics Research and Testing
March 24, 2017 · 9:17 a.m.

Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:20 a.m.

Q&A: Keynote - Reproducibility and Open Science @ EPFL
Pierre Vandergheynst, EPFL VP for Education
March 24, 2017 · 9:54 a.m.

Reproducible Research with Bob and the BEAT Platform
André Anjos
March 24, 2017 · 10:03 a.m.

Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:40 a.m.

Q&A: Running state of the art speaker verification and attack detection experiments on BEAT
Pavel Korshunov
March 24, 2017 · 10:57 a.m.

VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 11:35 a.m.

Q&A - VISCERAL and Evaluation-as-a-Service
Henning Müller, prof. HES-SO Valais-Wallis (unité e-health)
March 24, 2017 · 12:07 p.m.

A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:12 p.m.

Q&A - A 3D Riesz-Covariance Texture Model for Precision Medicine: Validation on Lung Adenocarcinoma in CT and Open-Access Radiomics Web Platform
Adrien Depeursinge
March 24, 2017 · 12:28 p.m.

Making experiments on remote heart-rate measurement reproducible
Guillaume Heusch
March 24, 2017 · 12:30 p.m.

Reproducibility of signals in Electromyography & Research Data Sharing
Manfredo Atzori
March 24, 2017 · 12:43 p.m.

Q&A - Reproducibility of signals in Electromyography & Research Data Sharing
March 24, 2017 · 1 p.m.

Big Data with Health Data
Sébastien Déjean
Sept. 5, 2019 · 9:20 a.m.

Can we tame mobile research in the wild?
Daniel Gatica Perez, Idiap Research Institute
Sept. 3, 2012 · 2:31 p.m.