Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
and the o. t. v. show in your rooms
00:00:04
and the context is that's ah having acquired the and specimen a
00:00:10
a a new reality show uh from a mouse brain
00:00:17
yeah in your just would like to have a very very very if i have a
00:00:21
hypothesis abode to the propagation of uh i
00:00:25
imposes in uh in the neural tissue
00:00:30
so uh basically a physical experiment can be performed way
00:00:33
where we attached electrodes to the to the specimen
00:00:37
uh we excise some new ronson observe the propagation of the of the sign out of the impulses
00:00:43
but we would also like to have a eh visual model of this a neural network
00:00:50
um uh if like that in a um relation setup
00:00:58
uh or imagine uh uh we want to see
00:01:02
how a certain you other generated diseases
00:01:05
influence is um the the the topology
00:01:09
of a of a neural tissue
00:01:12
or maybe a simply want to verify the hypothesis is that
00:01:18
a landing in a consistent creating new connections new connections between between you and
00:01:24
so we know each of the scenarios uh we need to uh a
00:01:29
construct a model of a of a of a neural tissue
00:01:32
from three d. observations normally v. observation will be a microscopy image
00:01:37
it's a t. dislike of images of a neural tissue
00:01:42
and uh uh normally well the what the image looks
00:01:47
like is uh to depicted on the left
00:01:52
and the the the the mole though that uh would be
00:01:56
off use for simulations 'cause the form of a graph
00:01:59
so it's overlaid in green at the right side of the slide
00:02:04
and uh and how we perform such reconstructions is usually in two steps
00:02:11
first we have a method for a segmenting the
00:02:15
uh the the the the volume into
00:02:17
uh x. owns land rights and and not axles not murals
00:02:23
uh and then on top of that oh uh we construct it uh a graph
00:02:28
so the details of these uh of this method them not essential but in general
00:02:33
it consists in that creating an over complete graph we sub sample
00:02:37
this page putting more or less evenly gaff note everywhere
00:02:42
we uh drawing uh uh uh the nodes uh
00:02:45
within a certain distance uh with edges
00:02:49
and that there's an optimisation scheme scheme that uh enables us to
00:02:54
and this guy the edges that don't really present your connections
00:03:00
and this talk is focused on the on the segmentation part of this pipeline
00:03:04
because uh there are some interesting problems to be solved a better
00:03:09
and they really most often the a segmentation is performed
00:03:14
by you know a a convolutional neural network
00:03:18
and the the the disadvantage of this approach is
00:03:22
not to uh get a highly uh
00:03:25
uh perform and a and a baby so we need a lot of training paid
00:03:30
and the the uh the the data needs to come as we come we've connotations
00:03:35
and these these uh annotations in a a off to mention of volumes
00:03:40
so as you can expect they are costly to produce
00:03:43
first because on updating in three dimensions is um
00:03:49
time consuming and second because nobody uh
00:03:53
only trained experts uh can actually
00:03:57
uh correctly classify correctly uh perform this annotation
00:04:03
and the uh it we were thinking how to address this problem and within this context
00:04:10
we were actually looking at an interface and the software interface design
00:04:13
for an operating these uh uh then dates and axles
00:04:18
and the of course it's a computer interface so what the user sees
00:04:21
on the screen is is is basically a to do you much
00:04:25
and the way these images uh create that it's called a maximal intensity projection
00:04:31
uh so basically eats uh you you take your volume
00:04:36
the user has the opportunity to rotate its uh over the the three coordinate axes
00:04:43
and uh what appears in the screen is a an
00:04:47
image that is a constructed by taking a maximum
00:04:51
along and they that crosses the volume and the that is basically par uh to the computer screen
00:04:57
and the uh did did did this uh the depicts so through from which this
00:05:02
way has been showed contains the the maximum value that this uh like crosses
00:05:08
so because these images have this night's property that the
00:05:11
structures of interest are brighter than the background
00:05:14
you will get to see a of or the important connections in an image like that
00:05:20
now too i'm not a it's such a volume include a of course
00:05:24
you need to click in a due to the coordinates of them
00:05:29
computer screen and one of the two things will happen
00:05:32
either you will adjust the depth uh manually
00:05:37
uh because there's the felt coordinate that that your remote the
00:05:40
position of your mouse is not a corresponding to
00:05:43
all all the bad will be selected automatically so in the first uh case the process is uh
00:05:51
time consuming and it's really not easy to navigate in a three d.
00:05:55
volume using it to the interface and in the second case
00:05:58
uh often this this that selection is based on heuristics other problem tool to failure
00:06:06
mm so the question so the question we're asking ourselves is uh uh instead
00:06:11
of actually because acting this today elevations can't we use uh annotations
00:06:18
uh uh which is costly us strong dislike going to use a a annotations of
00:06:24
the image that the user sees on the screen to train the neural net
00:06:28
so basically uh let's say we take three maximum density projections the peep that uh the
00:06:35
a top left corner of the slide uh will and that they
00:06:38
just them and uh we'll use them to train our network
00:06:43
the problem now was used in the formulating it out which is a lot less
00:06:48
cost more practical hopefully and the the problem now 'cause it's in the
00:06:55
in formulating a loss function that can accommodate on one side the uh uh
00:07:01
uh volume attic prediction and on the other side annotations of its projections
00:07:07
and the uh uh that can be addressed in a very simple way basically we can project
00:07:14
the automatic production in the same way as as the input
00:07:19
images project that were annotated and compare the projections
00:07:26
uh
00:07:28
and voila so the the the projection of the uh production is performed as i said according to the same
00:07:34
principle as the a projection of the volume when it's
00:07:38
on the tape it and in consequence the um
00:07:43
loss function has the following property uh uh there's a
00:07:49
one uh a cost computed for each pixel or
00:07:54
each of this product projections any depicts a is labelled as background
00:08:00
uh the last actually depends on the largest elements
00:08:05
of the call on a are all or tube of the uh production that
00:08:12
projects uh to this to the speaks and the nice property here
00:08:18
is that um uh for uh but don't speak so we're actually want to
00:08:25
minimise the largest value in the in the in the mentioned colon
00:08:30
and that uh uh corresponds to minimising an upper bound uh in the colon
00:08:36
so you can say that the background big broad background excel is actually constraining
00:08:41
a a hole are all off or or colon of walks us in the production and there's a
00:08:48
uh and there's a connection between uh this and
00:08:52
the classical metal fluffy deconstruction called space carving
00:08:56
where we uh i have for a and number of camera us uh
00:09:00
we segment the images of the scene into foreground and background
00:09:04
and the then uh the construction process consists in abstract terms of shooting at
00:09:10
i. from each of the cameras they emanates if they pass is true
00:09:15
uh and background speaks uh we remove all the box of the the way it's past
00:09:23
uh so this is this is the basic concept of the of
00:09:27
this uh of this last function however it has one problem
00:09:33
uh maybe it typically the volumes that we would like to ah taped uh uh should be large
00:09:39
because uh uh this is more time efficient and also
00:09:42
the the topology of the neural network is
00:09:46
but the scene in large volumes however normally we uh uh
00:09:52
when changing a neural network we only comparable tool
00:09:55
um a forward uh uh to a network uh a
00:10:01
cube over limit that size you tool a constraint
00:10:06
a memory size now imagine that we have annotated the volume
00:10:11
a present in the slide but we can only uh
00:10:15
we we copied and we only forward through the neural network
00:10:19
the uh a cube or a marked in red
00:10:23
uh the problem is uh about or even though we can uh crop uh
00:10:30
the maximum intensity a projection and the corresponding annotations to the corresponding size
00:10:37
uh it it will still contain a a in each
00:10:43
of the structures that uh that are are from
00:10:49
outside of the of the volume group so see here you
00:10:53
see this is a uh these bright then the rights
00:10:56
here and even though they would be outside of this volume they will still appear in the projection he
00:11:02
and the same applies to the uh annotations of course they will be on the fact that so this is an
00:11:08
annotation that does not correspond to our training thing and we found out that you can actually address this problem
00:11:15
at least badly uh again by a a drawing from the uh a field the off a
00:11:21
t. v. construction and space carving and using the
00:11:25
uh um construct caught the visual ha
00:11:29
so uh
00:11:34
yeah i'm afraid uh uh this which is not really well visible in the screen but uh
00:11:40
at the top left uh you have a you have a volume with three um
00:11:46
with the foreground talk so it's uh then in the needle there's
00:11:50
an annotation uh uh there are three uh yeah so
00:11:54
that the foreground boxes in the corners of like your but it's completely invisible in the screen i'm sorry for that
00:12:00
and then there uh uh the uh projections of the fuel with annotations of the wood projection on
00:12:06
the patients so basically each of these two cubes is visible in each of this the projections
00:12:14
thank you very much and the uh the one
00:12:19
and the visual howell is basically a an intersection of
00:12:24
the uh of these uh on the patients included
00:12:28
so as you see it's contains all the original
00:12:32
a foregone folks i was but it can contain also edition of boxes it's a superset of all the
00:12:39
uh and volumes that explain this is on the t. shirts but it has a nice property that
00:12:46
uh if uh uh uh work so is market is foreground in the visual how's it
00:12:51
means it has been my mark this foreground in all the uh on the patients
00:12:58
so we can use this property by observing that if we just cut out half
00:13:02
of this cube the left one with just one for going broke so
00:13:06
uh and we uh um crop the annotations accordingly will have
00:13:11
two annotations with just a single foreground uh um a boxer mark
00:13:17
and the third one we've all three of them mark however we can
00:13:21
the costs act a visual a howl from the cropped um
00:13:27
annotations up painting only one a foregone boxer because uh uh the
00:13:33
other two are not consistent with the in in three d.
00:13:37
we've we've the annotation clocks and that enables us to
00:13:41
prove on some false positive connotations however in
00:13:45
case where a and so this is a separate
00:13:48
uh uh image uh in case where um
00:13:53
the the oak lotions and the and walk so that uh there's not
00:13:58
existing eh in the realities because jack that individual how we
00:14:03
we cannot uh we cannot really uh really cover
00:14:07
the the correct i'm uh i'm not patients
00:14:15
so uh we have we have we have addressed the the presented problems
00:14:20
but no the basic question to ask is if the uh
00:14:25
if it's really a better tool undertaking the two d. then include me
00:14:32
so to tool shed light on the answer to this problem we have around a
00:14:38
small users that the would fifteen users who asked to are not paid
00:14:42
and uh volume uh microscopy volume both into the on t. v.
00:14:49
and you see the results in this plot there's the time of the annotation on the
00:14:53
x. axis and the time of the to the annotation on the y. axes
00:14:57
and that each point corresponds to a volume which has been annotated one d. and one thing t. v.
00:15:04
uh in in on them all the and by by different people and you see that on average
00:15:09
uh other patients in three d. and to be a faster than other patients
00:15:15
so it's top with like you for that and the patient to the going to be faster than other patients really
00:15:22
and in total if you some the annotation time it took eight hours to
00:15:26
annotate or the uh or a working hours to underpaid or the
00:15:31
uh the the volumes in three d. and five hours to on the pay them in in two d.
00:15:39
a a and a zoom what are the users said they
00:15:43
liked on the beginning to d. however uh uh
00:15:48
the three d. uh seem to to some of the use of the t.
00:15:52
v. seems to be easy to be useful to these m. b. u.
00:15:55
where they pick so they are looking at actually belong to annual it just annoys
00:16:00
uh and the side the other conclusion of the study was that uh
00:16:06
a very important factor or you live in the uh interface which should be in
00:16:11
into each simple and fast and basically that is uh but the game change
00:16:18
um uh and we everybody waited the the method
00:16:21
experiment that we just to verify that um
00:16:25
the uh annotation that the the supervision which is
00:16:31
less strict just in terms of the
00:16:33
a consistency of the projections leads to a results
00:16:38
that are much much worse than the
00:16:41
uh results of of uh then the performance of networks trained with with
00:16:46
you the elevations any indeed uh where data sets of um
00:16:53
uh confer concurs quickly no actually this is to puerto microscopy images of of
00:16:59
a mouse brain we got even better results when training on the uh
00:17:05
maximum intensity projection on the patients then
00:17:09
by training on the that would be the annotations and why it is
00:17:13
so does it's not completely clear to me but i guess it's
00:17:17
we should not throw a far reaching conclusions out of the single experiment but we
00:17:21
can safely say that that the method performs well what is also interesting is
00:17:26
that the performance doesn't the the great catastrophic we where the number of i'm not
00:17:31
take that uh uh projections uh is a decrease and yeah i i
00:17:45
yes yes so um i i mean uh it this is a a um
00:17:52
this is an important thing so indeed uh the to the
00:17:56
annotations i hear a obtained by projecting defeating elevation
00:18:01
which is the uh in some sense cheating because while the one
00:18:06
on updating into the you would get a bit different annotations
00:18:10
however seems evaluation in is in a a three d. uh according
00:18:15
to the other patients that come with the data set
00:18:18
it somehow makes sense to consistently use this thing on the patients into the l. p.
00:18:23
however we also uh around uh uh for this data set we
00:18:26
also run an experiment where we undertake that again the projections
00:18:31
uh uh without looking at the original t. v. annotations so of course but as we could
00:18:36
and the performance drop to buy wine point five
00:18:41
io you percent points so not much
00:18:44
uh and the baselines use here where the uh so actually
00:18:48
the basic baseline is undertaking the slices of the
00:18:52
of the volume because you can argue that you can put as much effort
00:18:55
into undertaking slices then as undertaking projections and maybe you will be
00:19:00
uh as good or even better and the answer is where you're slightly but the in if you are not that slices than
00:19:06
if you are not the projections than if you are not the sliced and these two are some existing methods that
00:19:12
uh i will not uh describe in detail and we did the the same uh a series
00:19:17
of experiments where they pass that off uh can focus on microscopy images of um
00:19:24
and it you double the cells and again the the the uh the network trained
00:19:28
on a a projection notations performs not much worse than the one thinking clearly
00:19:34
and we did the same for uh they can set off a
00:19:38
magnetic resonance and you're gonna be images of binaural bring musculature
00:19:43
and with the same conclusion that the performance uh over network trained on on the projection
00:19:49
on that patients use accept them and to conclude the the whole presentation uh
00:19:56
oh i would say that uh uh we manage to considerably lower
00:20:02
the d. v. annotation apple to how to the uh
00:20:07
uh the remote really you a compromising performance
00:20:13
um and as a commentary without but uh the the the this method
00:20:19
is actually quite unique that we can only apply to beta which
00:20:23
uh shows well in a marketing basically pretty projections
00:20:28
and which is sparse enough so that
00:20:30
you can use this property that that background speak sally's costing a whole

Share this talk: 


Conference program

Learning to Segment 3D Linear Structures Using Only 2D Annotations
Dr. Mateusz Kozinski, EPFL
19 April 2018 · 11:33 a.m.

Recommended talks

Pose estimation and gesture recognition using structured deep learning
Christian Wolf, LIRIS team, INSA Lyon, France
17 Oct. 2014 · 11:06 a.m.
267 views
Some Challenges in Biometrics: Facial Sketch, Altered Fingerprints & SMT
Anil K. Jain, Michigan State University
3 Sept. 2013 · 2:02 p.m.