Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:01
so we'll be focus more on the transparency of a c. d. c. n. n.'s
00:00:06
and that their their structure and how if we increase the if you have more structured
00:00:11
really is and and uh and operations this may increase transparency
00:00:17
um so i will first introduced the interplay between transparency how
00:00:21
we how we think of it and especially for c. n. n.'s
00:00:25
and how traditionally uh invariance is learned in c. n. n.'s
00:00:30
i would then present a group occurrence you know and that incorporate it in a
00:00:35
like a variance in the network and how these increases the the transparency of the network
00:00:42
and then i represent our work on local a rotation environs with reduced yeah
00:00:47
but filters for medical imaging for uh n. y. and stands in three d.
00:00:53
uh and we go through the details and experiments and results
00:00:59
so first uh we think of interplay b. t. s. two main parts what
00:01:03
uh the transparency that includes the suitability that
00:01:07
which is the simplicity of the of the algorithm
00:01:11
but the compatibility so how we understand each part of the
00:01:15
network of the the the inputs the feature is each parameter
00:01:20
uh the the algorithm transparencies so how we
00:01:23
understand the optimisation uh of the model and hall
00:01:27
be uh what how we expected generalisation to new data
00:01:33
and then the other side is the post talks post expired
00:01:35
ability ah one up to seeing mostly today the visualisation the same
00:01:40
into maps the examples of a inputs uh and the natural
00:01:45
language explanations that can explain a preacher a a a trained network
00:01:51
so transparencies more the on the hawk that's how uh colours was calling it
00:01:56
uh and we will focus on that the compatibility in the algorithm transparency ah
00:02:03
so in in c. n. n. c. in the planning in general
00:02:06
use you have this trade off between depth and transparency among all those
00:02:10
so if we increase that that the transparency as i define before uh usually reduces because it becomes more complex
00:02:17
now see the weight sharing and the local connectivity of the convolution operation
00:02:23
bring some extent of the compatibility and i'm going action fancy as compared to dance network
00:02:28
and it's also a lot together with the design deficiency of
00:02:31
the design and equivalence to translation that we see in most uh
00:02:36
uh image analysis so this is the the degree parents too
00:02:40
two translations so we if we have an input that is translated on the right the response maps also translated right
00:02:47
and this is just by definition of the convolution operation and also very basic spread beauty methods so the visualisation
00:02:53
without our divine ad right from this designs we need this design to use these methods i ah s. l.
00:03:02
despite what i said before the congressional filters ah scale and rotation selected among others
00:03:07
and this is a problem because the features the hidden features very lot with
00:03:12
respect to the german to transformation of the images so if we have larger
00:03:16
i'm a object uh the the the internal representation woodbury also with rotations
00:03:24
and then how we tackle this usually in the in the training is
00:03:27
to input various a representation of the same images at different orientations skate
00:03:33
the problem with this approach is that uh if we have few parameters social network
00:03:42
in putting these rotated versions will deteriorates the directionality
00:03:46
sensitivity that action of sensitivity so we we don't think
00:03:49
it's a topic filters of conclusion this is the only way to obtain a and rotation in via network
00:03:56
we've fuel convolution operations but if you have many parameters
00:04:01
we will and sort of these uh first layer of c.
00:04:04
n. n.'s that are very common but at the cost
00:04:07
of a of a heavy computation and and not controlled settings
00:04:15
so now coming between these equivalence inside this unit has been
00:04:18
done already in the in the last is so we use prior
00:04:23
try on data symmetry to explore the uh to exploit it and how could it
00:04:29
in a scene and architecture like so the the to be example is a good pick a vine c. n. n.
00:04:34
in which we apply filters at different orientations the same features so we need to learn only one feed the
00:04:41
and applied at different orientations so into d. is an
00:04:44
example uh we have a group of ninety degree rotations
00:04:49
and we apply the same food at all ninety degrees aren't patients and we
00:04:53
how the response map that is a that is a thing for each rotation does
00:05:00
uh and then we can obtain variance base from
00:05:03
these different responses by pulling on the orientation channels
00:05:08
in three d. becomes more complex because we have much more ninety degree orientations we have twenty four ninety degree
00:05:14
rotation the drawbacks of this approach our by a bad it's it's it only works within with the writing the rotations
00:05:22
and we have a large number of conclusion because we need to apply convolution for every rotated filter
00:05:27
and these are the two uh drawbacks that we'll address in the
00:05:31
a method that we developed but first still these between occur variance uh
00:05:38
first of all it simplifies the learning process so these are features that i learned as i said before
00:05:44
by using a or a rotation equivalents approach and and uh stand that's
00:05:49
you know and with the documentation we see that these are completely almost completely
00:05:54
uh it's a topic that you are not sensitive to edges or basic show basic shapes
00:06:01
and this makes the the capacity of the network strongly reduced and so these are we can
00:06:07
see he edges and it may be much more efficient to classify or deep or learn from images
00:06:16
uh it also increases the transparency sell um in terms of our guard algorithmic just parents
00:06:22
in the compatibility we have the enforced genetics to to uh in the in the hidden features
00:06:28
um that that in that uh uh results in
00:06:32
a predictable transformation of the in activation so when we
00:06:36
have a rotation of the image you have the image i in reply rotational here
00:06:41
and then we have a a feature work convolution that is written
00:06:45
to but you get uh the the the operator on the match
00:06:49
it is we know that it would be this the
00:06:51
same result as a first converting the image and rotating
00:06:57
uh the results so we have basically parents to work to audition and distract uh makes
00:07:03
the the model the model more transparent so we know that
00:07:07
we controlled the responses uh based on the in on the input transformations
00:07:13
and the feature maps become functions off a x. of the the look
00:07:17
at the location the the spatial location and the location of the image
00:07:22
so we can finalise that uh depending on which orientation channel would have responded
00:07:27
we can allies that are uh that the the the response that only us
00:07:33
it also improve the post talks but b. t. so here we show um
00:07:38
a stand that's it's l. rotated digits and faces and the class activation map that has been
00:07:44
presented a lot and we see that post and that's you know and it is less stable
00:07:48
uh then if we have location agree violent responses so that the
00:07:53
important region is much more stable with this architecture of rotation equipments
00:07:59
in both cases so we we we always focus on the same part of the region a region of the image
00:08:06
it it indicates more stable representations and better understanding of what what
00:08:10
is happening in the in the network so now this was uh
00:08:15
uh the work that was done previously because or other researchers and was for global rotation environs and
00:08:21
we'll talk about local rotation violence in we use
00:08:24
methods with three d. stable features that are present talk
00:08:28
what is book rotation parents we have an example in two d. but remember that we work in three d.
00:08:33
uh medical imaging we have local structures that appear at various rotations for
00:08:38
example in this image we see that uh we have directions here that are
00:08:43
this to have the same pattern but that'll occur at all different orientation who would like to
00:08:47
detect these as one single a pattern and treat
00:08:52
them all together uh as the same uh information
00:08:57
and generally we don't need global rotation variance as we would need in in image
00:09:02
net in normal object uh recognition tasks because we usually have a controlled setting in which
00:09:08
uh we know that we will uh acquire the the body for example in
00:09:11
a certain direction and we don't expect but they should at least be controlled piece
00:09:16
so we have the need for local rotation invariance working parents uh uh
00:09:20
that we will uh also describe it better in this image so we have
00:09:25
uh i mean pretty much with three simple buttons and the response map three part that three same uh
00:09:32
responses here if we have group rotation violence neither if we were to be much
00:09:37
the the responses what they that but if you wrote that we have
00:09:40
local rotation right environs we rotate inside the the the image the small patterns
00:09:45
and we have the same uh response so this is the local rotation parts
00:09:50
so as a summary of our our approach uh we so this is in three d. uh images
00:09:56
we we we can board with a set of basis filters instead of a learning three d. features entirely
00:10:03
and uh we don't features in the span of
00:10:06
these of these uh basis features the spherical harmonics
00:10:11
and then we can stay here the responses by the combination of the responses
00:10:15
to our bases features are going more details but is to be given other people
00:10:20
once we have we uh steered responses to responses for features that different orientations
00:10:26
remarks pool uh the responses topping local rotation environs
00:10:31
and we can train this network and to and weaves stand out a a optimise us so this
00:10:37
is sorry we have first we had the data plantation with us and that's you know and we
00:10:41
then these sort of features then we had a good a currency and and what we input
00:10:45
uh we we feature with what they did versions of the features
00:10:48
and now what we propose instead of of filtering with what they diversions
00:10:53
we've we can board with the set of bases filters
00:10:56
and we can recombine them to obtain responses that any orientation
00:11:02
so what are these stable features um data base we based on strike
00:11:08
and harmonics so they take harmonics foreman ultimate backspaces on this fee at
00:11:13
we can base interpret it as the extension of secular how many switches it's similar to that
00:11:17
for you bases on onto the sphere so we have a function on this here and uh
00:11:25
uh also there's this for the harmonics it's shaken how many so organised by degrees and
00:11:32
and all the m. so this is for the first
00:11:34
degree a degree zero o. d. one degree two and um
00:11:41
the the these days is when we when when the number of degree tends to within two infinities
00:11:46
these bases ease dance in the sense that we can read by combining
00:11:50
these bases features we can represent any uh any function on the sphere
00:11:57
ah now how do we form three feet is
00:12:01
using these basis usually form collapsible separable if stable filter
00:12:07
by using a set and a number of degrees that really need to act that an
00:12:12
and it is a function uh so we we have this function of this on
00:12:16
this feel that is the the how many here and we combine them really uses colours
00:12:21
the the coefficients for each a harmonic and we project these
00:12:25
function of the sphere into three d. volume by uh using
00:12:29
that um uh rachel profile that is function of uh of
00:12:34
the distance to the centre instead of a function of this here
00:12:38
so this forms a three d. filled up uh the speed
00:12:42
the f. and this is what we use in our convolutional network
00:12:46
uh so this parametric representation instead of a normal three feet the good thing is with the speaker is that we can
00:12:53
stay it in in the sense that if we modify these coefficients here we can obtain it at any orientation we want
00:13:01
uh so we can bore of uh the input by the uh the features here the h. y.
00:13:10
and we can obtain the response to feed that in your
00:13:13
addition ah by simply steering this is a standard uh a simple
00:13:18
um matrix multiplication that is very efficient we obtain at a new addition we want here all in three d.
00:13:26
uh so impact is we have more people uh features as we would have in this
00:13:31
in this you know and different if each outputs a written to buy these uh index i
00:13:37
so different reading profiles h. r. i. and coefficients you
00:13:41
the benefit of these that also we reduce the number of trainable parameters because we only learn the radio profile that is one d.
00:13:48
and coefficients colour coefficients instead of a big three d. uh feed
00:13:54
we have to have a limited number of convolution because we don't need control fried
00:13:57
or rotated versions of the kind of the only control with a basis with this
00:14:02
then we must put on the on patient after the first day or two up in
00:14:05
this local rotation environs so obvious response of the the first layout is locally rotation in heart
00:14:13
uh and then we followed this first layer bias but i global
00:14:18
average putting ah understand out fully connected a is for for them
00:14:24
cutting cage now the experiments uh we have to we experiments on two
00:14:28
data set one is a a c. d. that they decide that we created
00:14:32
we have different uh buttons that we'd manually uh rotate and put in different classes
00:14:38
uh we make it uh we we add some by ability to
00:14:42
introduce overlapping interpolation different density that bring some challenge to the data set
00:14:48
and a second data set is the three d. c. t. promote that primary noted classification
00:14:53
so we want to classify b. nine from malignant remote
00:14:57
remote i noted this out slices from the three buttons
00:15:03
um so what we seem mostly that that three c. n. n. is a stand that's you know and uh uh
00:15:09
three d. c. n. and we see that when we increase the number for inflation because we
00:15:12
can steal the responses that in your annotation we want when we increase the number of orientations
00:15:18
we increase the the accuracy of our network we also
00:15:20
have a significant a drastic reduction of number of parameters
00:15:25
this is very shallow network but we managed to have very good accuracy with only forty commenters in this case
00:15:31
and this is on the snow to vote a data set on the upper i noted classification we have a bit the same uh
00:15:37
results when we increase the number of foreign patients of all of it is the accuracy increases but it's utter
00:15:43
it's a bit before so business data set may not
00:15:47
need very uh a very precise is uh on station sampling
00:15:55
but the good thing is when we increase the number of rotation we don't increase the number
00:15:59
of parameters so we we can really be very uh how varied than sampling of our on patients
00:16:08
so to get back to the um to the transparency idea we we
00:16:12
showed before that uh the data the prior on data symmetry can be
00:16:16
uh including between the see the the networks to
00:16:20
increase the transparency so we have really responses full
00:16:23
fall a special uh locations of the input images but also for
00:16:28
annotations of uh of the patterns that are in the in the image
00:16:32
we have to develop the the set three d. locally rotation vines and then use instead of features
00:16:38
that's significantly reduce the number of parameters and meets the number of compulsions
00:16:44
the the compose ability is increasing also i i mention translation rotation but we also have
00:16:50
responses for different frequencies thanks to l. uh uh
00:16:54
a basis of uh of of a second harmonics
00:16:58
so we can if we can understand what are the frequency that responded
00:17:02
most and this can give us information on how the network uh works
00:17:09
in future work we uh we are considering calculating
00:17:12
environs from the harmonics instead of having to ski uh
00:17:15
or two on patients and use non polar separable features but there's not much time to discuss

Share this talk: 


Conference Program

Methods for Rule and Knowledge Extraction from Deep Neural Networks
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
3 May 2019 · 9:10 a.m.
349 views
Q&A - Keynote speech: Prof. Pena Carlos Andrés
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
3 May 2019 · 10:08 a.m.
Visualizing and understanding raw speech modeling with convolutional neural networks
Hannah Muckenhirn, Idiap Research Institute
3 May 2019 · 10:15 a.m.
Q&A - Hannah Muckenhirn
Hannah Muckenhirn, Idiap Research Institute
3 May 2019 · 10:28 a.m.
Concept Measures to Explain Deep Learning Predictions in Medical Imaging
Mara Graziani, HES-SO Valais-Wallis
3 May 2019 · 10:32 a.m.
What do neural network saliency maps encode?
Suraj Srinivas, Idiap Research Institute
3 May 2019 · 10:53 a.m.
Transparency of rotation-equivariant CNNs via local geometric priors
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
3 May 2019 · 11:30 a.m.
Q&A - Dr Vincent Andrearczyk
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
3 May 2019 · 11:48 a.m.
Interpretable models of robot motion learned from few demonstrations
Dr Sylvain Calinon, Idiap Research Institute
3 May 2019 · 11:50 a.m.
Q&A - Sylvain Calinon
Dr Sylvain Calinon, Idiap Research Institute
3 May 2019 · 12:06 p.m.
The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa
Xavier Ouvrard, University of Geneva / CERN
3 May 2019 · 12:08 p.m.
Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
3 May 2019 · 12:21 p.m.
Q&A - Seyed Moosavi
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
3 May 2019 · 12:34 p.m.