Player is loading...

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL

Friday, May 3, 2019 · 12:21 p.m. · 12m 44s

Abstract: Deep neural networks, despite their huge success in solving complex visual tasks, are generally considered as ``black-box'' models. In particular, while they outperform humans in natural image classification tasks, they are shown to be extremely vulnerable to well-sought small perturbations in the data, called adversarial perturbations. Such phenomena indicate that deep networks might only rely on superficial features, as opposed to semantically meaningful features, to discriminate between different classes. Therefore, improving robustness of deep networks to adversarial perturbations is a crucial step towards building more interpretable models. To achieve more robust classifiers, I will propose a new regularizer that directly minimizes curvature of the loss surface of deep networks, and leads to adversarial robustness that is on par with adversarial training. Besides being a more efficient and principled alternative to adversarial training, a network trained with our method exhibits visually meaningful adversarial examples, as perturbed images do resemble images from the adversary class.

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

well everybody nice to be here today a

00:00:03

book about uh improving robustness or better interpret ability

00:00:09

actually here are i would talk about one possible connection between interpret

00:00:13

ability and robustness of a classifiers announcer specifically i will talk about uh

00:00:21

pretty a method that we recently developed in order to improve the robustness of classifiers

00:00:27

and sane and jonathan um enhance you know to get a better interpret that classic it's

00:00:35

so i'm out many of you'll already and have heard about adversarial perturbations

00:00:40

uh i just quickly recap what an adversarial perturbation reversed example used

00:00:46

essentially you know no matter how accurate your classifier imported the image classifier ease

00:00:52

we can always find um small and most of the time imperceptible perturbations

00:00:59

that if you add to your or you know you made your networked

00:01:03

gets things won't server them here you see that any major for school boss

00:01:07

which is like a you know by trained classifier is a classified correctly

00:01:12

while if you had a nose isn't optimise no it's

00:01:15

it's not random uh it would be classified as an ostrich

00:01:20

completely different uh from basically the original concept again so this is kind of a symptom that

00:01:28

these classifiers might rely on some something that people called like surfaces

00:01:33

statistical features some features that are not really important in or at least

00:01:38

for humans they're not important in it with a battery then it's an school was ernest reach but

00:01:44

for some reasons but you still do not know do a completed yet it's in that they rely um

00:01:50

and some invisible and non interpret double meaning less features so

00:01:58

as another example um during the recent work uh

00:02:03

uh in triple a. i. uh this year that

00:02:07

the show that actually if you if you can find essentially two examples i mean usually the same

00:02:14

but you know the salience in math corresponding to these images would be very different

00:02:20

so here is that you look at the saying is it matches the ah yeah

00:02:23

it's that saying isn't that it's it's it's really great seven why my class very decided

00:02:29

to classify um this image as a butterfly which in large that a family of butterflies

00:02:36

so but here you have it seems that you know i it has relied on some features in this region

00:02:42

of course at like the as hillary don't see any butterfly there but why network should rely and such regions

00:02:49

in order to detect something you know it it's a butterfly or

00:02:53

anything else so up so the argument everything this work is that

00:02:58

somehow if they relied too much and saying in the bayes net that any of them

00:03:04

i mean we cannot have a very uh good kind of interpretation at least you know it faces it some cases

00:03:13

so ah so the idea now is okay maybe if they improve robustness opted networks

00:03:21

somehow trying to prevent them to rely on this kind

00:03:25

of surface statistical features just non irrelevant and invisible features

00:03:31

maybe we can also address this problem of kind of interpret ability essentially in a way that okay you know a lot

00:03:39

in some sense is they become more robust than you know with the hood focus on really relevant feature

00:03:46

that um we need to discriminate it's between different classes not really just sort of

00:03:52

it's somehow it sounded like really rendering a sort of changes

00:04:00

so but the question is that how to do that so the best thing

00:04:05

that they have actually so far in order to to improve robustness of classifiers

00:04:10

um it's what's called adversarial training it's in i met the

00:04:15

simplest method possible one can think of to improve robustness um

00:04:20

the only difference with normal training or standard training is that is that training on your

00:04:27

with a train batches of training data you train on the adversary perturbed batches

00:04:33

so in this case i isn't that you hope that you know when you repeat this

00:04:37

process procedure for i mean many iterations at the end your network would be robust adversarial perturbations

00:04:45

and the uh it has been shown to be the most effective

00:04:48

method uh actually you know the other matters most of the other

00:04:52

methods may fail to deliver what the problem is in terms of

00:04:56

robustness so this is the best method by it quite large margin actually

00:05:03

so let's assume now that we have a network that is adversary train and a network that

00:05:10

yeah normally trained you e. using just normal training that uh we do everyday

00:05:17

so if you look at the adverse l. examples corresponding uh to network

00:05:21

that train the normally and and network a network it train adversarial e.

00:05:27

what's interesting is that the network is trained no it seems that it to as he said it is a lie down some

00:05:35

not important or meaning less features in order basically to fully classify actually um this one

00:05:40

is the floor but it's classical both of them are classified as an eight but classifier

00:05:46

so how they're gonna look at the adversarial we trained model

00:05:50

would you say is that we didn't know that the perturbation is really

00:05:53

meaningful it may that may have four to look like really in eight

00:05:58

or here for twenty seven nine you can see that you know here in in you know we have perturbations in that black areas

00:06:06

i mean shouldn't be important for server and nice because there's nothing

00:06:09

there but it's in that latter become sensitive uh in those features

00:06:13

our anniversary train one create try to remove this a straight

00:06:17

line here and really make a nine to look like a seven

00:06:22

so how are they did not tight get that done the tight about being want more interpret

00:06:28

to be the but it seems that improving robustness if that better interpret ability in terms of

00:06:34

at least you know in terms of like like um the features

00:06:37

that the network looks up one in order to discriminate between different classes

00:06:42

in our example um for c. farting data that is a that anyways of an aeroplane for signed

00:06:48

our uh training is the lead into the noise that you should add it sort of looks like random

00:06:54

but you know yeah when you do that research training it really tries to rebate maybe case leg and really

00:07:00

it looks like a bird or for dog you can see that yeah it some sort of antlers become the deer

00:07:10

so now make it'll take maybe some of the hot they have somehow that didn't we use improving robustness

00:07:17

it seems that it improves interpret ability to the in the sense that we just uh mentioned

00:07:23

uh in some way but the question is that how efficient improve robustness because i talked about adverse your training

00:07:29

it was the first thing that people propose um but it is very inefficient imagine uh because you have

00:07:35

that optimisation for training which we know that it takes a lot of time as we've already mentioned data set

00:07:41

well the not another optimisation is to find these perturbations

00:07:44

so it is not really an efficient method to improve robustness

00:07:49

so one how how to find you know a it would be nice to have sort of

00:07:54

an of regular riser uh in the sense that okay i they have the cross entropy chair

00:08:00

to train the network but maybe we can find an additional chair or will rise there

00:08:05

um and the data will rise or you can actually wasn't really can do normal training

00:08:10

with this additional chair in our cost in the other last function and we get the more robust network

00:08:18

but one thing i should should look at it is that taking an address a train model any normally train model and

00:08:24

trying to see that you know what is kind of the

00:08:26

difference between the two how what adversarial training has don another model

00:08:32

once it's approach would be sort of energy are magic approach that they have taken i'm here you can see in

00:08:39

this block that it's for a two networks and whatever to train and one normally train on c. fourteen data set

00:08:46

and here you see the um eigenvalue provide of the haitian disrespected input

00:08:52

and and you can see here that an adversary

00:08:55

train model has a a smaller eigenvalues or smaller characters

00:09:01

informally speaking one can say that in a virtual

00:09:05

training try to flatten the decision boundaries of the network

00:09:11

00:09:13

i bet i then the idea would be okay i think irregular riser that encourages

00:09:19

the decision boundaries to be flat so maybe using that we can achieve a robust not

00:09:26

the benefit in doing so is that okay and research training of course it was like

00:09:31

kind of intuitively what was written arising but it might be better to explicitly enable right

00:09:36

because you know when you're gonna rise we have some tweaking probably tear so they have

00:09:40

more sort of it's a you know control over you know uh the amount of regulation

00:09:46

and also it would be computationally faster depending on the matter that they use for every sale train

00:09:53

and also it it gives us sort of like close to a state of the art result

00:09:58

it can not beat address l. training but uh at least bit other method and

00:10:03

it gives us some maybe very similar results adversarial train so i can show you an example

00:10:09

a list they just if maybe resonate eighteen and train it on steve martin

00:10:15

because that if you do a normal trainee

00:10:19

maybe you get like ninety five percent accuracy back up at the

00:10:22

sight of all the bit higher but ninety five percent pretty good

00:10:27

but if you try to attack that model using bounded infinity perturbations

00:10:32

um what you get is that almost zero percent accuracy so you can pull the network hundred percent of times

00:10:39

however if you explicitly regular rice the model said audi i want the model to have flat decision boundaries

00:10:47

uh well you're characterising it would draw but the

00:10:50

the significant boost in terms of an infinity adversarial perturbations

00:10:55

and if you look uh at the same numbers for adversarial train

00:11:00

um with the c. that we get more or less similar numbers

00:11:03

i'm yeah that's a trainee is a bit uh a higher uh robustness

00:11:11

so what i wanted to tell you today is that it seems that um

00:11:19

the models that they normally train the look at some services statistical features meaning less and

00:11:26

most of the time invisible features in the images to discriminate between different classes however we can

00:11:34

basically mitigate this problem by improving robustness so the problem the ball

00:11:40

is on to hack it to adding efficient methods to improve robustness

00:11:44

and here i just introduce briefly uh it's sort of geometry triggers

00:11:48

action taken each uh to improve the robustness of a big networks

00:11:54

so um the take home message maybe you understood road not understanding any anything

00:12:00

of my tie it but the most important thing that uh i want to

00:12:05

tell you today is that we address exam is not nothing just for security or safety critical or

00:12:10

some fancy problems no it has it has some real implications uh for everyday usage of did networks

00:12:18

i i just give you some example example today about you

00:12:22

know how adversarial robustness and interpret ability of can be link

00:12:28

but also there are some other uh kind of connections and uh

00:12:32

well that's we can discuss about it off line if you're interested

00:12:37

so that's what they want to the uh to deliver today

Share this talk:

Conference Program

57:55

Methods for Rule and Knowledge Extraction from Deep Neural Networks
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 9:10 a.m.

1445 views

04:39

Q&A - Keynote speech: Prof. Pena Carlos Andrés
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 10:08 a.m.

13:00

Visualizing and understanding raw speech modeling with convolutional neural networks
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:15 a.m.

371 views

01:56

Q&A - Hannah Muckenhirn
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:28 a.m.

240 views

20:07

Concept Measures to Explain Deep Learning Predictions in Medical Imaging
Mara Graziani, HES-SO Valais-Wallis
May 3, 2019 · 10:32 a.m.

458 views

16:17

What do neural network saliency maps encode?
Suraj Srinivas, Idiap Research Institute
May 3, 2019 · 10:53 a.m.

719 views

17:25

Transparency of rotation-equivariant CNNs via local geometric priors
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:30 a.m.

458 views

01:11

Q&A - Dr Vincent Andrearczyk
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:48 a.m.

103 views

15:45

Interpretable models of robot motion learned from few demonstrations
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 11:50 a.m.

194 views

01:23

Q&A - Sylvain Calinon
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 12:06 p.m.

125 views

12:00

The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa
Xavier Ouvrard, University of Geneva / CERN
May 3, 2019 · 12:08 p.m.

107 views

12:44

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:21 p.m.

132 views

02:23

Q&A - Seyed Moosavi
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:34 p.m.

14:38

Interpretation of End-to-end one Dimension Convolutional Neural Network for Fault Diagnosis on a Planetary Gearbox
Sooho Kim, UniGe
May 3, 2019 · 12:41 p.m.

Recommended talks

27:18

Digital lessons learned from musculoskeletal radiology
Patrick Omoumi
Feb. 1, 2019 · 11:37 a.m.

1894 views

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL

Embed

Transcriptions

Conference Program

Methods for Rule and Knowledge Extraction from Deep Neural Networks
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 9:10 a.m.

Q&A - Keynote speech: Prof. Pena Carlos Andrés
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 10:08 a.m.

Visualizing and understanding raw speech modeling with convolutional neural networks
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:15 a.m.

Q&A - Hannah Muckenhirn
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:28 a.m.

Concept Measures to Explain Deep Learning Predictions in Medical Imaging
Mara Graziani, HES-SO Valais-Wallis
May 3, 2019 · 10:32 a.m.

What do neural network saliency maps encode?
Suraj Srinivas, Idiap Research Institute
May 3, 2019 · 10:53 a.m.

Transparency of rotation-equivariant CNNs via local geometric priors
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:30 a.m.

Q&A - Dr Vincent Andrearczyk
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:48 a.m.

Interpretable models of robot motion learned from few demonstrations
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 11:50 a.m.

Q&A - Sylvain Calinon
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 12:06 p.m.

The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa
Xavier Ouvrard, University of Geneva / CERN
May 3, 2019 · 12:08 p.m.

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:21 p.m.

Q&A - Seyed Moosavi
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:34 p.m.

Interpretation of End-to-end one Dimension Convolutional Neural Network for Fault Diagnosis on a Planetary Gearbox
Sooho Kim, UniGe
May 3, 2019 · 12:41 p.m.

Recommended talks

Digital lessons learned from musculoskeletal radiology
Patrick Omoumi
Feb. 1, 2019 · 11:37 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Improving robustness to build more interpretable classifiers Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL

Embed

Transcriptions

Conference Program

Methods for Rule and Knowledge Extraction from Deep Neural Networks Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD May 3, 2019 · 9:10 a.m.

Q&A - Keynote speech: Prof. Pena Carlos Andrés Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD May 3, 2019 · 10:08 a.m.

Visualizing and understanding raw speech modeling with convolutional neural networks Hannah Muckenhirn, Idiap Research Institute May 3, 2019 · 10:15 a.m.

Q&A - Hannah Muckenhirn Hannah Muckenhirn, Idiap Research Institute May 3, 2019 · 10:28 a.m.

Concept Measures to Explain Deep Learning Predictions in Medical Imaging Mara Graziani, HES-SO Valais-Wallis May 3, 2019 · 10:32 a.m.

What do neural network saliency maps encode? Suraj Srinivas, Idiap Research Institute May 3, 2019 · 10:53 a.m.

Transparency of rotation-equivariant CNNs via local geometric priors Dr Vincent Andrearczyk, HES-SO Valais-Wallis May 3, 2019 · 11:30 a.m.

Q&A - Dr Vincent Andrearczyk Dr Vincent Andrearczyk, HES-SO Valais-Wallis May 3, 2019 · 11:48 a.m.

Interpretable models of robot motion learned from few demonstrations Dr Sylvain Calinon, Idiap Research Institute May 3, 2019 · 11:50 a.m.

Q&A - Sylvain Calinon Dr Sylvain Calinon, Idiap Research Institute May 3, 2019 · 12:06 p.m.

The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa Xavier Ouvrard, University of Geneva / CERN May 3, 2019 · 12:08 p.m.

Improving robustness to build more interpretable classifiers Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL May 3, 2019 · 12:21 p.m.

Q&A - Seyed Moosavi Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL May 3, 2019 · 12:34 p.m.

Interpretation of End-to-end one Dimension Convolutional Neural Network for Fault Diagnosis on a Planetary Gearbox Sooho Kim, UniGe May 3, 2019 · 12:41 p.m.

Recommended talks

Digital lessons learned from musculoskeletal radiology Patrick Omoumi Feb. 1, 2019 · 11:37 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL

Methods for Rule and Knowledge Extraction from Deep Neural Networks
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 9:10 a.m.

Q&A - Keynote speech: Prof. Pena Carlos Andrés
Keynote speech: Prof. Pena Carlos Andrés, HEIG-VD
May 3, 2019 · 10:08 a.m.

Visualizing and understanding raw speech modeling with convolutional neural networks
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:15 a.m.

Q&A - Hannah Muckenhirn
Hannah Muckenhirn, Idiap Research Institute
May 3, 2019 · 10:28 a.m.

Concept Measures to Explain Deep Learning Predictions in Medical Imaging
Mara Graziani, HES-SO Valais-Wallis
May 3, 2019 · 10:32 a.m.

What do neural network saliency maps encode?
Suraj Srinivas, Idiap Research Institute
May 3, 2019 · 10:53 a.m.

Transparency of rotation-equivariant CNNs via local geometric priors
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:30 a.m.

Q&A - Dr Vincent Andrearczyk
Dr Vincent Andrearczyk, HES-SO Valais-Wallis
May 3, 2019 · 11:48 a.m.

Interpretable models of robot motion learned from few demonstrations
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 11:50 a.m.

Q&A - Sylvain Calinon
Dr Sylvain Calinon, Idiap Research Institute
May 3, 2019 · 12:06 p.m.

The HyperBagGraph DataEdron: An Enriched Browsing Experience of Scientific Publication Databa
Xavier Ouvrard, University of Geneva / CERN
May 3, 2019 · 12:08 p.m.

Improving robustness to build more interpretable classifiers
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:21 p.m.

Q&A - Seyed Moosavi
Seyed Moosavi, Signal Processing Laboratory 4 (LTS4), EPFL
May 3, 2019 · 12:34 p.m.

Interpretation of End-to-end one Dimension Convolutional Neural Network for Fault Diagnosis on a Planetary Gearbox
Sooho Kim, UniGe
May 3, 2019 · 12:41 p.m.

Digital lessons learned from musculoskeletal radiology
Patrick Omoumi
Feb. 1, 2019 · 11:37 a.m.