Airworthy AI; challenges of certification, part two

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:01

prove that the neural network in a certain application to save

00:00:07

so before i spoil the creativity with my hunches

00:00:12

maybe somebody has worked on this plan to work on this

00:00:19

simulated doable stuff

00:00:31

okay so you take it toward version of the problem you have the neural nor networks

00:00:35

all that and show that it's all that as well as the artificial handcrafted one

00:00:44

right um so we did have this simulation simulated input as ground truth as it

00:00:50

is a major ingredient in in any approach we take a any other

00:00:56

well well discussed this at length does the rest of my talk them but i didn't mean to talk happen are

00:01:03

about this but so far um so let's first narrower ourselves

00:01:06

down to this reference system um it's it's uh

00:01:11

uh actually something we are building um so we have cameras

00:01:14

coming in we have a pretty high resolution picture

00:01:19

oh coming in we pretty process it so that goes to lower resolution we deal with the artifacts in the image

00:01:24

like bad pixels and stuff also we post rise the bit so that the neural network already sees

00:01:30

something that looks as if it was in again computer rather than the full spectrum of reality

00:01:35

uh which we hope will help later then go through the uh resonant thirty

00:01:39

four or whatever we once and it outputs the segmentation of the image

00:01:42

um we have two major applications one is recognising landing strips

00:01:47

um so there's the the black things what what lines in the middle of the circles with age they're relatively easy uh an

00:01:53

extension of that is anywhere look anywhere and see the difference

00:01:57

between saving and save 'em but this is the more

00:02:02

tangible it's more an edge case 'cause we can actually see if you can see lana strip or not um

00:02:08

uh and the other one is uh recognising things guy like you drones and birds

00:02:12

non cooperative traffic we call them um it's important because uh humans actually

00:02:18

i have not been of i. sites to sport a small drone at six hundred

00:02:22

meters or six seconds away if you're travelling hermes second um so the same

00:02:28

the same infrastructure uh same hardware very somersault person

00:02:33

network topology that in one case the false

00:02:35

positives or obviously far worse than the false negatives the other one we'll rerun yes

00:02:44

ah very good question that's why we have um well

00:02:49

first of all good they don't have to they just have to pass the exam

00:02:51

fume pilots we have um a corporation with the that and it's really

00:02:57

uh a whole chiffon competition shove them into two or they

00:03:00

have a department dedicated to avionics and they're exactly

00:03:04

um uh i do a project like this to figuring out how good humans are this at all

00:03:11

um so you could say well as soon as we prove that we're better than this set of humans over there in the lab

00:03:17

we we had done but we'd have to go to your thirties and they say that that's very nice

00:03:22

lots large story and still you have to um be better that you would have to

00:03:27

be as good as we expect from evaluation systems and not as humans um

00:03:48

yes so these companies that have been done so humans are actually dollar beam and best

00:03:55

uh and so a a a human uh messes up or once per utterance is seven hours

00:04:01

ah to uh if you have to humans is not the square because they can talk

00:04:06

to each other so and all these design flaw and uh and the cockpit

00:04:11

uh is that a pilot can talk is are actually it airbus internal joke is to replace the

00:04:15

second pilot with a dog will bark at the first part of the price thirty but

00:04:21

ah i'm there could be a credible step on the way to full autonomy um

00:04:26

so um so if you just look at the isn't database

00:04:32

you see uh uh the probability of a pilot dying

00:04:35

uh wise on hand is this number so we know and then afterwards there's the analysis of what went wrong

00:04:42

i think the fundamental problem is there that's not a very huge database because there's not not that many planes flying is not many

00:04:47

people trying not many hours of flight have been not ten to

00:04:51

nine hours a lot of hours um it's clearly a

00:04:57

yeah um if you have a very very experienced pilot with twenty thousand

00:05:02

hours of experience twenty thousand flight hours that's very experienced pilot

00:05:07

um this couple things we can do better though uh we can share our knowledge uh which

00:05:12

humans can only do you know over beer or in an informal setting like this

00:05:17

um we can pull the learning um that we can do off line learning is also something we

00:05:23

can clearly do better than humans we can record everything for later do learning online and um

00:05:30

uh well i'm gonna have myself so this is our references and i won't

00:05:33

talk about this uh as of now is the system for which we

00:05:38

like to go to us and say look we improve the thing that was good enough therefore the system as a whole is good enough

00:05:43

so um preprocessing so it's it's sort of a game world in

00:05:48

reality remembered segmentation then this is the single shot network

00:05:52

if you're so you have any recurrence or any buildup of stake in there

00:05:56

it becomes a are dependent on the history of images you show it

00:06:00

i think i have no idea how to track any how to make any

00:06:04

meaningful statement about adding more so let's focus on easy one picture

00:06:08

uh one segmentation combat but we can you post processing so with every

00:06:12

new frame of our fifty frames per second is a complete

00:06:16

surprised to see in the the to the neural network and it comes up with an independent estimate

00:06:21

after you moved for a couple of seconds you have completely independent image anyway

00:06:26

and we can do classical radar tracking algorithm new common field of the test

00:06:30

or something there i expect some consistency of uh of the object or

00:06:36

if you apply this to wire detection wires on major cause of accidents in uh in flying

00:06:42

i'm a sinner just seen the wire ones right we can have a post processing that remembers that

00:06:49

it's all wire there and until we are absolutely sure was mistake we will not go there

00:06:55

that's the way that you can increase the cost of the the the president nicole although the whole

00:07:01

system um even if you're no we're gonna middle has a finite chance of getting it wrong

00:07:07

um so the important constraints that we set ourselves now we we restrict ourselves to the corner

00:07:13

of the design space where we have no adaptive online learning so we will not have

00:07:17

uh we we're not trying to make an actual biological entity that

00:07:21

wants to fly 'cause turns out the make terrible pilots

00:07:24

um uh we are not in control look with skirts or the simplifies the notes of the whole thing and

00:07:30

we have no buildup of states we have no recurrence um in the network that makes it remember anything

00:07:37

yeah if we can do this we're already we can do pretty cool

00:07:41

things um okay so if you go over the way that a

00:07:46

f. a. any other reason about this gives me but uncertainty any

00:07:50

uncertainties caused by randomness um i mean this is not an

00:07:55

deep statements so what are the sources of randomness in uncertainty in our neural net and the application thereof

00:08:02

so the most obvious ones that can be noisy input and we've all heard the is fantastic

00:08:07

examples of um that was your attacks we try to find a little noise pattern that

00:08:11

makes the outcome a completely different um so we kind of any of that

00:08:17

um that's the only thing that really happens at run time for the system and uh and inference time

00:08:22

um the rest are all things related to uh the training

00:08:27

or the design phase as a airspace would call it

00:08:31

so far as we see it the first problem is that your test set

00:08:36

is the sample drawn from reality and you have the sample problem that has to be representative

00:08:42

the fundamental problem for anything you testify nuts at first

00:08:46

actually worse because they're on there isn't that much

00:08:50

footage of planes flying in extremely dangerous conditions not only did you not tend to carry recordings the

00:08:58

recording the there's no video and see no it doesn't happen that often it's very rare events

00:09:03

statistics um these accidents um then when we get to the second step so this this

00:09:10

meal data is far too valuable to waste on training the network you need

00:09:14

or the more magnitude than we have available for testing at all so you wanna

00:09:18

train is exclusively in simulator we think that's the only feasible way so these

00:09:25

training sets are synthesised for randomness actually the better the randomness

00:09:29

the better they are but how do you so there

00:09:33

you would have to argue that the randomness leads to more certainty um which is something we have to get a grip on

00:09:39

and then you run the my eyes the training processing different runs

00:09:44

we huge you would end up with different neural networks

00:09:47

um so are they meet different uh in which case that could be totally

00:09:51

valid and they could be multiple so good solutions to the same problem

00:09:55

or maybe they're essentially the same because there's a high degree of symmetry

00:09:59

representational network usually graph um user what's it um um so

00:10:08

if you

00:10:10

even adam excesses couple times even if the end result is a good no system to argue that is good enough you have

00:10:16

to open up the box and show you understand what's in there so that's the channels we have to sort ourselves

00:10:22

known problems for the design we're talking about his determinism sir

00:10:26

showed the same image i get same output will probable

00:10:29

um we excluded ourselves from modifying the weights in flight because

00:10:35

no that would be a very bad idea for all kinds of reasons not only would be hard to prove it safe enough i also think it would be safe enough

00:10:42

so it's good that we can't prove it safe enough because we'd be lying um

00:10:46

buildup of states is also something we avoid and go to evaluate the network

00:10:53

that's just you know well written c. plus plus we not to do that actually we

00:10:57

did it because the the first computer i showed you comes with a very ancient

00:11:01

um cheap you for on which tend to vote doesn't work so we roll around here where the implementation of that there's a

00:11:07

flow uh evaluation library which is you know classical solid hardcore

00:11:14

writing of software that we know what to do so

00:11:19

um the run time determinism that was the first court problem actually i've

00:11:24

oh i'm gonna go over these four points i mentioned earlier briefly um so the determinism is nice but we need

00:11:31

stability really if we change the input a little bit we should not get too widely different output

00:11:36

now my hunch is no i'm just all theoretical physicists that

00:11:42

stumbled into this field but if the whole thing is really different trouble seems to me you have everything you need

00:11:48

mathematically to come up with the concept of d. outputs divided

00:11:52

by the input and put some buttons on this

00:11:56

anybody has a theoretical framework to do work on this does not um you read a lot about

00:12:04

uh at the serial methods that try to um a full network can use these to exclude

00:12:11

such things to uh is there a way to uh learn and the

00:12:14

sense star is um um yourself to noise um uh or

00:12:21

could ensemble methods being used here to get greater stability and also

00:12:26

to get a grip on what a stability means either questions

00:12:31

and somebody ah

00:12:35

it's

00:12:39

00:12:44

00:12:55

no so my question one what are the mathematical right

00:13:02

and so the old really you see get this training set you have this

00:13:07

test set a you measured the gene recall and they are this number

00:13:13

right there this number on the test set is this test sets efficient representation of reality um

00:13:22

oh well maybe i know how would we go about proving that

00:13:37

i i i

00:13:47

yeah

00:13:58

00:14:01

yeah

00:14:17

exactly that is a pro as i said i've come here to ask questions not to get

00:14:23

yeah um

00:14:36

00:15:02

uh_huh

00:15:21

00:15:41

00:15:49

ooh

00:15:54

right

00:16:02

i think this is a call to arms for for no research groups that are looking for the next cool thanks

00:16:09

um basically so basically what we want avoids is that somebody can

00:16:14

put an orange or pink ball in the corner of the

00:16:17

field of view of the aircraft to make you think that there's a landing site over there that would be that

00:16:26

okay well uh that would be nice

00:16:39

so we could start and also so if we have a bunch of these uh uh networks trained separately when they are separate

00:16:46

uh we can maybe use at all get to that later so this is the only one where it's a property of the networking

00:16:52

comes out that we apply in there that we wanna have properties of that applies on time the rest is really about

00:16:58

opening the uh on the box of the the design phase and

00:17:04

and uh uh that those were the the the sampling error in the test set i think the air

00:17:12

uh we might actually have a chance so we're not gonna get out into the nine hours of flight

00:17:18

uh but for the case of the landing strips there's probably fifty thousand landing strips in the world

00:17:25

so we could probably get photos of forty thousand of them and then we can par

00:17:30

parameter tries what they look like by you know how far away which angle

00:17:34

uh we're looking uh and uh and then we can take a very large sample those or um

00:17:41

uh and then we have to multiply this by what they look like in the weather soul

00:17:47

there's by relevant parameters of the weather is to the

00:17:50

uh i'm really relevance uh aspects of the

00:17:55

lighting conditions it's the translucent c. of the air and how much letters to begin with

00:18:00

um so that's seven dimensions is space that you can quantify and

00:18:04

we can take samples and and then we can see

00:18:07

if we have a over or under sampled hard parts so

00:18:12

i have good hope there and so it's not even

00:18:14

that big data um fifty thousands times say um

00:18:20

thousands a weather conditions that is uh sounds like a big number but currently there's

00:18:26

a five hundred hours of video get uploaded to you'd you were minutes

00:18:31

uh they're not all about lying but um that means that and the seven hours actually

00:18:39

i've got forty days you'd you workloads um is that thirty the the factor

00:18:44

is real time to upload it is thirty three thousand or something so

00:18:51

um so we don't even call a bigger setting or a means that to go get it but um and then maybe we

00:18:58

can make some statements about the quality and the the variance in

00:19:03

the um sample set based on bootstrapping and i know

00:19:09

i said patients is that stations in the audience that say all we always do like it

00:19:15

okay actually uh um the more existing one is that uh as i said impact is known appeal data to train on

00:19:23

and training uh to put it differently it would be a giant waste to uh use real data

00:19:30

for training um we only one isn't protesting so we wanna set aside all the training

00:19:35

and then it's nice if any and it works all the testing set but a asses gonna ask so if

00:19:41

you traded on that's you know how does that if you if he's synthesised into the nine hours there

00:19:48

how have you not that on something could be relevance and

00:19:51

um why uh should we just that this means anything

00:19:58

so what have you thought this network um and specifically how have we

00:20:04

made sure we don't over fit on synthesise aspects of the

00:20:09

uh of the of the training set that we happen to not speak up on

00:20:14

in a um in the real data uh does that uh um again

00:20:21

so these uh the serial things seem to be older age um can we uh the

00:20:26

somebody suggested uh you approach where you learned a on a

00:20:31

i think you told me this uh you have this uh the serial

00:20:34

had on uh on your network we can tell the difference between

00:20:39

uh i'm real and fake and you only learn by rivers propagating

00:20:44

um the printer network the features uh that we can use to distinguish those

00:20:50

uh we already have a very obvious that a preprocessor makes

00:20:55

the will look a bit more like a game engine

00:20:58

uh in real time so the the real time application the network see something that looks very much like a game engine

00:21:05

i'm more for the mentally ill we have to define some for similarity of the training and

00:21:11

the test set if we wanna say you know it makes sense that this test set

00:21:15

or objectively tests will be trained uh because they are the same in this way so what

00:21:20

does it mean for the sets to be different or the same that is um

00:21:25

an interesting concept to have a number on um then

00:21:32

so if we if you have a p. i. d. controller and you have a process the set the uh the three

00:21:37

gains and every time you run it comes up with three different numbers at all sort of work the same

00:21:43

yeah it's like a big enough to convince someone to just there a jet engine control to it so you

00:21:49

wanna show that if you run the that work several

00:21:52

times and you do get different results they are

00:21:56

different for good reason reason under their objectively still good solutions

00:22:00

uh or that there are essentially the same modulo um permutation of all the weights and

00:22:06

all the uh um uh act activation all the connections so um uh um

00:22:16

if we do have

00:22:18

objectivity good but this several our networks uh which you could maybe force by

00:22:24

severing the making different apologies uh if we can some approve the

00:22:28

oars independent that actually you strong ass it be tool 'cause we could

00:22:32

put seventeen of them in a box in the committee no sample

00:22:35

and we have the expected amount of disagreements and essays only all agree

00:22:40

uh then there's probably common mode are that's the typical fold in these if you only systems is you have these

00:22:45

we don't thence with this we don't see built in but uh there's still way that they can all

00:22:51

phil at the same time because the uses one thing that's the same for all the all the common mode are

00:22:57

and people are could be allergic to it so if you have a fault in a in aviation

00:23:02

you know the rude call you in everything is brown and so you have the

00:23:05

root cause analysis isn't root cause analysis that some common common mode failure

00:23:09

people are gonna be upset um if we have an ensemble and all this and they all agree and of sense

00:23:15

it was probably something fishy going on at the same time if they all disagree so all of the seventeen

00:23:21

you now have the predictions all over the place that

00:23:24

that's also a good intrinsic a measure of um

00:23:29

uh how we should trust is doing on time so we can use the mortar during

00:23:33

run time how much we just the system um then um so to summarise

00:23:41

if we wanna convince the f. a. a. that it's good enough it's not a

00:23:46

good enough to show that it works on this test that because the test

00:23:49

of this small song good enough to show that we can do better than humans humans aren't that good and on so many of them and it's not

00:23:55

that many hours of flight we really have to open this box and see what goes on the inside so that we

00:24:01

can compare networks layer by layer and say this is means same as this and this is what it means

00:24:07

uh and so that we can compare data sets um given the same network perhaps uh to say

00:24:13

but we've learned here the e. is a generalisation of uh what we measure there

00:24:20

um then random thing that i want to put on some slides if you start relied retrain network

00:24:25

uh it will have been it was seen lots of things

00:24:29

that you never actually see in your training or your

00:24:32

television set and in fact this is will be get nodes that you could optimise out um to get them

00:24:39

nice once more network what you would do in aviation application i would propose you actually

00:24:45

you don't who come up to the next layer you will come up to um

00:24:49

a separate monitoring equipment and as if these nodes only fire you have seen

00:24:54

something that you haven't trained on so are a big thing anyway

00:25:01

um to wrap it up we needs to get a grip on reprise ability of

00:25:05

the training process if we wanna build the argument that we have trains

00:25:11

these fittingly numbers uh adequately so we need to have a framework

00:25:16

was that what goes on inside network so that we

00:25:18

can compare these networks and we credit sets and as far as i see it that's all we have to tackle

00:25:26

so you're helps a lot

Share this talk:

Conference Program

25:27

Airworthy AI; challenges of certification, part one
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:05 p.m.

350 views

25:34

Airworthy AI; challenges of certification, part two
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:30 p.m.

107 views

06:33

Airworthy AI; challenges of certification, Q&A
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:56 p.m.

181 views

Airworthy AI; challenges of certification, part two
Dr. Luuk van Dijk, Founder and CEO of Daedalean

Embed

Transcriptions

Conference Program

Airworthy AI; challenges of certification, part one
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:05 p.m.

Airworthy AI; challenges of certification, part two
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:30 p.m.

Airworthy AI; challenges of certification, Q&A
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:56 p.m.

Recommended talks

Klewel SA

What is Klewel?

Follow Us

Contact Us

Airworthy AI; challenges of certification, part two Dr. Luuk van Dijk, Founder and CEO of Daedalean

Embed

Transcriptions

Conference Program

Airworthy AI; challenges of certification, part one Dr. Luuk van Dijk, Founder and CEO of Daedalean Oct. 12, 2018 · 2:05 p.m.

Airworthy AI; challenges of certification, part two Dr. Luuk van Dijk, Founder and CEO of Daedalean Oct. 12, 2018 · 2:30 p.m.

Airworthy AI; challenges of certification, Q&A Dr. Luuk van Dijk, Founder and CEO of Daedalean Oct. 12, 2018 · 2:56 p.m.

Recommended talks

Klewel SA

What is Klewel?

Follow Us

Contact Us

Airworthy AI; challenges of certification, part two
Dr. Luuk van Dijk, Founder and CEO of Daedalean

Airworthy AI; challenges of certification, part one
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:05 p.m.

Airworthy AI; challenges of certification, part two
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:30 p.m.

Airworthy AI; challenges of certification, Q&A
Dr. Luuk van Dijk, Founder and CEO of Daedalean
Oct. 12, 2018 · 2:56 p.m.