TensorFlow 3 and Day 3 Questions and Answers session

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

Uh yeah yeah oh Yeah yeah yeah oh so we

00:01:20

are starting the last socialise this

00:01:22

through trouble us okay by me hide a

00:01:26

about on software again on that was

00:01:28

about we more ways you only had a user

00:01:32

speakers have to be the only so yeah

00:01:36

okay so I guess I'll start of the third

00:01:53

talk just as usual feel free to

00:01:57

interrupt me if there is a strange

00:01:58

questions way will have time for

00:02:00

questions at the end of this talk and

00:02:01

also for the panel and P shout out to

00:02:04

do some technical problems if you can

00:02:06

hear neons can see something on the

00:02:08

slide. So let's start talking about

00:02:12

neural networks intensive for because

00:02:14

the the example that we have so far was

00:02:16

very good for pedagogical account of as

00:02:18

a pedagogical example but you wouldn't

00:02:21

you really use that in practise and I'm

00:02:22

gonna show you some examples of how

00:02:25

what what you can actually use at home

00:02:27

or at work to to with with that simple.

00:02:31

So if you're here you probably know

00:02:35

what neural networks are and especially

00:02:37

the neural networks but they usually

00:02:39

look like this the idea is this your

00:02:42

article structure you always have this

00:02:44

input layer in which you feed the input

00:02:47

in this case the picture of I hope I

00:02:49

have that right. It's the bulk okay so

00:02:51

obvious I'm as good as the network

00:02:53

perfect. So now once you have the input

00:02:59

you always propagate information

00:03:00

forward and then doing learning the the

00:03:02

gradients flow back or so this is kind

00:03:04

of the thirty seconds introduction to

00:03:08

to neural networks and now we're here

00:03:12

to discuss how neural networks can be

00:03:15

used in terms of law and you have three

00:03:17

options depending or an how flexible

00:03:20

you want to be and how much work you

00:03:22

want to put into this. So you might be

00:03:24

able to use already trained models this

00:03:27

is very easy for this you don't

00:03:28

actually have to spend that much you

00:03:30

mutation or power or you might be able

00:03:33

to retrain one layer of one already

00:03:36

train model B would look at that also

00:03:37

in a bit or you can use higher label a

00:03:40

high level EP eyes to build on top

00:03:42

contender flow or you can define

00:03:44

entering your network directly in terms

00:03:46

of otherwise we saw with union

00:03:48

regression example so for each of these

00:03:50

I will going out into detail and

00:03:51

actually have a notebook to to see

00:03:53

exactly what what the entail so that

00:03:55

you get a feel for what particular you

00:03:57

would want to do. So if you want to use

00:04:01

already train models you only have to

00:04:04

do two steps one you'll all the graph

00:04:05

of the train model. And to you'll start

00:04:08

to doing inference. So the stats look

00:04:12

exactly like this. Oh it's not that

00:04:14

much code as you can see the first part

00:04:16

just loads the model from a path that

00:04:18

is given. So this is something that

00:04:21

only thing that is new here is that we

00:04:23

have important graph and they have so

00:04:24

before in the example in the last

00:04:27

lecture. I always define the raft in

00:04:30

used it in the same up program right.

00:04:33

But in this case I don't define the

00:04:36

graph in this program. I actually just

00:04:38

the use the here. So I loaded from us

00:04:42

what a someone else has defined. And we

00:04:45

are able to to easily locate. Now we

00:04:48

want to use this graph in here again

00:04:50

two things are different for the same

00:04:52

reason. I don't have now the code that

00:04:55

defines the graph here. But I have the

00:04:58

graph itself. So as we saw in the last

00:05:01

lecture when you want to run operations

00:05:04

on the graph with tons of we have to

00:05:06

initialise assertions of this has not

00:05:07

changed. Um but now instead of calling

00:05:10

session dot run with the valuable that

00:05:13

you have defined in the same program

00:05:14

you have to ask the graph to give your

00:05:18

tensor for particular domain that

00:05:20

answers or operations are associated

00:05:23

with a particular name that to the

00:05:26

program I can define. And here you can

00:05:28

actually query I one for the I want the

00:05:30

soft max because this is what you used

00:05:33

to get the probability distribution out

00:05:35

of a supervise montel for example which

00:05:37

is what we will look at right now. And

00:05:39

then into session to try and again for

00:05:41

the placeholder so this is the the same

00:05:44

thinking syntax that to use before when

00:05:46

you have to tell the network what to

00:05:48

run through through the graph right we

00:05:50

have done that we have to tell tensor

00:05:52

flow what it should run to the graph in

00:05:54

here again we can't say this is the

00:05:58

placeholder this is that it's that we

00:05:59

had before because that's not defining

00:06:01

this program but we can say the name of

00:06:03

the placeholder and just a whole

00:06:06

session dot that tries before and this

00:06:08

does exactly what you would expect so

00:06:10

this is very very short and and works

00:06:14

quite well it's actually have a short

00:06:17

example for this. C the he is so that

00:06:23

oh this example works with the image

00:06:26

net. So these are very big network that

00:06:29

we need to recognise object or

00:06:32

categories from images. And with this

00:06:35

before there's actually a very small

00:06:36

script that down to the mobile for you.

00:06:39

And us inference and we're gonna look

00:06:41

at some examples to see you also kinda

00:06:43

get an intuition how well it works and

00:06:45

how easy it is to use. So let me start

00:06:50

running isn't that this this image

00:06:52

comes this is the stalk image that

00:06:54

comes by default when you want to run

00:06:56

the example I'm gonna try that one

00:06:57

first so right now I'm I'm running the

00:07:02

code and it actually is downloading the

00:07:03

model and then it's running the

00:07:05

inference. And we will see what the

00:07:08

results are so as we expect this is

00:07:12

that we return to labels are giant and

00:07:15

up and up and that there when bear and

00:07:17

onto looks right I'm not as colleges

00:07:20

but this looks like a pent out to me

00:07:23

now if we move to another mentioned

00:07:26

this is actually not a part of that the

00:07:29

the preacher about probably is part of

00:07:31

the test set of mention enters the

00:07:32

picture that I personally took. Um and

00:07:35

I'm gonna show you exactly how to do

00:07:37

the same thing for this picture in this

00:07:39

case for that cancer for example we

00:07:42

just have to specify the image file as

00:07:46

an argument and I'm running it again.

00:07:48

It's about it's it's so the output of

00:07:57

the classification is that be tabby

00:08:00

cat. And if you're like me before

00:08:03

running this you didn't know what that

00:08:04

is but this is actually what okay so

00:08:11

this is actually a tabby cat because of

00:08:13

the stripes so this is correct. But you

00:08:16

might actually want or what what I

00:08:17

believe if I want to clarify that a

00:08:19

little bit. So the this also works I'm

00:08:21

now also gonna show you another example

00:08:22

again the I mean the picture. So that

00:08:24

not that actually that I really like

00:08:25

the golden gate bridge but this is

00:08:27

something that then network has

00:08:29

something before right it's I'm not

00:08:30

part of any they that's it as of yet.

00:08:33

So so I'm just gonna start trying this

00:08:37

and this is an example where the

00:08:39

network makes a mistake but some sort

00:08:42

of reasonable mistake I would claim. So

00:08:48

so right now it's running the in front

00:08:53

and it's loading the graph exactly like

00:08:55

in the code that I should before. And I

00:08:57

think this is the peer. It's not

00:08:59

actually appears so the second

00:09:01

suggestion is a suspension bridge that

00:09:03

is correct but the first suggestion is

00:09:06

not really right so as you can see it

00:09:07

kind of gets the gist of it but it's

00:09:09

not not really correct. Now how hard is

00:09:14

to do this so you see you. Um you can

00:09:16

clearly see that you can use this

00:09:19

rightly we're doing it live it takes

00:09:20

very little time I didn't have to train

00:09:22

model. But how hard it is to actually

00:09:24

write this button classified image

00:09:27

stoplights obviously this is part of an

00:09:29

about open source you can have a look

00:09:30

at it. And but this file is two hundred

00:09:33

lines which contain downloading the

00:09:36

although do I'm backing them although

00:09:40

doing in France and some nice and

00:09:44

transformation from the probabilities

00:09:46

that the model give you to the English

00:09:47

labels because the model does not give

00:09:49

you the strings right the model gives

00:09:50

you some tensor back and also some

00:09:54

boilerplate for imports comments and

00:09:56

the entire thing is two hundred lines.

00:09:59

So it's it's very very easy to do this

00:10:01

with the the already train models I

00:10:06

can't it And right so this is that you

00:10:20

want to do that in the case where the

00:10:22

labels that you want are exactly the

00:10:24

label that come image that right if you

00:10:26

want to classify some of the the labels

00:10:29

that are already there or if you don't

00:10:31

have your own dataset but let's it me

00:10:33

mention that you want to classify

00:10:35

flowers PC or your we passionate to

00:10:38

find out if in picture there's a dog or

00:10:40

a cat. So in that case you might not

00:10:42

want to just use the model as is you

00:10:46

might want to to it a little bit but

00:10:47

the but you know is is that you

00:10:49

actually don't have to retrain the

00:10:51

entire model or retrain a whole want

00:10:53

only with your data. So you can

00:10:56

actually take advantage of the features

00:10:59

that the big more to learn using the

00:11:02

emission a dataset and just we could

00:11:05

the last layer. So just we the

00:11:08

classification layer of of the model.

00:11:11

So that you learned the classes that

00:11:12

you want to see so and it in this is

00:11:16

also very easy and I hope it's very

00:11:18

clear white sorry why you would want to

00:11:25

do this because it saves you a lot of

00:11:27

computational power so again you are

00:11:28

not doing all the computationally

00:11:30

intensive task of learning at this is

00:11:33

kind of the distribution of natural

00:11:34

images this is what I should expect to

00:11:37

see as input you are just tweaking the

00:11:39

last part that does the classification.

00:11:41

And there's also an example of how to

00:11:44

do this but basically it's one command

00:11:46

line with tens of so if you have a

00:11:49

image classification problem. Um you

00:11:51

can just use your daytime and retrain

00:11:55

the last model of an exception model

00:11:57

and do it in one line that's that's

00:12:00

pretty cool. And something I we want to

00:12:03

mention here there's a lot of

00:12:04

documentation for all of these so in

00:12:06

case you decide one of these three

00:12:09

methods that I'm explaining out there's

00:12:10

plenty of resources online for you to

00:12:13

to look at now let's see that the

00:12:18

second use cases you've looked at this

00:12:20

but you think that or we using an

00:12:22

already trained network was not really

00:12:23

for you. You have a very specific use

00:12:25

case so you want to try something else.

00:12:28

Uh and you say I want to build my own

00:12:30

network fanciful but I want to I don't

00:12:33

we did very fast or not spend a lot of

00:12:35

time on it or I don't really want to

00:12:37

keep that we can act accuracy to the

00:12:40

zero point zero one that well then you

00:12:42

can actually use a TF floor which is a

00:12:45

high level API forty ancestral it is

00:12:47

part of that sort of also when you

00:12:49

install danceable you also get TF line.

00:12:51

And it implements dislike it learning

00:12:53

PI for those of you familiar with

00:12:55

python you might not this is the very

00:12:57

popular upright an API for machine

00:13:00

learning that implements a lot of

00:13:01

models including is jens and random for

00:13:04

instance all and this TF learned more

00:13:07

the P of learning a a PI allows you to

00:13:11

create this computational grafted I was

00:13:13

talking about before for neural

00:13:15

networks for you of just speed for non

00:13:18

actors competition on neural networks

00:13:20

are are in in in one line. So this is

00:13:23

pretty cool it's very easy to use so it

00:13:25

looks exactly like this. This

00:13:27

classifier holds the computational

00:13:29

graph that we saw before. So now I'm

00:13:34

actually gonna show you an example of

00:13:35

this to live an example that looks as

00:13:40

digits there's no machine learning

00:13:42

tutorial without putting up mister

00:13:44

right. So I'm gonna start running the

00:13:48

the courts and I'm gonna explain to you

00:13:49

want to one bit at the time what I'm

00:13:51

doing. So the first thing usually the

00:14:01

thing we do that all the necessary

00:14:02

imports. And yeah I'm actually getting

00:14:05

the daytime doing some preprocessing

00:14:08

because I'm using I wanted to make a

00:14:10

point to actually use the psychic

00:14:12

learning PI to get them mistake that

00:14:14

here. Because we're operating under the

00:14:17

assumption that we like this like it

00:14:19

learn a PIN you want to use that and

00:14:21

I'm using the and downloading them is

00:14:23

original data and this one comes into

00:14:26

digit is a pixel is from zero to two

00:14:29

five five which is not really great for

00:14:31

neural networks is better if they're

00:14:33

scaled. So in this case dividing each

00:14:36

pixel by two five five so that it there

00:14:38

in between zero and one and then I'm

00:14:41

splitting the daytime to training and

00:14:43

testing so this is pretty pretty

00:14:45

straightforward in case someone here is

00:14:48

not familiar with this. This is how it

00:14:51

looks like twenty by twenty eight

00:14:53

images. I can also run a little bit to

00:14:56

see multiple examples of these are just

00:14:59

random images from the training set.

00:15:01

And that this is the code that does the

00:15:05

creation of the graph. So this is for

00:15:08

lines and this is because I wanted to

00:15:10

be trendy and not to use the default

00:15:12

optimise their so actually can we move

00:15:14

three lines here. So you can do it in

00:15:16

one line if you want. And I'm just

00:15:19

gonna go ahead and create this model

00:15:22

and that one gonna train the model it's

00:15:25

wanna take around one or two minutes in

00:15:28

which you have time to ask me questions

00:15:31

but so now I'm just starting the mobile

00:15:34

this is how much it takes just for

00:15:36

reference to do everything that you

00:15:39

need including downloading the data

00:15:40

processing it visualising a little bit.

00:15:43

And so they'll they'll difficult part

00:15:46

two ideas for lines and now the model

00:15:50

is training. And while it's doing that

00:15:56

you have the option to ask a couple of

00:15:58

questions not for this example I have

00:16:04

for another example coming along but

00:16:15

actually that that sorry so you don't

00:16:19

need to handle sessions anymore when

00:16:21

you use the recipient I no it does that

00:16:23

for you. So does that under that so

00:16:25

that so this this force part creates

00:16:28

the classifier the first but the second

00:16:31

part deals with creating the execution

00:16:35

environment and what the what you pay

00:16:38

you if I want to run this one GP ooh I

00:16:41

think there are options to specify this

00:16:43

yes I mean there's no no one does

00:16:48

things without them being able to run

00:16:51

the GP anymore so income and learn how

00:16:58

to recognise the teachers faster you

00:17:00

machine yeah anyway it's it's gonna be

00:17:05

done very soon and then I I I'm

00:17:06

actually gonna show you a bit of what

00:17:08

kind of mistakes that that's because I

00:17:10

think that that's pretty interesting.

00:17:13

So any other questions. so so maybe

00:17:16

maybe one what is the model you know

00:17:18

well it's So this is complete fair

00:17:21

point this is just the feed forward

00:17:23

classifier with two layers in a hundred

00:17:26

the hidden units each switch little it

00:17:29

no it is not the problem there but you

00:17:30

can also equally easy to find the

00:17:32

content but this is just like a feature

00:17:34

for word of neural network with the

00:17:36

real activation. And it's done it. And

00:17:41

now I'm gonna run and see how well it

00:17:43

it. So I I had the result ready because

00:17:47

I I ran to put before but the the point

00:17:51

is that it's pretty good. It's actually

00:17:53

doing only two mistakes in a hundred

00:17:55

examples in a like time to ask a couple

00:17:58

of questions. So it's it's very easy to

00:18:01

use and gives relatively good result

00:18:04

obviously this is not state of the art

00:18:06

or anything so but I just want to give

00:18:09

a clear example. And now let's look at

00:18:12

how well it does on random image in the

00:18:16

test. So I'm just gonna be it from

00:18:19

here. So that you also give an idea how

00:18:25

how easy to it is later to look at

00:18:27

these results and visualise things

00:18:30

because and so we're again as when

00:18:33

python you can use all your of

00:18:35

favourite plotting libraries. So let's

00:18:40

get the random test example and plotted

00:18:46

to and it's also pretty so this one is

00:19:04

the correct it says it's a one and I

00:19:08

mean you should expect this if it's

00:19:09

wrong to out of a hundred times when I

00:19:11

pick one at random it's probably gonna

00:19:13

be correct right so here we see an

00:19:15

example of a correct classification but

00:19:17

how about if I want to see one way it

00:19:19

does something wrong to see does it

00:19:21

make mistakes along very trivial

00:19:23

examples or does it make the six

00:19:25

someone tweak your examples. I'm gonna

00:19:28

do this in not so efficient way but I'm

00:19:31

just gonna check when these two vectors

00:19:35

are equal and and I'm gonna get the

00:19:39

index of the first one where they're

00:19:41

not oh index around one one four four

00:19:48

for so I'm gonna call is there is this

00:19:54

is the index of a mistake right so here

00:19:56

it looks at when the classifier does

00:20:00

not agree with what we know to be the

00:20:03

actual values. And I'm also gonna like

00:20:07

that. And before can predicted. So now

00:20:25

we know there should be a mistake yeah

00:20:27

so probably this guy should be a two

00:20:32

and the network says it's a one. So

00:20:35

this is a mistake but it is doesn't

00:20:39

really like a two to me but in any case

00:20:42

this is kind of you can get an

00:20:44

intuition of when the moment as well

00:20:47

and one the one model does right and

00:20:48

you can do this again in four lines of

00:20:51

of code and fact now suppose what with

00:21:08

this I still didn't come into suppose

00:21:10

you are really interested in starting

00:21:11

things from scratch or perhaps a

00:21:13

researcher that want to come up with

00:21:15

new ideas and your models then you

00:21:17

actually maybe want to go from cancer

00:21:19

flow to go to actually the lowest level

00:21:23

possible work to make some of the lower

00:21:26

level variance with some of the higher

00:21:28

level ones. So control has a lot of

00:21:31

support for existing activation

00:21:33

functions we probably know where the

00:21:35

signal at any age someone as before

00:21:36

about cost functions differently there

00:21:39

a normalisation techniques and then

00:21:41

bidding so all these things that are

00:21:43

now very use they are already there.

00:21:45

And if you want to do this it's more

00:21:50

complex so it's not gonna be three

00:21:52

lines of code but it gives you a lot

00:21:53

more flexibility so depends again on

00:21:55

the spectral you always have to think

00:21:57

where you are on the spectrum how much

00:21:58

time do I want to spend of this how

00:22:00

much do I take care about improving the

00:22:02

model I do I want to potentially

00:22:04

defined a new layer type and this is

00:22:07

very ideal for for researchers. Now

00:22:11

actually have an example to go through

00:22:13

this to for completeness and it's

00:22:18

surprise also miss digits. So I'm gonna

00:22:24

cleared out you know so that you

00:22:26

actually trust me. I'm running

00:22:29

everything live okay so everything

00:22:32

disappears there's not put great. So

00:22:36

same idea I'm gonna get the data said

00:22:39

this time from right answer for all

00:22:42

because I'm no longer using this I

00:22:43

could learn API and the if P guys are a

00:22:46

bit different buttons are for also

00:22:47

gives you done this data set. So you

00:22:50

see it downloaded it extracted it again

00:22:53

we can visualise a couple of examples

00:22:56

by knowing already kind of get an idea

00:22:58

of how a this looks like. And now we

00:23:03

end up to the part about defining the

00:23:05

graph right so before defining the

00:23:07

graph was a couple of lines of code in

00:23:09

which we defined the optimiser. And one

00:23:12

line for actually defining the feature

00:23:14

for network classifier in this case

00:23:18

this is the graph definition. So we

00:23:21

have to choose how many layers we want

00:23:24

to define. We have to choose the batch

00:23:26

size the layer sizes so we're using as

00:23:29

before a hundred neurons per layer four

00:23:33

two layers the input size is given by

00:23:35

the dataset so twenty eight one two

00:23:37

times twenty sequel seven hundred

00:23:39

eighty four and the number of classes

00:23:42

that can because that's how many digits

00:23:43

we want to labels we want to be able to

00:23:47

classify. And now this shouldn't be a

00:23:50

surprise given the top that to you just

00:23:53

heard before we have to define to

00:23:54

placeholders this is a supervised

00:23:56

learning setting so we have the

00:23:58

examples and the lower the labels

00:24:00

associated with them so in this case X

00:24:02

withstand for the images so something

00:24:07

like this. And why withstand for the

00:24:10

labels associate so this is for example

00:24:12

in this case the three so this is the

00:24:15

placeholder definition. And now if you

00:24:18

want to define the graph you actually

00:24:20

have to define the weights and the vice

00:24:22

is very similar to the linear

00:24:23

regression example. So for neural

00:24:27

networks these are the weights are

00:24:29

matrices and the biases are a vector

00:24:31

the output is defined as you expect so

00:24:34

there's activation function which is

00:24:35

applied to one matrix multiplication

00:24:37

between the previous layer values and

00:24:40

the weights and you applied the why's

00:24:42

it so this is just the standard neural

00:24:44

network formula. And because I'm doing

00:24:46

this in a loop and why am I doing this

00:24:48

you know because I don't wanna copy

00:24:49

paste the code twice in case I want to

00:24:51

change my mind then use three layers

00:24:54

have to keep track of what my previous

00:24:56

layer sizes and values are so this is

00:24:59

this defines the hidden layers and how

00:25:01

you do the computation from the input

00:25:03

layer to the funeral Aaron from the

00:25:05

procedure there to the second in there.

00:25:07

But how about the output. So here we

00:25:09

want to use the soft max to get the

00:25:11

probability distributions for each of

00:25:13

the possible digits. So again we

00:25:16

defined the weights also matrix devices

00:25:19

the legit. So very similar matrix

00:25:24

multiplications blah plus additional

00:25:26

but here we don't have the rental

00:25:28

because we don't use that this

00:25:30

activation function directly so here

00:25:32

comes the trick of for numerical

00:25:34

stability you don't use soft max and

00:25:37

then cross entropy there's a function

00:25:39

that does soft core soft max cross

00:25:41

entropy with lodges that just takes the

00:25:43

logic to avoid the icon instability

00:25:46

that comes with soft max so you could

00:25:48

just use of max you might have suffered

00:25:50

because of this I have you might room

00:25:53

you might see a lot of finance coming

00:25:55

coming your way and again you have to

00:25:58

defined optimiser this is exactly the

00:26:00

same three lines as we saw before but

00:26:03

this does the same thing. But it does

00:26:07

it explicitly for the court is that

00:26:09

exactly the same thinking maybe they

00:26:12

need tele station that they use by

00:26:14

default is a bit different to the one I

00:26:16

have here but the concept is exactly

00:26:18

the same. So now I have actually run

00:26:21

the code which creates describe. So

00:26:24

again this just graze the computational

00:26:25

graph you have to have in mind this

00:26:27

same feature that we saw when we look

00:26:30

at that answer board them all money in

00:26:32

this the same kind of thing on your

00:26:34

network and now as before always I want

00:26:39

to start writing some start doing some

00:26:41

computations I have to create a

00:26:42

session. That's created. And now again

00:26:47

some waking time because I'm gonna go

00:26:48

through some examples and actually

00:26:50

start stuttering. So the training is

00:26:55

happening right now and someone ask

00:26:57

about them can support so I have here

00:27:00

the call that would just create the

00:27:03

summary writer so I did not add in

00:27:05

specific well make sure that we want to

00:27:09

see doing time. So I we will not see

00:27:11

that you were see going up or down but

00:27:13

we can actually visualise the graph so

00:27:15

exactly this graph that we are creating

00:27:16

here at the end that we show you how

00:27:19

how we can we can look at it this is

00:27:22

also taking a little bit but I just

00:27:24

wanna show that is the same idea same

00:27:26

principle. So any other question in the

00:27:28

meantime while this other network is

00:27:30

learning how to classify digits. if the

00:27:44

pencil borders actually live or you

00:27:47

visualise it only after that you have

00:27:50

finished training. So it depends how

00:27:52

you want to do it so here I was so

00:27:54

that's a bit this one just answer or

00:27:57

from the locks. So if you launch it at

00:28:01

the beginning you can look as it the as

00:28:03

the training happens yes in this case I

00:28:09

didn't do this because I'm not I'm

00:28:11

actually not recording any of the

00:28:13

accuracy or we we already saw them of

00:28:16

that but just want to show also how how

00:28:18

easy it is so this is just with one

00:28:21

line here and one line here in the this

00:28:23

is in the promotes I have the the bash

00:28:26

example here so it's done training

00:28:28

rate. And I'm sorry one question I Saw

00:28:33

that you were using before we inside to

00:28:35

love change you okay to this because I

00:28:38

I know that in general we should do

00:28:39

this is the don't what we're well you

00:28:42

have another function that so here you

00:28:44

can also do so down have the scan

00:28:46

function is so you can do this or you

00:28:48

can also use the equivalent of a formal

00:28:50

of the equivalent of us can intensive

00:28:53

role Is it the same adult of

00:28:54

performance for you. oh I I don't think

00:28:57

this has the same problems that panel

00:28:58

does okay so now that we is done and

00:29:07

here I'm computing also the precision

00:29:08

myself without the taking advantage of

00:29:11

the other functions. And it's also

00:29:14

that's pretty well so its rights also

00:29:17

around the ninety eight times out of a

00:29:20

hundred soul win a couple of minutes we

00:29:22

manage to train to different networks

00:29:24

to to do digits recognition and now I'm

00:29:28

gonna start that answer board will So

00:29:32

sorry one stupid question your training

00:29:34

on your laptop yes yes so the training

00:29:36

is that on my laptop on the CPU I'm

00:29:38

minor what's if you with this you of my

00:29:43

son okay that that that this is

00:29:55

actually a very important point this

00:29:56

did not run on on the GPU so if you

00:30:00

want on the you cute gonna be faster so

00:30:02

because I didn't want that any scalar

00:30:04

data we won a the see that we already

00:30:06

us on example but he here you would if

00:30:09

you have some helpful messages in case

00:30:10

of your stock when you're setting up

00:30:12

your scalar values. But we can see the

00:30:16

graph so this is how the graphics like

00:30:18

for a network and I'm just gonna zoom

00:30:20

in okay so you can see oh exactly how

00:30:36

like to have a you have a rattle you

00:30:38

have the gradient associated you always

00:30:40

have the matrix multiplication you also

00:30:42

see your placeholder variables. So you

00:30:45

get a very good idea of what what you

00:30:48

actually created when you define your

00:30:50

graph this is very useful especially in

00:30:52

the case where you're defining the

00:30:54

graph yourself rather than using a

00:30:55

higher level API so it's enough now but

00:31:12

of feed for a network so we saw kind of

00:31:15

the idea the ideas in the principles

00:31:18

that you can use to figure out what's

00:31:20

good for you obviously doesn't this

00:31:22

doesn't only apply to miss digit and

00:31:25

this doesn't only apply to feed for

00:31:26

networks is just the general idea where

00:31:28

do you fit into this landscape how far

00:31:30

do you want to to go. But you also

00:31:34

might care about the state of the art

00:31:36

model support in intensive also I'm

00:31:38

gonna talk a little bit about that now.

00:31:40

So you might be aware of the sequence

00:31:46

sequence model which are very much is

00:31:49

for for translation just days there was

00:31:51

a paper by or your being as and two

00:31:54

thousand fifteen two thousand fourteen

00:31:56

and two thousand fifteen from your job

00:31:59

angels that they sent and on those

00:32:00

results of how to use this kind of

00:32:02

models for translation. And the idea is

00:32:04

pretty simple you have an encoder that

00:32:06

is usually for translation on LSDM or

00:32:10

another are in that encodes the input

00:32:12

sentence and you have the decoder that

00:32:15

decodes the sentence into the target

00:32:18

language so in this case we're

00:32:19

translating from french to English

00:32:21

which might be helpful helpful if

00:32:22

you're here today and so this is again

00:32:27

a very useful in relatively new model

00:32:30

and I want to show how much it takes to

00:32:32

implement that in principle and

00:32:33

actually one line. So this is the basic

00:32:36

one you just have to say I want to grab

00:32:39

for a basic are and then with sequence

00:32:42

sequence aren't and these are the input

00:32:45

that you should you to the encoder

00:32:46

these are the input that used you to

00:32:47

the decoder and I want to use this kind

00:32:50

of cell so also the cell is the kind of

00:32:53

R and then sell that you want to use.

00:32:55

And this is it this is all that you

00:32:57

need to to do to get this kind of model

00:33:01

and this model has also lot of of other

00:33:05

variants of for example you might want

00:33:06

to use the one with embedding wanting

00:33:09

and code there or you might want to use

00:33:10

the tension and that's also only one

00:33:13

line. So it's a pretty straightforward.

00:33:17

And for are and then sells. They kind

00:33:20

of look like this but they also can be

00:33:21

easily created using the one line so

00:33:24

you can say I want the basic LSTM cell

00:33:26

of this size and feed that into the

00:33:29

sequence to seek one school that was a

00:33:33

bit about recurrent not role models but

00:33:37

how about the inception architectures

00:33:40

also called the implement which and

00:33:45

also available so you can actually get

00:33:47

the train fortunately so before but you

00:33:49

can also look at the code recently we

00:33:52

open source the code for dependency

00:33:55

part six also and which task if you

00:33:58

want to find out in the language what's

00:34:00

the action how are certain things

00:34:03

harder and things corresponding to each

00:34:10

other what is the object what is the

00:34:11

prepositions or you can actually do

00:34:14

that quite easy you can do that with

00:34:16

your data or there's already a train

00:34:18

model for English available it's called

00:34:19

part theme park space. So you can use

00:34:22

that if you want to look at English but

00:34:25

they're also the code is available so

00:34:27

you can just start playing with it and

00:34:29

understand more about language and this

00:34:32

one just a couple of examples that are

00:34:34

specifically given a by utterance of

00:34:37

low and of real people in their morsel

00:34:40

out encoders in syntax and also

00:34:43

present. But I also want to mention a

00:34:45

couple of really great community

00:34:47

examples in the hope that you will also

00:34:49

contribute and see how nice it is again

00:34:52

also coming back to this this point if

00:34:54

you have a community you can really

00:34:56

push the the field forward. So these

00:34:59

are a couple of examples and I'm gonna

00:35:02

go through each of them and discuss a

00:35:03

little bit what they do and again these

00:35:06

are all given by by contributors or not

00:35:10

but the part of a dancer from so you

00:35:13

might be aware of deep to networks

00:35:16

there a way to integrate the deep

00:35:18

learning framework into reinforcement

00:35:20

learning. And if you want to use that

00:35:23

with an civil there is the repository

00:35:25

available. So it's also very easy to to

00:35:28

start with that if you want to play

00:35:31

around with the neural art examples of

00:35:34

this as being very popular also helping

00:35:36

popular rising deep learning outside of

00:35:38

the motion and community because it's

00:35:41

very intuitive you have a picture and

00:35:44

you have a painting. And you input them

00:35:46

both to the model and the model return

00:35:48

to the picture kind of looking at the

00:35:51

at that painting. So if you want to do

00:35:54

this in terms of low it's the also

00:35:56

definitely possible char are and and

00:35:59

idea feeding into a recurrent neural

00:36:01

network one character at a time also

00:36:04

and impossible to do intensive role

00:36:07

also the to I mean their contribution

00:36:10

you might not hearers it's a deep

00:36:11

learning high level library I'd used to

00:36:15

work with analyse the back and not also

00:36:16

supports transfer from neural caption

00:36:21

generation and also this is an

00:36:22

implementation of the show and tell

00:36:24

paper if you are familiar with it so

00:36:27

that the basic idea is that you ask the

00:36:31

neural network to describe the picture

00:36:33

that you give so it's a combination

00:36:34

about convolutional and network the

00:36:37

encoders the convolutional network an

00:36:38

an hour and then as the decoder. So you

00:36:41

can also use this with danceable

00:36:43

English to chinese translation also

00:36:47

available in pencil flow if you're

00:36:49

interested and these this kind of sums

00:36:54

up examples that I wanted to show so as

00:36:56

you can see they're very diapers going

00:36:59

from in age to translation to art

00:37:04

examples of the communities really

00:37:06

interested in bringing these

00:37:07

distinctive dancer for line you can

00:37:10

also help with that and I'm gonna talk

00:37:13

a bit about another aspect also for

00:37:16

potentially more advanced service but

00:37:18

it might be very very important of how

00:37:20

to create your own operation. So that's

00:37:23

a floral. That's a lot of things it's

00:37:25

very flexible but maybe you want to do

00:37:28

something else right. And I'm gonna

00:37:33

show a little bit of what's the way to

00:37:37

to do that. But firstly when should you

00:37:39

create your own operation. So though

00:37:42

first obvious cases when you want to do

00:37:44

something and you can do this by

00:37:45

composition of already existing

00:37:47

operations the second use case is they

00:37:51

are all that do what you want so you

00:37:53

can combine them to to achieve your

00:37:56

goal. But you want to speed up the

00:37:58

computation so you might want you might

00:38:00

know what labour wait for instead of

00:38:03

calling one or after the other two

00:38:06

speed up the code if you just do it in

00:38:08

one all or you might want to memory

00:38:11

efficient implementation that can be

00:38:12

also made a better by combining

00:38:16

operations or you can have a more

00:38:18

numerically stable implementation in we

00:38:20

already saw this with the soft max

00:38:22

cross entropy example right so you can

00:38:25

have the soft max all you can have the

00:38:27

cross entropy all you can apply done

00:38:29

sequentially but for a numerical

00:38:31

stability reasons it's better to just

00:38:33

call them in one operation. So these

00:38:37

are tops to create your own operation

00:38:40

at the steps starting to create your

00:38:41

own operation intensive well some of

00:38:43

them are optional again depends on your

00:38:45

use case and what you want to do with

00:38:47

our all your operation. But usually the

00:38:50

it kind of goes like this you want to

00:38:52

read you starred operation in C plus

00:38:54

plus file so you want to have to tell

00:38:56

tense for all and we just trying on

00:38:58

your operation this is how it looks

00:39:00

like you have to implement it obviously

00:39:03

and they die the implementation of an

00:39:05

operation is called the cardinal and if

00:39:07

you want your kernel to run a multiple

00:39:09

devices you might have to implement

00:39:10

multiple kernels optional if you want

00:39:14

your ought to be used in python you can

00:39:17

use loads are provided potential flow

00:39:20

and create the python wrapper also and

00:39:22

one line. And you can write the

00:39:23

function to compute the gradients of

00:39:25

Europe so if you want to you use the

00:39:28

differentiation that comes with tens of

00:39:30

right if you might want to they want to

00:39:33

integrate your operation in training

00:39:35

the neural network then you need to

00:39:37

define the gradients your the output of

00:39:40

your all we respect your inputs and if

00:39:43

you want to benefit of shape in french

00:39:45

you can also write the function that

00:39:48

describes the input and output shapes

00:39:50

and obviously always always test your

00:39:52

recording. So now let's go back to to

00:39:56

the steps first you registered or it

00:39:59

looks a bit like this I have my own

00:40:01

operation I don't have a good name for

00:40:03

it so I'm gonna name at my own or this

00:40:05

is the important it's gonna be a good

00:40:07

compose that in thirty two and this is

00:40:09

not output this is a very simple step

00:40:12

then you know what you want to do so

00:40:16

here you just have to define the

00:40:17

implementation of the operation that

00:40:19

you have decided that you want to

00:40:21

implement. And it it comes with the

00:40:23

standard API that you have to use but

00:40:26

it's very simple you just take the

00:40:27

input from the all colonel context

00:40:30

object provided and you allocate the

00:40:32

output but the computation is something

00:40:35

that should be easy or relatively

00:40:37

straightforward because you know what

00:40:38

you want to do. And then you have to

00:40:42

00:40:44

did before we actually registered

00:40:45

operation. But now we have to say that

00:40:48

this operation it has discovered

00:40:52

registered to it so this is this is the

00:40:54

the TCU implementation associated with

00:40:56

my my operation. And then you can build

00:41:00

your kernel using your favourite

00:41:01

compiler or with the bill system that

00:41:04

comes with tense from which is the base

00:41:05

so now if you want to make your

00:41:09

operation run on the GPU you actually

00:41:12

have to define the put I implementation

00:41:14

of it and register it so this is the a

00:41:17

bit more more tricky part. We actually

00:41:20

have to write code I if you want your

00:41:21

operations to run that you you if you

00:41:25

want is an operation and pike then you

00:41:27

don't have to do much basically you

00:41:29

just have to create a take the excel

00:41:30

file that was created by the is using

00:41:35

the C plus plus court. And just load it

00:41:38

and now you have your module in which

00:41:39

you can call the the what that you

00:41:42

define so this is pretty

00:41:44

straightforward and you might also want

00:41:48

again to use the function to compute

00:41:51

the greedy and so if you want to apply

00:41:54

your operation and the neural networks

00:41:55

a setting you want the phone you want

00:41:57

to do this. And this is also simple and

00:42:00

up to you know I think that that's a

00:42:02

flow can help with this if you're

00:42:03

defining the new operation you have to

00:42:05

know what the gradients of the output

00:42:09

with respect to each of the input is

00:42:11

and from here you can just let in

00:42:15

several do its magic can you can

00:42:16

combine your all with all other

00:42:18

operations. And peace if you have a

00:42:21

require a really strange it need for an

00:42:24

operation and you have implemented just

00:42:25

sent upon request so that other people

00:42:27

can also benefit but but the in

00:42:29

principle these are kind of the steps.

00:42:32

And the most difficult part is to know

00:42:34

what you want once you know what you

00:42:36

want you just have to implement that

00:42:38

and the computer gradients for for that

00:42:41

operation oh and I all just something

00:42:49

that I want to stress about ten

00:42:52

syphilis general machine learning free

00:42:55

more actually more most in general

00:42:57

computation framework. So we can use

00:43:00

for it can be used for problems that

00:43:01

require differentiation optimise

00:43:03

station or linear algebra computation.

00:43:06

But I don't mind that it is made with

00:43:09

deep learning in mind so most of the

00:43:10

API support and the feature request

00:43:13

that will be addressed are actually

00:43:14

looking at at deep learning but you can

00:43:17

use it for for other other problems as

00:43:19

well and actually there is the

00:43:21

differential equation solvers intensive

00:43:23

flow if you want to play with that

00:43:24

little bit it's unavailable tutorial

00:43:27

online I'm not gonna go through this

00:43:28

right now. So this brings that's brings

00:43:33

me to the conclusion oh this talk and

00:43:40

the the general idea of that answer for

00:43:42

talks. So you're probably here because

00:43:45

you want to find out what's the best

00:43:48

thing for you right. And this again

00:43:50

depends a lot on your use case in

00:43:52

depends on how much time you want to to

00:43:54

spend on this if you actually don't

00:43:57

know anything about declining and you

00:43:59

wanna play around a little bit you can

00:44:01

you can train neural networks in your

00:44:03

browser. So there is playground all

00:44:05

kinds of rolled up or and show you a

00:44:07

little bit. And then here in D see so

00:44:12

with this and this you like you can

00:44:15

actually train very simple neural

00:44:17

networks just to get an event intuition

00:44:19

of how this looks like so you can

00:44:22

increase the number of neurons decrease

00:44:24

then both here you can change

00:44:27

activation function your regularly

00:44:29

station and you can change the problem

00:44:32

type and you can also start learning

00:44:34

and you can see how your model is

00:44:36

converging. So if you're not that old

00:44:38

familiar with deep learning this might

00:44:40

be a nice way to spend the thirty

00:44:42

minutes or so but this was just the way

00:44:53

example right. So if you want to the

00:44:55

start to using a machine learning for

00:44:58

real life problems you have multiple

00:45:01

options including using club ATCPI so

00:45:04

this is something that I'm not

00:45:05

discussed before and I think the neural

00:45:08

network case the more complicated you

00:45:12

go it's more flexible but you need to

00:45:14

spend more time and for to learn the

00:45:16

frame or to learn about the perks of

00:45:19

the models right so it depends where

00:45:22

you are on the scale. And now let's

00:45:25

look a bit at at this so if you just

00:45:28

want to not to really deal even with

00:45:31

cancer floor right you can just use

00:45:34

some Claude based API to get some of

00:45:37

the outputs that you want so there's

00:45:38

global translate API for translation

00:45:42

for speech for vision so you can do

00:45:45

also sentimental analysis with the

00:45:47

celtics API so there plenty of options

00:45:50

available without getting your hands

00:45:53

dirty with the actual machine running

00:45:56

go frameworks. So let's look a bit of

00:45:59

how it would look like with the

00:46:01

politician API so if you ask the

00:46:03

cognition API what's in the picture. It

00:46:06

can tell you that so for example here

00:46:07

clearly there's a lot of people running

00:46:09

and there's a marathon. So we can tell

00:46:11

you that in can also give the discourse

00:46:13

or some shit with it or you can find

00:46:16

out. What's what emotions people are

00:46:19

displaying so here it looks like

00:46:21

someone's the which a joyful and you

00:46:24

can also find out what's the text in

00:46:26

the pitcher and what's the language in

00:46:28

the in that text. So any again you

00:46:31

don't have to actually write the want

00:46:33

yourself or even the load already train

00:46:36

models the second option also very

00:46:38

simple is using up between model with

00:46:41

tends to flow we've already looked into

00:46:43

this one not gonna spend much time on

00:46:45

it. But the third and four options for

00:46:48

options were training your models or

00:46:52

creating and producing the models and

00:46:54

for this you have multiple options. So

00:46:57

you can I don't run that subplot open

00:46:59

source really senior physical machines

00:47:01

or in in your virtual machines are not

00:47:03

plough the environment or you can use

00:47:05

the clock machining API which allows

00:47:07

you to use answer pro but also a lot of

00:47:09

the other developer tools that the bill

00:47:13

provides. So again you have to think

00:47:15

where do you want to be here do you

00:47:17

want if you have your own physical

00:47:18

machines then you probably are here. So

00:47:21

depending on you use case if you

00:47:22

already using a lot of the global

00:47:24

developer tools then you probably want

00:47:26

to go here. Now if you want to develop

00:47:30

your own American model then he saw

00:47:32

it's relatively easy to define the

00:47:34

computational graph it's very flexible

00:47:36

and once you define it you create

00:47:38

efficient and you start training and

00:47:41

this is the example of how to to use

00:47:44

this for robotics for making robot arms

00:47:49

to pick up things we can have a look if

00:47:52

you're interested and so the which

00:47:55

obviously use distance from and just to

00:47:59

to include a a bit of a advertisement.

00:48:02

So we recently announced the gringos

00:48:05

Europe research centre based in zurich.

00:48:09

And we are encouraging people to apply

00:48:13

address software engineers or research

00:48:15

scientist and there's also a lot of

00:48:17

internship positions available so if

00:48:19

you're interested just check out the

00:48:21

the the website. And that's pretty much

00:48:25

it for eight me. Thank you very much

00:48:27

for listening and I hope you learn

00:48:28

something from this the at some maybe

00:48:44

maybe you can directly move to the but

00:48:46

the more panel because there is a need

00:48:48

more to on the I mean I'd are draining

00:48:52

so maybe you can keep your question for

00:48:54

one minute the time that we we sit yeah

00:48:58

yes okay I think okay oh so maybe we

00:50:23

should move directly to question so any

00:50:26

question yeah it's so just just to

00:50:35

quick questions is straightforward to

00:50:39

create aside these are three black

00:50:42

network using using their cell phones

00:50:45

as as easy as was shown the example and

00:50:49

another question can I create an

00:50:52

operational in the in the by don't

00:50:54

level so one logistical thinking UP

00:50:58

stand up when you answer the question

00:51:00

because I don't know oh okay no I know

00:51:04

in which direction to to look at so the

00:51:07

first question I I is it easy to create

00:51:09

a siamese network or or three black

00:51:11

network admin is the as simple as you

00:51:13

useful in your examples well it depends

00:51:16

yes basically yes the short answer is

00:51:19

yes and that's what was the second

00:51:22

question I can create operations in the

00:51:24

python level you or some examples in C

00:51:27

plus plus in you to the bindings

00:51:28

afterwards I you the operations you

00:51:31

have to register in C plus plus so

00:51:32

there is something plus plus part in

00:51:34

you definitely have to write if you

00:51:35

want the CPU implementation the kernel

00:51:38

has to be okay hi. Um can you give us

00:51:54

some insights on how the usually skate

00:51:57

so the inference that with that so from

00:52:01

the very like mean I'm I'm not asking

00:52:03

for like a getting me that or not sure

00:52:06

but I'm just asking what technology to

00:52:10

use in order to skate. And that serving

00:52:13

of tens of no there Spencer floor

00:52:15

servings itself a but do you combine it

00:52:19

what you combine it with in order to

00:52:20

achieve a a highly scalable system

00:52:23

again I'm gonna discuss Google internal

00:52:26

information if you want to see how to

00:52:28

serve dancer for exactly danceable

00:52:30

serving is where to look at yeah the

00:52:37

fanciful serving out of the box is

00:52:39

pretty sequential so I mean if you

00:52:41

tried like with that I mean if one

00:52:44

tries it with that tutorial based

00:52:46

version spit it sequentially you can

00:52:49

submit one query at the time I think my

00:52:52

question is that probably do something

00:52:54

else in order to to make it scale for

00:52:56

for work and I'm not gonna discuss it

00:52:59

okay just look at what fanciful serving

00:53:01

supports and then that's the open

00:53:03

source version yeah that concerns one

00:53:08

of the your slides about this clout

00:53:12

service for machine learning so if I

00:53:15

put data there what actually happens to

00:53:17

the data do actually guaranteed Reese

00:53:19

you know the the service it's going to

00:53:22

be sent to respect for example pry

00:53:24

possible there's okay so I actually the

00:53:38

question so ye if I'm correct you are

00:53:42

running benchmarking me sing so there

00:53:45

were some speed differences between the

00:53:47

values framework some point. I don't

00:53:49

know where we also moment and if we all

00:53:51

Stevie the LC differences or the you to

00:53:54

to something which is good you don't if

00:53:56

I'd or design tracy's or something

00:53:58

which is going to implement both I or

00:54:00

what the situation. so about when times

00:54:06

are slow started there was the speed

00:54:09

difference but no it's actually just

00:54:12

flows doing equally well that the other

00:54:16

frameworks except for anything on which

00:54:18

is a bit faster than the rest but the

00:54:23

problem with my benchmarks is that the

00:54:25

only colour comments. So it only covers

00:54:30

to use cases that people have for

00:54:32

content so what we want to know though

00:54:36

is that the framework supports a

00:54:38

recurrent nets comments like us

00:54:40

different like speech models and so on

00:54:43

us so I am working but to kinds of flow

00:54:48

case the TN and gender guys everyone in

00:54:51

the community to release a new set of

00:54:54

benchmarks that will capture the use

00:54:57

cases more uniformly of all the

00:54:59

researchers out there. And we will know

00:55:03

so and how all the frameworks perform

00:55:06

and that is the general skills as well

00:55:10

but that just for complex actually

00:55:12

cancer flows doing as well as other

00:55:15

frameworks except for you know So so at

00:55:22

the moment putting aside the benchmarks

00:55:24

a little or you you will not aware of

00:55:27

major differences in the design that

00:55:29

would that we know we backed

00:55:32

performance in some way or another. So

00:55:35

from my perspective the only also and

00:55:37

of skating up on something kind of like

00:55:39

the cool so they listen to it all ties

00:55:42

with you strongly comes off to as being

00:55:44

able to take advantage of really

00:55:47

village crystals or out of CGPS is it's

00:55:50

a case will talk so what is it you yes

00:55:53

I think in terms of the design so

00:56:00

there's basically two philosophies at

00:56:02

this moment it all frameworks one is

00:56:05

the whole write your neural network as

00:56:09

a computational graph. And then give it

00:56:12

to an execution engine that will just

00:56:15

treated appropriately and so on. And

00:56:18

there is another philosophy which is

00:56:21

that almost all of the distributed

00:56:24

computation is that you do with you

00:56:26

know networks and in general can XP

00:56:28

expressed as something in in terms of

00:56:32

computations that the high performance

00:56:35

computing groups have been doing for

00:56:37

many years which is called the MPI

00:56:40

class of collectives us or torch takes

00:56:46

the at this moment at least like that

00:56:50

that readers distributed package for

00:56:52

example for takes the approach where

00:56:55

you just use any PI that does the

00:56:58

distribution for you and it's oh picked

00:57:00

away and there is no separate execution

00:57:04

engine and so on. And answer flow and

00:57:07

cafe cafe to chain or these frameworks

00:57:13

take the other approach which is that

00:57:15

you have an execution engine and the

00:57:17

competition grab. And you would want to

00:57:20

do it in a more general generalised man

00:57:24

I want to ask you even notice little

00:57:28

does in to be a very good distribution

00:57:31

framework in that that's really

00:57:34

important nowadays that doesn't mean

00:57:37

that to there's not a lot of focus on

00:57:39

making it great on one machine so

00:57:40

there's a as the he said there's been a

00:57:43

lot of improvement since they need to

00:57:44

launch and there's still lot of work on

00:57:46

this so the in is definitely tool not

00:57:49

only focus on the disputed side it's

00:57:52

great to have and it's very useful but

00:57:54

the you should be able to have great

00:57:56

performances of thousands of labels on

00:57:58

one machine so we are time you're so

00:58:10

you've done okay sorry I cannot start

00:58:14

it I don't know I it if it's really a

00:58:19

question or more comment or something I

00:58:23

was working on but it was new and it

00:58:24

works for also several decades already

00:58:27

than I do to say I really like this

00:58:29

workshop I want us to say thank you for

00:58:32

organising this and I think was a quite

00:58:35

great success and it's quite hard

00:58:40

receptive audience and I realised also

00:58:42

that they're people from many

00:58:44

communities here some of them using the

00:58:46

library some of them very new to neural

00:58:50

networks. And it's very hard to find

00:58:52

something which is good for everyone.

00:58:55

And also for me there was some more

00:58:58

information which was new for me so it

00:59:00

was quite nice as thing and it's

00:59:04

difficult to decide what's to really

00:59:05

cover because it's called deep learning

00:59:08

methods and tools and how much to speak

00:59:10

about depending how much of a mess

00:59:12

that's how multiple tools we spoke you

00:59:14

mean about messes tools and

00:59:16

technologies the deep learning aspect

00:59:19

in general was a bit covered in

00:59:21

intonational presentation. But that was

00:59:24

only a very claims of the history and

00:59:27

this was in my face up in in a little

00:59:31

bit too few if you wanna cover history

00:59:34

whether it be maybe a bit more seems at

00:59:37

the moment that that was like two

00:59:38

thousand six the various a lot of

00:59:41

people learning and there are the names

00:59:44

just to mention I eva Nicole and ankle

00:59:48

who is considered as a father of the

00:59:51

learning an of course you know there's

00:59:54

the well I in there just some other

00:59:56

names would have been nice to be

00:59:57

mentioned as well. And maybe it might

01:00:00

be a suggestion because of putting the

01:00:02

slides online or something to just give

01:00:04

a small page or something of the

01:00:07

history of the planning as well Okay

01:00:12

well so thank you for one the for the

01:00:25

conference your workshop the I have you

01:00:29

and requested on the two older yeah I

01:00:32

recording tools and so and do you

01:00:36

expect that there will come more

01:00:40

graphical tools for Norman engineers.

01:00:44

So they can be involved more indeed

01:00:47

cloning and knocks programmers just

01:00:50

graphical tools to create your letters

01:00:57

for now I mean more graphical input you

01:01:04

you so have you can but this is

01:01:08

inherently my pieces and apparently a

01:01:10

programming task if you'll have to

01:01:13

debug it at some point you still have

01:01:15

to look at the at the cold right so you

01:01:17

still need someone to be familiar with

01:01:20

some some high level language such as

01:01:23

while or I think because it's not about

01:01:25

what happens if you see if you have

01:01:26

this if I understand the question

01:01:28

correctly if you have a graphical

01:01:30

interface it's harder to it at used

01:01:33

maybe I'm too much of a software

01:01:35

engineer. It's harder for me to see how

01:01:38

that would work in my daily work well I

01:01:44

assume you might be referencing to

01:01:46

something like simulating maybe like

01:01:49

you know where you have you can plug in

01:01:52

graphical things and then you run them

01:01:54

and double click on one of the things

01:01:56

and change the code or is that what you

01:01:59

are thinking about I mean more getting

01:02:03

more quick. Um I mean writing code is

01:02:10

kind of go close that's very flexible

01:02:13

learns all about it's a very to do is

01:02:17

where you're working if you know

01:02:20

different things that you want to do it

01:02:22

and you just want to take on the menu

01:02:25

in and select certain options and then

01:02:29

you can have the variability that you

01:02:32

need maybe for certain problems where

01:02:35

you can have a hierarchy of graphical

01:02:38

other choices so and so so from from my

01:02:44

interactions that many people in the

01:02:45

community there are a few projects in

01:02:50

the works in this direction. I think

01:02:52

defining a graphical tool that is very

01:02:56

effective for a new field is a is

01:03:01

somewhat of a a test a few type things

01:03:05

and converse to something that works

01:03:07

for most people and and there is a

01:03:10

project that is being put in the torch

01:03:14

community to like together neural

01:03:18

networks a graphically and it's being

01:03:21

done by you I PHD student was also

01:03:26

interested in your networks and like

01:03:27

that there's other there's something

01:03:29

built on top of cafe that is similar

01:03:32

the by and we D actually got Andrea

01:03:34

digits where you can adjust with the

01:03:37

few drop down menus trainer no network.

01:03:41

I think getting the power of defining

01:03:46

your most complex networks and the

01:03:48

graphical ways of somewhat hard from at

01:03:52

least our perspective as programmers

01:03:54

because the number of choices you can

01:03:56

take at each step is so large that it's

01:03:59

very hard to be expressed graphically.

01:04:02

But I think as or UI researchers coming

01:04:07

to the loop there probably eventually

01:04:10

be some solution that would be

01:04:12

effective for most people no more

01:04:25

questions okay you then maybe we can

01:04:30

stop you on the thanks again for your

Share this talk:

Conference Program

59:34

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

2368 views

55:38

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

427 views

01:01:02

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

330 views

55:14

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

815 views

55:57

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

342 views

01:08:04

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

2156 views

49:29

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

275 views

52:43

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

151 views

45:40

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

2659 views

52:33

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

1704 views

01:05:51

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

1406 views

01:04:41

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

2250 views

Recommended talks

32:23

Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
Nov. 26, 2016 · 3:31 p.m.

624 views

09:07

SGAN: An Alternative Training of Generative Adversarial Networks
Tatjana Chavdarova, Idiap Research Institute
April 19, 2018 · 10:32 a.m.

1051 views

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google

Embed

Transcriptions

Conference Program

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

Recommended talks

Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
Nov. 26, 2016 · 3:31 p.m.

SGAN: An Alternative Training of Generative Adversarial Networks
Tatjana Chavdarova, Idiap Research Institute
April 19, 2018 · 10:32 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

TensorFlow 3 and Day 3 Questions and Answers session Mihaela Rosca, Google

Embed

Transcriptions

Conference Program

Deep Supervised Learning of Representations Yoshua Bengio, University of Montreal, Canada July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning Alison B Lowndes, NVIDIA July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers Panel July 4, 2016 · 4:16 p.m.

Torch 1 Soumith Chintala, Facebook July 5, 2016 · 10:02 a.m.

Torch 2 Soumith Chintala, Facebook July 5, 2016 · 11:21 a.m.

Deep Generative Models Yoshua Bengio, University of Montreal, Canada July 5, 2016 · 1:59 p.m.

Torch 3 Soumith Chintala, Facebook July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers Panel July 5, 2016 · 4:21 p.m.

TensorFlow 1 Mihaela Rosca, Google July 6, 2016 · 10 a.m.

TensorFlow 2 Mihaela Rosca, Google July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning Mauricio Breternitz, AMD July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session Mihaela Rosca, Google July 6, 2016 · 3:21 p.m.

Recommended talks

Limbic system using Tensorflow Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain Nov. 26, 2016 · 3:31 p.m.

SGAN: An Alternative Training of Generative Adversarial Networks Tatjana Chavdarova, Idiap Research Institute April 19, 2018 · 10:32 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
Nov. 26, 2016 · 3:31 p.m.

SGAN: An Alternative Training of Generative Adversarial Networks
Tatjana Chavdarova, Idiap Research Institute
April 19, 2018 · 10:32 a.m.