Player is loading...

Embed

Embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
Uh yeah yeah oh Yeah yeah yeah oh so we
00:01:20
are starting the last socialise this
00:01:22
through trouble us okay by me hide a
00:01:26
about on software again on that was
00:01:28
about we more ways you only had a user
00:01:32
speakers have to be the only so yeah
00:01:36
okay so I guess I'll start of the third
00:01:53
talk just as usual feel free to
00:01:57
interrupt me if there is a strange
00:01:58
questions way will have time for
00:02:00
questions at the end of this talk and
00:02:01
also for the panel and P shout out to
00:02:04
do some technical problems if you can
00:02:06
hear neons can see something on the
00:02:08
slide. So let's start talking about
00:02:12
neural networks intensive for because
00:02:14
the the example that we have so far was
00:02:16
very good for pedagogical account of as
00:02:18
a pedagogical example but you wouldn't
00:02:21
you really use that in practise and I'm
00:02:22
gonna show you some examples of how
00:02:25
what what you can actually use at home
00:02:27
or at work to to with with that simple.
00:02:31
So if you're here you probably know
00:02:35
what neural networks are and especially
00:02:37
the neural networks but they usually
00:02:39
look like this the idea is this your
00:02:42
article structure you always have this
00:02:44
input layer in which you feed the input
00:02:47
in this case the picture of I hope I
00:02:49
have that right. It's the bulk okay so
00:02:51
obvious I'm as good as the network
00:02:53
perfect. So now once you have the input
00:02:59
you always propagate information
00:03:00
forward and then doing learning the the
00:03:02
gradients flow back or so this is kind
00:03:04
of the thirty seconds introduction to
00:03:08
to neural networks and now we're here
00:03:12
to discuss how neural networks can be
00:03:15
used in terms of law and you have three
00:03:17
options depending or an how flexible
00:03:20
you want to be and how much work you
00:03:22
want to put into this. So you might be
00:03:24
able to use already trained models this
00:03:27
is very easy for this you don't
00:03:28
actually have to spend that much you
00:03:30
mutation or power or you might be able
00:03:33
to retrain one layer of one already
00:03:36
train model B would look at that also
00:03:37
in a bit or you can use higher label a
00:03:40
high level EP eyes to build on top
00:03:42
contender flow or you can define
00:03:44
entering your network directly in terms
00:03:46
of otherwise we saw with union
00:03:48
regression example so for each of these
00:03:50
I will going out into detail and
00:03:51
actually have a notebook to to see
00:03:53
exactly what what the entail so that
00:03:55
you get a feel for what particular you
00:03:57
would want to do. So if you want to use
00:04:01
already train models you only have to
00:04:04
do two steps one you'll all the graph
00:04:05
of the train model. And to you'll start
00:04:08
to doing inference. So the stats look
00:04:12
exactly like this. Oh it's not that
00:04:14
much code as you can see the first part
00:04:16
just loads the model from a path that
00:04:18
is given. So this is something that
00:04:21
only thing that is new here is that we
00:04:23
have important graph and they have so
00:04:24
before in the example in the last
00:04:27
lecture. I always define the raft in
00:04:30
used it in the same up program right.
00:04:33
But in this case I don't define the
00:04:36
graph in this program. I actually just
00:04:38
the use the here. So I loaded from us
00:04:42
what a someone else has defined. And we
00:04:45
are able to to easily locate. Now we
00:04:48
want to use this graph in here again
00:04:50
two things are different for the same
00:04:52
reason. I don't have now the code that
00:04:55
defines the graph here. But I have the
00:04:58
graph itself. So as we saw in the last
00:05:01
lecture when you want to run operations
00:05:04
on the graph with tons of we have to
00:05:06
initialise assertions of this has not
00:05:07
changed. Um but now instead of calling
00:05:10
session dot run with the valuable that
00:05:13
you have defined in the same program
00:05:14
you have to ask the graph to give your
00:05:18
tensor for particular domain that
00:05:20
answers or operations are associated
00:05:23
with a particular name that to the
00:05:26
program I can define. And here you can
00:05:28
actually query I one for the I want the
00:05:30
soft max because this is what you used
00:05:33
to get the probability distribution out
00:05:35
of a supervise montel for example which
00:05:37
is what we will look at right now. And
00:05:39
then into session to try and again for
00:05:41
the placeholder so this is the the same
00:05:44
thinking syntax that to use before when
00:05:46
you have to tell the network what to
00:05:48
run through through the graph right we
00:05:50
have done that we have to tell tensor
00:05:52
flow what it should run to the graph in
00:05:54
here again we can't say this is the
00:05:58
placeholder this is that it's that we
00:05:59
had before because that's not defining
00:06:01
this program but we can say the name of
00:06:03
the placeholder and just a whole
00:06:06
session dot that tries before and this
00:06:08
does exactly what you would expect so
00:06:10
this is very very short and and works
00:06:14
quite well it's actually have a short
00:06:17
example for this. C the he is so that
00:06:23
oh this example works with the image
00:06:26
net. So these are very big network that
00:06:29
we need to recognise object or
00:06:32
categories from images. And with this
00:06:35
before there's actually a very small
00:06:36
script that down to the mobile for you.
00:06:39
And us inference and we're gonna look
00:06:41
at some examples to see you also kinda
00:06:43
get an intuition how well it works and
00:06:45
how easy it is to use. So let me start
00:06:50
running isn't that this this image
00:06:52
comes this is the stalk image that
00:06:54
comes by default when you want to run
00:06:56
the example I'm gonna try that one
00:06:57
first so right now I'm I'm running the
00:07:02
code and it actually is downloading the
00:07:03
model and then it's running the
00:07:05
inference. And we will see what the
00:07:08
results are so as we expect this is
00:07:12
that we return to labels are giant and
00:07:15
up and up and that there when bear and
00:07:17
onto looks right I'm not as colleges
00:07:20
but this looks like a pent out to me
00:07:23
now if we move to another mentioned
00:07:26
this is actually not a part of that the
00:07:29
the preacher about probably is part of
00:07:31
the test set of mention enters the
00:07:32
picture that I personally took. Um and
00:07:35
I'm gonna show you exactly how to do
00:07:37
the same thing for this picture in this
00:07:39
case for that cancer for example we
00:07:42
just have to specify the image file as
00:07:46
an argument and I'm running it again.
00:07:48
It's about it's it's so the output of
00:07:57
the classification is that be tabby
00:08:00
cat. And if you're like me before
00:08:03
running this you didn't know what that
00:08:04
is but this is actually what okay so
00:08:11
this is actually a tabby cat because of
00:08:13
the stripes so this is correct. But you
00:08:16
might actually want or what what I
00:08:17
believe if I want to clarify that a
00:08:19
little bit. So the this also works I'm
00:08:21
now also gonna show you another example
00:08:22
again the I mean the picture. So that
00:08:24
not that actually that I really like
00:08:25
the golden gate bridge but this is
00:08:27
something that then network has
00:08:29
something before right it's I'm not
00:08:30
part of any they that's it as of yet.
00:08:33
So so I'm just gonna start trying this
00:08:37
and this is an example where the
00:08:39
network makes a mistake but some sort
00:08:42
of reasonable mistake I would claim. So
00:08:48
so right now it's running the in front
00:08:53
and it's loading the graph exactly like
00:08:55
in the code that I should before. And I
00:08:57
think this is the peer. It's not
00:08:59
actually appears so the second
00:09:01
suggestion is a suspension bridge that
00:09:03
is correct but the first suggestion is
00:09:06
not really right so as you can see it
00:09:07
kind of gets the gist of it but it's
00:09:09
not not really correct. Now how hard is
00:09:14
to do this so you see you. Um you can
00:09:16
clearly see that you can use this
00:09:19
rightly we're doing it live it takes
00:09:20
very little time I didn't have to train
00:09:22
model. But how hard it is to actually
00:09:24
write this button classified image
00:09:27
stoplights obviously this is part of an
00:09:29
about open source you can have a look
00:09:30
at it. And but this file is two hundred
00:09:33
lines which contain downloading the
00:09:36
although do I'm backing them although
00:09:40
doing in France and some nice and
00:09:44
transformation from the probabilities
00:09:46
that the model give you to the English
00:09:47
labels because the model does not give
00:09:49
you the strings right the model gives
00:09:50
you some tensor back and also some
00:09:54
boilerplate for imports comments and
00:09:56
the entire thing is two hundred lines.
00:09:59
So it's it's very very easy to do this
00:10:01
with the the already train models I
00:10:06
can't it And right so this is that you
00:10:20
want to do that in the case where the
00:10:22
labels that you want are exactly the
00:10:24
label that come image that right if you
00:10:26
want to classify some of the the labels
00:10:29
that are already there or if you don't
00:10:31
have your own dataset but let's it me
00:10:33
mention that you want to classify
00:10:35
flowers PC or your we passionate to
00:10:38
find out if in picture there's a dog or
00:10:40
a cat. So in that case you might not
00:10:42
want to just use the model as is you
00:10:46
might want to to it a little bit but
00:10:47
the but you know is is that you
00:10:49
actually don't have to retrain the
00:10:51
entire model or retrain a whole want
00:10:53
only with your data. So you can
00:10:56
actually take advantage of the features
00:10:59
that the big more to learn using the
00:11:02
emission a dataset and just we could
00:11:05
the last layer. So just we the
00:11:08
classification layer of of the model.
00:11:11
So that you learned the classes that
00:11:12
you want to see so and it in this is
00:11:16
also very easy and I hope it's very
00:11:18
clear white sorry why you would want to
00:11:25
do this because it saves you a lot of
00:11:27
computational power so again you are
00:11:28
not doing all the computationally
00:11:30
intensive task of learning at this is
00:11:33
kind of the distribution of natural
00:11:34
images this is what I should expect to
00:11:37
see as input you are just tweaking the
00:11:39
last part that does the classification.
00:11:41
And there's also an example of how to
00:11:44
do this but basically it's one command
00:11:46
line with tens of so if you have a
00:11:49
image classification problem. Um you
00:11:51
can just use your daytime and retrain
00:11:55
the last model of an exception model
00:11:57
and do it in one line that's that's
00:12:00
pretty cool. And something I we want to
00:12:03
mention here there's a lot of
00:12:04
documentation for all of these so in
00:12:06
case you decide one of these three
00:12:09
methods that I'm explaining out there's
00:12:10
plenty of resources online for you to
00:12:13
to look at now let's see that the
00:12:18
second use cases you've looked at this
00:12:20
but you think that or we using an
00:12:22
already trained network was not really
00:12:23
for you. You have a very specific use
00:12:25
case so you want to try something else.
00:12:28
Uh and you say I want to build my own
00:12:30
network fanciful but I want to I don't
00:12:33
we did very fast or not spend a lot of
00:12:35
time on it or I don't really want to
00:12:37
keep that we can act accuracy to the
00:12:40
zero point zero one that well then you
00:12:42
can actually use a TF floor which is a
00:12:45
high level API forty ancestral it is
00:12:47
part of that sort of also when you
00:12:49
install danceable you also get TF line.
00:12:51
And it implements dislike it learning
00:12:53
PI for those of you familiar with
00:12:55
python you might not this is the very
00:12:57
popular upright an API for machine
00:13:00
learning that implements a lot of
00:13:01
models including is jens and random for
00:13:04
instance all and this TF learned more
00:13:07
the P of learning a a PI allows you to
00:13:11
create this computational grafted I was
00:13:13
talking about before for neural
00:13:15
networks for you of just speed for non
00:13:18
actors competition on neural networks
00:13:20
are are in in in one line. So this is
00:13:23
pretty cool it's very easy to use so it
00:13:25
looks exactly like this. This
00:13:27
classifier holds the computational
00:13:29
graph that we saw before. So now I'm
00:13:34
actually gonna show you an example of
00:13:35
this to live an example that looks as
00:13:40
digits there's no machine learning
00:13:42
tutorial without putting up mister
00:13:44
right. So I'm gonna start running the
00:13:48
the courts and I'm gonna explain to you
00:13:49
want to one bit at the time what I'm
00:13:51
doing. So the first thing usually the
00:14:01
thing we do that all the necessary
00:14:02
imports. And yeah I'm actually getting
00:14:05
the daytime doing some preprocessing
00:14:08
because I'm using I wanted to make a
00:14:10
point to actually use the psychic
00:14:12
learning PI to get them mistake that
00:14:14
here. Because we're operating under the
00:14:17
assumption that we like this like it
00:14:19
learn a PIN you want to use that and
00:14:21
I'm using the and downloading them is
00:14:23
original data and this one comes into
00:14:26
digit is a pixel is from zero to two
00:14:29
five five which is not really great for
00:14:31
neural networks is better if they're
00:14:33
scaled. So in this case dividing each
00:14:36
pixel by two five five so that it there
00:14:38
in between zero and one and then I'm
00:14:41
splitting the daytime to training and
00:14:43
testing so this is pretty pretty
00:14:45
straightforward in case someone here is
00:14:48
not familiar with this. This is how it
00:14:51
looks like twenty by twenty eight
00:14:53
images. I can also run a little bit to
00:14:56
see multiple examples of these are just
00:14:59
random images from the training set.
00:15:01
And that this is the code that does the
00:15:05
creation of the graph. So this is for
00:15:08
lines and this is because I wanted to
00:15:10
be trendy and not to use the default
00:15:12
optimise their so actually can we move
00:15:14
three lines here. So you can do it in
00:15:16
one line if you want. And I'm just
00:15:19
gonna go ahead and create this model
00:15:22
and that one gonna train the model it's
00:15:25
wanna take around one or two minutes in
00:15:28
which you have time to ask me questions
00:15:31
but so now I'm just starting the mobile
00:15:34
this is how much it takes just for
00:15:36
reference to do everything that you
00:15:39
need including downloading the data
00:15:40
processing it visualising a little bit.
00:15:43
And so they'll they'll difficult part
00:15:46
two ideas for lines and now the model
00:15:50
is training. And while it's doing that
00:15:56
you have the option to ask a couple of
00:15:58
questions not for this example I have
00:16:04
for another example coming along but
00:16:15
actually that that sorry so you don't
00:16:19
need to handle sessions anymore when
00:16:21
you use the recipient I no it does that
00:16:23
for you. So does that under that so
00:16:25
that so this this force part creates
00:16:28
the classifier the first but the second
00:16:31
part deals with creating the execution
00:16:35
environment and what the what you pay
00:16:38
you if I want to run this one GP ooh I
00:16:41
think there are options to specify this
00:16:43
yes I mean there's no no one does
00:16:48
things without them being able to run
00:16:51
the GP anymore so income and learn how
00:16:58
to recognise the teachers faster you
00:17:00
machine yeah anyway it's it's gonna be
00:17:05
done very soon and then I I I'm
00:17:06
actually gonna show you a bit of what
00:17:08
kind of mistakes that that's because I
00:17:10
think that that's pretty interesting.
00:17:13
So any other questions. so so maybe
00:17:16
maybe one what is the model you know
00:17:18
well it's So this is complete fair
00:17:21
point this is just the feed forward
00:17:23
classifier with two layers in a hundred
00:17:26
the hidden units each switch little it
00:17:29
no it is not the problem there but you
00:17:30
can also equally easy to find the
00:17:32
content but this is just like a feature
00:17:34
for word of neural network with the
00:17:36
real activation. And it's done it. And
00:17:41
now I'm gonna run and see how well it
00:17:43
it. So I I had the result ready because
00:17:47
I I ran to put before but the the point
00:17:51
is that it's pretty good. It's actually
00:17:53
doing only two mistakes in a hundred
00:17:55
examples in a like time to ask a couple
00:17:58
of questions. So it's it's very easy to
00:18:01
use and gives relatively good result
00:18:04
obviously this is not state of the art
00:18:06
or anything so but I just want to give
00:18:09
a clear example. And now let's look at
00:18:12
how well it does on random image in the
00:18:16
test. So I'm just gonna be it from
00:18:19
here. So that you also give an idea how
00:18:25
how easy to it is later to look at
00:18:27
these results and visualise things
00:18:30
because and so we're again as when
00:18:33
python you can use all your of
00:18:35
favourite plotting libraries. So let's
00:18:40
get the random test example and plotted
00:18:46
to and it's also pretty so this one is
00:19:04
the correct it says it's a one and I
00:19:08
mean you should expect this if it's
00:19:09
wrong to out of a hundred times when I
00:19:11
pick one at random it's probably gonna
00:19:13
be correct right so here we see an
00:19:15
example of a correct classification but
00:19:17
how about if I want to see one way it
00:19:19
does something wrong to see does it
00:19:21
make mistakes along very trivial
00:19:23
examples or does it make the six
00:19:25
someone tweak your examples. I'm gonna
00:19:28
do this in not so efficient way but I'm
00:19:31
just gonna check when these two vectors
00:19:35
are equal and and I'm gonna get the
00:19:39
index of the first one where they're
00:19:41
not oh index around one one four four
00:19:48
for so I'm gonna call is there is this
00:19:54
is the index of a mistake right so here
00:19:56
it looks at when the classifier does
00:20:00
not agree with what we know to be the
00:20:03
actual values. And I'm also gonna like
00:20:07
that. And before can predicted. So now
00:20:25
we know there should be a mistake yeah
00:20:27
so probably this guy should be a two
00:20:32
and the network says it's a one. So
00:20:35
this is a mistake but it is doesn't
00:20:39
really like a two to me but in any case
00:20:42
this is kind of you can get an
00:20:44
intuition of when the moment as well
00:20:47
and one the one model does right and
00:20:48
you can do this again in four lines of
00:20:51
of code and fact now suppose what with
00:21:08
this I still didn't come into suppose
00:21:10
you are really interested in starting
00:21:11
things from scratch or perhaps a
00:21:13
researcher that want to come up with
00:21:15
new ideas and your models then you
00:21:17
actually maybe want to go from cancer
00:21:19
flow to go to actually the lowest level
00:21:23
possible work to make some of the lower
00:21:26
level variance with some of the higher
00:21:28
level ones. So control has a lot of
00:21:31
support for existing activation
00:21:33
functions we probably know where the
00:21:35
signal at any age someone as before
00:21:36
about cost functions differently there
00:21:39
a normalisation techniques and then
00:21:41
bidding so all these things that are
00:21:43
now very use they are already there.
00:21:45
And if you want to do this it's more
00:21:50
complex so it's not gonna be three
00:21:52
lines of code but it gives you a lot
00:21:53
more flexibility so depends again on
00:21:55
the spectral you always have to think
00:21:57
where you are on the spectrum how much
00:21:58
time do I want to spend of this how
00:22:00
much do I take care about improving the
00:22:02
model I do I want to potentially
00:22:04
defined a new layer type and this is
00:22:07
very ideal for for researchers. Now
00:22:11
actually have an example to go through
00:22:13
this to for completeness and it's
00:22:18
surprise also miss digits. So I'm gonna
00:22:24
cleared out you know so that you
00:22:26
actually trust me. I'm running
00:22:29
everything live okay so everything
00:22:32
disappears there's not put great. So
00:22:36
same idea I'm gonna get the data said
00:22:39
this time from right answer for all
00:22:42
because I'm no longer using this I
00:22:43
could learn API and the if P guys are a
00:22:46
bit different buttons are for also
00:22:47
gives you done this data set. So you
00:22:50
see it downloaded it extracted it again
00:22:53
we can visualise a couple of examples
00:22:56
by knowing already kind of get an idea
00:22:58
of how a this looks like. And now we
00:23:03
end up to the part about defining the
00:23:05
graph right so before defining the
00:23:07
graph was a couple of lines of code in
00:23:09
which we defined the optimiser. And one
00:23:12
line for actually defining the feature
00:23:14
for network classifier in this case
00:23:18
this is the graph definition. So we
00:23:21
have to choose how many layers we want
00:23:24
to define. We have to choose the batch
00:23:26
size the layer sizes so we're using as
00:23:29
before a hundred neurons per layer four
00:23:33
two layers the input size is given by
00:23:35
the dataset so twenty eight one two
00:23:37
times twenty sequel seven hundred
00:23:39
eighty four and the number of classes
00:23:42
that can because that's how many digits
00:23:43
we want to labels we want to be able to
00:23:47
classify. And now this shouldn't be a
00:23:50
surprise given the top that to you just
00:23:53
heard before we have to define to
00:23:54
placeholders this is a supervised
00:23:56
learning setting so we have the
00:23:58
examples and the lower the labels
00:24:00
associated with them so in this case X
00:24:02
withstand for the images so something
00:24:07
like this. And why withstand for the
00:24:10
labels associate so this is for example
00:24:12
in this case the three so this is the
00:24:15
placeholder definition. And now if you
00:24:18
want to define the graph you actually
00:24:20
have to define the weights and the vice
00:24:22
is very similar to the linear
00:24:23
regression example. So for neural
00:24:27
networks these are the weights are
00:24:29
matrices and the biases are a vector
00:24:31
the output is defined as you expect so
00:24:34
there's activation function which is
00:24:35
applied to one matrix multiplication
00:24:37
between the previous layer values and
00:24:40
the weights and you applied the why's
00:24:42
it so this is just the standard neural
00:24:44
network formula. And because I'm doing
00:24:46
this in a loop and why am I doing this
00:24:48
you know because I don't wanna copy
00:24:49
paste the code twice in case I want to
00:24:51
change my mind then use three layers
00:24:54
have to keep track of what my previous
00:24:56
layer sizes and values are so this is
00:24:59
this defines the hidden layers and how
00:25:01
you do the computation from the input
00:25:03
layer to the funeral Aaron from the
00:25:05
procedure there to the second in there.
00:25:07
But how about the output. So here we
00:25:09
want to use the soft max to get the
00:25:11
probability distributions for each of
00:25:13
the possible digits. So again we
00:25:16
defined the weights also matrix devices
00:25:19
the legit. So very similar matrix
00:25:24
multiplications blah plus additional
00:25:26
but here we don't have the rental
00:25:28
because we don't use that this
00:25:30
activation function directly so here
00:25:32
comes the trick of for numerical
00:25:34
stability you don't use soft max and
00:25:37
then cross entropy there's a function
00:25:39
that does soft core soft max cross
00:25:41
entropy with lodges that just takes the
00:25:43
logic to avoid the icon instability
00:25:46
that comes with soft max so you could
00:25:48
just use of max you might have suffered
00:25:50
because of this I have you might room
00:25:53
you might see a lot of finance coming
00:25:55
coming your way and again you have to
00:25:58
defined optimiser this is exactly the
00:26:00
same three lines as we saw before but
00:26:03
this does the same thing. But it does
00:26:07
it explicitly for the court is that
00:26:09
exactly the same thinking maybe they
00:26:12
need tele station that they use by
00:26:14
default is a bit different to the one I
00:26:16
have here but the concept is exactly
00:26:18
the same. So now I have actually run
00:26:21
the code which creates describe. So
00:26:24
again this just graze the computational
00:26:25
graph you have to have in mind this
00:26:27
same feature that we saw when we look
00:26:30
at that answer board them all money in
00:26:32
this the same kind of thing on your
00:26:34
network and now as before always I want
00:26:39
to start writing some start doing some
00:26:41
computations I have to create a
00:26:42
session. That's created. And now again
00:26:47
some waking time because I'm gonna go
00:26:48
through some examples and actually
00:26:50
start stuttering. So the training is
00:26:55
happening right now and someone ask
00:26:57
about them can support so I have here
00:27:00
the call that would just create the
00:27:03
summary writer so I did not add in
00:27:05
specific well make sure that we want to
00:27:09
see doing time. So I we will not see
00:27:11
that you were see going up or down but
00:27:13
we can actually visualise the graph so
00:27:15
exactly this graph that we are creating
00:27:16
here at the end that we show you how
00:27:19
how we can we can look at it this is
00:27:22
also taking a little bit but I just
00:27:24
wanna show that is the same idea same
00:27:26
principle. So any other question in the
00:27:28
meantime while this other network is
00:27:30
learning how to classify digits. if the
00:27:44
pencil borders actually live or you
00:27:47
visualise it only after that you have
00:27:50
finished training. So it depends how
00:27:52
you want to do it so here I was so
00:27:54
that's a bit this one just answer or
00:27:57
from the locks. So if you launch it at
00:28:01
the beginning you can look as it the as
00:28:03
the training happens yes in this case I
00:28:09
didn't do this because I'm not I'm
00:28:11
actually not recording any of the
00:28:13
accuracy or we we already saw them of
00:28:16
that but just want to show also how how
00:28:18
easy it is so this is just with one
00:28:21
line here and one line here in the this
00:28:23
is in the promotes I have the the bash
00:28:26
example here so it's done training
00:28:28
rate. And I'm sorry one question I Saw
00:28:33
that you were using before we inside to
00:28:35
love change you okay to this because I
00:28:38
I know that in general we should do
00:28:39
this is the don't what we're well you
00:28:42
have another function that so here you
00:28:44
can also do so down have the scan
00:28:46
function is so you can do this or you
00:28:48
can also use the equivalent of a formal
00:28:50
of the equivalent of us can intensive
00:28:53
role Is it the same adult of
00:28:54
performance for you. oh I I don't think
00:28:57
this has the same problems that panel
00:28:58
does okay so now that we is done and
00:29:07
here I'm computing also the precision
00:29:08
myself without the taking advantage of
00:29:11
the other functions. And it's also
00:29:14
that's pretty well so its rights also
00:29:17
around the ninety eight times out of a
00:29:20
hundred soul win a couple of minutes we
00:29:22
manage to train to different networks
00:29:24
to to do digits recognition and now I'm
00:29:28
gonna start that answer board will So
00:29:32
sorry one stupid question your training
00:29:34
on your laptop yes yes so the training
00:29:36
is that on my laptop on the CPU I'm
00:29:38
minor what's if you with this you of my
00:29:43
son okay that that that this is
00:29:55
actually a very important point this
00:29:56
did not run on on the GPU so if you
00:30:00
want on the you cute gonna be faster so
00:30:02
because I didn't want that any scalar
00:30:04
data we won a the see that we already
00:30:06
us on example but he here you would if
00:30:09
you have some helpful messages in case
00:30:10
of your stock when you're setting up
00:30:12
your scalar values. But we can see the
00:30:16
graph so this is how the graphics like
00:30:18
for a network and I'm just gonna zoom
00:30:20
in okay so you can see oh exactly how
00:30:36
like to have a you have a rattle you
00:30:38
have the gradient associated you always
00:30:40
have the matrix multiplication you also
00:30:42
see your placeholder variables. So you
00:30:45
get a very good idea of what what you
00:30:48
actually created when you define your
00:30:50
graph this is very useful especially in
00:30:52
the case where you're defining the
00:30:54
graph yourself rather than using a
00:30:55
higher level API so it's enough now but
00:31:12
of feed for a network so we saw kind of
00:31:15
the idea the ideas in the principles
00:31:18
that you can use to figure out what's
00:31:20
good for you obviously doesn't this
00:31:22
doesn't only apply to miss digit and
00:31:25
this doesn't only apply to feed for
00:31:26
networks is just the general idea where
00:31:28
do you fit into this landscape how far
00:31:30
do you want to to go. But you also
00:31:34
might care about the state of the art
00:31:36
model support in intensive also I'm
00:31:38
gonna talk a little bit about that now.
00:31:40
So you might be aware of the sequence
00:31:46
sequence model which are very much is
00:31:49
for for translation just days there was
00:31:51
a paper by or your being as and two
00:31:54
thousand fifteen two thousand fourteen
00:31:56
and two thousand fifteen from your job
00:31:59
angels that they sent and on those
00:32:00
results of how to use this kind of
00:32:02
models for translation. And the idea is
00:32:04
pretty simple you have an encoder that
00:32:06
is usually for translation on LSDM or
00:32:10
another are in that encodes the input
00:32:12
sentence and you have the decoder that
00:32:15
decodes the sentence into the target
00:32:18
language so in this case we're
00:32:19
translating from french to English
00:32:21
which might be helpful helpful if
00:32:22
you're here today and so this is again
00:32:27
a very useful in relatively new model
00:32:30
and I want to show how much it takes to
00:32:32
implement that in principle and
00:32:33
actually one line. So this is the basic
00:32:36
one you just have to say I want to grab
00:32:39
for a basic are and then with sequence
00:32:42
sequence aren't and these are the input
00:32:45
that you should you to the encoder
00:32:46
these are the input that used you to
00:32:47
the decoder and I want to use this kind
00:32:50
of cell so also the cell is the kind of
00:32:53
R and then sell that you want to use.
00:32:55
And this is it this is all that you
00:32:57
need to to do to get this kind of model
00:33:01
and this model has also lot of of other
00:33:05
variants of for example you might want
00:33:06
to use the one with embedding wanting
00:33:09
and code there or you might want to use
00:33:10
the tension and that's also only one
00:33:13
line. So it's a pretty straightforward.
00:33:17
And for are and then sells. They kind
00:33:20
of look like this but they also can be
00:33:21
easily created using the one line so
00:33:24
you can say I want the basic LSTM cell
00:33:26
of this size and feed that into the
00:33:29
sequence to seek one school that was a
00:33:33
bit about recurrent not role models but
00:33:37
how about the inception architectures
00:33:40
also called the implement which and
00:33:45
also available so you can actually get
00:33:47
the train fortunately so before but you
00:33:49
can also look at the code recently we
00:33:52
open source the code for dependency
00:33:55
part six also and which task if you
00:33:58
want to find out in the language what's
00:34:00
the action how are certain things
00:34:03
harder and things corresponding to each
00:34:10
other what is the object what is the
00:34:11
prepositions or you can actually do
00:34:14
that quite easy you can do that with
00:34:16
your data or there's already a train
00:34:18
model for English available it's called
00:34:19
part theme park space. So you can use
00:34:22
that if you want to look at English but
00:34:25
they're also the code is available so
00:34:27
you can just start playing with it and
00:34:29
understand more about language and this
00:34:32
one just a couple of examples that are
00:34:34
specifically given a by utterance of
00:34:37
low and of real people in their morsel
00:34:40
out encoders in syntax and also
00:34:43
present. But I also want to mention a
00:34:45
couple of really great community
00:34:47
examples in the hope that you will also
00:34:49
contribute and see how nice it is again
00:34:52
also coming back to this this point if
00:34:54
you have a community you can really
00:34:56
push the the field forward. So these
00:34:59
are a couple of examples and I'm gonna
00:35:02
go through each of them and discuss a
00:35:03
little bit what they do and again these
00:35:06
are all given by by contributors or not
00:35:10
but the part of a dancer from so you
00:35:13
might be aware of deep to networks
00:35:16
there a way to integrate the deep
00:35:18
learning framework into reinforcement
00:35:20
learning. And if you want to use that
00:35:23
with an civil there is the repository
00:35:25
available. So it's also very easy to to
00:35:28
start with that if you want to play
00:35:31
around with the neural art examples of
00:35:34
this as being very popular also helping
00:35:36
popular rising deep learning outside of
00:35:38
the motion and community because it's
00:35:41
very intuitive you have a picture and
00:35:44
you have a painting. And you input them
00:35:46
both to the model and the model return
00:35:48
to the picture kind of looking at the
00:35:51
at that painting. So if you want to do
00:35:54
this in terms of low it's the also
00:35:56
definitely possible char are and and
00:35:59
idea feeding into a recurrent neural
00:36:01
network one character at a time also
00:36:04
and impossible to do intensive role
00:36:07
also the to I mean their contribution
00:36:10
you might not hearers it's a deep
00:36:11
learning high level library I'd used to
00:36:15
work with analyse the back and not also
00:36:16
supports transfer from neural caption
00:36:21
generation and also this is an
00:36:22
implementation of the show and tell
00:36:24
paper if you are familiar with it so
00:36:27
that the basic idea is that you ask the
00:36:31
neural network to describe the picture
00:36:33
that you give so it's a combination
00:36:34
about convolutional and network the
00:36:37
encoders the convolutional network an
00:36:38
an hour and then as the decoder. So you
00:36:41
can also use this with danceable
00:36:43
English to chinese translation also
00:36:47
available in pencil flow if you're
00:36:49
interested and these this kind of sums
00:36:54
up examples that I wanted to show so as
00:36:56
you can see they're very diapers going
00:36:59
from in age to translation to art
00:37:04
examples of the communities really
00:37:06
interested in bringing these
00:37:07
distinctive dancer for line you can
00:37:10
also help with that and I'm gonna talk
00:37:13
a bit about another aspect also for
00:37:16
potentially more advanced service but
00:37:18
it might be very very important of how
00:37:20
to create your own operation. So that's
00:37:23
a floral. That's a lot of things it's
00:37:25
very flexible but maybe you want to do
00:37:28
something else right. And I'm gonna
00:37:33
show a little bit of what's the way to
00:37:37
to do that. But firstly when should you
00:37:39
create your own operation. So though
00:37:42
first obvious cases when you want to do
00:37:44
something and you can do this by
00:37:45
composition of already existing
00:37:47
operations the second use case is they
00:37:51
are all that do what you want so you
00:37:53
can combine them to to achieve your
00:37:56
goal. But you want to speed up the
00:37:58
computation so you might want you might
00:38:00
know what labour wait for instead of
00:38:03
calling one or after the other two
00:38:06
speed up the code if you just do it in
00:38:08
one all or you might want to memory
00:38:11
efficient implementation that can be
00:38:12
also made a better by combining
00:38:16
operations or you can have a more
00:38:18
numerically stable implementation in we
00:38:20
already saw this with the soft max
00:38:22
cross entropy example right so you can
00:38:25
have the soft max all you can have the
00:38:27
cross entropy all you can apply done
00:38:29
sequentially but for a numerical
00:38:31
stability reasons it's better to just
00:38:33
call them in one operation. So these
00:38:37
are tops to create your own operation
00:38:40
at the steps starting to create your
00:38:41
own operation intensive well some of
00:38:43
them are optional again depends on your
00:38:45
use case and what you want to do with
00:38:47
our all your operation. But usually the
00:38:50
it kind of goes like this you want to
00:38:52
read you starred operation in C plus
00:38:54
plus file so you want to have to tell
00:38:56
tense for all and we just trying on
00:38:58
your operation this is how it looks
00:39:00
like you have to implement it obviously
00:39:03
and they die the implementation of an
00:39:05
operation is called the cardinal and if
00:39:07
you want your kernel to run a multiple
00:39:09
devices you might have to implement
00:39:10
multiple kernels optional if you want
00:39:14
your ought to be used in python you can
00:39:17
use loads are provided potential flow
00:39:20
and create the python wrapper also and
00:39:22
one line. And you can write the
00:39:23
function to compute the gradients of
00:39:25
Europe so if you want to you use the
00:39:28
differentiation that comes with tens of
00:39:30
right if you might want to they want to
00:39:33
integrate your operation in training
00:39:35
the neural network then you need to
00:39:37
define the gradients your the output of
00:39:40
your all we respect your inputs and if
00:39:43
you want to benefit of shape in french
00:39:45
you can also write the function that
00:39:48
describes the input and output shapes
00:39:50
and obviously always always test your
00:39:52
recording. So now let's go back to to
00:39:56
the steps first you registered or it
00:39:59
looks a bit like this I have my own
00:40:01
operation I don't have a good name for
00:40:03
it so I'm gonna name at my own or this
00:40:05
is the important it's gonna be a good
00:40:07
compose that in thirty two and this is
00:40:09
not output this is a very simple step
00:40:12
then you know what you want to do so
00:40:16
here you just have to define the
00:40:17
implementation of the operation that
00:40:19
you have decided that you want to
00:40:21
implement. And it it comes with the
00:40:23
standard API that you have to use but
00:40:26
it's very simple you just take the
00:40:27
input from the all colonel context
00:40:30
object provided and you allocate the
00:40:32
output but the computation is something
00:40:35
that should be easy or relatively
00:40:37
straightforward because you know what
00:40:38
you want to do. And then you have to
00:40:42
register your current also on what we
00:40:44
did before we actually registered
00:40:45
operation. But now we have to say that
00:40:48
this operation it has discovered
00:40:52
registered to it so this is this is the
00:40:54
the TCU implementation associated with
00:40:56
my my operation. And then you can build
00:41:00
your kernel using your favourite
00:41:01
compiler or with the bill system that
00:41:04
comes with tense from which is the base
00:41:05
so now if you want to make your
00:41:09
operation run on the GPU you actually
00:41:12
have to define the put I implementation
00:41:14
of it and register it so this is the a
00:41:17
bit more more tricky part. We actually
00:41:20
have to write code I if you want your
00:41:21
operations to run that you you if you
00:41:25
want is an operation and pike then you
00:41:27
don't have to do much basically you
00:41:29
just have to create a take the excel
00:41:30
file that was created by the is using
00:41:35
the C plus plus court. And just load it
00:41:38
and now you have your module in which
00:41:39
you can call the the what that you
00:41:42
define so this is pretty
00:41:44
straightforward and you might also want
00:41:48
again to use the function to compute
00:41:51
the greedy and so if you want to apply
00:41:54
your operation and the neural networks
00:41:55
a setting you want the phone you want
00:41:57
to do this. And this is also simple and
00:42:00
up to you know I think that that's a
00:42:02
flow can help with this if you're
00:42:03
defining the new operation you have to
00:42:05
know what the gradients of the output
00:42:09
with respect to each of the input is
00:42:11
and from here you can just let in
00:42:15
several do its magic can you can
00:42:16
combine your all with all other
00:42:18
operations. And peace if you have a
00:42:21
require a really strange it need for an
00:42:24
operation and you have implemented just
00:42:25
sent upon request so that other people
00:42:27
can also benefit but but the in
00:42:29
principle these are kind of the steps.
00:42:32
And the most difficult part is to know
00:42:34
what you want once you know what you
00:42:36
want you just have to implement that
00:42:38
and the computer gradients for for that
00:42:41
operation oh and I all just something
00:42:49
that I want to stress about ten
00:42:52
syphilis general machine learning free
00:42:55
more actually more most in general
00:42:57
computation framework. So we can use
00:43:00
for it can be used for problems that
00:43:01
require differentiation optimise
00:43:03
station or linear algebra computation.
00:43:06
But I don't mind that it is made with
00:43:09
deep learning in mind so most of the
00:43:10
API support and the feature request
00:43:13
that will be addressed are actually
00:43:14
looking at at deep learning but you can
00:43:17
use it for for other other problems as
00:43:19
well and actually there is the
00:43:21
differential equation solvers intensive
00:43:23
flow if you want to play with that
00:43:24
little bit it's unavailable tutorial
00:43:27
online I'm not gonna go through this
00:43:28
right now. So this brings that's brings
00:43:33
me to the conclusion oh this talk and
00:43:40
the the general idea of that answer for
00:43:42
talks. So you're probably here because
00:43:45
you want to find out what's the best
00:43:48
thing for you right. And this again
00:43:50
depends a lot on your use case in
00:43:52
depends on how much time you want to to
00:43:54
spend on this if you actually don't
00:43:57
know anything about declining and you
00:43:59
wanna play around a little bit you can
00:44:01
you can train neural networks in your
00:44:03
browser. So there is playground all
00:44:05
kinds of rolled up or and show you a
00:44:07
little bit. And then here in D see so
00:44:12
with this and this you like you can
00:44:15
actually train very simple neural
00:44:17
networks just to get an event intuition
00:44:19
of how this looks like so you can
00:44:22
increase the number of neurons decrease
00:44:24
then both here you can change
00:44:27
activation function your regularly
00:44:29
station and you can change the problem
00:44:32
type and you can also start learning
00:44:34
and you can see how your model is
00:44:36
converging. So if you're not that old
00:44:38
familiar with deep learning this might
00:44:40
be a nice way to spend the thirty
00:44:42
minutes or so but this was just the way
00:44:53
example right. So if you want to the
00:44:55
start to using a machine learning for
00:44:58
real life problems you have multiple
00:45:01
options including using club ATCPI so
00:45:04
this is something that I'm not
00:45:05
discussed before and I think the neural
00:45:08
network case the more complicated you
00:45:12
go it's more flexible but you need to
00:45:14
spend more time and for to learn the
00:45:16
frame or to learn about the perks of
00:45:19
the models right so it depends where
00:45:22
you are on the scale. And now let's
00:45:25
look a bit at at this so if you just
00:45:28
want to not to really deal even with
00:45:31
cancer floor right you can just use
00:45:34
some Claude based API to get some of
00:45:37
the outputs that you want so there's
00:45:38
global translate API for translation
00:45:42
for speech for vision so you can do
00:45:45
also sentimental analysis with the
00:45:47
celtics API so there plenty of options
00:45:50
available without getting your hands
00:45:53
dirty with the actual machine running
00:45:56
go frameworks. So let's look a bit of
00:45:59
how it would look like with the
00:46:01
politician API so if you ask the
00:46:03
cognition API what's in the picture. It
00:46:06
can tell you that so for example here
00:46:07
clearly there's a lot of people running
00:46:09
and there's a marathon. So we can tell
00:46:11
you that in can also give the discourse
00:46:13
or some shit with it or you can find
00:46:16
out. What's what emotions people are
00:46:19
displaying so here it looks like
00:46:21
someone's the which a joyful and you
00:46:24
can also find out what's the text in
00:46:26
the pitcher and what's the language in
00:46:28
the in that text. So any again you
00:46:31
don't have to actually write the want
00:46:33
yourself or even the load already train
00:46:36
models the second option also very
00:46:38
simple is using up between model with
00:46:41
tends to flow we've already looked into
00:46:43
this one not gonna spend much time on
00:46:45
it. But the third and four options for
00:46:48
options were training your models or
00:46:52
creating and producing the models and
00:46:54
for this you have multiple options. So
00:46:57
you can I don't run that subplot open
00:46:59
source really senior physical machines
00:47:01
or in in your virtual machines are not
00:47:03
plough the environment or you can use
00:47:05
the clock machining API which allows
00:47:07
you to use answer pro but also a lot of
00:47:09
the other developer tools that the bill
00:47:13
provides. So again you have to think
00:47:15
where do you want to be here do you
00:47:17
want if you have your own physical
00:47:18
machines then you probably are here. So
00:47:21
depending on you use case if you
00:47:22
already using a lot of the global
00:47:24
developer tools then you probably want
00:47:26
to go here. Now if you want to develop
00:47:30
your own American model then he saw
00:47:32
it's relatively easy to define the
00:47:34
computational graph it's very flexible
00:47:36
and once you define it you create
00:47:38
efficient and you start training and
00:47:41
this is the example of how to to use
00:47:44
this for robotics for making robot arms
00:47:49
to pick up things we can have a look if
00:47:52
you're interested and so the which
00:47:55
obviously use distance from and just to
00:47:59
to include a a bit of a advertisement.
00:48:02
So we recently announced the gringos
00:48:05
Europe research centre based in zurich.
00:48:09
And we are encouraging people to apply
00:48:13
address software engineers or research
00:48:15
scientist and there's also a lot of
00:48:17
internship positions available so if
00:48:19
you're interested just check out the
00:48:21
the the website. And that's pretty much
00:48:25
it for eight me. Thank you very much
00:48:27
for listening and I hope you learn
00:48:28
something from this the at some maybe
00:48:44
maybe you can directly move to the but
00:48:46
the more panel because there is a need
00:48:48
more to on the I mean I'd are draining
00:48:52
so maybe you can keep your question for
00:48:54
one minute the time that we we sit yeah
00:48:58
yes okay I think okay oh so maybe we
00:50:23
should move directly to question so any
00:50:26
question yeah it's so just just to
00:50:35
quick questions is straightforward to
00:50:39
create aside these are three black
00:50:42
network using using their cell phones
00:50:45
as as easy as was shown the example and
00:50:49
another question can I create an
00:50:52
operational in the in the by don't
00:50:54
level so one logistical thinking UP
00:50:58
stand up when you answer the question
00:51:00
because I don't know oh okay no I know
00:51:04
in which direction to to look at so the
00:51:07
first question I I is it easy to create
00:51:09
a siamese network or or three black
00:51:11
network admin is the as simple as you
00:51:13
useful in your examples well it depends
00:51:16
yes basically yes the short answer is
00:51:19
yes and that's what was the second
00:51:22
question I can create operations in the
00:51:24
python level you or some examples in C
00:51:27
plus plus in you to the bindings
00:51:28
afterwards I you the operations you
00:51:31
have to register in C plus plus so
00:51:32
there is something plus plus part in
00:51:34
you definitely have to write if you
00:51:35
want the CPU implementation the kernel
00:51:38
has to be okay hi. Um can you give us
00:51:54
some insights on how the usually skate
00:51:57
so the inference that with that so from
00:52:01
the very like mean I'm I'm not asking
00:52:03
for like a getting me that or not sure
00:52:06
but I'm just asking what technology to
00:52:10
use in order to skate. And that serving
00:52:13
of tens of no there Spencer floor
00:52:15
servings itself a but do you combine it
00:52:19
what you combine it with in order to
00:52:20
achieve a a highly scalable system
00:52:23
again I'm gonna discuss Google internal
00:52:26
information if you want to see how to
00:52:28
serve dancer for exactly danceable
00:52:30
serving is where to look at yeah the
00:52:37
fanciful serving out of the box is
00:52:39
pretty sequential so I mean if you
00:52:41
tried like with that I mean if one
00:52:44
tries it with that tutorial based
00:52:46
version spit it sequentially you can
00:52:49
submit one query at the time I think my
00:52:52
question is that probably do something
00:52:54
else in order to to make it scale for
00:52:56
for work and I'm not gonna discuss it
00:52:59
okay just look at what fanciful serving
00:53:01
supports and then that's the open
00:53:03
source version yeah that concerns one
00:53:08
of the your slides about this clout
00:53:12
service for machine learning so if I
00:53:15
put data there what actually happens to
00:53:17
the data do actually guaranteed Reese
00:53:19
you know the the service it's going to
00:53:22
be sent to respect for example pry
00:53:24
possible there's okay so I actually the
00:53:38
question so ye if I'm correct you are
00:53:42
running benchmarking me sing so there
00:53:45
were some speed differences between the
00:53:47
values framework some point. I don't
00:53:49
know where we also moment and if we all
00:53:51
Stevie the LC differences or the you to
00:53:54
to something which is good you don't if
00:53:56
I'd or design tracy's or something
00:53:58
which is going to implement both I or
00:54:00
what the situation. so about when times
00:54:06
are slow started there was the speed
00:54:09
difference but no it's actually just
00:54:12
flows doing equally well that the other
00:54:16
frameworks except for anything on which
00:54:18
is a bit faster than the rest but the
00:54:23
problem with my benchmarks is that the
00:54:25
only colour comments. So it only covers
00:54:30
to use cases that people have for
00:54:32
content so what we want to know though
00:54:36
is that the framework supports a
00:54:38
recurrent nets comments like us
00:54:40
different like speech models and so on
00:54:43
us so I am working but to kinds of flow
00:54:48
case the TN and gender guys everyone in
00:54:51
the community to release a new set of
00:54:54
benchmarks that will capture the use
00:54:57
cases more uniformly of all the
00:54:59
researchers out there. And we will know
00:55:03
so and how all the frameworks perform
00:55:06
and that is the general skills as well
00:55:10
but that just for complex actually
00:55:12
cancer flows doing as well as other
00:55:15
frameworks except for you know So so at
00:55:22
the moment putting aside the benchmarks
00:55:24
a little or you you will not aware of
00:55:27
major differences in the design that
00:55:29
would that we know we backed
00:55:32
performance in some way or another. So
00:55:35
from my perspective the only also and
00:55:37
of skating up on something kind of like
00:55:39
the cool so they listen to it all ties
00:55:42
with you strongly comes off to as being
00:55:44
able to take advantage of really
00:55:47
village crystals or out of CGPS is it's
00:55:50
a case will talk so what is it you yes
00:55:53
I think in terms of the design so
00:56:00
there's basically two philosophies at
00:56:02
this moment it all frameworks one is
00:56:05
the whole write your neural network as
00:56:09
a computational graph. And then give it
00:56:12
to an execution engine that will just
00:56:15
treated appropriately and so on. And
00:56:18
there is another philosophy which is
00:56:21
that almost all of the distributed
00:56:24
computation is that you do with you
00:56:26
know networks and in general can XP
00:56:28
expressed as something in in terms of
00:56:32
computations that the high performance
00:56:35
computing groups have been doing for
00:56:37
many years which is called the MPI
00:56:40
class of collectives us or torch takes
00:56:46
the at this moment at least like that
00:56:50
that readers distributed package for
00:56:52
example for takes the approach where
00:56:55
you just use any PI that does the
00:56:58
distribution for you and it's oh picked
00:57:00
away and there is no separate execution
00:57:04
engine and so on. And answer flow and
00:57:07
cafe cafe to chain or these frameworks
00:57:13
take the other approach which is that
00:57:15
you have an execution engine and the
00:57:17
competition grab. And you would want to
00:57:20
do it in a more general generalised man
00:57:24
I want to ask you even notice little
00:57:28
does in to be a very good distribution
00:57:31
framework in that that's really
00:57:34
important nowadays that doesn't mean
00:57:37
that to there's not a lot of focus on
00:57:39
making it great on one machine so
00:57:40
there's a as the he said there's been a
00:57:43
lot of improvement since they need to
00:57:44
launch and there's still lot of work on
00:57:46
this so the in is definitely tool not
00:57:49
only focus on the disputed side it's
00:57:52
great to have and it's very useful but
00:57:54
the you should be able to have great
00:57:56
performances of thousands of labels on
00:57:58
one machine so we are time you're so
00:58:10
you've done okay sorry I cannot start
00:58:14
it I don't know I it if it's really a
00:58:19
question or more comment or something I
00:58:23
was working on but it was new and it
00:58:24
works for also several decades already
00:58:27
than I do to say I really like this
00:58:29
workshop I want us to say thank you for
00:58:32
organising this and I think was a quite
00:58:35
great success and it's quite hard
00:58:40
receptive audience and I realised also
00:58:42
that they're people from many
00:58:44
communities here some of them using the
00:58:46
library some of them very new to neural
00:58:50
networks. And it's very hard to find
00:58:52
something which is good for everyone.
00:58:55
And also for me there was some more
00:58:58
information which was new for me so it
00:59:00
was quite nice as thing and it's
00:59:04
difficult to decide what's to really
00:59:05
cover because it's called deep learning
00:59:08
methods and tools and how much to speak
00:59:10
about depending how much of a mess
00:59:12
that's how multiple tools we spoke you
00:59:14
mean about messes tools and
00:59:16
technologies the deep learning aspect
00:59:19
in general was a bit covered in
00:59:21
intonational presentation. But that was
00:59:24
only a very claims of the history and
00:59:27
this was in my face up in in a little
00:59:31
bit too few if you wanna cover history
00:59:34
whether it be maybe a bit more seems at
00:59:37
the moment that that was like two
00:59:38
thousand six the various a lot of
00:59:41
people learning and there are the names
00:59:44
just to mention I eva Nicole and ankle
00:59:48
who is considered as a father of the
00:59:51
learning an of course you know there's
00:59:54
the well I in there just some other
00:59:56
names would have been nice to be
00:59:57
mentioned as well. And maybe it might
01:00:00
be a suggestion because of putting the
01:00:02
slides online or something to just give
01:00:04
a small page or something of the
01:00:07
history of the planning as well Okay
01:00:12
well so thank you for one the for the
01:00:25
conference your workshop the I have you
01:00:29
and requested on the two older yeah I
01:00:32
recording tools and so and do you
01:00:36
expect that there will come more
01:00:40
graphical tools for Norman engineers.
01:00:44
So they can be involved more indeed
01:00:47
cloning and knocks programmers just
01:00:50
graphical tools to create your letters
01:00:57
for now I mean more graphical input you
01:01:04
you so have you can but this is
01:01:08
inherently my pieces and apparently a
01:01:10
programming task if you'll have to
01:01:13
debug it at some point you still have
01:01:15
to look at the at the cold right so you
01:01:17
still need someone to be familiar with
01:01:20
some some high level language such as
01:01:23
while or I think because it's not about
01:01:25
what happens if you see if you have
01:01:26
this if I understand the question
01:01:28
correctly if you have a graphical
01:01:30
interface it's harder to it at used
01:01:33
maybe I'm too much of a software
01:01:35
engineer. It's harder for me to see how
01:01:38
that would work in my daily work well I
01:01:44
assume you might be referencing to
01:01:46
something like simulating maybe like
01:01:49
you know where you have you can plug in
01:01:52
graphical things and then you run them
01:01:54
and double click on one of the things
01:01:56
and change the code or is that what you
01:01:59
are thinking about I mean more getting
01:02:03
more quick. Um I mean writing code is
01:02:10
kind of go close that's very flexible
01:02:13
learns all about it's a very to do is
01:02:17
where you're working if you know
01:02:20
different things that you want to do it
01:02:22
and you just want to take on the menu
01:02:25
in and select certain options and then
01:02:29
you can have the variability that you
01:02:32
need maybe for certain problems where
01:02:35
you can have a hierarchy of graphical
01:02:38
other choices so and so so from from my
01:02:44
interactions that many people in the
01:02:45
community there are a few projects in
01:02:50
the works in this direction. I think
01:02:52
defining a graphical tool that is very
01:02:56
effective for a new field is a is
01:03:01
somewhat of a a test a few type things
01:03:05
and converse to something that works
01:03:07
for most people and and there is a
01:03:10
project that is being put in the torch
01:03:14
community to like together neural
01:03:18
networks a graphically and it's being
01:03:21
done by you I PHD student was also
01:03:26
interested in your networks and like
01:03:27
that there's other there's something
01:03:29
built on top of cafe that is similar
01:03:32
the by and we D actually got Andrea
01:03:34
digits where you can adjust with the
01:03:37
few drop down menus trainer no network.
01:03:41
I think getting the power of defining
01:03:46
your most complex networks and the
01:03:48
graphical ways of somewhat hard from at
01:03:52
least our perspective as programmers
01:03:54
because the number of choices you can
01:03:56
take at each step is so large that it's
01:03:59
very hard to be expressed graphically.
01:04:02
But I think as or UI researchers coming
01:04:07
to the loop there probably eventually
01:04:10
be some solution that would be
01:04:12
effective for most people no more
01:04:25
questions okay you then maybe we can
01:04:30
stop you on the thanks again for your

Share this talk: 


Conference program

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
4 July 2016 · 2:01 p.m.
Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
4 July 2016 · 3:20 p.m.
Day 1 - Questions and Answers
Panel
4 July 2016 · 4:16 p.m.
Torch 1
Soumith Chintala, Facebook
5 July 2016 · 10:02 a.m.
Torch 2
Soumith Chintala, Facebook
5 July 2016 · 11:21 a.m.
Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
5 July 2016 · 1:59 p.m.
Torch 3
Soumith Chintala, Facebook
5 July 2016 · 3:28 p.m.
Day 2 - Questions and Answers
Panel
5 July 2016 · 4:21 p.m.
TensorFlow 1
Mihaela Rosca, Google
6 July 2016 · 10 a.m.
TensorFlow 2
Mihaela Rosca, Google
6 July 2016 · 11:19 a.m.
TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
6 July 2016 · 3:21 p.m.

Recommended talks

Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
26 Nov. 2016 · 3:31 p.m.
SGAN: An Alternative Training of Generative Adversarial Networks
Tatjana Chavdarova, Idiap Research Institute
19 April 2018 · 10:32 a.m.