Embed code
Note: this content has been automatically generated.
00:00:00
not that far go i was working and to go into work so it's really great to be back in switzerland and
00:00:05
uh talk about one of my favourite tools out there so i'm definitely gonna enjoy mine for i hope you will too
00:00:11
um and yeah that's it going feel free to interrupt me if you have questions there's plenty of
00:00:16
question time for questions at the end but there's something really strange just just go for it
00:00:22
so for those of you who don't know what's what's as opposed to what am i gonna spend the next
00:00:27
i'm half an hour talking about what was where is it suppose an open source machine learning library
00:00:33
and it's primarily aimed at people earning so these deep learning in this this
00:00:36
new branch of machine learning that have been it's been basically sweeping
00:00:40
um um the world the way with the uh with its ability to solve new problems a lot of
00:00:45
problems that people are not solvable five years ago have already been sold uh using people earning
00:00:52
in some of the really core feature of dance floor and i really cannot stress this enough is that is meatballs for
00:00:57
research in production the same framework works then it's applicable very
00:01:03
well for research work but you can also we might
00:01:06
big that model that the researcher uh made in just um deployed out
00:01:10
and it comes with a very flexible license or you can just take it in
00:01:13
in use it in your product were your project uh however you like
00:01:18
so just as a brief introduction i'm gonna talk to you but about
00:01:21
what machine learning has become and cool and especially how that integrates
00:01:26
with what you what you could potentially saw with machine learning just to give a couple of examples
00:01:32
so as you probably know who girls mission is to make the world
00:01:37
organise the world information and make it accessible and useful and turns
00:01:41
out that deep learning has become very crucial to words admission
00:01:46
so if we look at the number of projects that have been using deep
00:01:49
learning in the last couple of years this has grown so much
00:01:53
in this range just across a lot of areas that that you know when product that you really like
00:01:59
and i'm just gonna go through a couple of these briefly int oh i
00:02:02
have to mention i'll forego oh oh you all probably know it it's
00:02:06
it's really amazing so people did not expect to have um
00:02:11
uh i'll machine being able to beat a dead than nine player without giving handicap for
00:02:17
awhile so handicapped means that you are actually favour ink the machines although between up
00:02:22
when you're uh playing but this was a fair game enough light bulb beat one
00:02:27
of the best players in the world for one so that's that's really impressive
00:02:31
um and moving want to go to the product side
00:02:35
i hope you all use my reply any box
00:02:38
i really like a so it's uh on this my reply is a really nice feature that allows
00:02:44
save time especially when you're on the goals when you're on mobile anything in email and i'm just hopping into the train
00:02:49
almost losing it or the tram i i just want to present something
00:02:54
uh and then okay that'd be email was sent and um
00:02:57
smart reply really leverage is machine learning and an l. p. to kind of predict from the incoming
00:03:03
incoming email okay what are the possible should i firstly suggest an automated reply and secondly
00:03:09
um what that reply could be in this feature has been really popular um
00:03:14
and taking around ten percent of 'em responses sent on mobile
00:03:21
clearly recommendation in clustering are really good fit for machine learning in google
00:03:25
play music makes uh makes the from the use of that
00:03:30
uh one of our again one of my favourites is google for
00:03:34
also we're having more and more powerful phones is that
00:03:37
is right i used to remember that time when i was younger in just having to always copy my photos
00:03:42
to uh onto my computer because i never had space but now i have thousands and thousands of
00:03:46
uh pictures on my i'm on my phone and i just wanna search for a picture for example when i travel
00:03:52
i took some really nice pictures in japan and then i shouldn't my friends i don't have to scroll through
00:03:57
thousands and thousands of features i can just say okay go show me pictures of your important data in japan and then it
00:04:03
will and this this is really cool and this is made possible due to the shivering in deep learning in particular
00:04:11
um and also related to travel and no two two vision is this idea
00:04:15
of okay if a spate um when you're travelling your let's say
00:04:20
can i have no clue how to reach japanese script so i can't even type in my case
00:04:25
sir spar what that means i can just point my phone to something and it'll automatically translate
00:04:34
these are just a couple of examples and just think when when you're trying maybe potentially
00:04:39
to solve a problem is machine learning applicable to what you're trying to solve
00:04:43
oh and if it is in if you decide to use machine learning and then you have a couple of
00:04:49
sources of complexity that you have to deal with so firstly you're gonna train your um although on
00:04:55
maybe a couple of c. pews or maybe g. p. was or maybe some specialised hardware
00:05:00
but at your users are not gonna use the same uh platforms is user you want to potentially deploy
00:05:06
immobile or and they're multiple platforms there's that we have and read and i was and so on
00:05:11
um and these models are also really be beginning like
00:05:15
usually stays so you might want to use um
00:05:18
mm you really might want to take advantage of a cluster to train your model
00:05:23
thank you also want that too expensive to be able to specify wheelers models
00:05:28
all the image that i have here is the state of the art
00:05:32
or almost obscured image recognition um model and it has a lot of layers
00:05:37
as you can see take it you want to be able to specify
00:05:39
this kind of models easily you don't want to have to take months
00:05:44
to write such a thing because then you're really really slow down
00:05:48
and they're really good thing is that in several really handles all this for you so that
00:05:52
you can focus on what you're trying to sort of write like you don't care about
00:05:56
making sure that your model is portable everywhere you just wanna solve your problem in
00:06:01
shared could users or play around with it or whatever you want to do
00:06:05
impassable we deals with all this 'cause i'm source of of of complexity
00:06:11
how does it do that so how does it do all this all these new really nice things for us
00:06:15
so um one of the core things to know about it's wrong to always think about so there are two main concepts
00:06:22
one is the concept of tense or not answer just the multi dimensional array so if you have a vector
00:06:28
uh that's a dancer a matrix is of cancer if you go to three dimension
00:06:31
that's also dancer if you go to and i mentioned that also things there
00:06:35
and the stance ours are actually flowing to work computational graph of operation so
00:06:40
that hence the name tens of little this is how computation is described
00:06:45
intensive fall so it's really important to think when you're programming into several on or when
00:06:50
you're structuring your problem of how your computational graph look what looks like you always
00:06:56
well most like teen when you define emission line problem you would want to open my eyes
00:07:01
a loss and you would define that was in terms of mathematical optimisation that that will become part of the graph
00:07:10
and the general procedure when working with tons of what was to define your graph
00:07:14
as i said in usually a high level language so just by fun
00:07:18
and then this graph is compiled an optimist by the or a tenth the
00:07:23
prosecution system and then executed either being paragraph um for example you have
00:07:29
training and testing graphs or different things that you might want to do or only part of it's what this thing time
00:07:35
you don't want to execute the part that does the training in that you execute that on the available devices
00:07:43
oh how how does this work this to the core of tens of what was
00:07:46
in c. plus plus but if you don't want to deal with that
00:07:50
oh you don't have to so you can specify your computation either in c. plus plus we're
00:07:55
in python and these are the uh well officially supported um front end but i
00:08:00
think there are some others available right now so this is one of the really nice
00:08:04
things about being open source if the great community out there that really helps with
00:08:09
whatever you're trying to do probably someone is also thinking about doing that under a lot of times of
00:08:14
your users out there that can help each other in this building a really really very community
00:08:20
so now we're going into one of the uh sources of complaints like city
00:08:24
that i mentioned before in this is um distribution to these models
00:08:29
these days that your your training for example for image recognition and so on they are as as they should be can be quite big
00:08:35
and you might want to take advantage of the two or three machines that you might have where maybe the
00:08:41
twenty or the class or that you have multiple should be use and
00:08:45
to you you want a flexible platform that is able to
00:08:48
deal with that then you don't want to spend months trying to
00:08:51
to scale up your training from one machine to multiple machines
00:08:57
and that's a flow really deals with this quite nicely supports that you can switch from the g. p. you twist c. p. u.
00:09:02
just with the change of flags will have to change your code
00:09:06
and a jeep use are really crucial to keep learning because
00:09:11
the kind of computation that we do when we train neural networks is very suited for g. p.
00:09:16
use and this is actually one of the reasons why we have all these breakthroughs is that
00:09:21
it training um although on a jeep you with at least imply sponsored
00:09:24
on c. p. u. c. having to be available you can just
00:09:27
um change how you are or how you're in a network is a or a run in that's it your your training once you you
00:09:35
and uh are you can also use multiple cores or multiple graphics cards in if you have
00:09:39
a plaster or multiple machines you can also just just take advantage of that in just
00:09:44
really scale a scale down your experiment time and now it's just look at some numbers
00:09:51
on 'cause i'm i'm i'm selling it here but i wanna i wanna probably that
00:09:54
it actually does this on this is a graph of training inception so this is the
00:10:00
really big model that i showed that just does um image classifications we have
00:10:05
an image in it recognises um if a dog is in that image or another object isn't that image
00:10:11
and let's say i wanna twain this mortal until i have zero point
00:10:15
five um that you received so i'm right half of the time
00:10:19
right and let's see how long that takes with one c. p. u. word but ten or fifteen o. g. p. use sorry
00:10:27
so if i do this with one g. p. u. was gonna take around eighty hours
00:10:32
the train this model so that that takes for quite sometime but if i use fifty
00:10:37
g. b. o.s then it's only two point six hours so that that makes people
00:10:41
will be able to experiment a lot faster in advance the field or fast and
00:10:45
also test your ideas we faster right so this the uh big big scale
00:10:51
difference and the same between ten and fifty g. p. use
00:10:54
you use yeah around a four times improvement if
00:10:57
you always fix for precision because that's that's what you're trying to to get as precision of as possible
00:11:05
and as you know this here so you don't get the nearly more um performance
00:11:12
if you scale uh the number of machines because there is an overhead of communication right so you will uh
00:11:18
um you will expect this over especially if you have a closer there's the bandwidth communication but even if used to jeep use on
00:11:24
the same machine there is a little bit of overhead there any uh this graphic just trying to show you the um
00:11:31
how would distribute it training scale so many increase the number of workers if you
00:11:35
have around a hundred workers you would get around fifty six speed up
00:11:39
over using only one work so it's it's definitely way
00:11:43
faster but expect this this distribution overhead which
00:11:47
comes from the fact that you have only one networks owning one set of parameters that you have to synchronise
00:11:53
or i i think when you say communicate between this or workers okay what is our current sort of parameters of the model
00:12:02
another source of of complexity that i mentioned previously used this idea of having
00:12:07
it's virginia systems and um um complex architectures right so
00:12:12
i mention cheap use just now and uh distributive systems but you might
00:12:17
also want to put your model one raspberry pie why not
00:12:20
oh war we also have um a certain machine learning specialised
00:12:24
hardware which you might want to using you don't want
00:12:27
you have to tweak that every time a name change your code
00:12:31
to be able to run your model or all these platforms
00:12:34
um in terms of what what does that for you so you if you
00:12:37
have your model that you trained on ten or a hundred machines
00:12:41
you can just take that model and put it on a on an i. phone or a hundred device and and that's that's really good
00:12:49
and especially for for mobile support um there there's a lot of
00:12:53
uh resources out there ah and there's so even um yeah
00:12:57
google blog about how to just take an already trained model
00:13:02
load that and create an apt that recognises what in an image
00:13:06
and this is really nice because actually for this model you don't even have to train the
00:13:10
network because then several comes with some train network so you can just download the network
00:13:15
like it in in europe and i think in half an hour you have
00:13:18
this happy that you're able to recognise thing so that's that's pretty cool
00:13:26
and so why are all these things we important for like in the in the end we we hear about
00:13:33
one thing specifically bringing disagree features to users but if that takes us here is like a
00:13:39
this researcher has a really great idea it working research but then you have to
00:13:43
pour tapes to the production code and then we have to make sure that it works
00:13:47
something imply firms that takes so much time bin users are always so behind
00:13:52
compared to the deep learning research right and they're not benefiting of the best possible things that they they could benefit of
00:13:59
and that does the week or thing about uncivil because you have all these things available
00:14:03
um then you can you can just go very fast from
00:14:07
a research to production and one of the things that
00:14:10
oh i was really nice that is that it only took four months
00:14:14
would uh take something that people weren't sure does this even work
00:14:18
to actually launching you to users that the that the that the pretty good
00:14:23
thing to have in mind when you're when you're thinking about these things
00:14:29
and i'm not sure you very simple example of how to do linear regression intensive low if you want to do
00:14:35
the integration maybe this is not the best way to just
00:14:38
the pedagogical example uh it doesn't work but um
00:14:42
uh there are better ways of they're just as a as a disclaimer i'm
00:14:45
just trying to show the basic features of of them for for in
00:14:49
feel free to interrupt again if you have any any questions regarding this
00:14:53
so this is my problem statement i have a couple of points
00:14:57
and i'm trying to fit the line to disappoint so what's the baseline that i could fit so in this case i know
00:15:03
that this is the line because i cheated and i wanted the line first and then i generated the data through it
00:15:09
but um let's assume that we don't know the line so we have the point will uh we have
00:15:15
the points and i'm trying to find find this line oh and then you want to use
00:15:21
this for example for testing so if i give you a new point on the x. axis where
00:15:25
should you predict at that point should be so if i tell you at this coordinate
00:15:29
push my point b. well i i learned this line i'm gonna predicted this point here on the line
00:15:35
and in order to line the line you have to find a slope and intercept of the line so it's defined
00:15:40
by two variables this is the the align equation w. n. b. r. d. things that we're gonna learn
00:15:46
x. is the input so this is the thing on the on the
00:15:49
x. axis n. y. is what you predict so for training time
00:15:56
you know what why should be because you have disappointed this is the
00:16:00
fourth this x. you know that your why should be here
00:16:03
or does x. you know that your why should be here and so on but
00:16:06
that test time you would not know uh what your why should be
00:16:09
but you just use this line that you learn to to predict that it now going back to one of the core things that i mentioned before
00:16:19
um the computational graph so tiny thing about the cancer from program
00:16:24
beat to something really trivial like this or be it and shoot you
00:16:28
know metal put a hundred players in doing something really fancy
00:16:32
always there's a computational crap behind it that you define and that powers the computation intensive flow
00:16:37
an added score it looks something like like this so um this reminds me actually a
00:16:44
lot of when i learned l. operator precedence in school that plus comes uh
00:16:50
when you evaluate something there's and forest and then plus in the main that made us though introduced we use
00:16:56
it's always makes me reminds me of that but um if you if you
00:17:01
think about what you're trying to do you're trying to learn w.
00:17:04
be in these things are variables right so you don't you don't know them before in you
00:17:10
just start with initialise them with something at all times of uh well these are things
00:17:15
that you will learn and can you to start with one i just chose one random here
00:17:19
but at as we go through learning we will we will improve all these things
00:17:24
that there's another concept here which is a a place holder so this is the value that i had before on
00:17:30
the on the x. axis this is the x. here in this that i'm telling tends to flow well
00:17:38
this will be an input for program i don't know we now i don't know the value now
00:17:43
but when you define the graph just know that they would be of value in here
00:17:47
right so it's basically it's holding place for or somebody that will come later
00:17:52
and then wise very simple what we expect w. x. plus p.
00:17:56
uh and the graphs looks like this uh this that route that we define and not that we're not doing any
00:18:01
learning right now i'm just showing a very simple example of how you can compute w. e. x. a plus
00:18:06
b. e. um uh in um intensive for the one that for things to know is this the graph but
00:18:13
the graphics static the grass doesn't compute you and i'll put you are doing this because you wanted output
00:18:18
otherwise we wouldn't be doing any any training right but the the but the craft that as is because that's what
00:18:25
this is all computation looks like but it doesn't do actually in the computation if you want to do
00:18:30
computation you have to create the session in run things through the graph so this is very simple to greatly
00:18:36
just a one line of a line of point than uh and this is how the entire program
00:18:42
slide in this is really important to stress because all sensible programs will look like this
00:18:46
more or less even if it's a neural network training or something very simple
00:18:50
when you first read the graph in the case of the neural network you will replace this with something more complex
00:18:56
and this is the graph that we have created before for w. x. plus b. um you initialise the session you'd
00:19:04
you tell tends to flow initialise all the variables so this is keeping w. n. b. the value of one
00:19:10
and then i'm using the station to run things through the graph
00:19:14
and have something that's important to mention is what i have to tell tens of what the value of x. was i didn't tell
00:19:19
it before i'm telling i i just pulled it here well there's gonna be something here and wait for it until i
00:19:26
asked you for something when asking for something and here i am asking for something i want the output i'm actually gonna keep
00:19:32
you something or that the here what i'm actually doing and actually
00:19:35
computing three times one plus one which is or in
00:19:40
you will love you will just been be able to get your output this is uh about noon pipe answer so for those of
00:19:47
you familiar with heightened it's both the output in what you can feed it in the dictionary are no right answers oh
00:19:54
and that's uh there's just the standard we uh to go about these things so that this was very very simple we just
00:20:03
computed l. y. if we know w. b. but we don't
00:20:07
know w. b. because that's why we're doing some learning
00:20:10
and always you're trying to when you when you're doing actually you have this error that you're trying to minimise and that's up
00:20:17
to you to define right so when in the case of classification
00:20:21
i want to tell the network well if this image
00:20:24
a ah is of a cat and you're saying that it's a dog when you're wrong and there are ways to define that
00:20:29
uh but here for just a simple example i'm just choosing the euclidean distance so these things that i
00:20:34
but i plotted in yellow and i'm using that for my entire day that's it
00:20:39
i don't think it's the line was this one what would never be well
00:20:43
it would be the some of this but because this is hell off
00:20:47
my model is here my model when i give my model this x. it would predict this point here
00:20:53
but actually now that the point is here so it's off by this much same for this point and for all the other
00:20:59
points in my training set and i'm averaging that in this is my error and now i want to minimise that and
00:21:06
what do you do that well i'm very often in the in
00:21:10
the shining we use this gradient basement that there are some
00:21:14
something it's your hands in atlantic that meant not exactly that if you go on to the direction of the gradient
00:21:19
you are going to towards the direction of the local minima at these that are very small small scale so there are a lot
00:21:25
of this um iterative algorithm so this is something important to straight there's this algorithms are iterative so it they tell you
00:21:31
go this direction a little bit the new computer gradient again then this to go in this direction a little bit
00:21:36
any hope that at the end you will end up in the minimum and with the care about the gradient you don't have to care about even drop demise there
00:21:45
you just tell tends to flow well i want to use this up to my
00:21:48
server for my my error so this is exactly what it just putting
00:21:52
it all together this is what we had before so i'm defining the distance
00:21:56
between labels is what i know to be that the true that you
00:22:01
why is uh what i've predicted we to the computational brought that we defined before
00:22:07
this is the cost so the average error that the model is scanning
00:22:11
in this is the whole optimisation step so this performs the
00:22:14
gradients for you and that's what n. computed operation
00:22:18
that would do one step to this again does not want anything this is defined the computational graph
00:22:23
it just i think that's a flaw okay this is what we will be doing in this is how the graph
00:22:28
looks like and this under the what computes the gradients
00:22:33
with respect to the variables that we have defined
00:22:36
so what are we trying to optimise w. n. b. and these are the things that you
00:22:40
would need to compute the gradient with respect and that's what does that for you
00:22:44
and now you're defining the session as always any dental program will have that
00:22:49
and i said this is an interactive algorithms i have to iterate here i chose a
00:22:53
hundred times you can iterate more there's the whole machine learning theory about how long
00:22:59
youth rate and and so on so don't don't take this the hundred for granted
00:23:02
it's just the just the illustrative example and then i'm running this update step
00:23:09
and again i have to tell tens of flow what might put was and what my output
00:23:13
was so these are basically my points on in the graph that i had before
00:23:17
and a time i run up the step doubling in b. change
00:23:22
such that might cost is smaller line changing every time i
00:23:25
run this for the first loop w. n. b. where one initially after i run this step there are gonna be
00:23:31
one anymore they're gonna be to the point where they're gonna go into the area actionable point that minimises my cost in
00:23:37
my course the bit smaller now and i thought that until my cause doesn't decrease or interloper fit it's all
00:23:42
but this is how a very simple example of of linear
00:23:45
regression looks like and now a neural networks i'm
00:23:52
not gonna go into as develop implement the neural network and then suffer because it might take some time
00:23:57
but i don't want to show you the options that you have when implementing neural networks in terms
00:24:01
of uh because they're quite a few and i think this is also one of the
00:24:04
really core strength of cancer flow and this is also white allows it to be
00:24:08
a great platform both if you wanna do research in production so what's a neural network
00:24:16
um yeah i mean 'cause the very nice more though
00:24:20
that allows feature learning in a very nice
00:24:24
and efficient way so before in people that machine learning there was a lot of feature engineering
00:24:28
right so if you had a a an image as inputs to um
00:24:32
although you have to turn turn transform somehow the image from
00:24:37
it's the sequence of pixels which is what i mean it is something that the model would understand
00:24:43
and what people did is they they handcrafted this feature for every problem so they had human experts
00:24:49
saying okay for cad versus dog i think it should be this and then they
00:24:53
they were moving it for any kind of problem and that that's that's very expensive percy need
00:24:57
human experts to do that second the it doesn't scale right i yeah you're just
00:25:02
trying to do this again and again but the real core thing about neural networks is that they learned the features themselves
00:25:07
so you're actually feeding into the model that role pixels that you have so you're
00:25:13
not and you don't have to be features engine you don't have to
00:25:15
be an expert in kept versus dog uh recognition ah to to be able
00:25:22
to on to be able to do this and one of the corners
00:25:25
reasons why can do that is because of this hierarchical structures what it does
00:25:29
is that it looks at the input looks at the pixels but here
00:25:33
it looks at for example ages or strokes in here it looks at even more high level features and so on so it turns out that
00:25:40
not only that they learned is quite well they also do it in an intuitive manner so this is what people would expect
00:25:45
and this is what people were doing it before they were trying to extract ages from ages but instead of doing it
00:25:52
automatically they had handcrafted gabor filters or something or something similar and then they
00:25:57
got here in that from here we try to get here but
00:25:59
now you only have this model that um that lends it all together
00:26:04
and you do something very similar as what we did before with
00:26:07
a gradient descent you you define your cost here which in my case with the distance between the points here with the uh with the
00:26:13
find it a bit a bit differently and then you back propagate the
00:26:17
gradient again there's the the same core think the gradient back
00:26:21
uh and learned the weights between between these these layers now if you
00:26:27
want to do neural networks intensive low you have multiple options and
00:26:31
there's not right choice depends on you where you are on the scale up
00:26:35
i just wanna play with this or i want something that works or i'm willing to spend two months or does versus
00:26:40
i'm willing to spend the week on this and options are the following separately you can use already trained models
00:26:47
before the image example that uh that i i've shown you can actually get
00:26:52
go to the website to download it uh and um uh and then you
00:26:57
someone go so you just load it you train it you don't
00:27:01
have to care that there is this complicated beast behind it and that
00:27:04
someone spent a lot of time figuring out how to define this
00:27:08
you just uh you just use it for your purposes and actually
00:27:11
have an example of that it's it's works oh no
00:27:19
not this one i thought this out i don't know if you can see this i have to two examples here
00:27:27
this is just using a a script that comes with tens of also it even comes with all
00:27:31
these nice examples of how to use this that's will live classified this this image of
00:27:36
a hand you want is ah n. well this runs it um you know so don't not
00:27:44
the model right now but i'm well uh this is running i'm just gonna talk about
00:27:48
the script so the script is two hundred lines in it that's quite
00:27:52
a few things so oh it sounds the model a it'd
00:27:55
does a a prediction it's very keen so it has a lot of comments and things like that and it also all
00:28:01
uh transforms the output of the network into something
00:28:05
human readable because the network is telling you
00:28:09
uh it's getting a probability distributions on the networks is zero point one zero point four zero
00:28:14
point five and the script also kind of gives us something something really nice back
00:28:19
and this is only two hundred lines so that in there plenty of other examples
00:28:22
out there so if you really just want something that recognises images prickly there
00:28:27
is something about that in here does this worked well since this giant panda
00:28:32
panda wouldn't there so this this looks correct me this is the picture
00:28:39
of a different resolutions which works just to show that
00:28:42
it works at um okay different scales them
00:28:55
uh and again it's corrected for those of you who don't know this and i have to
00:28:59
come this i did not know this until i saw the result this is correct because this is
00:29:03
a tabby cat because of the stripes but it is the cat it is actually a
00:29:08
tabby cat because it has this this tribes here so that's what that's what i learned from from
00:29:13
this so this is one example and of course there's not only the um the train model
00:29:25
for images there's also model called heart seemed par
00:29:29
fate with parts phase which actually does um
00:29:33
um semantic parsing for you in english so you can't uh kind of see the cement
00:29:38
three and also this is a noun this is a an adjective and so now
00:29:44
what if you're not lucky enough that there's already a model out there
00:29:48
so one of the things is that even if it's not in the war cancer for
00:29:52
all of our libraries would this model that represent this part or cancer flow
00:29:57
yes apple is open source so there's huge amount of people actually open sourcing
00:30:01
there want also even if it's not insufferable just have a look on
00:30:05
online see okay maybe someone actually has the problem is probably more useful for me and
00:30:09
i can still just just wrote the the model but if you don't find that
00:30:13
anyone else over problem using implementing your own model there's another
00:30:18
option that's very high level and then implemented the
00:30:21
psychic learning p. i. so for those of you have done a bit of machine learning python disappeared
00:30:25
popular frame or for doing machine learning it's very nice clean easy to use
00:30:30
and as part of the answer for all we have to give learn that builds on top of
00:30:34
this this a. p. i. n. actually you don't even have to worry about the graph
00:30:38
because it creates the graph for you so here if i want to create a neural network uh with this is random
00:30:45
set of unit so this means that i have three layers so in the graph that actually before i have the
00:30:49
input to that they were the pixels will be and then i have uh the layers going up and and outputting
00:30:55
three classes so let's say i want to distinguish between cad or rabbit that i can define this this classifier
00:31:02
and then i can fit my data after with so this is also pretty simple and again here you wouldn't have to worry necessarily about
00:31:08
okay this is the computational graph this is the session does this creates the graph for you so this
00:31:13
is all that you have to specify and actually have an example of that too and i'm gonna
00:31:19
train one of these models who to recognise digits right
00:31:24
now using a careful or this is gonna
00:31:28
run on my computer life it's on a c. p. u. would not only g. p. u.
00:31:32
and it'll take a couple of minutes the maybe we can ask questions during the meantime but um this is
00:31:39
basically all it takes and i made it long because i wanted to showcase something's this is how
00:31:46
much it takes to learn how to recognise digits using
00:31:50
zinc you have learned the first name import i'm
00:31:55
gonna get the data from its psyche to learn and i'm doing a bit of processing because
00:32:02
uh the day that comes from zero to two five five but actually neural networks were
00:32:06
better when the input is kind of between zero and one were relatively small
00:32:11
well i'm i'm just doing that that to an hour let's visualise a bit the problem
00:32:16
this is how the digits look like this is a six this
00:32:21
is i would say seven knots or so what we're
00:32:27
gonna do now is actually between the model tools to see this picture and say okay there's that to this picture
00:32:33
and this is by the way the hello world of machine learning that any algorithm or
00:32:38
a neat benchmark we'll try this it's over done to the maximum but um
00:32:45
i i had to do it and i had to do it and there was a there was two ten thing so
00:32:52
this is defining the network engineer can i make it a bit more interesting just to show
00:32:57
you that even if it's very high level you're actually not losing a lot of
00:33:01
um of the flexibility that you get with your flirt so here i'm actually saying
00:33:06
i define this classifier in use this op demise are so if
00:33:09
you remember before i showed you about gradient descent and adam
00:33:13
and the but there are others that come with tons of flow and yeah i'm single actually
00:33:18
the default i think this ad that right but i'm not gonna use that i want to use momentum and this is
00:33:24
four lines of caught in this is just defining the what also notice there's no graph or session here
00:33:31
this this actually doesn't running you computation this has to fight with without this has to find the graph
00:33:38
and now i'm actually telling it well we this graph fit fit fit the data
00:33:44
the grass i'm thinking all these images they are uh sixty a thousand
00:33:49
in toto um uh and i'm actually that training right now for
00:33:57
sixty thousand steps and this is again training live on my on my machine right now
00:34:02
and i can it's gonna take a bit turns out it takes more than two
00:34:05
minutes to learn how to recognise digits so if you have any questions now
00:34:10
um uh well well this is turning feel free to uh ask

Share this talk: 


Conference program

Keynote
Jean-Baptiste Clion, Coordinator DevFest Switzerland
26 Nov. 2016 · 9:40 a.m.
How to convince organization to adopt a new technology
Daria Mühlethaler, Swisscom / Zürich, Switzerland
26 Nov. 2016 · 10:14 a.m.
Q&A - How to convince organization to adopt a new technology
Daria Mühlethaler, Swisscom / Zürich, Switzerland
26 Nov. 2016 · 10:38 a.m.
Animations for a better user experience
Lorica Claesson, Nordic Usability / Zürich, Switzerland
26 Nov. 2016 · 11:01 a.m.
Q&A - Animations for a better user experience
Lorica Claesson, Nordic Usability / Zürich, Switzerland
26 Nov. 2016 · 11:27 a.m.
Artificial Intelligence at Swisscom
Andreea Hossmann, Swisscom / Bern, Switzerland
26 Nov. 2016 · 1:01 p.m.
Q&A - Artificial Intelligence at Swisscom
Andreea Hossmann, Swisscom / Bern, Switzerland
26 Nov. 2016 · 1:29 p.m.
An introduction to TensorFlow
Mihaela Rosca, Google / London, England
26 Nov. 2016 · 2:01 p.m.
Q&A - An introduction to TensorFlow
Mihaela Rosca, Google
26 Nov. 2016 · 2:35 p.m.
Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
26 Nov. 2016 · 3:31 p.m.
Q&A - Limbic system using Tensorflow
Gema Parreño Piqueras, Tetuan Valley / Madrid, Spain
26 Nov. 2016 · 4:04 p.m.
How Docker revolutionized the IT landscape
Vadim Bauer, 8gears AG / Zürich, Switzerland
26 Nov. 2016 · 4:32 p.m.
Closing Remarks
Jacques Supcik, Professeur, Filière Télécommunications, Institut iSIS, HEFr
26 Nov. 2016 · 5:11 p.m.
Rosie: clean use case framework
Jorge Barroso, Karumi / Madrid, Spain
27 Nov. 2016 · 10:05 a.m.
Q&A - Rosie: clean use case framework
Jorge Barroso, Karumi / Madrid, Spain
27 Nov. 2016 · 10:39 a.m.
The Firebase tier for your app
Matteo Bonifazi, Technogym / Cesena, Italy
27 Nov. 2016 · 10:49 a.m.
Q&A - The Firebase tier for your app
Matteo Bonifazi, Technogym / Cesena, Italy
27 Nov. 2016 · 11:32 a.m.
PERFMATTERS for Android
Hasan Hosgel, ImmobilienScout24 / Berlin, Germany
27 Nov. 2016 · 11:45 a.m.
Q&A - PERFMATTERS for Android
Hasan Hosgel, ImmobilienScout24 / Berlin, Germany
27 Nov. 2016 · 12:22 p.m.
Managing your online presence on Google Search
John Mueller, Google / Zürich, Switzerland
27 Nov. 2016 · 1:29 p.m.
Q&A - Managing your online presence on Google Search
John Mueller, Google / Zürich, Switzerland
27 Nov. 2016 · 2:02 p.m.
Design for Conversation
Henrik Vendelbo, The Digital Gap / Zurich, Switzerland
27 Nov. 2016 · 2:30 p.m.
Q&A - Design for Conversation
Henrik Vendelbo, The Digital Gap / Zurich, Switzerland
27 Nov. 2016 · 3:09 p.m.
Firebase with Angular 2 - the perfect match
Christoffer Noring, OVO Energy / London, England
27 Nov. 2016 · 4:05 p.m.
Q&A - Firebase with Angular 2 - the perfect match
Christoffer Noring, OVO Energy / London, England
27 Nov. 2016 · 4:33 p.m.
Wanna more fire? - Let's try polymerfire!
Sofiya Huts, JustAnswer / Lviv, Ukraine
27 Nov. 2016 · 5 p.m.
Q&A - Wanna more fire? - Let's try polymerfire!
Sofiya Huts, JustAnswer / Lviv, Ukraine
27 Nov. 2016 · 5:38 p.m.
Closing Remarks
Panel
27 Nov. 2016 · 5:44 p.m.

Recommended talks

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
6 July 2016 · 3:21 p.m.
Multiple ways of building a recommender system with ElasticSearch
Andrii Vozniuk, React-EPFL
11 May 2017 · 1:04 p.m.