Day 2 - Questions and Answers | Panel | 05.07.2016 at 16:21 | Part of Deep Learning, Tools and Methods workshop

Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.

00:00:00

Hmmm oh okay so we'll start the session

00:00:22

with one more participants so midnight

00:00:25

are all skies formal and she will be

00:00:27

talking to more about that on soft so

00:00:30

on so we can ask questions about

00:00:31

frameworks both so Bout top shown

00:00:34

console to user disown around right no

00:00:39

she's you know okay so that's why a day

00:00:46

is is that a bunch of questions that we

00:00:48

ask hua websites so I may sometime or

00:00:52

like today would ask by one of those

00:00:54

one because it's it's okay yeah I one

00:00:57

so what would be the most that was the

00:01:00

best chance there are goes are much

00:01:02

learning techniques so what about

00:01:04

decision trees for instance yeah you

00:01:07

sure okay if you if you where we we we

00:01:09

can you could yes exactly so something

00:01:14

when you you are not allowed to have

00:01:16

gradients on the back pop on offence ah

00:01:21

okay so is a microphone I'm trying to

00:01:27

understand by problem. And realise it

00:01:31

and I think there is a general

00:01:34

principle behind backdrop which is

00:01:36

trying to find ways to do credit

00:01:38

assignment you can think about what's

00:01:40

going on and some are L reports betting

00:01:42

incumbents back prop is is great I mean

00:01:48

it's it's working quite well but but

00:01:50

maybe there's some more general

00:01:52

principles that could you applied that

00:01:55

the work you can when the changes we

00:01:59

care about and not infinitesimal which

00:02:00

is one of the weaknesses of that

00:02:02

problem so yeah I don't wanna talk more

00:02:08

about it but I for me. D planning is

00:02:12

not back problem declining is about

00:02:14

learning this representations the

00:02:19

distributor station running good

00:02:21

representations and backdrop is the

00:02:23

best we have now but I hope we can find

00:02:25

better something. So So actually for

00:02:34

it's like or a lot of and all P tasks.

00:02:37

Um keep signing is not a very good tool

00:02:41

like usually just bag of words followed

00:02:45

by some SVM is like it's faster the

00:02:51

either the same performance are better

00:02:54

and it scares very nicely and also work

00:02:56

to back is not designing either and

00:02:59

it's like used everywhere. And it's

00:03:02

really effective but things. Um okay

00:03:10

and also pay means is so what about as

00:03:19

not deep learning but it's it's shallow

00:03:22

but it's representation learning and

00:03:24

just distributed representations. But

00:03:27

yeah it's Monty obviously I mean how

00:03:30

much make alone would keep saying where

00:03:33

do back is not defined things that I

00:03:35

agree make sense bend like K means then

00:03:39

like you know all these clustering

00:03:40

algorithm that we use everyday followed

00:03:42

by some kind of SVM or something like

00:03:45

these are still most effective for in

00:03:47

like a lot of cases that I I don't

00:03:50

think there we go so here's an example

00:03:53

working means is very useful sometimes

00:03:56

you want to take a quick hard decisions

00:04:00

for example we're working on the

00:04:03

scaling memory nets and you want

00:04:05

something like hashing or clustering.

00:04:08

Um that can quickly you know find some

00:04:12

sort of good answers. Um if we use the

00:04:17

normal soft attention just competition

00:04:21

too expensive but but there are other

00:04:23

algorithms that can help us there. So

00:04:29

no trees no I I wrote a paper about how

00:04:33

trees are bad because the generalise a

00:04:41

locally and so they can essentially be

00:04:43

killed by the curse of dimensionality.

00:04:46

Now when you do a forest or any kind of

00:04:50

combination of trees like boosting you

00:04:54

actually go deeper by one level and you

00:04:56

get some nice kind of this

00:04:58

representation "'cause" if you think

00:05:00

about it each truly is two you sort of

00:05:04

one aspect of the problem and and the

00:05:08

the it's the composition of all the

00:05:11

leaves that you've selected for each

00:05:14

tree which is a presentation of your

00:05:16

data. So it's actually pretty powerful

00:05:18

representation problem is right now

00:05:21

yeah I don't it's not clear how to go

00:05:23

beyond these two levels and also except

00:05:26

for boosting there's no it's it's not a

00:05:30

how to train these things jointly for

00:05:31

example. But yeah that's so it's not

00:05:35

something that's the that's a centre of

00:05:37

so if your topic up users press easy to

00:05:42

drink to bring into trees this ability

00:05:45

to to extract a certain type of

00:05:48

information that which is trying to

00:05:50

that to another little trees which is

00:05:53

to duty some sort of the tree you could

00:05:57

do that but I think you first you would

00:06:01

need to do some kind of yeah and if

00:06:05

it's not you know how you could

00:06:07

optimise jointly all the trees. So so

00:06:09

here's an example where you know there

00:06:10

are things we'd like to optimise. But

00:06:13

back problem can be used to optimise

00:06:14

them and so trees are kind of greedy

00:06:16

things that have greedy all buttons but

00:06:19

it's they're not is you can't

00:06:22

generalise to you know trees on trees

00:06:24

entries because we don't have any agree

00:06:26

all ribbons for that. Maybe questions

00:06:31

in the room oh I I was wondering

00:06:45

whether there is a mean when it comes

00:06:48

to frameworks. So and today we saw a

00:06:51

nice going by and in that in that

00:06:54

slides where we see that that's a flaw

00:06:57

is more closer to that that that

00:06:59

production one that or she's closer to

00:07:03

research is there I don't see before

00:07:05

going because everybody's doing. And

00:07:08

research but there are but there's a

00:07:13

lot of hope that it's got could be

00:07:14

turned it to production it was actual

00:07:16

actual existence and it it is difficult

00:07:19

to change frameworks as we go from

00:07:22

research to production is there a

00:07:25

recipe for going the whole cycle from

00:07:28

from research to production with one

00:07:32

framework everybody could that be on or

00:07:36

is it like something very preferential

00:07:38

or subject So the question is is there

00:07:41

for a market that researchers beta

00:07:43

scientist and developers can use and I

00:07:46

think we will see a lot more about this

00:07:48

tomorrow but this is the point of

00:07:50

cancer flow and it was built with this

00:07:52

in mind because I don't know there are

00:07:54

a lot of very good researchers and a

00:07:56

lot of what developers that want to put

00:07:57

this research ideas into production.

00:08:00

And the feel this the deploring here

00:08:02

right now is moving so fast that if you

00:08:04

have two different systems you end up

00:08:07

with your ideas in production being

00:08:08

completely out of date. So this is what

00:08:11

I actually aims to do to be the system

00:08:13

that researchers and this is what's

00:08:14

happening right now now researchers use

00:08:16

answer flow and the same models are

00:08:18

being production also but then several

00:08:20

easily. other questions okay then I

00:08:33

will go through my oh yeah hello. Um my

00:08:39

question is concerned about

00:08:41

unsupervised learning and the

00:08:44

understanding what are the features

00:08:45

that will there be a simplified

00:08:48

unsupervised learning because if you

00:08:50

have images and then we get the like

00:08:52

which genitive model we can see the

00:08:55

images that we generate but if we have

00:08:56

other types of data which don't have a

00:08:59

visual interpretation how do we go

00:09:01

about the assessing that live in fact

00:09:05

also the generative model. And the

00:09:08

secondary how could we use this to

00:09:10

perform some kind of clustering or

00:09:12

understanding of the data it's a very

00:09:15

good question even for images it's not

00:09:19

completely that's satisfactory to only

00:09:21

look at was generated and there is some

00:09:25

some nice discussions about whether

00:09:28

even you know having good generation

00:09:31

doesn't necessarily mean we have good

00:09:33

features that in the sense of using

00:09:35

them for particular task it's not clear

00:09:40

was the right even if you stick within

00:09:43

generation is not clear what's the

00:09:44

rights measures we should use to know

00:09:47

that we have a good generator. So this

00:09:50

lot of open proper problems about you

00:09:51

know how do we evaluate instruments

00:09:53

learning in general this is really a

00:09:55

field where of papers are being which

00:09:57

in these days and you know I we don't

00:09:59

know what the right answers or

00:10:01

classical answers to question are

00:10:03

simply to take the unsupervised

00:10:06

learning as a helper for some

00:10:10

supervised learning task right. So you

00:10:12

could do some nice provides nothing you

00:10:13

could do well the preach retraining

00:10:15

which is a form of sony's prize

00:10:16

learning. Um yeah things like transfer

00:10:21

learning with unsupervised learning has

00:10:22

been done before. So you you basically

00:10:25

define another task which hopefully

00:10:27

would be can helped by using the

00:10:30

features or the regular eyes or

00:10:33

whatever coming from an splice I so

00:10:35

that's kind of not completely

00:10:38

satisfactory because it may measures

00:10:39

some aspects and maybe not other

00:10:41

aspects but yeah that's what we have

00:10:43

now. So an ideal and answer to your

00:10:48

question from from conceptual point if

00:10:50

you would be something like not a

00:10:53

single task. But a very rich wind

00:10:57

family of tasks. So if I can define so

00:11:02

we let me give you a concrete example

00:11:03

would say that I could have a task

00:11:10

which is something like visual question

00:11:13

answering. So so you have to you you're

00:11:15

given any mention and somebody asks a

00:11:17

question natural language and you have

00:11:18

to answer and if the questions are

00:11:21

really very completely open and you

00:11:24

know the kinds of questions then

00:11:26

presumably if you're doing a good job

00:11:28

is if you're able to solve any you know

00:11:31

semantic understanding question about

00:11:33

the image express natural language.

00:11:36

Well this you know presumably means

00:11:38

you've extracted the right information

00:11:39

from the image. So I I I think we can

00:11:44

conceive of very broad tasks that go at

00:11:49

all of the aspects and the data because

00:11:51

if if I throwing some aspect of the

00:11:53

image somebody may come up with a

00:11:55

question that not gonna be able to

00:11:57

answer right my features away things

00:12:01

that that the question could ask about

00:12:04

know this by still in this particular

00:12:06

example because obviously humans are

00:12:08

not gonna ask you know is the pixel

00:12:11

three twenty one seventy six greater

00:12:15

then pixel little but that's not the

00:12:17

kind of question you gone again Yes I

00:12:32

have a question about the many people I

00:12:36

speak with the is comes to from signal

00:12:39

processing have been working with the

00:12:41

image processing and different things

00:12:47

before and they usually see that are

00:12:52

very sceptical too deep longing because

00:12:55

it's services a black box and

00:12:57

everything so within it. And which is

00:13:02

about yeah yeah if if features but what

00:13:07

usually in in the it class together the

00:13:11

signal processing you have very good

00:13:14

knowledge of the lower levels. And

00:13:17

there are very good to models of that

00:13:20

also with the agenda line detectors the

00:13:25

way the structures and things like

00:13:28

that. But the problem as being the

00:13:32

semantic up and the but couldn't we

00:13:36

inherit the the lower thereabouts and

00:13:39

the knowledge from all resource that

00:13:42

has been a for there into the deep

00:13:45

learning and concentrate more on the

00:13:48

semantic solutions I think that's

00:13:53

already what's happened a lot of the

00:13:56

early research with convolutional nets

00:13:59

especially the the period where we use

00:14:01

a lot of expertise lining was actually

00:14:03

focusing on the evaluation that metric

00:14:07

was how does it look like we're getting

00:14:09

Gabor filters and what does it look

00:14:13

like filters that you would expect

00:14:14

using you know sensible signal

00:14:16

processing. And that was it of course

00:14:19

quotable evaluation as my first part of

00:14:23

the answer the second part is it's not

00:14:24

where people focus right now people

00:14:26

focus you know the design of the

00:14:27

architecture isn't anymore like how to

00:14:29

do the first two years that's not where

00:14:32

the action is the action is precisely

00:14:34

where you're talking about the semantic

00:14:36

aspect thinking about objects thinking

00:14:38

about scenes and you know a high level

00:14:41

relationships and so on but the still

00:14:44

many yeah many deep learning past is

00:14:49

very much about creating data and

00:14:52

augment the data with the different you

00:14:55

your metrics. And that is you know way

00:14:59

already solidly in the lower levels if

00:15:02

you have a proper models and so I think

00:15:05

you want to take a step above that yeah

00:15:08

people have tried that so the reason

00:15:10

we're using these complex with

00:15:12

different nations is because before

00:15:15

people we're doing exactly what you say

00:15:17

they were taking handcrafted features

00:15:20

which were invariant two kinds of

00:15:22

things and sticking some machine

00:15:23

running on top including your meds but

00:15:26

it turns out that he works better if

00:15:28

you learn the whole thing into and so

00:15:32

what you're suggesting is in you know

00:15:34

something we should do is something we

00:15:36

have done. And maybe we can do it

00:15:38

better but it has been tried it it's

00:15:40

it's exactly where we come from oh

00:15:43

right. So so I have at face but we do

00:15:49

have some research going on and decide

00:15:53

where we want to clearly have learned

00:15:55

from all of the research and signal

00:15:59

processing. Um last year we published a

00:16:03

paper called the complex well dude

00:16:05

continents which basically inspired

00:16:08

from wavelet packet transforms. Um they

00:16:12

don't work as well as well yeah that's

00:16:15

but regardless I think it's important

00:16:18

to understand. Um why applying

00:16:23

traditional signal processing methods

00:16:25

directly just doesn't work as well. Um

00:16:29

we we do have a collaborations that and

00:16:34

why you professors for example David

00:16:36

here man there's a lot of ongoing work

00:16:40

but we at this moment we don't see

00:16:42

anything promising enough to be excited

00:16:46

about oh one question actually about

00:16:59

the batteries is there a successful way

00:17:02

to integrate the motion of time on it

00:17:05

the like some successful application

00:17:08

like time series and predicting that

00:17:10

that's that's fine yes three letters

00:17:20

are and and we carry on that and there

00:17:26

are many forms. They're just design

00:17:29

exactly for that. And they're working

00:17:31

beautifully well I have some more

00:17:37

advanced questions to this island I've

00:17:41

seen. a nice idea of by interpolation I

00:17:49

that And there are and for an and yeah

00:18:04

cool Marcus investigations and capacity

00:18:08

into thousand fifteen is correct yeah

00:18:13

but in general how okay yes quite well

00:18:17

do you have some wrestlers beyond the

00:18:26

things that already exists I don't know

00:18:30

I think personally I would be

00:18:34

interested in understanding more the

00:18:37

structure of the dynamics. Um how the I

00:18:46

can spectrum of the jacobian changes

00:18:49

and both are in training or trying the

00:18:52

sequence another interesting question

00:18:56

is what information is preserved in the

00:18:59

state. So if you think about what a

00:19:02

recurrent net does is it reads a

00:19:05

sequence and that any point it has a

00:19:09

vector steak that is a function of that

00:19:12

and of course it has to throw away some

00:19:14

information and and it's gonna keep

00:19:15

some information. So we could use some

00:19:18

kind of monitoring devices maybe to try

00:19:20

to figure out what a particular

00:19:22

recurrent net is remembering and what

00:19:24

it's forgetting about the input but

00:19:29

yeah I I think it's a great question

00:19:31

and more could be done to understand

00:19:34

what's going on in a would maybe help

00:19:36

us design better recurrent that's as

00:19:38

well better architectures oh so at some

00:19:47

point soon as you were saying that face

00:19:51

book imagine it is a small data set. So

00:19:54

so what the future be of the

00:19:56

interaction between the getting X on

00:19:58

the on the combined because it is a is

00:20:00

a is a bit of feeling that's we soon if

00:20:04

not already well not being the same

00:20:06

category. So we go to conferences and

00:20:09

you have to wonder about we have if you

00:20:12

you have that's a good question there

00:20:17

are actually larger dataset sleeker

00:20:22

hundred million dataset. It's hundred

00:20:26

times larger than they measurement. But

00:20:28

it's weak labels and that's something

00:20:33

we've been really interested in and it

00:20:37

is true or data sets are larger and

00:20:41

magnitudes but we also don't publish

00:20:43

any research on our data sets we play a

00:20:47

level game in the sense that we play

00:20:49

the game that the academics are playing

00:20:52

every work on the datasets that are

00:20:54

public and there's work published

00:20:57

around it it's it's not only cushion

00:20:59

that I'd so it's a question of

00:21:01

competition power also I discuss

00:21:03

sometime people were private companies

00:21:07

are the at at the at the hurt of

00:21:10

getting good results especially when

00:21:12

you play the rock ending game four

00:21:15

ounces if you if you guys to such with

00:21:18

one thousand GP use we do agree touch

00:21:21

we stooges use and it's it's it's over

00:21:25

also again what would be the way of

00:21:29

dealing with this. this is something

00:21:31

that I didn't think about a lot because

00:21:35

then I running go visit research labs

00:21:38

and also some of my interns come in and

00:21:42

there's a there disconnect there

00:21:45

between industry and academia. I mean

00:21:48

some labs for example your shows that

00:21:50

the or or you know rich enough to or

00:21:56

you know they they get donations for

00:21:59

like a lot of cheap use but like most

00:22:01

labs I know there's like one or two

00:22:03

jeep use where you do most of a

00:22:05

research and there is no clear answer

00:22:10

like at face but we are trying to

00:22:12

bridge just disconnect we are donating

00:22:15

machines to many research that in

00:22:18

europe. And it's an ongoing program. Um

00:22:23

that's one way to bridge the gap but if

00:22:26

you ask how do you how do you bridge

00:22:31

the income back got that in the real

00:22:33

world. Um how do it is the the

00:22:38

disparity between the rich and the

00:22:39

poor. I think it's a hard problem. Um

00:22:43

and that's that's it's it's it's it's a

00:22:48

hard question as well to to answer your

00:22:52

cat so here's here's a simple

00:22:53

suggestion you know make your tax

00:22:58

returns public in other words you know

00:23:01

declared your paper how many GP use

00:23:03

you're using we actually we talk about

00:23:07

how many jeep you think it's something

00:23:10

that we should it should become a habit

00:23:13

and that we viewers would take that

00:23:14

into account in their judgement because

00:23:17

you can't compare to papers where you

00:23:19

know that has one to diffuse the other

00:23:21

as a hundred for the same job So this

00:23:25

is what I wanted to to cause not a

00:23:27

simple because somebody could fake that

00:23:29

you only have to give you right but

00:23:31

people can I but that was some do do

00:23:33

you think it would make sense I suggest

00:23:34

is really too few people I was hoping

00:23:36

to but right no it's it doesn't seem to

00:23:39

to me so interesting to others but

00:23:42

would it make sense that people have to

00:23:43

declare the amount of clubs they burnt

00:23:46

for the paper including the grid search

00:23:48

including everything so to have a rough

00:23:50

is even of earth estimate that that

00:23:52

would kind of be helpful no I think it

00:23:59

makes no sense at all because doing

00:24:06

better research is not a function of

00:24:08

the plot to burn the I mean we could

00:24:11

burn a billion plot and still publisher

00:24:15

that paper it's not enough to have

00:24:17

computers but it it can help to give

00:24:19

you the little edge for you know

00:24:22

beating the benchmark so I think there

00:24:25

is another way that we could both agree

00:24:30

which is something I talked about

00:24:31

yesterday change the focus of the

00:24:35

evaluation from purely the numbers to

00:24:38

something more about the ideas and I

00:24:40

know it's harder is much easier so well

00:24:42

you're far from the benchmark reject to

00:24:46

try to actually think about all is this

00:24:48

really interesting any like is you know

00:24:51

what what we feel about this idea is it

00:24:54

is it reasonable and this is something

00:24:56

that could have an impact if it works

00:24:58

which harder to do the evaluation but

00:25:02

so he in the here's another thing I

00:25:04

think what's gonna happen naturally is

00:25:07

a segregation of the tasks all of the

00:25:10

research goals right so so one of the

00:25:14

main reasons for doing the kind of

00:25:16

research we're doing in my lap within

00:25:18

spies leading under the models that you

00:25:20

don't need to have a million examples

00:25:22

to do that you can you can develop new

00:25:25

ideas and test them on small datasets

00:25:28

in fact most new ideas fail on and this

00:25:33

and so you don't need to go very far to

00:25:34

know that it doesn't work. Um it's it's

00:25:38

so I I think we'll see some sort of

00:25:40

research topics that are gonna be more

00:25:43

explored by academia. And some research

00:25:46

topics that require doing things like

00:25:49

you know producing the state of the art

00:25:51

in some can be difficult computer

00:25:53

vision task focused more more I

00:25:56

industrial apps. It's gonna be sad but

00:25:59

I think that's where it might be going.

00:26:01

So the other alternative is we come in

00:26:05

those two people sitting there to to

00:26:08

make that kind of attitude is So I just

00:26:14

wanna say that I think I bit also what

00:26:17

the yours was that it's I think less as

00:26:19

a competition more as a symbiotic

00:26:21

relationship. And academia and industry

00:26:24

can complement each other and from

00:26:26

double side there are hundreds of

00:26:28

grants the done to research labs every

00:26:32

year of visiting scientists that's just

00:26:34

common work that well four months and

00:26:38

also open sourcing tons of little helps

00:26:40

as well right I mean the idea is to

00:26:42

help the community not only help or

00:26:44

sell yeah but okay that that may turned

00:26:46

a bit to put it together but I I I

00:26:48

understand that's the phrase book

00:26:50

Conger sees a solution has faced book

00:26:52

on Google funding the results but I

00:26:55

mean we there are so many things to

00:26:57

discuss here you wanted to add

00:27:00

something okay can okay Of first yeah

00:27:04

well we actually a totally agree a

00:27:06

specific with that donating to

00:27:07

universities even though you know we're

00:27:09

not Intel rain D so we actually usually

00:27:12

one or two you use actually have been

00:27:14

doing these mostly in the US but

00:27:16

however you know we're open to

00:27:17

corporation you're also going to being

00:27:20

a few other aspects and I should be in

00:27:22

here in this conference and there's

00:27:23

another node centric approach even you

00:27:26

S rises anymore tightly you were

00:27:28

there's also these came out machine

00:27:29

learning you think your own things like

00:27:32

how do approach to park actual I'll

00:27:34

talk about "'em" or have done some work

00:27:36

with the rice university which she you

00:27:38

know each accelerate practise practise

00:27:40

and you use in that again takes you

00:27:42

will for dirty to be able to handle

00:27:44

bigger datasets. So again there's one

00:27:46

interesting direction to to what you're

00:27:48

looking at the down thinking also for

00:27:51

this there's tired I was like a graph

00:27:52

lab in the US to know there's a company

00:27:54

called actually about comedies are both

00:27:56

know probably this came out you know a

00:28:00

holiday to many companies that also

00:28:02

work in that space And so two things

00:28:07

I'm very and I'm sure I'm date some

00:28:10

extent as well give out a lot of GPS

00:28:13

we've got a hardware academic ground so

00:28:16

if there is anyone you just go online

00:28:18

and put a proposal three and we tend to

00:28:21

I'm actually I'm actually we now for

00:28:23

giving away way too remote look nine

00:28:25

sales. Um so there is that but the

00:28:29

other thing is we will kind of just

00:28:31

assuming that the learning is gonna

00:28:33

continue the way it's and it's going

00:28:37

which is I really really intensive

00:28:39

training. And then you have your

00:28:41

inference and there are already

00:28:43

research is to collect that entire work

00:28:45

load. Um you know what mention names

00:28:48

but I mean I'm I'm talking to people

00:28:50

who have upbringing and and things that

00:28:54

expectation minimise asian and where

00:28:56

you're going more the biological we

00:28:59

wear and this a space comes back to

00:29:01

attention models things like that where

00:29:03

you're recognising the features before

00:29:05

you even then go to the training which

00:29:08

and this flipping of the workload means

00:29:09

that you don't need as many GP use and

00:29:12

you can also have massive massive data

00:29:15

sets because you're not doing this

00:29:17

intensive training it so you actually

00:29:19

found and and that's just one thing do

00:29:22

learning is not gonna remain as it

00:29:24

currently is we're not there and there

00:29:26

is is as far as I'm concerned as far as

00:29:29

I know only about one or two people

00:29:31

actually looking at this because the

00:29:33

majority people just assume the living

00:29:36

is this this huge training and then the

00:29:38

inference. Um but that's gonna change

00:29:42

the field you have in Macy's these

00:29:44

people make at way that that will

00:29:45

change the failed for GP you use but

00:29:48

there's maybe another solution that we

00:29:53

come to us from hardware guys. So there

00:30:01

are lots of companies who are trying to

00:30:03

compete with envy via. And build the

00:30:06

next generation of neon that chips. Um

00:30:10

this could give us a hundred full speed

00:30:13

up in the next couple of years. And it

00:30:17

could level the playing field if if

00:30:20

these chips also sold in in a commodity

00:30:24

products. And they're gonna be cheap.

00:30:27

And it's gonna make it hopefully much

00:30:29

easier for research. That's a

00:30:31

possibility that I hope will happen

00:30:33

yeah I think we both on the speaker

00:30:36

that actually some interesting because

00:30:37

what's something that you mentioned

00:30:39

like if I were to commute to harder to

00:30:41

days neural networks by the time I had

00:30:44

rates ready to be obsolete. So we have

00:30:46

extra yeah I don't think I don't think

00:30:48

so. I think I think a lot of the

00:30:50

building blocks will be there actually

00:30:53

shows but I agree with you that's

00:30:54

exactly it and we need to identify the

00:30:56

the building blocks a making those

00:30:58

available in making those programmable

00:30:59

yeah that's that's actually that's but

00:31:03

the other thing is that the differences

00:31:05

in hardware very reminds you know for

00:31:08

example Pascal was like three years of

00:31:11

aren't indian and it's what should

00:31:13

bring it to to market. So that's very

00:31:15

very slow compared to the advances that

00:31:18

you make in in software and I'd have to

00:31:20

get on is numbers you know on a really

00:31:23

quick. And and and it's not so much the

00:31:25

the neon framework it's the actual

00:31:28

shape that that developing even though

00:31:29

it's still okay I think we have to get

00:31:32

it out next year yeah yeah basically

00:31:34

they use the RG B.s right now to do the

00:31:36

the simulation but you know when it

00:31:38

comes out I mean this is a dedicated

00:31:39

chip so you know where well where

00:31:41

everyone and all tabs are actually

00:31:43

working this got grey and yeah be it

00:31:45

could be used for training not just for

00:31:47

infants yeah but the the point is that

00:31:51

the software advances I think you you

00:31:54

have to realise that that is gonna

00:31:55

happen a lot quicker than hardware you

00:31:58

know I mean we still talking next year

00:31:59

before Madonna bring bring the chip

00:32:02

out. And in the meantime you know this

00:32:04

people doing tests only on on CPU at

00:32:07

the moment the claiming two hundred

00:32:09

times speed up using genetic

00:32:11

algorithms. And the neural networks and

00:32:15

this gonna be way more advances in

00:32:17

software thing even before the the

00:32:18

hardware comes in sorry I'm very

00:32:24

curious to hear a less busy numbers and

00:32:26

well to do. So I'm speaking as loud as

00:32:30

I can honestly I'm really curious to

00:32:34

hear cast position on this and what

00:32:36

Google is doing a with that you you the

00:32:41

question is a curious to see what the

00:32:44

goes doing with the new UITP you yeah

00:32:47

so obviously it's not a secret that

00:32:49

Google uses a lot of deep learning. And

00:32:53

speeding up the training is very

00:32:57

important so you pews are obviously

00:32:59

still very much in use and TP use our

00:33:04

as you probably know already used by

00:33:07

problem products for one here so around

00:33:10

brain uses the user and bring one of

00:33:12

the search out. So it's cranking out

00:33:17

where they're just not the very simple

00:33:19

base as you can imagine and part of it

00:33:22

is neural network that helps with the

00:33:25

ranking and that uses to be for example

00:33:27

so we we definitely see neural network

00:33:32

or machine I machine going specific

00:33:34

hardware helping but again. It's not a

00:33:37

focus on hardware versus softer they're

00:33:39

both that that thing and it's not

00:33:41

facing a bit rather it's getting at at

00:33:45

that thing at the same time and says

00:33:47

likes sea horses we want to have the

00:33:49

hardware to enable us to do the best

00:33:52

possible research sh cool I had a

00:33:58

general question ah yeah yeah it's it's

00:34:05

more search a word stuff presentation.

00:34:07

So is there some sort of intuition

00:34:10

behind why cans work better than

00:34:12

variational autumn colours because

00:34:15

variation on encounters have a nice

00:34:16

elegant formulation but can simply

00:34:19

before in the past "'cause" they can't

00:34:20

"'cause" variational or I don't colours

00:34:22

can scale this over safari believe can

00:34:25

sell so it's a good question I think

00:34:27

different researchers may have

00:34:28

different opinions about this that what

00:34:32

happens with very small encoders is

00:34:34

that the it tends to as I said to lose

00:34:38

too much information about the input in

00:34:40

their later representation by adding

00:34:42

too much noise somehow and even if you

00:34:47

just yeah and then what happens is

00:34:52

that's the decoder sees the same

00:34:57

representation being associated to

00:34:59

different a axes right so I I it's

00:35:04

trying to do a one to many mapping and

00:35:08

it does it by having a deterministic

00:35:11

function fall by at some gaussian

00:35:14

noise. So what what you're getting is

00:35:16

that the mean of that gaussian is going

00:35:19

to be somehow in the middle of many

00:35:22

images that correspond to the same

00:35:26

later no presentation roughly speaking.

00:35:28

So what happens that's what you get a

00:35:29

blurred images image is the up because

00:35:32

the the average of a bunch of images is

00:35:35

a kind of a blurry image whereas gas

00:35:40

doesn't have this issue at all it can

00:35:44

produce very very sharp images but it

00:35:46

has other issues it may miss boats

00:35:49

other it it may give zero probability

00:35:52

to things that should happen in the

00:35:54

world of course when you generate

00:35:56

samples you don't necessarily see this

00:35:58

that is a whole world that is missing.

00:36:00

So it looks nice but if you were to

00:36:05

compute the log like you have again you

00:36:07

get infinitely bad log like you so yeah

00:36:12

they they have their advantages and

00:36:14

disadvantages so maybe again imagine

00:36:24

general question so there is a bit the

00:36:26

feeling that I go result was it a bit

00:36:29

of a surprise for even the people in

00:36:31

the field. So what what would be

00:36:35

according to each of you something

00:36:37

which is ask clearly. That's redefine

00:36:39

because this is pretty pretty clearly

00:36:41

define these are let's say that you

00:36:45

don't expect to happen before ten years

00:36:47

and if it was happening before ten

00:36:49

years you would be very surprised well

00:36:52

is is just too much of a perspective

00:36:54

question is that okay before okay two

00:37:00

years two years. I would say is

00:37:06

starcraft within if it if it gets all

00:37:10

within two years that would be that

00:37:14

would be very very impressive so

00:37:17

without the go I think there was

00:37:19

sentiment the your for all I got was so

00:37:24

I went initial paper came out that I'll

00:37:29

forego I mean that goal will be thought

00:37:33

to because the initial results really

00:37:35

promising just what that's the building

00:37:38

this about it or with starcraft I think

00:37:45

they are very hard problems in it to us

00:37:49

all first for example doing

00:37:53

assimilation and in inside the model

00:37:56

like all go has an advantage of having

00:38:00

the simulator it can predict different

00:38:02

moves and then see if they're ballad or

00:38:05

not that's not applicable either to

00:38:09

starcraft or to the real world but

00:38:12

basically doing planning in this late

00:38:14

and space and another thing is also the

00:38:16

action spaces are much larger which

00:38:18

means we won we need a system that can

00:38:23

do hierarchical actions really

00:38:25

effectively or even in for the

00:38:28

hierarchy of actions automatically and

00:38:30

I would say if that in two years

00:38:33

something like this happens that would

00:38:35

be amazing and surprising natural

00:38:41

language understanding yeah I don't

00:38:44

know what's the benchmark there is one

00:38:49

of course that ring test but the

00:38:52

problem with that ring test is that

00:38:55

it's not just about natural language

00:38:56

understanding it's also about fully I

00:38:59

so you have to understand you know

00:39:02

everything about the will that humans

00:39:03

typically know about but you could

00:39:05

imagine a trying tests geared at a

00:39:08

particular domain like be able to

00:39:13

answer technical questions about Linux

00:39:15

or you put two for example there's data

00:39:18

set for this too small but in doing it

00:39:22

as well as a human. I think that's

00:39:24

something. That's not impossible but I

00:39:28

doubt that will have it in two years

00:39:30

but if if we do it would be a great

00:39:32

success I think I would be very

00:39:38

surprised if we get in use machine

00:39:40

learning algorithms to generalise as

00:39:42

well as we do so also a bit related to

00:39:44

transfer learning if unsure what to

00:39:46

your old two features of T rex the next

00:39:49

day that we are to run around the house

00:39:51

and we'll say artist direct this is not

00:39:53

the T rex right we are very far from

00:39:56

that now I I'm not at liberty to guy.

00:40:02

But but no not a bad idea is just two

00:40:08

examples is the enough for a child

00:40:10

because as usual said in his talk right

00:40:12

we understand things about the world

00:40:14

anyway easily able to generalise and

00:40:16

right now we're very very far from that

00:40:18

should read some papers that came out

00:40:22

recently using the only got dataset

00:40:26

where it looks like you're you know

00:40:29

we're able to do a fairly good job with

00:40:32

one or two or three examples using sort

00:40:36

of one trouble on different you one

00:40:38

shot learning techniques I and I think

00:40:40

it's all this problem at all right but

00:40:42

but there's been some recent progress.

00:40:44

So we could see more of that in the

00:40:47

next and of course the magic comes from

00:40:49

the fact that you've already seen

00:40:52

hundreds of other similar in this case

00:40:57

similar alphabets and then you can

00:41:00

generalise to a new alphabet with you.

00:41:03

You know we ways of writing specific

00:41:06

actors but shouldn't exclude this

00:41:10

because also the child has the

00:41:11

knowledge about the world so you

00:41:13

shouldn't assume that learning will not

00:41:15

come from nothing right we just one

00:41:17

more generalisations so within the next

00:41:24

two years of roughly I don't think it

00:41:28

will happen but possibly within the

00:41:31

next five fucking john but I really

00:41:34

would like to see and this is strange

00:41:36

thing fair estimates that from an very

00:41:39

your morphing chips and and

00:41:40

developments that and that's really

00:41:43

bringing down the the power budget but

00:41:46

also ramping up the capability. Um I

00:41:50

don't exactly know where this more

00:41:53

slower and capabilities going to sort

00:41:57

of P can get its next. So the second

00:42:01

wind from a goddess that but normal for

00:42:04

chips are probably very very important

00:42:07

for getting to a GI and and I think you

00:42:11

know getting to a TI is is a really

00:42:13

important thing ignoring all the scary

00:42:17

stuff and what could go wrong at such a

00:42:20

but we need to get that kind of

00:42:21

capability. And I suppose if you and if

00:42:24

you go away from D living for for one

00:42:27

second the other thing would be the the

00:42:31

space program pushing this to you know

00:42:33

to for the for the or getting passed

00:42:37

for the for the limits you know things

00:42:39

like you know you know most things.

00:42:40

Well next year will fly twenty eighty

00:42:47

to to get a know that that kind of

00:42:50

focus is is gonna really yeah take this

00:42:54

feels it's a difference different

00:42:57

sectors I think I actually just

00:43:02

summations visit those speak if brought

00:43:04

their crops are no the the reason for

00:43:07

having it actually or X box comes out a

00:43:09

Christmas level type terrify machine

00:43:12

you two have been on the top five

00:43:13

hundred list a few years ago. I in

00:43:16

actually right now are not the US

00:43:17

department of energy's having these

00:43:19

access key program a need to be able to

00:43:22

annex extra four system in twenty

00:43:24

thirty two or so. But again we need to

00:43:27

bring those flops to there hopefully

00:43:28

system no questions and you know yeah

00:43:41

but yeah sorry I'm asking many

00:43:45

questions but this time or maybe to the

00:43:47

hardware produces. Um fan of recurrent

00:43:51

neural networks and especially might it

00:44:00

used is not because they're not good

00:44:02

but because they're not so really fast

00:44:04

there are in some laps. goods

00:44:07

implementations but I wonder uses some

00:44:10

I don't know natives to disappoint

00:44:14

coming soon follows such architectures

00:44:17

and so I know you can very well. And

00:44:22

from from what I know he brought out

00:44:26

there that LSTM simply because there

00:44:29

wasn't any decent way to paralyse with

00:44:33

GP and this is like the the pretty case

00:44:35

it's a a lot of where the we started

00:44:37

doing and keep you know and fives

00:44:38

obviously now offering I mean it's only

00:44:41

single or and then at the moment. But

00:44:43

we are working on it but that paper.

00:44:46

And what really surprises me and and it

00:44:48

does all the time actually that it's

00:44:52

yeah is is so quiet about what they do

00:44:54

you know that it is but it it which is

00:44:56

surprising you know when we know what

00:44:59

your is like right but that that type

00:45:02

is probably should have been pushed

00:45:03

around that a whole lot more because

00:45:05

it's it was purposely. So that you

00:45:07

could use it with with GP A.'s and it's

00:45:09

it's really just a very another elegant

00:45:13

solution because it it this full

00:45:16

concepts of the actual volumetric data.

00:45:20

And then you know if you if you haven't

00:45:21

read the permit of a CM paper just take

00:45:24

a look at it this. It's useful but it

00:45:26

it did drive us and to be honest when I

00:45:30

first started and video I did say you

00:45:33

know why are we not covering on "'em"

00:45:35

but to be honest I think a a year ago

00:45:39

there wasn't that much activity with

00:45:41

with our own and especially on the

00:45:42

white afield anyway. So I was pushing

00:45:44

you know we need to double more but

00:45:46

again it takes time we we've only got

00:45:48

finite number of people so we'll gather

00:45:50

but it was purposefully. So that you

00:45:53

could implement on GPA "'cause" they've

00:45:55

been using GP news. And opinion you a

00:45:57

big proponent of those for for a long

00:45:59

time I don't know what yours while

00:46:07

you're saying that are intense and use

00:46:08

that much my lab they're used all over

00:46:12

the place. I mean some you know variant

00:46:15

of L already oh I mean if you're in the

00:46:20

via research later for and the

00:46:23

publication means CNN and so the much

00:46:25

more here's switching conference you go

00:46:28

if you go to CP or maybe you don't see

00:46:30

that much but if you go to Lena

00:46:31

language related conferences that

00:46:33

enables you different picture but these

00:46:34

are the one I mentioned I mentioned the

00:46:36

multidimensional oh the

00:46:37

multidimensional yes oh yeah well it's

00:46:47

because you I was only lasted that and

00:46:49

that they were that paper is stalling

00:46:51

and and and you can so the same then

00:46:55

there's but a lot more check out you

00:46:59

know three D volumetric data for GPAXS

00:47:03

to the the is quite a lot different

00:47:04

type is that that are out there now so

00:47:09

fast a military and he gave a paper

00:47:12

LGTC conference and and I think that

00:47:15

what is going on base he's so the jump

00:47:18

strangest also because of that one

00:47:19

paper. But is this quite a lot we

00:47:21

devalue much is really ramping up now

00:47:23

because as a obviously the the medical

00:47:25

applications. Um but again it's you

00:47:28

know we write the beginning of this

00:47:29

where like when you get to three day

00:47:31

and "'kay" so maybe one one more

00:47:43

question and then we can stuff here so

00:47:45

the question for the a framework people

00:47:48

so it's it's one of the question that

00:47:51

you got quite a lot of votes on the

00:47:53

website was what are the and the

00:47:56

inevitable trade offs in a framework so

00:47:59

we you you you showed this kind of

00:48:02

gradient between and destroy and work

00:48:04

too when we see that a sex to be to one

00:48:07

side to be changes are on maybe like it

00:48:10

when question which was asked at the

00:48:11

beginning it ye it's I was loose and

00:48:16

nobody the right of so it's impossible

00:48:17

to have the best of both worlds or is

00:48:20

it simply that we do not have yet come

00:48:24

with the the right the right overall

00:48:26

thing or what we call them so I think

00:48:33

regard regarding the initial trade off

00:48:36

question something that clearly comes

00:48:39

to mind is what's the right level of

00:48:41

abstraction. So ideally you want to

00:48:44

have the things always being composed

00:48:48

of different operations and have the

00:48:50

operations and everything being very

00:48:52

modular but sometimes if you do that

00:48:55

you have your call this lower right

00:48:58

because you can't optimise for example

00:48:59

if you spend your time actually writing

00:49:01

your into and program that say in C or

00:49:04

C plus plus you can really optimise the

00:49:06

bare or to put that same for speed

00:49:09

memory usage and so on but if you want

00:49:11

to have a composition allergy then you

00:49:15

trade off a little bit of the the speed

00:49:17

and also this comes up with numerical

00:49:21

stability. So as we know soft max is

00:49:24

not really that remote "'cause" stable

00:49:26

stable peaceful max and then cross

00:49:28

entropy so for example intensive you

00:49:29

have soft max with cross entropy on one

00:49:32

not ideal for generalisation but

00:49:34

sometimes you you have to combine these

00:49:37

operation so I think this is something

00:49:39

that when you design a deep learning

00:49:41

frame and this is one of the hard

00:49:42

questions where do you stop where where

00:49:44

is the level of composition allergy

00:49:46

that you want to go with and for

00:49:48

example for tens of low for a compared

00:49:51

to do forced DB deplore in systematic

00:49:54

well it's more compositional so and it

00:49:56

turns out that if you do it right you

00:49:58

can even make it faster so it's faster

00:49:59

than the person system also more

00:50:03

modular but I think overall this is one

00:50:06

this is the first thing that comes to

00:50:07

mind I think there is of course this

00:50:13

tradeoff exist but the there are some

00:50:15

tools to really improve on both fronts

00:50:19

and one of them is a very old one it's

00:50:21

called the compiler and the compiler

00:50:24

allows you to have a lot of flexibility

00:50:26

and and modularity but you know once

00:50:30

you've specified the computation you

00:50:31

can use the compilers intelligence

00:50:36

which could you use machine learning

00:50:39

you know to to make it efficient.

00:50:41

Instead of having a human design right

00:50:43

in you know has been designed to try to

00:50:48

you know make it easy for putting

00:50:50

compiler technology but you know now I

00:50:55

think we could do a lot better if we

00:50:56

put in like professional compiler

00:50:57

writers to do these kinds of things

00:51:00

hopefully does it flow will get their

00:51:02

but I think this is a direction where

00:51:04

we could have both ease of you know

00:51:09

design flexibility. And efficiency and

00:51:14

you know efficient implementation and

00:51:15

production ready think the remote there

00:51:22

but make good points and it there's a

00:51:24

comment the in their where we are not

00:51:30

like I mean your question was do we

00:51:33

always have to make these trade offs

00:51:35

between research and the production

00:51:38

like faster and flexibility in an ideal

00:51:42

world we done you actually should be

00:51:45

able to write the most flexible most

00:51:48

fast is the thing but yeah do you have

00:51:54

not yet the the research on the

00:51:58

compilers and the grab placements and

00:52:02

basically a bunch of system research is

00:52:04

still not there yet to build such a

00:52:07

tool and that's why people do these

00:52:10

things by hand. Um when that research

00:52:14

catches up and I'm sure it will catch

00:52:16

up like thing we will move closer to

00:52:20

words like unified system that does but

00:52:23

things really well "'kay" so and use a

00:52:30

question okay so maybe that's enough

Share this talk:

Conference Program

59:34

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

2370 views

55:38

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

427 views

01:01:02

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

331 views

55:14

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

815 views

55:57

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

342 views

01:08:04

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

2157 views

49:29

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

275 views

52:43

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

151 views

45:40

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

2660 views

52:33

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

1705 views

01:05:51

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

1406 views

01:04:41

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

2251 views

Recommended talks

02:44

Day 2 - Questions and Answers
Panel

Embed

Transcriptions

Conference Program

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

Recommended talks

Q&A (Roberto Boghetti)
Roberto Boghetti, Idiap Research Institute
Nov. 15, 2021 · 10:13 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Day 2 - Questions and Answers Panel

Embed

Transcriptions

Conference Program

Deep Supervised Learning of Representations Yoshua Bengio, University of Montreal, Canada July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning Alison B Lowndes, NVIDIA July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers Panel July 4, 2016 · 4:16 p.m.

Torch 1 Soumith Chintala, Facebook July 5, 2016 · 10:02 a.m.

Torch 2 Soumith Chintala, Facebook July 5, 2016 · 11:21 a.m.

Deep Generative Models Yoshua Bengio, University of Montreal, Canada July 5, 2016 · 1:59 p.m.

Torch 3 Soumith Chintala, Facebook July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers Panel July 5, 2016 · 4:21 p.m.

TensorFlow 1 Mihaela Rosca, Google July 6, 2016 · 10 a.m.

TensorFlow 2 Mihaela Rosca, Google July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning Mauricio Breternitz, AMD July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session Mihaela Rosca, Google July 6, 2016 · 3:21 p.m.

Recommended talks

Q&A (Roberto Boghetti) Roberto Boghetti, Idiap Research Institute Nov. 15, 2021 · 10:13 a.m.

Klewel SA

What is Klewel?

Follow Us

Contact Us

Day 2 - Questions and Answers
Panel

Deep Supervised Learning of Representations
Yoshua Bengio, University of Montreal, Canada
July 4, 2016 · 2:01 p.m.

Hardware & software update from NVIDIA, Enabling Deep Learning
Alison B Lowndes, NVIDIA
July 4, 2016 · 3:20 p.m.

Day 1 - Questions and Answers
Panel
July 4, 2016 · 4:16 p.m.

Torch 1
Soumith Chintala, Facebook
July 5, 2016 · 10:02 a.m.

Torch 2
Soumith Chintala, Facebook
July 5, 2016 · 11:21 a.m.

Deep Generative Models
Yoshua Bengio, University of Montreal, Canada
July 5, 2016 · 1:59 p.m.

Torch 3
Soumith Chintala, Facebook
July 5, 2016 · 3:28 p.m.

Day 2 - Questions and Answers
Panel
July 5, 2016 · 4:21 p.m.

TensorFlow 1
Mihaela Rosca, Google
July 6, 2016 · 10 a.m.

TensorFlow 2
Mihaela Rosca, Google
July 6, 2016 · 11:19 a.m.

AMD's Open Compute and Open Source cross platform solutions for Machine Learning
Mauricio Breternitz, AMD
July 6, 2016 · 1:59 p.m.

TensorFlow 3 and Day 3 Questions and Answers session
Mihaela Rosca, Google
July 6, 2016 · 3:21 p.m.

Q&A (Roberto Boghetti)
Roberto Boghetti, Idiap Research Institute
Nov. 15, 2021 · 10:13 a.m.