Player is loading...

Embed

Copy embed code

Transcriptions

Note: this content has been automatically generated.
00:00:00
'kay
00:00:01
well uh a morning still thank you for coming uh right before lunch to talk
00:00:08
uh so yeah my name is non over and uh so i was a p.
00:00:12
h. d. student at a martin and ask you flat uh working on programming languages
00:00:17
and then after i finish my p. h. t. i. meandered around a bit did some uh research at microsoft in yeah
00:00:23
but then uh you know i decided i couldn't there was enough so
00:00:26
uh i i would switch thing i would switch to the industry and uh
00:00:33
you know the main the main reason to switch for the industry for me was when it's basic common sense you own a lot more money right
00:00:39
um but no i'm sorry would find out that the true academic that i was already compare you know
00:00:45
do i really on more money and how does it stand how does it stack up a with respect to uh
00:00:51
i'm sorry with respect to i guess uh address the people so you know i went and asks my colleague backup pay a yak on
00:00:58
how much you aren't yeah so i i'd already been in the industry for while and um i was a bit surprised romano asked me because
00:01:05
we're we talk about this in the industry so what we did is uh
00:01:12
um okay sorry rubber of slides uh we want to talk a t. l. c. d.
00:01:17
o. uh timit are and asked them like uh what what we should do any suggested
00:01:22
well you know what yeah instead of revealing everyone sorry why don't we just try to get an average
00:01:26
um so we came up with this basic protocol cat cardigan average
00:01:31
problem here is it doesn't really solve the problem that uh we that
00:01:35
we could meet that we would like to hide our individual salaries um
00:01:39
either we need to all reveal it all we need to trust the third party to have access to all of our salaries
00:01:46
so instead i we can understand why did not want to reveal salaries while i'm instead
00:01:53
suggested we could come up with the a different kind of protocol protocol that is privacy preserving
00:01:59
so uh with some that ever met mathematics or we can uh can't
00:02:04
get the average without revealing our individual salaries so here's how this protocol goes
00:02:11
um every one of us we have a secret number we would like to to keep secret which is the
00:02:17
number in which are the numbers in red you see here um we each also generate a three random numbers
00:02:26
or numbers must be random but the total sum must be zero get to that in a minute
00:02:33
how to remodel listens colour well the first thing
00:02:37
we should introduce is a two types one the
00:02:41
the secret value we would like to represent in this case it's just an integer in this example
00:02:46
and then uh the concept of a shared number so it's the shared number is the list
00:02:51
of ticket numbers it corresponds to what you've seen on the left here uh the sequence of numbers
00:02:59
and the key part here is in this more once colour is every element
00:03:03
of the list wrist is only accessible by one uh by the corresponding a person
00:03:10
then um well we have a function that generates these random
00:03:14
numbers and we generate three uh one for them it are one for myself in one form on or
00:03:21
uh in the table view how would this look like um
00:03:25
like this we have again the the parties involved in the computation
00:03:30
on every row you have the uh the shared zero so b. j.
00:03:34
n. m. are actually all some up to zero and i'll secret data
00:03:39
um then the protocol goes as follows we each sum up our shares
00:03:46
of uh of zero so our own share and what we have received from other places
00:03:54
this gives us a shared some this value here is published to every other player
00:04:02
and the final some all values will equal the sum of all our private data
00:04:11
but um yes as the years we consume collapse uh was i just mentioned
00:04:16
um as in numeric example might be more destructive to see how it works we're going to the security in a second
00:04:24
we we see every row indeed sums up to zero all numbers you have been randomly generate generated
00:04:30
and if we sum up all the the shared psalms we arrive at
00:04:36
the same some as if we hatch are summed up our all our data
00:04:41
um now what does this work the key thing again here is
00:04:44
that all the shared zeroes a sort all all the shared um
00:04:48
the the the triplets we generated they all some up to zero so i mean the total sum that gets cancelled out
00:04:55
and why is it secure well it's because no player has for information if
00:05:01
we look at the point of view of the much are in the tar
00:05:04
knows what's in this row because these are the numbers that you generated any
00:05:07
knows what's in the column because these are the numbers that he has received
00:05:11
for myself that i ever different the view of the world
00:05:16
and manner again so unless someone is able to intercept or communication
00:05:21
there's no way to uh to find the original a secret data
00:05:27
our model this uh exchanges colour
00:05:31
well we need to introduce what it means to add numbers first
00:05:34
so there's one concept which is adding shares so adding a parts
00:05:40
adding essentially two numbers uh d. and jane what that means
00:05:44
it's just an element wise edition of the individual secret numbers
00:05:48
and then we also need to introduce the concept of what it means to reveal a value and
00:05:53
that again is just summing up all parts of
00:05:58
the uh of the the or individual secret shares
00:06:02
um then on the right of the table you'll see one to one mapping
00:06:05
of the sections of what i just described in how we can more latin scholar
00:06:08
so again the green part we generate a zero shares the red part we model the secret data we have
00:06:15
which would not like to reveal and then finally in the shared some we use the additions we just defined
00:06:21
to compute the final some under be able to uh to
00:06:24
the world um this for the sake of the demonstration here
00:06:29
uh we have edition and a secret addition to find on element wise of course with some meters
00:06:36
colour tricks we can make this look much nicer but for the demo we we did not do this
00:06:41
so i guess i was too now i know that the average salaries i also learned
00:06:47
something you know what you have just seen is an example of what is known as
00:06:53
a secure multiparty computation so this is actually
00:06:57
a subfield of typography uh where the goal is
00:07:01
to create protocols are algorithms uh for different parties to
00:07:06
compute a joint function such that without revealing their private data
00:07:12
so to illustrate this again what it means is you had secret shape data that is sort of hidden
00:07:20
only i know my share but when we joined together if we
00:07:25
get something we get a result so we can of course goal
00:07:29
back and forth right the idea is to jot computer joint function
00:07:34
which gives you something uh that you are looking for all together
00:07:38
um so well how does in p. c. work so what we have seen
00:07:41
is an example of additions so as you said an average is reveals some
00:07:46
and you get the and uh and then you compute the average by dividing by the number of people
00:07:52
well edition is great uh but it's not enough really is it we may need a bit more so
00:07:58
well not multiplication of secret values some modification is a nonlinear function uh how we do that well
00:08:06
uh i i won't go too much into the exact details about it uh but the general idea is that
00:08:12
like you we can also do multiplication and please come talk to us to find out more
00:08:17
uh the main idea is that we can do multiplication but to make it a bit more efficient
00:08:22
uh in terms of apollo in terms of performance we introduce any the concept of a trusted in his so
00:08:29
uh remember the oh yeah example so uh yeah cup gonna turn myself we
00:08:34
created our own shares of zero and then we exchanged it amongst each other
00:08:39
uh in this case we rely on a misty question mark here
00:08:44
to do that work for us so much a question mark is only involved in giving the shares to as
00:08:50
but when we actually do the computation he's out of the picture so this is what we mattress to the
00:08:55
uh and what does what did the shares allow us to do well basically
00:09:02
for multiplication dismissed the question mark needs to give us some free computed
00:09:06
values which we use uh for doing something just called mice can reveal
00:09:12
and in the case of multiplication this is known as beaver triplets is a lot of new works
00:09:17
uh don't worry about it you don't need to know much about the words just as i said come talk to as to no
00:09:23
more i just put these here because uh you know when you see the videos it's nice for you to go and we keep even
00:09:30
um right so we have edition of secret values we have multiplication of sick of values
00:09:37
which means we can actually do quite a lot of things because with addition multiplication we can express
00:09:43
any polynomial so which means we can do actually serious interesting stuff
00:09:48
in particular we can do things like linear regression and logistic regression because
00:09:54
uh you know they can be expressed as paul loneliness wait how is exponential
00:10:00
a polynomial right well i don't know if you remember your calculus classes for me also it was a while back
00:10:06
thankfully i also like that's the most uh in addition to my average
00:10:12
exponential you can uh you can approximate next make special function using uh something like telephone on
00:10:18
your so again back to point on us so no kind of works out quite nikki and so
00:10:24
these are just mathematical formulas but what you really can't do once you have such functions is classification right
00:10:32
so i mean i wouldn't want to be a a one of those she well is there
00:10:38
in public would be be quite uh quite unfortunate you get
00:10:43
confused the muffin yeah i'd like to be get confused with nothing
00:10:49
uh yeah so well when we get those so we get the
00:10:51
full enormous then we can actually go to the real world and
00:10:56
men p. c.'s actually valid and valuable use case has values case in the real world yeah
00:11:02
so if a general picture of the different kinds of use cases that we may have here
00:11:07
ah yeah different sectors financial services healthcare engine on x. digital advertising
00:11:13
defence and manufacturing so basically anytime you want to compute something uh
00:11:20
but you're not allowed to share data for various reasons right the data is sensitive they can be legal they
00:11:26
can be just computational they can be anything uh you can use a technique like ours like m. p. c.
00:11:33
uh to do this so that's going to you know one of the bank cases is the case that we studied with i. n. g.
00:11:40
uh example i use a bank it has a very
00:11:44
rich customers in some of the richest countries in europe and
00:11:48
uh what it wants to do is uh build a some sort of credit scoring model uh on on its clients
00:11:55
but the problem is the user data can't cross the country because you know of various laws in particular
00:12:03
i'm sure you've heard of g. d. p. r. which is you know yeah what is one of the one of the
00:12:06
biggest uh things in the last two years in terms of
00:12:10
privacy ah privacy related computations uh legally speaking of course so
00:12:18
uh if you have something like uh and multiparty competition engine then
00:12:22
you could still build your model without the data having to cross borders
00:12:29
let's look at a slightly more fun example uh that's a you have you know
00:12:33
the usual russians and americans maybe chinese now because that's what it was like kindly
00:12:38
website lights on knife and they don't wanna be where the positions the
00:12:41
satellites are yet they don't want to create a destruction in outer space
00:12:48
uh so you may want to uh see quickly computers you could detect collisions and this is not
00:12:55
uh an imaginary example we had such a thing happen in two thousand nine
00:13:00
uh and uh you know it's just nice to avoid such things because this actually
00:13:06
uh then uh resulted in the international space station having to move or because it just to avoid and
00:13:12
avoided every and stuff like that so this is not ah a a joke really it is serious stuff
00:13:19
and uh so of course we also uh you know we had we
00:13:22
know that computationally um we can preserve privacy we can have the security
00:13:27
it's also nice to know that we know consulted a lawyer is uh and they can
00:13:32
mackenzie are experts in this area and they have concluded as well that uh we don't
00:13:38
no cross uh we don't violate the g. d. p. r. so that's also great
00:13:43
news legally speaking if your interest insulation uh this is definitely one way to go
00:13:48
so
00:13:50
i we show us short example of embedding is kyle uh we
00:13:54
have done a bit more work in our uh prototype here we have
00:13:59
multiplication involve so even if you don't wanna come to talk to as you can definitely see the code and find out how it works
00:14:06
great well thank you very much for it to have a direct embedding so what is the direct embedding give us once more
00:14:19
what have we done here we've taken code now we can write code in a a declarative or machine learning
00:14:26
style we can write machine learning algorithms as if they were written in sky or python of your favourite language
00:14:33
and due to the embedding we were able
00:14:35
to get the security multiparty computation protocols a running
00:14:41
uh but actually in reality weeny bit more right because in reality the computations are
00:14:48
distribute it they're not just an array and so we must tiger run time which one
00:14:53
securely in each of the parties so that's one reason why it's not enough just having bedding
00:14:58
um the other reason actually a more fundamentally mathematically speaking is we need
00:15:04
to do some static analysis of our programs such that we can sort of
00:15:11
alton eyes for memory and uh for communication purposes
00:15:16
and we also want to be able to computers and statistical
00:15:19
distributions which uh we will get into yeah in a short while
00:15:26
yes so rather than having only library we you decided to write a compiler
00:15:31
um the the idea here is that we will take a
00:15:36
high level language something uh data scientists would be familiar with
00:15:40
uh something that looks maybe like python matlab so in a linear algebra style and uh
00:15:49
at this uh in high level language will abstract away completely of any
00:15:52
kind of low level n. p. c. details so anyone could use this language to write
00:15:57
a program as they would you would you would usually do the compiler will then generate a
00:16:03
um the transformers program into low level primitives that get distributed and actually run
00:16:09
so i am that that's that a touch of the case for the the real life deployment and then
00:16:14
of course there's also lots of static analysis that is done a kind of what men are just mentioned
00:16:20
uh i'll i'll give you a quick overview to uh of the the system we
00:16:24
have it in for and how to compile a fits into this distributed a reality check
00:16:30
so there are a couple components involved to do a real life a secret computation
00:16:37
well i will start going or are from the bottom work our way to the top in terms of obstruction
00:16:42
so at the bottom we have a a players too so players are
00:16:47
is the technical term for uh parties of data um we have
00:16:50
other players who have access to some data which they would not like
00:16:54
to reveal each one of them will run a virtual machine on the
00:16:59
system that has access the data this virtual machine can be verified that
00:17:04
uh there there is no uh data ever leaked out um
00:17:09
uh the then these these but so these virtual machines will do the number crunching bees will actually run the secret compute algorithm
00:17:16
to receive the true that the algorithm this is where the compiler fits into the picture
00:17:22
uh we also have the trust the below which is what we
00:17:25
mention what manner mentioned earlier to support any some more advanced uh computations
00:17:30
and just to make it a little bit easier to use there is
00:17:33
also a nice front and the compiler is a hosted by in for
00:17:38
uh the engines things running the programs are of course hosted by customers
00:17:43
so how would look like if we want to run it a computation
00:17:47
well the analysis through a front end with semitic invitation
00:17:51
to the compiler compiler compiles a program distributes amongst all players
00:17:56
then the trust the dealer generates random numbers sends them to the
00:18:01
players so the trust the dealer knows of course what random numbers too
00:18:04
to generate a jeep thanks to the compiler undercover cops or
00:18:08
cooperation between the compiler and uh some engine components um and then
00:18:13
there is the the the the computation actually starts amongst the engines is the communication exchange of data
00:18:19
and finally the end result gets sent back to uh the one is submitted the operation now
00:18:25
this uh trust the deal i'm just wanna take two minutes to dig into detail your wireless is still secure because you could think like a
00:18:32
we generate random numbers why should anyone trust us we could spy on you right after that and the key thing here is
00:18:39
uh the you'll notice that after the trust the dealer has sent out numbers there's no more communication with the trust the people
00:18:45
so that means we do not have any knowledge of any data that leaves the the data sources
00:18:51
so that even if we had a malicious intent and knew all the random numbers that we generated
00:18:56
we would not know any intermediate results in there would be no way for us to reverse what comes out of
00:19:02
so that was um the uh the distributed part the overview of the architecture now um
00:19:09
well why else in your compiler so we're talking about uh you know
00:19:12
the actual mathematical uh properties behind this so this is going to be
00:19:17
uh the most technical part of this talk uh but i just
00:19:21
wanna give you an intuition about what it means to mask numbers
00:19:25
so what do we mean by masking numbers i will only an example we had
00:19:29
we will masking our salary by generating these random numbers and adding them up such that
00:19:34
when i look at a number i couldn't really make out
00:19:37
what's the the initial or the original plain text value was
00:19:43
uh so of course we did with into into ages right so how do we mask into ages well
00:19:51
it's a relatively easy um easy thing to do um and integer let's
00:19:56
look at uh the sixty four bit intuition so we have a fixed
00:20:00
uh we have a fixed size with a fixed range and which means that i
00:20:04
can uniformly at random pick any integer and add it to my plain text and
00:20:12
um so what is so how do i know that uh how do i know that
00:20:15
this is secure enough well the security model is given to a given to mosque to values
00:20:22
can i differentiate them can i distinguish one from the other and in this info
00:20:28
in in this integer case i can't because it's everything is pick uniformly at random so
00:20:35
i'm not otherwise it really with mass values and this gives as something a property which is very uh very
00:20:41
nice just call information to erratic security so this is sort of the best of the best that i can do
00:20:48
unfortunately uh in real life we have floating point numbers
00:20:52
and those are slightly higher harder to mask so i'm gonna
00:20:56
try to get a bit of intuition as to why that is well in floating point numbers the first thing is
00:21:01
the fixed range is something that does not exist we have uh basically it's it's actually real numbers
00:21:06
and so how can i ask real numbers well the best i can really do is to uh take
00:21:12
my plain text and uh have a normal distribution around it and then pick something uh in there now
00:21:20
if i pick a normal if i think a distribution that is too small then i may be
00:21:25
able to distinguish two values and that's sort of on the left with the double can hump here
00:21:30
so i think off a me picking two different values i think it's sort of zero and one
00:21:35
and i put the small distribution not a big variance and um so
00:21:42
the only the only way these two values are distinguishable is if
00:21:45
i fall you know between the likes of the canon here uh
00:21:49
if if i'm if i'm able to pick something uh that's in this
00:21:54
uh in this space then i'm good that means i can't distinguish the values
00:21:57
however if i happened to have generated my uh zero in this area in the
00:22:03
left hand and my one in the right hand that means i can clearly distinguishable of
00:22:08
and that's a problem because that means an attacker can i know was and you've
00:22:11
leaking uh you eating your data right um sell what you note also is that
00:22:18
well so these humps i large so that means i have they've high probability here of falling
00:22:24
uh into a case where i can easily distinguish mine on this so what is the solution there
00:22:31
the solution is on the right so is it taking a small distribution i take a very large one i take sums
00:22:37
the okay i take things that are sufficiently large uh you see that the scales have considerably changed on the right hand side
00:22:44
such that the humps are very very small
00:22:49
or sufficiently small so what is sufficiently small mean
00:22:55
so sufficiently small means that i have a very low probability of falling in those humps and
00:23:01
i get computation security so i want so my attacker
00:23:05
could actually if maybe distinguish the value but to do that
00:23:10
he would have to do a lot of work because you'd have to send up
00:23:13
to a a large number of queries just to be able to distinguish the values so
00:23:19
we we don't have information kinetic security when none the wiser with the mass values however
00:23:25
uh what we have here is to making the work of an
00:23:27
attacker much more difficult by making the humps as small as possible
00:23:33
and ah and yeah that's that's what you get with computation security uh but however the problem is now as i told
00:23:40
you we have this uh we have these homes let's go back to these fonts so if i think of a large distribution
00:23:47
that means that what in impact this what does it mean it means that if i have
00:23:50
a floating point let's say that's ten bits long so it's smaller than about one thousand three four
00:23:55
i i didn't take a sufficiently large distributions so i need to mask using maybe forty extra bits because
00:24:01
i want my attacker to be able to be work to work very hard so this for ten number
00:24:09
i'm using a forty bit number so i find multiply to such nonetheless
00:24:14
then you know i can do a lot right so if i had when i multiply plain text i have
00:24:19
attended number uh and i multiply with another technique then but then i can get twenty that number at most
00:24:25
twenty that non this is fine because it's a floating point it can still fit
00:24:28
and the sixty four bits but it fourteen forty well you know i'm really blowing up
00:24:33
and so that's an issue so those are kind of things that ah you know we uh basically need to
00:24:41
on the compiler side and allies and estimate says that we can solve and show that we don't belong
00:24:48
so there is enough uh you know what we do is we don't we use a floating point representation
00:24:55
oh you something that's called a fixed point representation to represent a flow a fling points in our backend
00:25:01
and these different number representation sort of a help says
00:25:06
avoiding this format explosion or we please don't explode as quickly
00:25:10
and this is what we and ally statically in the compiler so that we can propagate the
00:25:16
statistical bounce correctly to the whole program before we sending to the engine
00:25:24
yes exactly so uh um uh our give a quick overview no uh now
00:25:29
of uh what the compile actually does and how the input language looks like
00:25:35
let's jump right into it this is uh the some source code for my our language uh looks
00:25:41
pretty much likes colour except a semi colon is still required i'm sorry well why it's a long story
00:25:51
um uh the it it doesn't look really uh exceptional and
00:25:56
that's by design we want it to be as easy to use
00:25:59
for anyone who does not know anything about m. p. c. so it
00:26:03
it really looks like some of us some some linear algebra or code
00:26:07
the only thing you'll note is there are these exhort bought things
00:26:14
these are you can think of them as a calls to standard library
00:26:18
or some primitive instructions this is these things are known why
00:26:24
the engines known by the machines that have access to data um
00:26:29
but not so so so they're not they're not known quite like that's
00:26:33
what we're again not showing here all these parameters these masking parameters sizes
00:26:38
which which are not of relevance for a a developer but
00:26:41
of course very important uh in the actual uh run time
00:26:47
so on top of these primitives them we build a language on it it is strongly typed in fact it
00:26:54
everything is always in lined everything uh gets constant folded
00:26:59
so that uh that these primitive set we can in for a parameters
00:27:04
such as masking sizes to the primitive to me actually generate the code
00:27:08
we get into that in a interest in the second so when
00:27:12
you compile a program or are you just submitted as a source file
00:27:17
and the compiler is a good cause for various phases
00:27:22
but like a traditional compiler for around eight phases so
00:27:26
we do things like all parsing type checking et cetera
00:27:29
and then we get into our so some more uh details where we actually
00:27:33
do the crypt graphic card and this is where we need to check that
00:27:37
uh for example if we multiply two numbers uh that we the them that although with the the resulting
00:27:43
number um maybe will will be larger that the mosque does not just naively get doubled maybe we can uh
00:27:50
because we know something about the distribution of data we can uh take a small mosque and just be much more efficient
00:27:58
this is what comes out of the compiler
00:28:01
so it might look a little daunting and i it looks a bit like is
00:28:05
simply because this is what it is it's assembly for b. m. p. c. engines
00:28:10
but will walk you through through some uh what happens here um so that you can understand uh what's going on
00:28:18
the first part is memory management so right in the beginning
00:28:23
we said that one of the jobs of the compiler is too
00:28:26
uh to make sure we're as efficient as possible and we're doing matrix multiplication these things require a lot of memory
00:28:33
so we need to make sure that we can actually so that
00:28:35
we allocate and the allocate memory when the variables i no longer use
00:28:40
so in blue you see currently all the uh the crew or create
00:28:44
colds of so these are like it this is like a matlock essentially
00:28:47
uh delete is not on the slide it's happened just below but don't worry about that then we have uh the bill tents
00:28:56
so bill tens always work on containers container is again like uh think would like a variable these uh built
00:29:03
ins know however so these buildings map to do what i showed here the x. or the these language buildings
00:29:11
the difference however is that we now show it or built ins
00:29:17
have some extra parameters passed to them and these are the result of the compiler
00:29:22
there's also all dimensions of any input variables so these other things
00:29:28
you seen brown here they all get folded they all get uh
00:29:33
replaced by the actual values so in the end so the the example i'm showing here is
00:29:38
we have a a number of rows in the number of columns these are taken from the input values
00:29:44
but however at run time uh since it's the distributed computation
00:29:49
we need to uh interviews a fashion as possible we need to know the
00:29:52
sizes of all all input uh in advance and that gets propagated through the program
00:29:58
and then finally in red this is actually the key part of the compiler
00:30:02
is inferring these parameters so these are these uh
00:30:06
of magical masking values that we try to minimise
00:30:10
in the compiler to make is the the computations as efficient
00:30:13
as possible now you could technically right this by hand right
00:30:17
but of course uh and that's actually what we have done right in the beginning but uh using a compiler also allows us
00:30:24
to compose programs much remote much better so rather than
00:30:27
writing up repackaged linear regression you know assembly routine you could
00:30:32
you can actually uh build a full a full a mathematical
00:30:36
library with these these operations and compose them to your liking
00:30:41
so what we currently have is the compiler only works with m. p. c. so this is the the protocol we we
00:30:47
talked about however since it's a compiler and we can work one on this again these high level uh what uh languages
00:30:54
it would be nice if in the future we can integrate other privacy preserving techniques it this is
00:31:00
uh something more after racial that we would like to get to carly we are focusing on m. p. c.
00:31:07
but the compiler is in a in a prime spot to be able to work as an intermediate for all these technologies
00:31:14
i'm finally i only give a quick what about uh our teamwork hardly six people
00:31:20
working on the compiler at in for a company where uh quite some more um
00:31:27
well actually growing and uh the salary is not as bad as we mentioned in the example
00:31:35
so yes please come talk just your more detailed
00:31:46
i think we have a enough time for questions
00:31:49
yeah quick yeah we're we're actually tens in one one for these
00:31:56
yes plastic megaphone uh_huh yeah that's pretty cool things to talk about the i mean i
00:32:04
guess one of many questionable contributor but uh i'm i'm not the type system then you
00:32:08
in for the metrics dimension i guess yes so so if i if i have
00:32:12
any special the metrics mention doing compile time you will tell me but that's the good
00:32:17
yeah well so the the compiler is uh it's it's not a turing complete language everything must terminate
00:32:23
and we need to know a lot of information about the input so the shapes we need to know
00:32:27
uh sizes of the inputs are actually when you see this
00:32:30
uh x. or input built in you you see your this x.
00:32:34
uh needs to be known by we yeah we need to know the yeah yeah please okay sounds good so so we get it's important
00:32:40
to emphasise that we need to know just the properties of the
00:32:43
data and if there's not they'd act as as is still private basically
00:32:46
yeah not a and then there's value but you you're in for that or for uh you know this
00:32:51
a complex of multiparty competition you use it as a part of the touches them
00:32:55
to something even for a sport of the type checker so uh that's actually a a
00:33:02
of if a trick question and that complex questions let me this ambiguity that a little bit um
00:33:07
so we don't encoded in the type system in our implementation uh what we do
00:33:14
instead is uh you know to the multiple phases with a bit of partial evaluation insulin
00:33:19
um but yeah with i mean i guess it's got three little types it would be very fun
00:33:24
with a oppressive you thought about writers we it is good for that matter if this would be
00:33:30
so that when it isn't something that i would like to experiment as well to play around with
00:33:34
um but i think the key uh really his yes you still at some point need something like a compiler so you
00:33:40
need to go from sort of a you know shallow embedding to deepen bedding and actually actually have access to your trees
00:33:47
uh because in particular i mean if you look at a phase for here uh
00:33:53
as they say is actually very critical for us because uh we want to generate
00:33:58
efficient assembly such that memories properly allocated in the allocated and so you need to convert
00:34:04
to some uh nice form so that you can uh you can do to
00:34:09
uh yeah it makes sense anyway you want most to uh matthias because you have very specific to
00:34:14
exaggerate before one so the uh it's actually the the key part what we're focusing mostly is anything after
00:34:20
round face eight anything before that is general more compiler construction
00:34:24
the um the the front end like how the language looks
00:34:27
this is something we could imagine changing infinity multiple of them wrote put together to potentially you thank you thank you
00:34:36
oh have a few questions yeah
00:34:40
uh yeah i was so questions first is a house i wasn't sensitive
00:34:44
and too noisy with somebody you want to use apply a mission or any
00:34:49
using some back propagation available uh using exponential functions even just use
00:34:55
approximation uh then you add uh some run the numbers into the inputs
00:35:00
uh which might make the uh back to our publishing have problems how these so this
00:35:07
i'm currently we don't know so this is uh definitely something uh you were working on um
00:35:13
so you you whatever you need a row to probably get to
00:35:16
get some results back from another computation as input your new computation
00:35:20
we need to submit a new program at the moment and save intermediate results to another place uh it
00:35:26
we we uh working on uh maybe making this more seamless mine too
00:35:31
and and and and and with and with regards to a precision uh right 'cause
00:35:35
europe approximating but uh what uh we try to we try very hard to do is
00:35:40
uh we want a out of the same precision whether you would do it in
00:35:45
plain text or with l. p. c. and so are kind benchmarks are that were
00:35:49
uh what and a seven to eight decimal point after the
00:35:54
after the the unit so which is what uh our customers also expect us to have
00:36:00
and that's already uh that's that's uh okay so it's a man that's you
00:36:03
can always keep these push change yeah uh regard uh what kind of imports
00:36:10
yeah so so that's that's what uh the the the the key or the most difficult part rises with a static analysis
00:36:16
okay uh this thing get that my next question is uh oh what is the uh
00:36:21
inputs for the chance adidas in general uh for all these uh for for your libraries
00:36:28
um the included yeah the would maybe input to the trust and uh what kind of information you should uh in general
00:36:34
you provide for the transit years yet so the uh the trusted so the this uh compile program which comes out of it
00:36:41
is uh you can think of it as a a graph it's
00:36:43
a back essentially um and the this gets fed into the trust
00:36:48
that you look to the trust the dealer knows how much memory
00:36:51
like so how many secret shares are have a need to be generated
00:36:55
generates them and then sends them to the the engines e. we give the trusted you'll become a program
00:37:02
oh okay okay
00:37:05
um okay um i we can discuss it it has some more questions so my and my last question is
00:37:12
uh because these are the summer commanding a right uh a lot if there's
00:37:17
some thirty years happens seeing one of the nodes on a high again to websites
00:37:21
i um so the uh currently uh the system is designed to handle large amounts
00:37:27
of data not large amounts of parties and so will have i mean we can have
00:37:32
so just as ever parties but we wouldn't have like thousands of parties
00:37:35
in computation um so if something goes wrong we would just restart the computation
00:37:40
uh and the how how do you well if i like uh oh
00:37:43
it is quite some knots right uh yeah that's a good question um
00:37:49
uh the uh yeah but maybe we can just get a that's a good that's a ah yeah yeah yeah flush okay
00:37:56
uh so
00:37:58
oh okay yeah
00:38:01
oh uh i think it's uh i just go to summarise all the
00:38:05
morning directives animal which protocols that could independently verify the correctness of the competition
00:38:12
in just to determine yeah so one of the the protocols
00:38:14
that they're used is used also in this capability currency z. trash
00:38:19
it's called the keys knocks you can do work yeah
00:38:24
so um yeah i mean uh i don't have proper last because i'm i'm like whether they
00:38:31
had no i'll pop into bob chair when we investigate the yeah the cat problems like anybody
00:38:38
so when i was mad inoperative at the well if you want to provide approves or
00:38:43
even once maybe yeah just work with need is that for one community you could be like
00:38:48
uh it means they can still generates approves you know so well you can you can provide
00:38:54
who's the supreme supreme picnics in which we can provide pools we were made it with her
00:38:59
uh external company especially is is in a moment but does your mortgage protocols
00:39:05
so for example the problem that you saw gear up about program we can give you a room that convinces
00:39:12
but but but but brightens animal that should prove that
00:39:16
the function use minimise the objective function machine when does minimalist
00:39:21
so if i may interrupt yes please take this off line because a few more questions at the back at the picture had a question i can't thank you whereas
00:39:32
i think this may be going to the uh to the hard part uh but so do i infer
00:39:36
correctly that you you need to estimate the ranges of these uh or or a real number approximations that you're
00:39:43
encountering yes exactly so we have a um actually what one of
00:39:47
the input is not only necessary size also distribution of the data
00:39:52
so for certain types of uh additions we need to know that okay so the the the type systems
00:39:59
l. that or do you feel that kind of upset interpretation the medical domains or a better match yeah um
00:40:07
so i think i think honestly it's a big snakes of both again uh for us since um
00:40:12
so so we're using sort of a a cheated partial evaluation
00:40:16
to do this uh but sort of the calculus itself is um
00:40:20
uh so we haven't finalised this but i think something like i
00:40:23
mean you need you need a the actual sort of uh distributions there
00:40:28
uh maybe one uh of oneself on so that
00:40:32
gives us a problem from the future technique is sometimes
00:40:35
a statically estimating them using static a mathematical knowledge of distributions
00:40:39
may not be enough so you may need to do some complications at compile time
00:40:44
uh some random sampling about them to get you better uh sounds such as some simulation
00:40:50
yeah isn't yeah so i don't know if that answers the question thank you with it
00:40:57
could you please uh point some uh public results
00:41:00
uh that explain no they agreed with you you
00:41:05
uh so yeah definitely so there is a a the sort of the the main reference for
00:41:09
what uh we do is uh how the the advice again and it's yeah the flights but
00:41:14
it's called the best speeds protocol so high speed easy for the for office
00:41:19
uh that that that brought this and this is uh this is sort of the setting that we use which is
00:41:25
uh you know on this but curious uh with trusted the
00:41:30
um now that would be a good a good place to start another good place to start i would say is uh
00:41:36
uh check out so actually check out our people because we've posted about links and that is one of
00:41:41
the tutorial by uh morton down uh he goes into some of the details it's very uh very educated
00:41:48
yeah we have
00:41:51
so he see the the sorry the other language do we want to open source it
00:41:56
so that's uh um as a very uh adjusting question and sort of tough question because actually
00:42:04
we had this um yeah so yeah the the problem is uh so
00:42:08
i security model uh uh as of tolerate somethings but there are other attacks
00:42:14
so particularly if i i can uh send multiple uh sort of orchestrated values
00:42:21
to the secret computation says that i can recover uh the other person's data
00:42:26
and so this is a where something like a privacy budgets uh would be
00:42:32
nice so you wanna have a static analysis which tells you hey you've sort of
00:42:37
you you walk the competition you're doing with the really really data and we don't have this for now and so
00:42:42
uh since our cost of on our customers are fairly a half very sensitive
00:42:47
data we don't want them to shoot themselves in the foot just yes it an
00:42:51
example would be you just to know all the identity or even just transpose
00:42:55
the matrix and give me the result back or you can trivially we for that
00:42:58
so we don't have this concept yet but uh it's something which would be super exciting if we could do some research into that
00:43:06
and you know um so if there are some correctly though some trade off
00:43:12
between like your whole complex or masking ease and at home security security's versus
00:43:19
how much your computations good time but with the competitions and
00:43:23
we're going to take so how do you choose that a trail
00:43:28
he um i don't know actually the choice into your johnson you'd hardly a forty bit
00:43:33
yeah so so so so cute is actually paramount for yeah i mean i let me that are uh confirm that but the idea is
00:43:39
uh get customers are uh you know the needs as the actual some them
00:43:44
the security which in this case is the computational security i was mentioning earlier
00:43:49
so we need to do we need to make sure that all the algorithms and protocols um match this security requirement
00:43:55
that's the tradeoff music so so basically if you think about standard symmetry
00:44:00
conclusion now with these these standards for how many bits of security you want
00:44:05
uh in the uh what is the amount of work that you do not want to
00:44:09
be gusty what example a yes so people advice make certain your body to bits of security
00:44:16
so based on that these standards you couldn't find your when compiling parameters
00:44:22
that are going to guarantee security before the political problem and we have um i think
00:44:29
so it's wrong but it's to to the forty uh bits about
00:44:33
forty bits for online uh operation so interactive queries which is largely sufficient
00:44:38
for any uh data address that you could like if you were to take it and analysts that much more secure like a
00:44:44
yes uh one two to one twenty eight or something yes all
00:44:48
so you want to match the security of a us one twenty
00:44:52
okay does that actually a funny names of the our product called x. or
00:44:57
that many reasons one of them is x. or is uh from one time pad via information theoretically security
00:45:03
i mean it sort of austin used in many cryptic graphic uh operations on the question the back yeah
00:45:10
alright thanks uh i have a question regarding the g. d. p. r. because it is right to be forgotten
00:45:16
i enjoyed it they are and they're from the algorithm itself it looks like
00:45:22
one of the am police likely in first example exercises the right to be forgotten then i
00:45:28
need to re computed the various for everyone else and it feels like a very computation heidi
00:45:35
in case you have a lot of users how do you tackle that
00:45:39
um well i uh so my answer to this is that i'm not can't use cases uh
00:45:46
uh we don't uh so we don't have the kind you haven't faces issue directly
00:45:51
because most of i can't use cases are about as of one entity
00:45:54
managing multiple a distinct uh but sensitive entities right like a back for instance
00:46:00
it's all the data sources i belong to them uh they just a lot this allows them to sort of exchange things
00:46:07
but i do a user of the bank and i can request my data to be forgotten forever by the bank
00:46:12
so i don't want it to be used in the training of the models i don't want it to be used everywhere and it's my g. d. p. uh right
00:46:19
so at feels like in that case the bank needs will be native like may computing
00:46:23
a lot of things so that's why we need to make a very efficient i guess
00:46:28
yeah we we don't uh i just this problem iridium privacy preserving so i don't
00:46:33
know maybe this i i don't know the exact legal details of g. d. p. r.
00:46:36
but uh is it not it may be sufficient if you and and the
00:46:39
normal mice maybe user handles or anything uh before they get a good share
00:46:45
so what is important is that your bleep a member please the bank never leaves the source with this protocol
00:46:52
that's would be good big mackenzie estimated the this falls outside of the scope of g. d. p. or
00:47:00
or any other questions or are you hungry oh yeah one more question
00:47:05
the look and how does you a solution currently compares to the um well
00:47:12
they can go to the what the decision fall
00:47:15
a policy preserving the competition like uh what you
00:47:19
uh you can uh you know uh done so full figured he'd and comes in mind
00:47:26
sorry oh so helpless also motions for multiple
00:47:30
topics i compare store principle was not the question
00:47:35
um well it would be more on the uh commit
00:47:38
suicide because uh you'll you'll complete details to the uh
00:47:43
uh people like it and so for that and so full but the the yeah the good good
00:47:48
and the using tons of lucidity you can do a seat you a multiparty competition like
00:47:55
like the i see the questions also this is our miserable distinction between all mobile multiparty
00:48:01
competition and validated links all pretty rigid lump
00:48:04
is a protocol where we could all competition
00:48:09
here everybody in this room so everybody computes locally on advice
00:48:14
from sam's a local does all apple server that is due in
00:48:17
part to get the result will be completely should so i'll obviously
00:48:22
this is this is really the from them secure multiple to completion
00:48:25
well you know i'm going to people
00:48:30
your jeff just computer company couldn't example is distributed
00:48:36
yeah but the competition you're getting better as old yeah second chance so you don't have to result
00:48:43
two but probably probably the analyst so it's it's
00:48:47
different but but we're in a very different used pieces
00:48:51
and your situation where for example you want will compute between
00:48:56
action for cost controls them seeker multiparty competitions for for many years
00:49:01
mathematical functions as much more appropriate them them but they
00:49:05
did learning mobile exam posted invalidated learning obviously because these are
00:49:09
the warmth many many many buttons to do a collective competition are
00:49:14
you from from a security point of view it's it's it's very different

Share this talk: 


Conference Program

Welcome!
June 11, 2019 · 5:03 p.m.
1574 views
A Tour of Scala 3
Martin Odersky, Professor EPFL, Co-founder Lightbend
June 11, 2019 · 5:15 p.m.
8337 views
A story of unification: from Apache Spark to MLflow
Reynold Xin, Databricks
June 12, 2019 · 9:15 a.m.
1267 views
In Types We Trust
Bill Venners, Artima, Inc
June 12, 2019 · 10:15 a.m.
1569 views
Creating Native iOS and Android Apps in Scala without tears
Zahari Dichev, Bullet.io
June 12, 2019 · 10:16 a.m.
2231 views
Techniques for Teaching Scala
Noel Welsh, Inner Product and Underscore
June 12, 2019 · 10:17 a.m.
1295 views
Future-proofing Scala: the TASTY intermediate representation
Guillaume Martres, student at EPFL
June 12, 2019 · 10:18 a.m.
1156 views
Metals: rich code editing for Scala in VS Code, Vim, Emacs and beyond
Ólafur Páll Geirsson, Scala Center
June 12, 2019 · 11:15 a.m.
4695 views
Akka Streams to the Extreme
Heiko Seeberger, independent consultant
June 12, 2019 · 11:16 a.m.
1552 views
Scala First: Lessons from 3 student generations
Bjorn Regnell, Lund Univ., Sweden.
June 12, 2019 · 11:17 a.m.
577 views
Cellular Automata: How to become an artist with a few lines
Maciej Gorywoda, Wire, Berlin
June 12, 2019 · 11:18 a.m.
386 views
Why Netflix ❤'s Scala for Machine Learning
Jeremy Smith & Aish, Netflix
June 12, 2019 · 12:15 p.m.
5026 views
Massively Parallel Distributed Scala Compilation... And You!
Stu Hood, Twitter
June 12, 2019 · 12:16 p.m.
958 views
Polymorphism in Scala
Petra Bierleutgeb
June 12, 2019 · 12:17 p.m.
1113 views
sbt core concepts
Eugene Yokota, Scala Team at Lightbend
June 12, 2019 · 12:18 p.m.
1655 views
Double your performance: Scala's missing optimizing compiler
Li Haoyi, author Ammonite, Mill, FastParse, uPickle, and many more.
June 12, 2019 · 2:30 p.m.
837 views
Making Our Future Better
Viktor Klang, Lightbend
June 12, 2019 · 2:31 p.m.
1682 views
Testing in the postapocalyptic future
Daniel Westheide, INNOQ
June 12, 2019 · 2:32 p.m.
498 views
Context Buddy: the tool that knows your code better than you
Krzysztof Romanowski, sphere.it conference
June 12, 2019 · 2:33 p.m.
393 views
The Shape(less) of Type Class Derivation in Scala 3
Miles Sabin, Underscore Consulting
June 12, 2019 · 3:30 p.m.
2321 views
Refactor all the things!
Daniela Sfregola, organizer of the London Scala User Group meetup
June 12, 2019 · 3:31 p.m.
514 views
Integrating Developer Experiences - Build Server Protocol
Justin Kaeser, IntelliJ Scala
June 12, 2019 · 3:32 p.m.
551 views
Managing an Akka Cluster on Kubernetes
Markus Jura, MOIA
June 12, 2019 · 3:33 p.m.
735 views
Serverless Scala - Functions as SuperDuperMicroServices
Josh Suereth, Donna Malayeri & James Ward, Author of Scala In Depth; Google ; Google
June 12, 2019 · 4:45 p.m.
936 views
How are we going to migrate to Scala 3.0, aka Dotty?
Lukas Rytz, Lightbend
June 12, 2019 · 4:46 p.m.
709 views
Concurrent programming in 2019: Akka, Monix or ZIO?
Adam Warski, co-founders of SoftwareMill
June 12, 2019 · 4:47 p.m.
1974 views
ScalaJS and Typescript: an unlikely romance
Jeremy Hughes, Lightbend
June 12, 2019 · 4:48 p.m.
1377 views
Pure Functional Database Programming‚ without JDBC
Rob Norris
June 12, 2019 · 5:45 p.m.
6374 views
Why you need to be reviewing open source code
Gris Cuevas Zambrano & Holden Karau, Google Cloud;
June 12, 2019 · 5:46 p.m.
484 views
Develop seamless web services with Mu
Oli Makhasoeva, 47 Degrees
June 12, 2019 · 5:47 p.m.
785 views
Implementing the Scala 2.13 collections
Stefan Zeiger, Lightbend
June 12, 2019 · 5:48 p.m.
810 views
Introduction to day 2
June 13, 2019 · 9:10 a.m.
250 views
Sustaining open source digital infrastructure
Bogdan Vasilescu, Assistant Professor at Carnegie Mellon University's School of Computer Science, USA
June 13, 2019 · 9:16 a.m.
374 views
Building a Better Scala Community
Kelley Robinson, Developer Evangelist at Twilio
June 13, 2019 · 10:15 a.m.
245 views
Run Scala Faster with GraalVM on any Platform
Vojin Jovanovic, Oracle
June 13, 2019 · 10:16 a.m.
1340 views
ScalaClean - full program static analysis at scale
Rory Graves
June 13, 2019 · 10:17 a.m.
463 views
Flare & Lantern: Accelerators for Spark and Deep Learning
Tiark Rompf, Assistant Professor at Purdue University
June 13, 2019 · 10:18 a.m.
380 views
Metaprogramming in Dotty
Nicolas Stucki, Ph.D. student at LAMP
June 13, 2019 · 11:15 a.m.
1250 views
Fast, Simple Concurrency with Scala Native
Richard Whaling, data engineer based in Chicago
June 13, 2019 · 11:16 a.m.
624 views
Pick your number type with Spire
Denis Rosset, postdoctoral researcher at Perimeter Institute
June 13, 2019 · 11:17 a.m.
245 views
Scala.js and WebAssembly, a tale of the dangers of the sea
Sébastien Doeraene, Executive director of the Scala Center
June 13, 2019 · 11:18 a.m.
661 views
Performance tuning Twitter services with Graal and ML
Chris Thalinger, Twitter
June 13, 2019 · 12:15 p.m.
2003 views
Supporting the Scala Ecosystem: Stories from the Line
Justin Pihony, Lightbend
June 13, 2019 · 12:16 p.m.
163 views
Compiling to preserve our privacy
Manohar Jonnalagedda and Jakob Odersky, Inpher
June 13, 2019 · 12:17 p.m.
301 views
Building Scala with Bazel
Natan Silnitsky, wix.com
June 13, 2019 · 12:18 p.m.
565 views
244 views
Asynchronous streams in direct style with and without macros
Philipp Haller, KTH Royal Institute of Technology in Stockholm
June 13, 2019 · 3:45 p.m.
304 views
Interactive Computing with Jupyter and Almond
Sören Brunk, USU Software AG
June 13, 2019 · 3:46 p.m.
681 views
Scala best practices I wish someone'd told me about
Nicolas Rinaudo, CTO of Besedo
June 13, 2019 · 3:47 p.m.
2702 views
High performance Privacy By Design using Matryoshka & Spark
Wiem Zine El Abidine and Olivier Girardot, Scala Backend Developer at MOIA / co-founder of Lateral Thoughts
June 13, 2019 · 3:48 p.m.
753 views
Immutable Sequential Maps – Keeping order while hashed
Odd Möller
June 13, 2019 · 4:45 p.m.
276 views
All the fancy things flexible dependency management can do
Alexandre Archambault, engineer at the Scala Center
June 13, 2019 · 4:46 p.m.
389 views
ScalaWebTest - integration testing made easy
Dani Rey, Unic AG
June 13, 2019 · 4:47 p.m.
468 views
Mellite: An Integrated Development Environment for Sound
Hanns Holger Rutz, Institute of Electronic Music and Acoustics (IEM), Graz
June 13, 2019 · 4:48 p.m.
213 views
Closing panel
Panel
June 13, 2019 · 5:54 p.m.
400 views