Embed code
Note: this content has been automatically generated.
well thank you all for coming and it's five pleasure to and to
talk to about using formal methods and i'm going to concentrate on
uh how you get started with what formal methods and some of the things that um
that you can apply for so um it's only fair to start off by saying what what are formal methods
a formal methods are a mathematically based or logically based
approach to uh to writing software and by writing software i mean
writing specifications that say what the software supposed to do a writing the code that's actually going to
to execute on the machine and then trying to connect those two pieces in in some sort
of way that is that through verification to check that the code is really doing what what
the specifications are telling us that the uh what we intend for the software to do
so it's um it's also good to explain the various words i had on my
longs title slide so formal methods was one and young software engineers was another
so what are young software engineers today well they're probably
engineers software engineers were just starting college maybe
i'm maybe they've gotten started in college maybe there are few years um until they start college
how can these people be um be getting exposed to formal methods so that they can use it in fact
is as they're developing software um and in industry or for uh for any other purpose so um
um formal methods are mathematically based that is logically based or they are precise
in some ways but why is that something that we would like
well there are a fair number of of of considerations
in software uh production today and one is correctness
when you when you write the program you um you would like to to make sure that doesn't crash that
is perhaps one of the one of the first things that you would think about software uh and um
in the um but twenty years ago we started
seeing in the press affecting serve everyone's lives
various is a source of um security attacks uh there were typically at that point um
it came through buffer overruns a buffer over on is just
uh your computer accessing memory word word shouldn't be so these
are program errors that allow certain kinds of security attacks
so memory safety is something that we would like well memory safety is something
that you can obtain if you start using the language that um
that is type safe that does not allow you to access any part of of memory like for example
c. program but there are other other things that you would consider as well you have um
you have run time exceptions so if you run a java program for example
um the program may it may all the sudden and because you've uh you've attempted some operation that was not allowed
so the good thing with that is that as the program and
setup point you're not uh you're not inviting further security attacks
but uh but having such problems in your software means that your your customers
are going to be um a bit worried by by this behaviour
and perhaps customers go to different uh different platform different application
and stop using stop using your apps stop using your web sites uh
they don't trust you anymore they they want to do something different
but it's also costly to and to fix these things later now
if it's yes is a crash you can't fix them by um
by reporting by collecting various kinds of uh a crash reports
that various apps and um and the software programs would report
back to you so because of the the software deployment
that process today you can get these things out to customers rather quickly for example if you have
uh if your um if errors are on the on the website you can just update your software on
the website you don't need to update that millions of apps all um all of the world
but then there's one more step which is that of functional correctness so why would you want to go all
the way to functional correctness to make sure that that your program is correct well sometimes you don't
some programs it's okay if they don't if they don't always work
um and wise that well there's lots of priorities uh
you um you may want to release the product just as quickly as you can um you may want to um
um to spend less money because you want to see that you are these
the features that the customers would want to have in the first place
no that's um these are strategies that's some uh some groups
and software companies uh it may choose to to apply
so i would not expect in the future that all software is going to be verified
to be functionally correct but there are places where functional correct this really matters
no one place where you would like to go beyond just the
the simple crashed for unison programs is the area of security
and when you when you deal with security there is no such thing
as as having a program that is just partially secure that's
it's not going to be just a little bit secure it's either secure or does and uh if it if it
has folder abilities that attackers can uh can make use of
then your data um uh is is not safe so
putting a price at obtaining security is something that um
uh that you may want to do that that is if you um if you have a resource budget for
for your programmers for your software engineers putting effort into making sure
that things are secure uh is a good thing and um
so security is that is a really difficult thing even for experts and um if you have all of
the the literature in the last uh twenty thirty years you will found a a number of
uh problems in uh and encrypted graphic protocol sort of number breakage is
and uh so even with the experts these are difficult things
to do but if you're if you're an amateur just getting used to things then uh then you probably shouldn't attempt it
but if you do you see i'm paying a special attention to all of all areas of correctness is is really key
you know these are things that that you may see if you're
if you're going to develop programs in uh in and say i'm in a company setting
uh you have some products that you're that you're putting out there but what about
what about all the rest of us who are writing software in in various other ways that perhaps are not
as important we think well you still need you still need to think about getting a program correct
even if your program is not the highly the most highly critical software
that that you're producing you still need to have some mindset
uh some uh some techniques everything about your program that lets you figure out
is this program really going to work or not but it's it's it's not okay to just
had various statements the program and then hope that one day perhaps it is going to
be cracked you need to actually think about what these things are just like
other kinds of engineers would have various methods of calculating if we're
bridges gonna stand up and and so forth there are methodology
sinner techniques in software engineering that let you think about programs to make sure that they that they really are okay
and the good thing here is that there's plenty of tools support today um there wasn't a tool support
for these things say twenty twenty five years ago a lot of things have happened since then
and um so you can get a lot of help and that help can be their balls for
teaching people about how to how to reason about software and also using it in fact this
okay so i'm in the future the new software engineers the young software engineers of today
uh or a likely to see these sort of uh the first a crack this insecurity issues a lot more
um the um with data being stored uh even more so when in the cloud
uh you worry about security and making sure that that everything is um
uh is kept safe uh and um and the tools can help with that
so um what about the cost if you mean it would be really great to be able to have a ball that that says
using formal methods is going to make your software development keeper is going to let you
get there faster is going to let you compete better as well um then
many of us would have that feeling that we can get there but if you look at measurements i think
there's only one project that really has a lot of these measurements and jim uh mention this project project
it's the s. c. l. for project in them from australia
which build a um a a micro kernel and um
uh that colonel has um when it started it was about ten
thousand lines just under ten thousand lines of of c. code
which is not a lot of code it's something that you could you would think are sure i can write that in a weekend
but uh can you write it correctly the question and to begin with uh
they had about two hundred thousand lines of specifications and prove scripts
so that is about twenty times the the number of lines of executable
code well today that number is is even higher because because
this group has has had a lot of things so in fact i think that these numbers a little bit outdated even so
but the but you can see that there's a large ratio here of lots of
lines of specifications and prove steps to the number of of lines of code
so now you might say cheese this what i'm in for well that one thing is that we're all
learning how to do these sorts of things and uh they have had a particular tool set here
they have a dealt with a number of especially tricky problems but they've also done
measurements over these many years and i think that this group has done
that these measurements uh breast but it's actually no one else has and what they have shown that compared
to other less packed this is approaches that even with these numbers that you may think are
our um extremely um large of of a large gap between the ten thousand
in the five hundred thousand lines of of uh specs improves is that
compared to other that's packed this as they found that they're only somewhere between a factor
of two and three away from the cost of developing something similar you saying
uh other best back this is like a testing and and things like that and of course when
you write test suites those add up to a lot of lines of code as well so
if you believe these numbers are representative and if you believe that we can do better
than the the ratio of well as i'm showing here fifty fifty two to one
uh if we could do to improve that ratio only to by a factor of two or three
then maybe we couldn't set here and make the the argument that is always going to
be caught more cost effective for these sorts of projects to to apply formal methods
so you know the case it's showing that when you when you
strive for tools uh correctness when you want to make
sure that things really are correct and secure there are techniques that can help you get in in in that direction
okay so there are pieces of code that you would like
even today uh let's say you're writing a variation of
binary binary so a binary search and instead of i'm wanting to find
exact elements the exact index where something is you want to find
the place where you want to insert something if it isn't actually in the in the right that's like
so maybe have or a library routine for doing this or maybe you're gonna write it yourself your writings such
a routine yourself and you could imagine a lot of other routines that you would write of this work
there is no reason for you to to have any errors in this code at all
would you probably want to do is you're gonna test it a few times if you're good software engineer you
will write um a test suite that you that you incorporate with them with every building every every change
but you can do even better than that have not too much cost if you understand
what these methods are now i'll show you some of of that as well
and um the formal methods as i mentioned before just that
the reasoning process itself the thinking process that goes on
uh that when you use formal methods is um is going to help you be be a better software dinner
so well how is it on there are three areas would be may think about here um verification specification and development
i had these on in a different word or uh in that we're not defined formal methods on the previous slide them good putting them
in this war because i think that this is perhaps how most people think of of things they might think of verification first
well i have some piece of code i'd like to make sure that it's correct how can
i make sure it's correct well i'm i might wish for just the push button tool
another or in such areas where uh being able to run the pushbutton two
or maybe running it tool but some specification that some expert road
um and it one example of this is the uh the slam project that
took place um in fact in them a gym there's group and um
when norm well almost twenty years ago now at uh at microsoft and uh this is the kind of a pushbutton
technique where you could have a a tool that uh that tries to find yours in a program for you
but if you want to extend this to your all sorts of things outside a particular domain
chances are that you were going to have to interact with the with the verify or
and i claim that this is a really good thing because uh if we never interact with the verify or how are we going
to get any thinking help from the verify or we would like to the tool support that we have to help us think
and if we never interact with that if it just says yes or no when the and
uh then how are we going to to be able to to think better because of the tool
so um we like things to be as automatic it as possible but when there's a net
interactivity i think that is that is can be used as a really good thing
so that's a verification side of things but really there are many other parts of formal methods that that help you a lot too
specifications is one that some people would think of specifications as being
the thing that you do will formal methods and maybe never
connected with code because the specifications help you abstract
and help you model things and help you think about what is it that these that that my software is really supposed to do
and as we all know this sooner in the development process you
can detect errors and a fatal flaws of your your architecture
or what you're trying to do the the cheaper just to to change gears and do something that is going to work
so writing specifications is a really good thing when you when you write the
specification for let's say class or module in whatever language you have
if you can think of how to write that specification in a way that the that you expected
all callers will be able to read the statement of of what's going on and understand it
that probably means that you had exposed enough of the abstraction to your callers and that is a really good thing
um you might find when you're writing specifications that all they're probably some hidden states in here that i need to tell
all the clients of of the class or the model about those things can come out when you're writing specifications themselves
and then in the development um that is something that that a number
researchers have had as a goal that's formal methods are used
in development that you write specifications and somehow you automatically you push a button and you
synthesise the code or you um you do some hand holding with the um
hand guidance of the the tool and and eventually you get the software well
that's that's something that can work in some domains but as of
try to emphasise several times using that using the formal methods themselves the reasoning
techniques as the thinking tool is something that is going to help you
okay so i'd like to just mention this if you milestones um and that there there's so
many so i mean i'm just going to mention three little things and they will
a support little bit what i'm i'm what i'm talking about that are there so
many other um that i'm leaving else it's it's not like this is a
complete history of of things but um if you look in uh the early seventies
in fact things started really in the in the early sixties where um
we're um many people started thinking about the correctness of software and and um how can you
what sort of mathematical techniques could you have and so twenty four
published the program in nineteen seventy one there were some similar programs that were being written as well
and um these programs had specifications in them uh this is i'm using really small fancier
but the um um but the program that you published has invariance it has
preconditions it has supporting arguments for why this piece of code is correct that is
it has stuff in it that the compiler is not going to use
and that's a big step for for any software engineer to take the step of saying i'm going to supply things
uh in my program text that is more than is needed by the compiler that's a step
that you must take a before you um can read any of the other benefits
so now uh a lot of people dreamt about these things in the
in the nineteen seventies and the mechanical tools that existed at
that time were just not there to support the the full vision of of what people expected to be able to do
in the mid eighties another milestone is bertram myers uh that i
fall programming language so i've full balls use object around orientation
which was um uh was ramping up and um and also put
what he called a contractor design by contract into the language
no this is not the first language that had contracts in that but this is really a piece that i think a popular rice this
idea and and made it into the whole language into the whole language
designed to have things like p. impose conditions in the language
so they're not the foreign part that you're trying to add later but you have it there
as your writings saw for the first place and that makes a big difference um
then in the um in the mid to late nineties
um java started becoming a quite popular and if you look
at how java as a job as libraries were were
designed at that point how they were documented you could see that that there were hints of specifications in them
and there was a there was a mindset that you should try to get your programs to fail as quickly as possible
that is both with the run time checks that the language itself imposes but also in
the the specifications that you would uh that you would write down or imagined for
for the software that for the libraries you'd like to have things fail as
quickly as possible because in that case you could find errors more quickly
i'll show you another on on another slide an example of this and
around this time the java modelling language uh which here elevens
that pioneered and and designed um from the help of a lot of different people
uh became a specification language um for for java in particular and
that sort of subsumed uh a lot of other such efforts
and serve incorporated them into into gem elsevier was one standard way
to write a formal methods for for java like programs
okay so here's a um yourself um
a one page where part of page from the specification the java that you till vector um
class and it's the insert element aftermath that doesn't really matter so much
but it is the highlighted some things in in yellow here
in the text itself it says something about the parameter called index
and it says the index that must be somewhere between zero role and the size the current size of the vector
now that's something that you might write in a in a piece of documentation well this is something that you can also form license some sort of way
the way that this was done in java was that a job would check these um these
sorts of preconditions i call the preconditions they would call them parameter about validation is probably
and it's as as you can see here it's as if the index is less than zero or if the index is
greater than the size then this description of the methods tells you that you're going to get some exception back
now what i interpret that i interpreted that that as a precondition i should
not be doing this that is i should make sure that i
don't get this exception i should make sure that i that i pass in in an index that is um that is within range
so the good thing with this is that that the specifications were riding yeah were written down so that you could always read them and
see what they were uh even though they were for for parameter
validation which is just a small small part of of preconditions
but it also helped uh the young software engineers at that time
because as a software engineer when you look at these um as at this documentation this a. p. i. documentation
that that describes what interfaces doing you can um you can see the style in which it's written
and so other java code would also start doing these uh fail fast
uh kind of paths that is we would thrown exception if if
the parameter values were not what you expected and this is one of the things i expect to happen with formal methods as well
that is when you see formal methods being used if it is in place in such a
way that the young software engineer can see them for example in documentation when tool tips
or in some programming language that uh that becomes popular so that it there are there
used then uh that i think that the software engineers will pick these techniques up
i don't think that there's anything um that's stopping them for example that that it's too hard to think about
or it's just a matter of of habit so getting this the young software engineer into these things
uh is one thing that we can do by by placing them visibly um
in documentation and as i said tool tips other things in the languages
okay so let me go through um three things from formal methods preconditions pose conditions and
the variance and then i'll put these together just to show you the flavour of
of how i see these tools being used for uh for software during
so let me start would preconditions the precondition for a method where routine or procedure um
is is a condition that you expect to hold when that method is called
so now the question is alright so if it's expected to hold
when it's called who is supposed to establish this condition
for precondition is always the caller that is expected to establish the precondition
and then the inside implementation you could just assume that this condition holes
that's is but a precondition is and i think that preconditions are the most
important kind of specification that you have anywhere if we could teach our
computer science students um to come out with their degrees and they
all know of very well exactly with the precondition s.
and they can write down the precondition i think we would be in a much better situation with writing software than we are today
and that seems like a small thing um the having a precondition means that we divide up
the the responsibilities who's going to check these conditions to make sure they that they hold
if they're not mentioned in the preconditions then maybe it's something that implementation is supposed
to be checking it and uh having preconditions another specifications they hardened all
the interfaces which means uh that that we take on this sort of fail
fast mentality if you will that is we say what supposed to hold
that is when you run your test suites you you can check not
only that the final result of running your entire program is correct
but you can check that the execution all all all along the way was what you expected it to be
so what is my advice here well i'm my advice to a
young software engineer would be start with parameter to validation
if you start with parameter validation you can get a feel for what our preconditions and these are really important things
to getting the parameter validation in your program if it's um if you if you throw an exception that's one
way of doing it here i'm showing one line from the dot net a coat contracts which is um
the way that you in your programming language uh in this case c. sharp can just include a run time checks
for these are the sorts of things so this is one way for for the software here to get started
okay post conditions well they are sort of the the the dual of the preconditions
they tell you what is expected to hold at the end of a of
a procedure body a word at the end of the method body
and the the the um the dual also holes that it's the
implementation that is supposed to to establish these things and
that means that the caller can assume these things after the call and that's a that's a very nice thing
um if you want to go all the way to specify functional correct this you need post conditions
but if you're just getting started chances are that you're just gonna stick with preconditions for awhile preconditions are
or more useful to get started with everywhere um but if you want to go all the
way to functional correctness the post conditions are going to say a lot of different things
um there are also a lot of things in a p. ice if you if you look at them
that maybe just incidental that the people start relying on that is made it so happens that
the routine might in some cases return a negative value well maybe you detect that it looks like
it's always returning minus one and maybe in your code you start comparing with minus one
well that may be a bad idea because maybe maybe you can get the different negative number
in in other cases so having a specification that states what that contract
is between the cauldron call the it's a really good thing
so my advice here is that if you're young software engineer and
you start fighting pose conditions doled go crazy with them
what i mean by that is it's really easy to try to move up your ambitions to
try to bite post conditions that are very elaborate imagine having routine that is going to
put a dialogue on the screen well what is that was the pose commissioner that
well that you have something else on the screen that serve me make
some button that says okay and so forth and you could try
to write these things don't do that um try to rights much
smaller pose conditions and ride larger ones as the need arises
um that's going to keep you say is also going to keep you
locked into a good engineering position which is the um finding
divide trade off between um between the routers that you're getting on your investment on your last time
okay so the third thing is invariance so invariance um our conditions that hold always
well they don't always all day the whole at important times
for example maybe if you're in an object oriented setting they hold whatever no one is operating in
on the object that is when the whole method is executing the object at that point stable
uh if you're in um concurrent setting and you you use something like monitors
yeah that is locked that protects some data that the monitor invariant is
something that holds when no one is operating on that data
so um so the very into something that holds always or at least at certain points under
many different kinds of invariance um system invariance a
monitor variance loop invariance data structure variance
and these invariance are mainly for the implement or um that is when you
implement the data structure you won't have it clear in your head
what you what you want things to uh to be to be um for the representation of what you're trying to do
so what is my advice here well i'm at mit uh and
i think this came out of the the clue days
um the with the the clue programming language a barbell this coven don't go to again and many others um
there was um you would write a rat okay method so replicate method is something
that's just want to check that your data structure isn't a good shape
that is that the representation is in an okay state and then if you write
such a method you can then call this method anytime that you like
and this is one way to checked invariance you're writing down invariant essentially in your
replicate method and then you're calling it when you wanted to to hold
um so if you're trying to learn to do to do verification using invariance then there's a concept called
an interactive invariance which means that invariant needs to say enough things so that you can prove six
and this is similar to if you have a if you're doing mathematics and you want an induction hypothesis sometimes we know that
you need to strengthen that induction hypothesis to be able to prove what what you
want to prove so to learn that there i would i would recommend
uh starting off polluted areas because they will show you that they've just sort of things that you need to know about uh verifying
things with the variance okay so what i want to do now is i'm going to show you these things at work
and so we'll do an example and um as i was riding through the uh the swiss landscape
yes ray by train uh i was thinking ah let's do the saddle back search
example so um so um this is going to be one example but i can show you
some preconditions post conditions and any variance and um let me explain what the problem is
so in saddle back search you have a you're given a two dimensional array
uh something like this and that we're going to search for an element somewhere in this right that's why it's in that buys the search
but it's called the saddle back because we're given a some properties about this right
we're given that the values if you go up the uh um if you take a
particular point in the x. axis and you go up the the y. axes
things and roll monotonically that is they they keep on growing as you as you go up
it's also the case that that the holes in the other direction so we're given that the over the matrix
the rated two dimensional array that were given has these properties that values role when these these directions
so what does it mean now to find an element so we're gonna
find an element as somewhere in the uh uh in this matrix
so the typical thing would be that you would start at let's say the origin
there we have it and you would you would search for the element and you could
imagine that the search would go something like that then ooh we found that um
so uh that's something that you that you might think of operationally but what does this
mean in terms of invariant what is the invariant of such such a thing
well it is that we have some area of the matrix that we have still to explore
that is as we go on their certain parts of the matrix that we um that we don't need to look at anymore
so let's say that we are given that the element upper looking for it does exist
somewhere in the in the matrix in that case uh and show in animation here
the as we do the search were blocking off certain parts of the
of the matrix that we no longer need to look at
so the invariant in this case is that the white area is
the button white area contains the element that we're looking for that is the marriott
so if we want to move the yellow circle that is the current
position if we want to move that somewhere uh in this
way that the way i did it if you want to move it to the right then you may get better make sure
that the elements we're looking for is nowhere to the left of what we're removing the point too
that is unlikely
density p. yet so that's alright so well it turns out that if you're going to do saddle back search there's
a moment of creativity that you that you need and i'm going to show you that moment now and then
after that moment of creativity everything is actually just simple um at the moment of creative give it that you
need to what you need to realise is that the way that i showed it on the slide
it's it's not a good technique for getting there um would you like to do is the following
uh you realise that this model tenacity well it's sort of like the
them the money in the city the values flow up words
in this direction and what does this remind you of
in switzerland well of course uh the beautiful landscape and hills and so forth
and uh so here i'm trying to depict that these arrows go up in the direction of the elevations as you move
around in this matrix um we're getting higher and higher if
you go uh up in the extraction worldwide direction
alright so uh well now what well now we can think of the contour
lines and the contour lines of this landscape look something like this
and it turns out that now if we want to find something in this
array one way to do it is that we start not that the
origin but we start at one of the diagonals we can start in
the upper left hand corner or in the lower right hand corner
um so um so if we do that then we can move this point
around and we have the same kind of invariant as we did
before in that were blocking things off to um that were saying
in those blue blue reaches we no longer have to log
and this is uh this next to program a bold doable at all and it also makes for nice invariant
okay so i hope that by this um animation that i've shown you what invariant is supposed to be so
now all we have to do is is right the code like so um we will do that
so i will write the code in um in a language called daphne so daphne was uh
does is the programming language that is designed for reasoning so
it was designed to make it possible for you to
uh to balls incorporate specifications into into the programs but also have
the verify around uh as you as you go along
alright so i'm going to write the um uh sell back search
and uh let's evaluate how is a um a um
how's that font sizes the museum and that's alright
okay so all of the um the method is um takes and takes in an array
that we had discussed and it's going to return to sex is going to return
uh the place the the coordinate m. and where we're going to find that that's this flag
uh all yes and of course we need also to know what what is that that's with flat what is
the key that we're looking for supper going want to search for key in that didn't national right
alright so oh well i said preconditions are the most important thing but actually
let me for the demonstration here let me write the post commission first
so you introduce suppose conditioned by the ensures keywords i hear i'm saying here comes the post condition
and the post condition is going to be more or less that if you look at
the at the re at that visit position you're going to see the key
so here i'm saying you to rely on after calling this routine that the admin and
that there are returned that's how parameters at that location you can find the key
well here we get it get some complaints already from from the verify or so
the verify as running in the background as some typing in this program
and it's complaining that the in the c. c. r. m. and may not be in the c. c. into data right
so i think that that nice strings and this by saying um that these are in
the um the indices into that the right and um it's alright um thanks
about that alright and i can see that that the number of fires happy
we're saying that okay this seems like an okay uh pose condition
but of course if we're gonna be able to ensure dispose condition there had better be such an element in the first place
so we introduce a precondition which we do with the requires keyword samples say that there exists a position i ga
in the re such that well essentially is that these um i'm a things all
and i i i this is for someone doing this example and from the crowd away so um these um
make tensions you can find my errors probably introduce some alright so that's the precondition well that's one of the preconditions
remember that we're also saying that this is only gonna work if we
have those monitor missy properties so we need some additional preconditions
that you can see that the preconditions i'm writing in this example they have on the fires uh like
this that that's probably not the first example that you would that you would show your um
you're young software engineers but i trust that that i can charities and that you're okay but that
so here i'm going to say well if we have uh some
cornet in the um uh in that direction and we have a
j. and ha prime in the other direction there to indices
and then uh it is the case that a. i. j.
is less or equal to a high j. prime okay
and we also are going to need it for the other directions let me write up on as well
so we haven't high and then i prime and um
if we see the seas or in this order
i owe much to to type but it all be worthwhile shirley alright okay i like that
that leaves something out this is going to allow us to write subtle bikes or snow
but we think that that
you know yes good uh_huh yes alright one week response in the back ah
alright so what are you not understanding about the specification should be um
so they're preconditions impose conditions preconditions are introduced by the keyword require suppose conditions by the keyword
ensures i'm trying to write down here what we want saddle back search to do
so writing down the spec is really important we need to get this right
we don't get the spec right mom we're not gonna get the program right that's going to do what we want
right so okay i did um i'm using in this specification
open fires which are more advanced and you um
they use mathematics and logic but the but i'm also writing a post condition which
is just a condition that you can write in in any language alright
okay let's let's plough ahead with guns with the um
writing some code and see where where we go okay so um
someone to write some some code and remember we're going to start here in
the in the upper left hand corner so i'm going to start m.
uh at zero and i'm going to start an ad well here we have this a decision to make
should we start and just one about the um just one about the or the the matrix
or should we always make sure that and and is an index into the matrix
well this is a a design decision that you also would make typically if you start from origin you would do it a
little bit differently but here i'm going to make sure that it's you did it all these years and i'm an index
so we're going to have a loop and and do is going to continue as long as the thing that we have is not the key
it's not clear that if this loop is ever going to terminate it would terminate when this condition doesn't hold
in other words it would terminate when we have found the key so if we can ever get there then this is going to do good thing
alright so um what sort of um invariance would be right for the loop
so a loop invariant is a condition that holds on every
iteration just before the loop guard is is evaluated
so we've said here already that m. we would like em to always be
uh in index into the right and would like and to always be an index into the right as well
okay so we're writing that down as a design decision that is worth thinking
these things never had that and then should always be in the seas
so if you writing this programming and you didn't think clearly about this you may very easily
step outside the bounds or do things in a program that's not gonna be quite right
okay so now we have another part of the invariant which
is remember those blue shaded regions in my my diagram
well they are essentially saying that we have a variation of of this
precondition so i'm i'm a copy that precondition down to to here
someone to say that there exist an element that is equal to the key that i
don't want to say that it exists somewhere in the in the whole ray
because we want to to be able to shrink that area to have a smaller area as we
go along with more loop iterations so um okay who can tell me how to change this
um hum this loop invariant to be what we want to be the blue shaded regions from from the diagrams
uh_huh uh_huh
thank you very much so replace zero by am
so what we're saying here is that here we saying that
there exists an index somewhere between am and the um
and the and the the x. x. axis such that this holes great
okay and what we say about the um the other direction
um one subject oh
ah yes ah thank you so the suggestion is let's replace eight out um
uh the things by an and actually remember that we're we're going to um
yeah actually what we can see here this seems like a good suggesting 'cause that's going in that direction
here are getting in an error message that that complains about loop invariant violation
well but this is saying is that uh that this cannot be established initially and it is because
and usually usually we have we always use half open interval is but in this case is initialised and to be an
index so actually we want all jason are in uh that
or uh that can be as large as an itself
okay so this is not our goal we're gonna maintain this condition with every loop iteration
um alright well um how are we gonna do that to her writing the body of the loop now so and
now i'm getting a complaint that he can prove termination well of course i'm not doing
anything in them so we're not certainly this loop is not going to terminate
but it also reminds us of what what is the what is
what is going to make this terminate very often the tool would figure out that itself
but in this case it's little bit trickier so i'll i'll have to supply it
so here i'm going to supply a termination metric something that is going to decrease with every step through
and of course if you remember that diagram the diagram shrunk up the little white region
um each time so i can say here something that is going
to decrease always is what remains in the x. axis
plus what remains in the y. axis because why goes in the um uh goes
down in this case okay so now we're going to write the code
so how do we do that well we're certainly going to compare that current element with it um
with the with the key so it could be that it's less than the
key um or it could be that it's greater than the key
so if it is less than the key well then we would like to do
something that is going to eat increase the value of a and and
so we have two things to do we could either fiddle with am where we can fiddle with an
i remember an is going in the direction from zero upwards and and the scoring from the last
index downwards so which one of these to do we want to change in this case
uh_huh and ran
yeah okay alright so am goes upwards and uh well certainly and the other one must
be an right uh and there it is that is the correctness of the program
you know if you're writing this program and you didn't have this tool
uh let me guess what you might do you perhaps you didn't write down the specification to begin with
maybe you didn't explain carefully in the in the documentation that you needed these
two preconditions for saddle back search to to work in the first place
yeah so in particular you bought a precondition says that we're assuming that the element exists
we would have written a slightly different program if we didn't assume that the that the element existed in the first place
so that's one good thing but then once once we wrote down those loot
invariance it was and uh in this case the termination metric as well
it was clear what what operations that we had a available to us were we could increment m. or we could recommend and
now it's not always the case that that just putting those in like this would would
know all magically lead to the right program but in this case it does
but if you were true that you're doing it right now you're thinking in your head is
is really like it i think in some cases where this is not really right
well that that's the sort of thing that we do but if you know how to reason in
terms of invariance you reestablished invariant with every loop iteration and we prove that the loop terminates
and from the loop invariant uh and the mediation regards fall as opposed to this we want
so this program is now known to be correct and it's been proved to be correct
so my point here is that i think that the source of tools are the
ones that uh that young software engineers as they are learning to program
a can can make use of um and again this was more complicated than
the many of the introductory programs because i was using one fires
okay so um that made the move on will but that and but let me re emphasise the point again remember i said the
there are some routines that you're right that there's just no reason to making
a mistake but and the this i would say it's one of those
okay so let's see so in them the ten minutes i have left i'm going to say a few things about security um so
um i've talked about correctness so far would preconditions post conditions and invariance
and you also want to make sure that your programs are secure depending what sorts of programs or writing
and that this goes beyond the functional correctness in the following way that
if we're just writing a program ourselves we are our own any that is um
it's just we're trying to protect ourselves from making a mistake in the program
but when you deal with security you also have to me to to checks that no
add the series no attacker can somehow intercept what you what the program is doing
for example by looking at the messages that you're setting across the the network
and then try to do bad things with the program so
that goes beyond the the source of um a simple
a fungal correctness so i'm going to give you one little example um
and the these examples always have an allison a ball and so this does
too so the ideas alice is going to send the message to bob
and uh that's easy enough to do but now if someone reads what's going on on that on that
network you would immediately get am so we want to encode am in some sort of way
well we could write some function of them and send that across um that seems
good now we're not sending am so this uh perhaps has some some problems
and um but of course not all such functions are good ideas
for example if we just decide to um increment m. by one and thinking that that that's um
a a secret way of uh of getting the message to the other side we're probably mistaken
so what it comes down to is that with like
the whatever goes across the wire to look as if we're just sending something random
that is with like an uh an eavesdrop or to be able to
it should not be able to distinguish the actual execution from
uh from something that sense just a random number across that's uh essentially
it means um are part of it in a nutshell for you
um so uh in this case but we can't prove that these two programs are um are indistinguishable in eavesdrop
or would be able to distinguish these two programs but that's the that's that's the idea so um
we can we can do such security reasoning uh using program reasoning so if you know program
a program logic send you understand what the semantics of programs you can do this kind of reasoning
uh it's otherwise uh it's off and on for example in game based techniques or
what their number techniques i'm not going to go through them here but
if you want to do this would program reasoning that essentially what you're doing is you're
riding the specification and the specification is itself a program and it is one that
that you you would be happy with if that's what eavesdrop or looks at so for example
something that really sensor raw true random number across the wire you would be happy but
and then you implement uh your program and then you prove that you are indistinguishable
and there are various footnotes with indistinguishable you want to say that there
pull the normal time um indistinguishable that is that that if you give
the attacker only him resources computational resources that are a little mill
in the size of the problem that they would be able they will not be able to to figure out what's going on
i'll skip over those of footnotes so uh we build the
tool uh called quibble of that text equivalents that were
checks the in distinguish ability of programs and uh it has to back ends and both are basing formal methods
uh it uses the rose that back and and it uses rose
that as a as a down the model checkers actually
so that means that uh when uh when you feed a oh yeah i'm going to say
the inputs to quit like it's not just the program but it's two programs and it's a whole um
it's a proof that you think it's a proof for why uh why your program is correct
and we have like an checked that in two different ways can check it with words that were with actually
so what was at rose that is working like a bounded model checker which means that it's going to check this
uh this equivalence up to certain bounce then that means that it might find errors in that case was that
usually find some pretty quickly or it may not find any errors and done well maybe it is correct
and uh once you uh you get enough made easy uh you're you have enough
confidence that you might want to spend the effort trying to really prove it
so then you can use daphne in this case daphne is really use
just as a logical foundation as a proof um as a um
prevention for it and it can then with your help uh it gets too yes answer if you're
protocol really is correct so things look a little bit like this in the in the quit long notation
that you have the um the ideal program in this case the left uh where
you can see that what is being sent this just a a random number
and then on the on the right i'm using the uh the well
in this case faulty program that's going to send em plus one
and you can see in the query language that you can set up an adversary as and the adversaries are uh
uh you um you tell the pro you you make it part of
your program where an adversary can them um can be part of
um of things in this case we're saying that in the network itself
is the adversary so um anything can happen at some point
uh alright so the way that you prove indistinguishable lydia programs in gridlock is that
you start with the ideal program and you do program transformations on it
and it's these program transformations art that are checked so this is
a this is not an um as automatic process as
it was to to write settle back surgeon in that i think you have to work a lot more here um
and the i'm giving you some examples on the slide as to what these
transformations are for example you might re factor these objects in some ways
you might have some assumptions lying around that you can appeal to
and you might um establish some some data invariance which are
really just like the variance are talked about and you might make use of those somewhere in the in these proves developments
so that is essentially what what a quick like those so you can use
formal methods uh not only for functional correctness but also going
beyond that for for security and that's really important
so let me conclude so at the heart of formal methods are specifications
we write specifications for what we want one things to do
um some people stop it just riding specifications in the model them in some way where they
check to specifications for error and then they think of the code writing process as
being a separate process that's still making you some some formal methods but you might also
try to connect it to get the code to be formally verified mechanically verified
to um to live up to the specifications as we saw in the in the daffy them all
but in any case it helps you with that reasoning because when you write down
the the loop invariance when you write down the various preconditions you're thinking
in terms of what is it they're really need here what should all that which points in the program and that makes you better software engineer
so um my advice to the the young software engineers is
trial quantities um programmer fires uh for example daffy has a
tutorial you can try online or a install it
on your own machine is much more fun 'cause then it's uh it runs um as your um
at your typing in the program which really changes your mind as to i mean what what these things really can do for you what you can
you get disappointed when your c. p. u. doesn't does not do this for you and in other i. d.s afterwards
then if your teacher i would suggest uh might vices incorporate these sorts of things that is when
you think about algorithms when you think about programming and data structures uh loops think about invariance
that point output the preconditions are we need preconditions we need to think in terms of
these things to to get our programs to work and um that having that
uh that common framework for it uh and setting that early i think would be a really great thing
so um for the future software engineering i think that we have a have a great
future and uh so i'd like to thank you for attention and program safely
i think you know it
her daughter will have time for some questions uh
but the high tech for a ride a bus are towards up here
mike's or order yeah whatever once i guess
the extra stuff for a for a wonderful talk i teach software engineering undergrad software
engineering and actually actually start using daphne no perfect them so this last right
um so i have two questions uh one is that you are good at an interactive for
him was to wonder just yes no uh is more useful because of shape thinking
for south exit or nice but what about other kind of cold like stuff that's more t.
v. connected to each of our like you're writing a a file system are out
you know where you end up having to really spend a lot of time on week
let's think about a lot of the calls conditions so things become very complex and does it
become distracting especially when you have or some fireworks like it uses a particular you
know first order logic inside why did you kind of forget what you're gonna write interface
e. you can get lost in the in using the tool and the tools
uh even though they they're really very good these days they they still have a
lot of wrinkles so um um but the the um i think that one
one difference between the those are the pushbutton and the then the interactive then i should say i should make it clear that
if we could get more things to be automatic uh of course i'm all for for that
i mean there's no reason to to add more work onto the already overloaded software engineer
but i i just uh want to argue that the interactivity that is there when it is
needed is not is not an overall that thing so when you write something like um
if you're trying to write specifications for something that someone else already wrote
then trying to in for them and having as much automation as possible is
is a really good thing 'cause then there is some very tricky to
figure out with these things are afterwards but if it's cool that you're right then i think that interactivity is is a really nice thing
and if you have something like a file system that these things do get
complicated and then that's one formal methods can help me even more
because you you abstract and you try to simplify and you write down uh the things that that matter
um but but yes or wrinkles in the tools to thank you and if i may ask a follow up question huh
so that they're what to show that's is in some sense stacked even below right the spec yep right the coding
mind of task depending on where you write the test and then the call this size but the tests yeah
um and that is my understanding that the really take off a
while impact is because software engineers under time pressure and
like what s. dive right into writing the code uh_huh and you know if they have to spend
right the time reading task before handing the like yeah so that's that's kind of the human angle
this is something you already know how has that shaped you're thinking about
proposing spectral and development and are there ways to mitigate that every human desire to
your deadline is yeah um when i was a young software a software engineer
a i a lot about these techniques i thought they were really great and i
thought all wonderful now i can do all my development as this way
that i think is a mistake yeah that is um but it's it's
going to take us a long time before all software engineering is
is that is done this way if for no other reason that
that correctness they're correct this is a spectrum um and um
i think that the way to to approach that is that um that
there's certain routines that you learn are that you want you
know these sorts of techniques they're certain routines that you think in your had this one i just don't wanna make any mistakes
and then you apply these techniques and the other cases where uh let's say that you write lot of java coder lots
of 'em delegates being passed around here and there uh i don't know how to write to specs um they're
testing is probably more effective there uh to uh to figure things out so finding those engineering trade offs is something
that that um i mean that we all need to learn as a community but also as a software dinners
thank you thank you or style and here
right that'd huh thank you yes or it was a worry in his buying the
example that you gave me one thing i i thought about when you say that the the
you you could environmental does god terminates all the calls conditions on the whole uh_huh
i was thinking impact is uh it's not you know that the whole will terminate eventually
yeah you did you to terminate it in that in some reasonable time so
ah what about the like a simple to complex in general in
about the fourth performance reasoning uh in the state of the art it
and you do some things like like you did on the screen
for performance that's right yeah thank you for that price in the so
even when you um it for daphne in particular daphne checks that that your
program is correct and that it will eventually terminate it will not check
uh at that the uh the time in which it terminates is something that
you have to spend a that is you don't you may not
the light this on may not be shining anymore when that when your program terminates uh it also
does not check that you have the space requirements needed those are the things that it
doesn't jack so with with daphne if you try to do something like that what you would
do is you would set up something that you that would let let you count
the number of loop iterations which in this case you can because them the decreases gloss was a it was an image or
so you can just compute that what it is a up front which is the the some of the two dimensions
and uh so in that case you can you can find of performance down but in
many other cases you may um that maybe much trickier to uh to compute
so in principle you could do it with a bike instrument thing a a daffy
program but there are also other tools that that would do it for you
and those things are certain important that also for like real time programs that thank you yep x.
i also the question yes thanks a lot for the interesting talk um i have a question about the uh really sure of
um basically the lines of code in quotes uh between specifications and uh
actual a code that you are specifying the specifications because you mentioned
uh one example where you have ten thousand lines of code uh and that is
basically specified by five hundred yeah i wasn't i think so yeah uh yeah
lines of specs yeah staggering numbers yeah and and
you doesn't example was also and not that's that's started difference but uh
also you had a lot more characters specifying than coding now um
at some point don't you get a a problem that you have too much of box
much more bucks in your spec then your program yep so in the what when you do the development
it very often happens that uh let's say for you leave something out of the the the specification
but you realise that then at some point during the development work when you're trying to to reason about the loop
that you either forget some conjoined or that you write them incorrectly for example it was very easy to have written
yeah for the j. if you remember that to say let's strictly less than a ducklings
one that rather than less or equal to which is what this example needs
so many of those things happen and you correct as many things in the specs as you would in in the programs
um that s. for the ratio i i think that the uh daphne has has seen a
lot better ratios then uh then uh many other tools um there been some some measurements
especially in the uh in the ironclad unearthly projects that use a daffy
for for system a system code uh that was the um
um so there the well the the overhead there was about five soaps for
between four and six lines of overhead for every line of executable code
so i mean that's that seems a lot better but but even as i said you know
with this staggering five hundred thousand lines for ten thousand executable i mean it just seems
seems i'm so out of proportion but even with those numbers uh that were measured as i said that the
the additional cost was just two or three times which means that that in
the best track this is you spend a lot of time trying to
debug things and um and figure out what the errors are and and
things like that um but yeah that's specs can be unwieldy to
so so the basically based on the on those findings box in the spec are easier to find
and the bugs in the code no i'm an hour they i are easier i'm not sure uh
um i have written pieces of code that where the code has happened to be correct
but the this there's a some error in this back and i'm tearing my hair
out trying to figure out what's going wrong and then i realise all
well i forgot to say something so i'm not sure if it's easier in the in the program or not i'm not you you could also try
checking these at run time which is a which is a fine idea that when you get to more complicated data structures
um with um let's say a recursive data structures or something that uses these qualifiers then
you might take quite a bit of a performance it to to do it but
if you if you just check for for parameter validation things at run time that's a
that's a really good idea to uh for checking these things and it's cheap
uh unfortunately we're a little bit uh behind the schedule as often happens with software engineering informal methods that
well i think in the interest of time or like to think i our first speaker today

Share this talk: 


Conference program

Welcome address
Andreas Mortensen, Vice President for Research, EPFL
7 June 2018 · 9:49 a.m.
Introduction
Jim Larus, Dean of IC School, EPFL
7 June 2018 · 10 a.m.
The Young Software Engineer’s Guide to Using Formal Methods
K. Rustan M. Leino, Amazon
7 June 2018 · 10:16 a.m.
Safely Disrupting Computer Networks with Software
Katerina Argyraki, EPFL
7 June 2018 · 11:25 a.m.
Short IC Research Presentation 2: Gamified Rehabilitation with Tangible Robots
Arzu Guneysu Ozgur, EPFL (CHILI)
7 June 2018 · 12:15 p.m.
Short IC Research Presentation 3: kickoff.ai
Lucas Maystre, Victor Kristof, EPFL (LCA)
7 June 2018 · 12:19 p.m.
Short IC Research Presentation 5: CleanM
Stella Giannakopoulo, EPFL (DIAS)
7 June 2018 · 12:25 p.m.
Short IC Research Presentation 6: Understanding Cities through Data
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 12:27 p.m.
Short IC Research Presentation 7: Datagrowth and application trends
Matthias Olma, EPFL (DIAS)
7 June 2018 · 12:31 p.m.
Short IC Research Presentation 8: Point Cloud, a new source of knowledge
Mirjana Pavlovic, EPFL (DIAS)
7 June 2018 · 12:34 p.m.
Short IC Research Presentation 9: To Click or not to Click?
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 12:37 p.m.
20s pitch 1: Cost and Energy Efficient Data Management
Utku Sirin, (DIAS)
7 June 2018 · 2:20 p.m.
20s pitch 2: Gamification of Rehabilitation
Arzu Guneysu Ozgur, EPFL (CHILI)
7 June 2018 · 2:21 p.m.
20s pitch 4: Neural Network Guided Expression Transformation
Romain Edelmann, EPFL (LARA)
7 June 2018 · 2:21 p.m.
20s pitch 5: Unified, High Performance Data Cleaning
Stella Giannakopoulo, EPFL (DIAS)
7 June 2018 · 2:21 p.m.
20s pitch 6: Interactive Exploration of Urban Data with GPUs
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 2:22 p.m.
20s pitch 7: Interactive Data Exploration
Matthias Olma, EPFL (DIAS)
7 June 2018 · 2:22 p.m.
20s pitch 8: Efficient Point Cloud Processing
Mirjana Pavlovic, EPFL (DIAS)
7 June 2018 · 2:23 p.m.
20s pitch 9: To Click or not to Click?
Eleni Tzirita Zacharatou, EPFL (DIAS)
7 June 2018 · 2:24 p.m.
20s pitch 10: RaaSS Reliability as a Software Service
Maaz Mohiuddlin, LCA2, IC-EPFL
7 June 2018 · 2:24 p.m.
20s pitch 11: Adversarial Machine Learning in Byzantium
El Mahdi El Mhamdi, EPFL (LPD)
7 June 2018 · 2:24 p.m.
Machine Learning: Alchemy for the Modern Computer Scientist
Erik Meijer, Facebook
7 June 2018 · 2:29 p.m.

Recommended talks

Pitch Headswap
James Murray
27 April 2017 · 2:06 p.m.