[00:00.000 --> 00:18.520]  The next presentation will be by Fabien Choteau.
[00:18.520 --> 00:25.640]  He will give an introduction on formal verification and will learn us how to mathematically prove
[00:25.640 --> 00:36.200]  this error box in your software. Thank you. I need the timer because I have quite a lot
[00:36.200 --> 00:42.960]  of things to say. Hi everyone, I'm Fabien and yet today I want to talk about formal verification,
[00:42.960 --> 00:49.760]  open source formal verification. First the disclaimer, I'm not an expert in formal verification,
[00:49.760 --> 00:55.680]  but I'm a user of this technology. What I'm an expert at is embedded software, but I
[00:55.680 --> 01:00.560]  use formal verification and I will explain later our work in a company that's developing
[01:00.560 --> 01:08.520]  some formal verification solutions. If we look at what Wikipedia says, formal verification
[01:08.520 --> 01:14.760]  is the act of proving or disproving the correctness of intended algorithms using formal methods
[01:14.760 --> 01:24.880]  of mathematics. So in practice, what does that mean? Let's take a very trivial example.
[01:24.880 --> 01:30.760]  If we look at this line of code and we want to prove that it's correct, that it never
[01:30.760 --> 01:36.440]  fails. First we will have to look at what can go wrong. So for instance here we can say
[01:36.440 --> 01:42.640]  well there's potentially a division by zero, right? That's bad. So if we want to prove
[01:42.640 --> 01:51.360]  that the line is correct, we have to prove that x minus 10 is different from zero, which
[01:51.360 --> 02:00.880]  in terms means we have to prove that x is different than 10. This is known as a verification
[02:00.880 --> 02:10.040]  condition, something that has to be true for our program to be correct. Now if we look
[02:10.040 --> 02:16.800]  at this line of code in a context, just a trivial example, here we see that there is
[02:16.800 --> 02:22.720]  an if statement that's guarding the expression. So we know that x is always different from
[02:22.720 --> 02:29.160]  10 and therefore we know that there's no possible division by zero in this piece of
[02:29.160 --> 02:37.840]  code. So that was easy, right? But now let's look at another very trivial piece of code.
[02:37.840 --> 02:42.280]  Are you able to spot all the verification conditions? Are you able to check that they
[02:42.280 --> 02:51.880]  are respected or not? This is actually very, very difficult. Most programmers will know
[02:51.880 --> 02:58.040]  some of the things that can go wrong. Most of the time we will forget what they are.
[02:58.040 --> 03:05.720]  I'm looking at you, integral flow. And so that's why programming correctly is very difficult
[03:05.720 --> 03:11.120]  for human beings because we are not able to keep in mind all the verification conditions
[03:11.120 --> 03:15.560]  and play with them and check them all the time. Again, this is a very simple piece of
[03:15.560 --> 03:24.480]  code. And so what formal verification is, and in particular, automatic formal verification,
[03:24.480 --> 03:30.320]  well the goal is to have tools that will extract the verification conditions from the code
[03:30.320 --> 03:38.160]  and then run a mathematical proof to check that they are respected. And so today I want
[03:38.160 --> 03:47.360]  to talk about one framework for automatic formal verification, which is called Spark.
[03:47.360 --> 03:57.000]  So Spark is both a set of tools and a language to perform automatic formal verification.
[03:57.000 --> 04:02.560]  So on the tool side, we have different tools that are working together to achieve this
[04:02.560 --> 04:09.560]  goal. The first one is Gnatt Prove. It's developed by the company I work for, Aida Core. It's
[04:09.560 --> 04:16.280]  going to take Spark code as the input and translate it to another language called YML.
[04:16.280 --> 04:26.000]  Then we have this, the tool itself, Y3, developed by Inria in France. It's a research institute
[04:26.000 --> 04:34.680]  which, again, translates the code and extract the verification conditions and call the different
[04:34.680 --> 04:42.520]  solvers, which are on the right here, to ask them to prove the verification conditions.
[04:42.520 --> 04:47.560]  So we have different solvers that will have properties on different kinds of algorithms,
[04:47.560 --> 04:54.960]  Altergo, CVC5, Z3. And so this full tool chain is open source and developed by different
[04:54.960 --> 05:01.760]  entities, as I mentioned. And so the solvers, for instance, they are not only used for Spark,
[05:01.760 --> 05:07.880]  they are used for other formal verification frameworks, but all of them work together
[05:07.880 --> 05:17.360]  in this framework. And so on the other side, Spark is also a language. And actually Spark
[05:17.360 --> 05:25.200]  is a subset of the Aida programming language. And so the question you may ask is, why would
[05:25.200 --> 05:32.680]  you use Aida? Why would you use a subset of Aida for formal verification? So I'm going
[05:32.680 --> 05:43.120]  to take just two simple examples. Why Aida is great for when you want to do formal verification?
[05:43.120 --> 05:52.320]  Well, this language provides a lot of specification power. The developers can express very precisely
[05:52.320 --> 05:59.920]  what they want from the program, from the code, which then the formal verification framework
[05:59.920 --> 06:06.480]  will be able to check. Just a simple example, if you program in any other language, if you
[06:06.480 --> 06:11.720]  want to have in your application a percentage value, for instance, for the completion of
[06:11.720 --> 06:19.800]  a process or whatever, usually we'll say, OK, my percentage is a float. And I'm going
[06:19.800 --> 06:26.000]  to say I'm going to use the value from 0 to 1. And if you are an extremely good programmer,
[06:26.000 --> 06:30.200]  you're going to write that in a comment. Like, you know, my lowest value is 0, my highest
[06:30.200 --> 06:36.320]  value is 1. And that's just a comment. And then if five years down the line, you want
[06:36.320 --> 06:41.600]  to say, oh, actually, it's better if it's from 0 to 100. What happens to the comment?
[06:41.600 --> 06:45.760]  Maybe you update it. Maybe you don't. How do you make sure that everything, all the code
[06:45.760 --> 06:51.840]  is updated to follow this new rule? That's very difficult to maintain. So one example
[06:51.840 --> 06:57.360]  with Aida, you just define your own type and you say it's a float. It's a new float. It's
[06:57.360 --> 07:05.880]  a different kind of float. That has only valid value between 0 and 1. And so what's great
[07:05.880 --> 07:12.560]  here is that the solvers that will try to prove the code, they get a lot of information
[07:12.560 --> 07:19.520]  from here. First, they will add verification conditions when you try to cast from a regular
[07:19.520 --> 07:24.360]  float to this percentage value. Because if the value is not within the range, that's
[07:24.360 --> 07:30.240]  a bug in your program. And you want to know it. Also, you can extract information. Like,
[07:30.240 --> 07:34.840]  if you take two percentage value and you multiply them, you also know that the result is between
[07:34.840 --> 07:40.560]  0 and 1. And so in turn, that means a lot of information for the provers to do their work.
[07:40.560 --> 07:52.160]  And so they will be able to more automatically reason and prove that the program is correct.
[07:52.160 --> 08:00.520]  Another example is the contract-based programming of Aida. So you can have preconditions and
[08:00.520 --> 08:07.280]  postconditions on your subprograms. So precondition is something that has to be true when you
[08:07.280 --> 08:13.960]  call the function or the procedure. Postcondition is something that has to be true when you
[08:13.960 --> 08:21.360]  return from the subprogram. So very simple example here with the stack. We implement
[08:21.360 --> 08:27.360]  the stack. We have functions to know if the stack is full or if it's empty. And we implement
[08:27.360 --> 08:33.560]  the push. And obviously, well, not obviously, but in this API, we say it doesn't make sense
[08:33.560 --> 08:38.640]  to push something on the stack if the stack is already full. So never do that. That's
[08:38.640 --> 08:45.800]  the contract that the procedure is asking from the caller. And then the implementation
[08:45.800 --> 08:53.080]  says if you push something on the stack, well, it's not empty anymore at the return.
[08:53.080 --> 08:59.160]  And so with Spark, what's great here is that you have formal verification both on the
[08:59.160 --> 09:07.800]  caller side. So Spark will prove that you never call the push function when the stack
[09:07.800 --> 09:13.600]  is full. And on the implementation side, it's going to prove that when you return from
[09:13.600 --> 09:20.280]  the push function, the stack is never empty. And so that's going into functional correctness
[09:20.280 --> 09:28.480]  verification, which means your software is doing what it's supposed to do and only what
[09:28.480 --> 09:35.960]  it's supposed to do. And so with the integrated pre and post conditions in ADA and other features
[09:35.960 --> 09:43.240]  that I don't have time to mention, well, this makes ADA a very great language for formal
[09:43.240 --> 09:53.680]  verification. So why should I care about Spark and I would say formal verification in general?
[09:53.680 --> 10:00.000]  So with Spark, you can have mathematical proof that there is no possible vulnerabilities
[10:00.000 --> 10:07.280]  in your application for any possible inputs. That means no buffer overflow, no division
[10:07.280 --> 10:15.920]  by zero, no integral overflow and so on. If you go beyond and you use contracts, you can
[10:15.920 --> 10:21.720]  also prove, as I mentioned, the functional correctness. So the program does what it's
[10:21.720 --> 10:27.080]  supposed to do and only what it's supposed to do. And in terms, that means you can avoid
[10:27.080 --> 10:33.360]  some of the testing efforts because, for instance, unit testing is more or less trying to achieve
[10:33.360 --> 10:40.080]  the same goal. So if you already have a mathematical proof that the functional correctness of your
[10:40.080 --> 10:47.400]  code, you don't need to do unit testing anymore. And so you're going to save time on that.
[10:47.400 --> 10:58.320]  Recently we published a case study from NVIDIA. So a few years ago, the NVIDIA security team
[10:58.320 --> 11:04.920]  was questioning their methodology for security and how to achieve their goals in terms of
[11:04.920 --> 11:11.200]  security. And so they said testing security is pretty much impossible. You cannot test
[11:11.200 --> 11:15.680]  all possible combinations of all possible values for your application. And so they decided
[11:15.680 --> 11:23.280]  to try provability. And they selected Spark as an experiment. And now they are deploying
[11:23.280 --> 11:29.960]  more and more Spark in the GPU. So if you get the latest, greatest NVIDIA GPU, there
[11:29.960 --> 11:39.360]  should be some Spark-proven code embedded in the firmware, which lets them actually focus
[11:39.360 --> 11:48.040]  on other parts of the security on more interesting verifications and more interesting properties,
[11:48.040 --> 11:52.680]  security properties of the application. They don't have to deal with buffer overflows and
[11:52.680 --> 11:56.760]  integers overflows. All the low-level stuff, it's already taken care of. And they can focus
[11:56.760 --> 12:07.200]  on more interesting points. So now let's do some proof. So for the A9 Spark programming
[12:07.200 --> 12:13.160]  language, we have this package manager called Alire. So here are a few instructions to make
[12:13.160 --> 12:19.800]  your first and prove your first piece of Spark code. So you don't know then install the package
[12:19.800 --> 12:26.120]  manager. So from the command line, we start by creating a project or a crate with Alire
[12:26.120 --> 12:33.880]  in it. We enter the directory. We add the net prove tool suite as a dependency. So it's
[12:33.880 --> 12:39.280]  going to download everything and set it up for you. Then we write some piece of code.
[12:39.280 --> 12:49.680]  So you can recognize our very nice equation here. Just for the declaration of the X constant,
[12:49.680 --> 12:55.840]  it doesn't matter what it is. I'm just taking an integer value that Spark doesn't know.
[12:55.840 --> 13:03.840]  So just to make sure I'm not cheating or anything. We go to the console again. We run a net
[13:03.840 --> 13:10.280]  prove. And so a net prove will tell us, well, there might be a division by zero error at
[13:10.280 --> 13:16.560]  these points. So as you can see, the message is pretty clear. Actually, it can be even
[13:16.560 --> 13:22.560]  better than that because the tool can provide counter examples. So if we add the switch,
[13:22.560 --> 13:29.840]  counter examples on, net prove will say division by zero might fail, for instance, when X equals
[13:29.840 --> 13:39.480]  10. And so that's pretty easy to fix. We just add this if statement. And we run the tool
[13:39.480 --> 13:49.720]  again. And that's it. We proved our first piece of code. So as you can see, it was easy.
[13:49.720 --> 13:59.120]  If you want to try and learn a little bit of Spark, we have an online website. So learn.edocore.com.
[13:59.120 --> 14:04.920]  Online interactive website. So you don't even have to install what I showed before just
[14:04.920 --> 14:12.560]  to learn and try the tool sheets. So there's different chapters and one specific to Spark.
[14:12.560 --> 14:20.160]  So that's one way to get started. And for those who wondered, just the piece of code
[14:20.160 --> 14:27.600]  before, there are seven potential bugs or errors in this one. So I'll let you as an
[14:27.600 --> 14:43.400]  exercise to fix this one. Thank you very much.
[14:43.400 --> 15:01.360]  Thank you for the presentation. Let me unwrap you with a shot.
[15:01.360 --> 15:16.480]  Perhaps someone might have a question about that.