[00:00.000 --> 00:18.520] The next presentation will be by Fabien Choteau. [00:18.520 --> 00:25.640] He will give an introduction on formal verification and will learn us how to mathematically prove [00:25.640 --> 00:36.200] this error box in your software. Thank you. I need the timer because I have quite a lot [00:36.200 --> 00:42.960] of things to say. Hi everyone, I'm Fabien and yet today I want to talk about formal verification, [00:42.960 --> 00:49.760] open source formal verification. First the disclaimer, I'm not an expert in formal verification, [00:49.760 --> 00:55.680] but I'm a user of this technology. What I'm an expert at is embedded software, but I [00:55.680 --> 01:00.560] use formal verification and I will explain later our work in a company that's developing [01:00.560 --> 01:08.520] some formal verification solutions. If we look at what Wikipedia says, formal verification [01:08.520 --> 01:14.760] is the act of proving or disproving the correctness of intended algorithms using formal methods [01:14.760 --> 01:24.880] of mathematics. So in practice, what does that mean? Let's take a very trivial example. [01:24.880 --> 01:30.760] If we look at this line of code and we want to prove that it's correct, that it never [01:30.760 --> 01:36.440] fails. First we will have to look at what can go wrong. So for instance here we can say [01:36.440 --> 01:42.640] well there's potentially a division by zero, right? That's bad. So if we want to prove [01:42.640 --> 01:51.360] that the line is correct, we have to prove that x minus 10 is different from zero, which [01:51.360 --> 02:00.880] in terms means we have to prove that x is different than 10. This is known as a verification [02:00.880 --> 02:10.040] condition, something that has to be true for our program to be correct. Now if we look [02:10.040 --> 02:16.800] at this line of code in a context, just a trivial example, here we see that there is [02:16.800 --> 02:22.720] an if statement that's guarding the expression. So we know that x is always different from [02:22.720 --> 02:29.160] 10 and therefore we know that there's no possible division by zero in this piece of [02:29.160 --> 02:37.840] code. So that was easy, right? But now let's look at another very trivial piece of code. [02:37.840 --> 02:42.280] Are you able to spot all the verification conditions? Are you able to check that they [02:42.280 --> 02:51.880] are respected or not? This is actually very, very difficult. Most programmers will know [02:51.880 --> 02:58.040] some of the things that can go wrong. Most of the time we will forget what they are. [02:58.040 --> 03:05.720] I'm looking at you, integral flow. And so that's why programming correctly is very difficult [03:05.720 --> 03:11.120] for human beings because we are not able to keep in mind all the verification conditions [03:11.120 --> 03:15.560] and play with them and check them all the time. Again, this is a very simple piece of [03:15.560 --> 03:24.480] code. And so what formal verification is, and in particular, automatic formal verification, [03:24.480 --> 03:30.320] well the goal is to have tools that will extract the verification conditions from the code [03:30.320 --> 03:38.160] and then run a mathematical proof to check that they are respected. And so today I want [03:38.160 --> 03:47.360] to talk about one framework for automatic formal verification, which is called Spark. [03:47.360 --> 03:57.000] So Spark is both a set of tools and a language to perform automatic formal verification. [03:57.000 --> 04:02.560] So on the tool side, we have different tools that are working together to achieve this [04:02.560 --> 04:09.560] goal. The first one is Gnatt Prove. It's developed by the company I work for, Aida Core. It's [04:09.560 --> 04:16.280] going to take Spark code as the input and translate it to another language called YML. [04:16.280 --> 04:26.000] Then we have this, the tool itself, Y3, developed by Inria in France. It's a research institute [04:26.000 --> 04:34.680] which, again, translates the code and extract the verification conditions and call the different [04:34.680 --> 04:42.520] solvers, which are on the right here, to ask them to prove the verification conditions. [04:42.520 --> 04:47.560] So we have different solvers that will have properties on different kinds of algorithms, [04:47.560 --> 04:54.960] Altergo, CVC5, Z3. And so this full tool chain is open source and developed by different [04:54.960 --> 05:01.760] entities, as I mentioned. And so the solvers, for instance, they are not only used for Spark, [05:01.760 --> 05:07.880] they are used for other formal verification frameworks, but all of them work together [05:07.880 --> 05:17.360] in this framework. And so on the other side, Spark is also a language. And actually Spark [05:17.360 --> 05:25.200] is a subset of the Aida programming language. And so the question you may ask is, why would [05:25.200 --> 05:32.680] you use Aida? Why would you use a subset of Aida for formal verification? So I'm going [05:32.680 --> 05:43.120] to take just two simple examples. Why Aida is great for when you want to do formal verification? [05:43.120 --> 05:52.320] Well, this language provides a lot of specification power. The developers can express very precisely [05:52.320 --> 05:59.920] what they want from the program, from the code, which then the formal verification framework [05:59.920 --> 06:06.480] will be able to check. Just a simple example, if you program in any other language, if you [06:06.480 --> 06:11.720] want to have in your application a percentage value, for instance, for the completion of [06:11.720 --> 06:19.800] a process or whatever, usually we'll say, OK, my percentage is a float. And I'm going [06:19.800 --> 06:26.000] to say I'm going to use the value from 0 to 1. And if you are an extremely good programmer, [06:26.000 --> 06:30.200] you're going to write that in a comment. Like, you know, my lowest value is 0, my highest [06:30.200 --> 06:36.320] value is 1. And that's just a comment. And then if five years down the line, you want [06:36.320 --> 06:41.600] to say, oh, actually, it's better if it's from 0 to 100. What happens to the comment? [06:41.600 --> 06:45.760] Maybe you update it. Maybe you don't. How do you make sure that everything, all the code [06:45.760 --> 06:51.840] is updated to follow this new rule? That's very difficult to maintain. So one example [06:51.840 --> 06:57.360] with Aida, you just define your own type and you say it's a float. It's a new float. It's [06:57.360 --> 07:05.880] a different kind of float. That has only valid value between 0 and 1. And so what's great [07:05.880 --> 07:12.560] here is that the solvers that will try to prove the code, they get a lot of information [07:12.560 --> 07:19.520] from here. First, they will add verification conditions when you try to cast from a regular [07:19.520 --> 07:24.360] float to this percentage value. Because if the value is not within the range, that's [07:24.360 --> 07:30.240] a bug in your program. And you want to know it. Also, you can extract information. Like, [07:30.240 --> 07:34.840] if you take two percentage value and you multiply them, you also know that the result is between [07:34.840 --> 07:40.560] 0 and 1. And so in turn, that means a lot of information for the provers to do their work. [07:40.560 --> 07:52.160] And so they will be able to more automatically reason and prove that the program is correct. [07:52.160 --> 08:00.520] Another example is the contract-based programming of Aida. So you can have preconditions and [08:00.520 --> 08:07.280] postconditions on your subprograms. So precondition is something that has to be true when you [08:07.280 --> 08:13.960] call the function or the procedure. Postcondition is something that has to be true when you [08:13.960 --> 08:21.360] return from the subprogram. So very simple example here with the stack. We implement [08:21.360 --> 08:27.360] the stack. We have functions to know if the stack is full or if it's empty. And we implement [08:27.360 --> 08:33.560] the push. And obviously, well, not obviously, but in this API, we say it doesn't make sense [08:33.560 --> 08:38.640] to push something on the stack if the stack is already full. So never do that. That's [08:38.640 --> 08:45.800] the contract that the procedure is asking from the caller. And then the implementation [08:45.800 --> 08:53.080] says if you push something on the stack, well, it's not empty anymore at the return. [08:53.080 --> 08:59.160] And so with Spark, what's great here is that you have formal verification both on the [08:59.160 --> 09:07.800] caller side. So Spark will prove that you never call the push function when the stack [09:07.800 --> 09:13.600] is full. And on the implementation side, it's going to prove that when you return from [09:13.600 --> 09:20.280] the push function, the stack is never empty. And so that's going into functional correctness [09:20.280 --> 09:28.480] verification, which means your software is doing what it's supposed to do and only what [09:28.480 --> 09:35.960] it's supposed to do. And so with the integrated pre and post conditions in ADA and other features [09:35.960 --> 09:43.240] that I don't have time to mention, well, this makes ADA a very great language for formal [09:43.240 --> 09:53.680] verification. So why should I care about Spark and I would say formal verification in general? [09:53.680 --> 10:00.000] So with Spark, you can have mathematical proof that there is no possible vulnerabilities [10:00.000 --> 10:07.280] in your application for any possible inputs. That means no buffer overflow, no division [10:07.280 --> 10:15.920] by zero, no integral overflow and so on. If you go beyond and you use contracts, you can [10:15.920 --> 10:21.720] also prove, as I mentioned, the functional correctness. So the program does what it's [10:21.720 --> 10:27.080] supposed to do and only what it's supposed to do. And in terms, that means you can avoid [10:27.080 --> 10:33.360] some of the testing efforts because, for instance, unit testing is more or less trying to achieve [10:33.360 --> 10:40.080] the same goal. So if you already have a mathematical proof that the functional correctness of your [10:40.080 --> 10:47.400] code, you don't need to do unit testing anymore. And so you're going to save time on that. [10:47.400 --> 10:58.320] Recently we published a case study from NVIDIA. So a few years ago, the NVIDIA security team [10:58.320 --> 11:04.920] was questioning their methodology for security and how to achieve their goals in terms of [11:04.920 --> 11:11.200] security. And so they said testing security is pretty much impossible. You cannot test [11:11.200 --> 11:15.680] all possible combinations of all possible values for your application. And so they decided [11:15.680 --> 11:23.280] to try provability. And they selected Spark as an experiment. And now they are deploying [11:23.280 --> 11:29.960] more and more Spark in the GPU. So if you get the latest, greatest NVIDIA GPU, there [11:29.960 --> 11:39.360] should be some Spark-proven code embedded in the firmware, which lets them actually focus [11:39.360 --> 11:48.040] on other parts of the security on more interesting verifications and more interesting properties, [11:48.040 --> 11:52.680] security properties of the application. They don't have to deal with buffer overflows and [11:52.680 --> 11:56.760] integers overflows. All the low-level stuff, it's already taken care of. And they can focus [11:56.760 --> 12:07.200] on more interesting points. So now let's do some proof. So for the A9 Spark programming [12:07.200 --> 12:13.160] language, we have this package manager called Alire. So here are a few instructions to make [12:13.160 --> 12:19.800] your first and prove your first piece of Spark code. So you don't know then install the package [12:19.800 --> 12:26.120] manager. So from the command line, we start by creating a project or a crate with Alire [12:26.120 --> 12:33.880] in it. We enter the directory. We add the net prove tool suite as a dependency. So it's [12:33.880 --> 12:39.280] going to download everything and set it up for you. Then we write some piece of code. [12:39.280 --> 12:49.680] So you can recognize our very nice equation here. Just for the declaration of the X constant, [12:49.680 --> 12:55.840] it doesn't matter what it is. I'm just taking an integer value that Spark doesn't know. [12:55.840 --> 13:03.840] So just to make sure I'm not cheating or anything. We go to the console again. We run a net [13:03.840 --> 13:10.280] prove. And so a net prove will tell us, well, there might be a division by zero error at [13:10.280 --> 13:16.560] these points. So as you can see, the message is pretty clear. Actually, it can be even [13:16.560 --> 13:22.560] better than that because the tool can provide counter examples. So if we add the switch, [13:22.560 --> 13:29.840] counter examples on, net prove will say division by zero might fail, for instance, when X equals [13:29.840 --> 13:39.480] 10. And so that's pretty easy to fix. We just add this if statement. And we run the tool [13:39.480 --> 13:49.720] again. And that's it. We proved our first piece of code. So as you can see, it was easy. [13:49.720 --> 13:59.120] If you want to try and learn a little bit of Spark, we have an online website. So learn.edocore.com. [13:59.120 --> 14:04.920] Online interactive website. So you don't even have to install what I showed before just [14:04.920 --> 14:12.560] to learn and try the tool sheets. So there's different chapters and one specific to Spark. [14:12.560 --> 14:20.160] So that's one way to get started. And for those who wondered, just the piece of code [14:20.160 --> 14:27.600] before, there are seven potential bugs or errors in this one. So I'll let you as an [14:27.600 --> 14:43.400] exercise to fix this one. Thank you very much. [14:43.400 --> 15:01.360] Thank you for the presentation. Let me unwrap you with a shot. [15:01.360 --> 15:16.480] Perhaps someone might have a question about that.