[00:00.000 --> 00:10.000] Please welcome our next speaker, Govah. [00:10.000 --> 00:20.600] Hi, so my name is Govah, which is a really not name, so you can just call me Govah. [00:20.600 --> 00:23.600] And I'm going to talk about this great thing for the talk. [00:23.600 --> 00:30.320] So before that, I was sponsored by the MLNet Foundation with a grant that is financed through [00:30.320 --> 00:34.000] the European Commission called NGISU, a generation internet. [00:34.000 --> 00:38.240] So if you have ideas and just want money, just go for it, it's actually a really cool [00:38.240 --> 00:39.240] thing. [00:39.240 --> 00:41.960] So now, what's this before? [00:41.960 --> 00:48.080] So I guess the academic definition would be the programming protocol independent packet [00:48.080 --> 00:49.080] processor. [00:49.080 --> 00:51.920] It's a domain-specific land-range for network devices. [00:51.920 --> 00:57.040] But it's fine, our data plane devices, switches, microtors, filters, as well, process packets. [00:57.040 --> 00:58.040] OK, great. [00:58.040 --> 00:59.680] But what does this actually mean? [00:59.680 --> 01:05.840] OK, so it's an ungrade, basically, for hardware-optimized network processing. [01:05.840 --> 01:11.000] Like, think SIMD for network, so you can just have, like, those FPGAs, DPUs, whatever, [01:11.000 --> 01:15.760] and you can make them process network packets for you at an accelerated rate compared to [01:15.760 --> 01:17.640] software. [01:17.640 --> 01:23.000] So it roughly looks like C, in a way. [01:23.000 --> 01:28.080] As you can see, like, it has the same kind of syntax, like, arguments, whatnot. [01:28.080 --> 01:30.680] The thing is, there's a few things, there's a few oddities. [01:30.680 --> 01:35.400] Like, you can see it doesn't, it has the state thing, it doesn't have a return. [01:35.400 --> 01:38.600] You have transition select, what is this? [01:38.600 --> 01:39.600] It looks a bit weird. [01:39.600 --> 01:44.040] So let's explain this a bit. [01:44.040 --> 01:49.520] Funds in P4 are replaced, mostly, by things called parser, control and package. [01:49.520 --> 01:50.840] So, what is parser? [01:50.840 --> 01:55.760] A parser is a function that is going to pass an incoming packet according to, well, the [01:55.760 --> 02:01.120] same thing as in C, which means structs, type the effect, etc., like, say, I'm going to [02:01.120 --> 02:05.000] define a struct, I'm going to say, if this element of a struct is this, then I'm going [02:05.000 --> 02:06.000] to call this function. [02:06.000 --> 02:08.080] Basically, that's what the parser function does. [02:08.080 --> 02:10.880] You have then the control functions. [02:10.880 --> 02:17.600] You modify your past packet, say, for example, you have this really long packet, which says, [02:17.600 --> 02:22.680] I have up x, y, z, and then you want to remove the up x, because you're on up x, you're going [02:22.680 --> 02:25.440] to do this in a control function. [02:25.440 --> 02:30.120] And then, you have package, which defines basically the binding logic between the hardware, [02:30.120 --> 02:36.640] like, oh, am I going, say, I implement a firewall, oh, am I going to add new rules, am I going [02:36.640 --> 02:40.120] to, like, load new definitions, and whatnot. [02:40.120 --> 02:42.000] So, that's basically it. [02:42.000 --> 02:46.640] There are very interesting keywords, there are other things, like, that I'm not going [02:46.640 --> 02:53.040] to explain over this, because P4 is kind of, like, that this would basically be out of [02:53.040 --> 02:58.400] scope for the talk, and P4 is already complex enough as is. [02:58.400 --> 03:02.240] So, how do you want to use P4 in NICs? [03:02.240 --> 03:04.240] There's a few ideas. [03:04.240 --> 03:06.440] My take on it is, let's make a transpiler. [03:06.440 --> 03:07.440] Why? [03:07.440 --> 03:14.000] A transpiler allows you to reuse for concepts within NICs, say, because P4 is a bit verbose, [03:14.000 --> 03:18.120] because, say, you want to pass this IP, TCP packet, something. [03:18.120 --> 03:23.600] You need each time to redefine the IP header thing is, so you want to redefine this track, [03:23.600 --> 03:28.160] you want then to have it passed in a way, and then sometimes you want this thing, sometimes [03:28.160 --> 03:32.320] you have this other IP strike, because you don't want to pass, like, some optional data [03:32.320 --> 03:35.240] in the header that would slow down the process, et cetera. [03:35.240 --> 03:39.920] So it's a bit verbose, let's just use NICs to basically implement parts of it and have [03:39.920 --> 03:42.200] it in a nice packet. [03:42.200 --> 03:45.040] So what is a transpiler? [03:45.040 --> 03:50.520] In our case, it's a NICs to P4 translator, which means you can define things in NICs, [03:50.520 --> 03:55.120] strats, enums, whatnot, and it generates P4 code. [03:55.120 --> 03:59.760] Then you have the P4 compiler, which actually processes the code that's generated, followed [03:59.760 --> 04:06.880] by the target compiler, which, in this case, allows you to deploy your thing to, say, an [04:06.880 --> 04:10.680] FPGA, a DPU, et cetera. [04:10.680 --> 04:14.040] Basically the thing that allows you to run your thing, your program. [04:14.040 --> 04:18.840] So what does it look like in action? [04:18.840 --> 04:25.920] Basically this is a NICs file, which allows you to define some P4 concepts. [04:25.920 --> 04:30.720] In this case, I define a header flag, which contains a few counts, like, max-ops, standards, [04:30.720 --> 04:31.920] ops, et cetera. [04:31.920 --> 04:34.120] And then a few headers. [04:34.120 --> 04:40.760] In this case, I'm just defining standard T, which has two flags, source and destination, [04:40.760 --> 04:44.200] both of type bit 8. [04:44.200 --> 04:50.240] That's basically just a way to redefine the annoying thing of P4, then you can also have [04:50.240 --> 04:51.240] the include. [04:51.240 --> 04:56.240] So this is processed, then, by this function called run-thread-spiler, mcat-transpiler, [04:56.240 --> 04:58.040] whatever I call it now. [04:58.040 --> 05:03.080] And you just run this, and it generates code for you, and that's great. [05:03.080 --> 05:09.760] So we can simplify this, obviously, because that will be annoying to each type, like, [05:09.760 --> 05:12.080] this header, this header, et cetera. [05:12.080 --> 05:15.880] And the whole point of making this whole thing is to make it less verbose and to reuse code [05:15.880 --> 05:20.200] so you can just have your own info, mostly in NICs and, like, a few bits and packages [05:20.200 --> 05:24.360] in P4, so you can basically put everything in the same place. [05:24.360 --> 05:28.720] In this case, you can just innovate things that I define in helpers, and you can do the [05:28.720 --> 05:30.960] same thing, and you get your P4 source. [05:30.960 --> 05:33.240] So that's a lot simpler. [05:33.240 --> 05:36.600] And we don't have to define eternity each time, we don't have to define mac addresses [05:36.600 --> 05:42.080] each time, which is a thing in P4, because if you use, like, standard P4, it's going [05:42.080 --> 05:47.800] to make you have to redefine all of this at each program you make. [05:47.800 --> 05:53.760] Which is thanks to helpers that I wrote down in my packages. [05:53.760 --> 05:55.480] So what does the end result look like? [05:55.480 --> 05:59.400] Because I'm saying, oh, this is the NIC code, this is what it looks like, et cetera, yeah, [05:59.400 --> 06:00.400] it's great. [06:00.400 --> 06:01.400] Okay. [06:01.400 --> 06:05.760] So this is basically the transpire. [06:05.760 --> 06:12.520] It's kind of dirty NICs in a way, like, you can see it's a lot of messy inventions, but [06:12.520 --> 06:18.040] basically the idea is that I define a module, NICSOS module, which verifies the types of [06:18.040 --> 06:22.440] what I give it, in this case the other, I give it a default type, then you can have, [06:22.440 --> 06:27.920] like, the union, the content, the one-off, et cetera, which are then passed by the transpire, [06:27.920 --> 06:36.240] which are these huge, like, nested functions, which basically just map strokes like NICSOS [06:36.240 --> 06:41.080] at Simeon into just, like, P4 materials, and then just write it. [06:41.080 --> 06:42.840] It just starts to start to transpire. [06:42.840 --> 06:46.640] It's nothing fancy, but it's fine. [06:46.640 --> 06:49.720] And so what does the end result look like after the transpire? [06:49.720 --> 06:54.080] It looks like P4 code, but pretty clean. [06:54.080 --> 06:59.440] You have, like, the include, the define, the ops, max ops, standard metadata, then you [06:59.440 --> 07:08.240] have your structure, et cetera, and then you can include, like, your own code. [07:08.240 --> 07:16.640] Then, and then, this whole P4 code is then processed by the target compiler, which then [07:16.640 --> 07:21.640] processes it to its own platform, say, it can actually generate, like, say, EBPF code, [07:21.640 --> 07:23.200] so it runs on Linux. [07:23.200 --> 07:30.440] EBPF is basically, like, this canal thingy that you can run to have more privileges. [07:30.440 --> 07:32.720] I explain it later if I have the time. [07:32.720 --> 07:36.600] And this is BMV2, and BMV2 is just a simple model architecture. [07:36.600 --> 07:38.400] You'll also talk about it later. [07:38.400 --> 07:42.640] So basically, it starts to suspend compiler that has multiple status and allows you to [07:42.640 --> 07:45.800] specify P4 functions, and we use concepts. [07:45.800 --> 07:49.000] It's nothing fancy, but it's useful. [07:49.000 --> 07:53.120] So now that I talked about BMV2, I think I have time to do it. [07:53.120 --> 07:54.600] So what is it? [07:54.600 --> 07:56.520] Glad you asked. [07:56.520 --> 08:02.040] So the simple switch architecture is the de facto architecture that is used to basically [08:02.040 --> 08:04.800] test P4C, which is the main compiler. [08:04.800 --> 08:10.680] So it's basically, like, this kind of standard test setup that you can use to just see if [08:10.680 --> 08:12.680] your code works, but it's not fast. [08:12.680 --> 08:14.480] You can't really target it to anywhere. [08:14.480 --> 08:19.120] It's basically, like, this low user-length thing that you can use to just test your program [08:19.120 --> 08:24.640] that is not fast, but does mostly what you want. [08:24.640 --> 08:28.240] It's basically just, like, this interface for targeting the switch. [08:28.240 --> 08:35.160] It's an abstract interface, and it's also used to just verify how P4C works. [08:35.160 --> 08:38.320] So now to continue a bit more on P4C. [08:38.320 --> 08:43.680] To set up, like, to use P4, you need a target, and the main targets that are currently implemented [08:43.680 --> 08:48.120] in P4C are user-length, which is, in this case, EBPF. [08:48.120 --> 08:51.720] Some people are going to cringe at this because EBPF is technically kernel, but what I mean [08:51.720 --> 08:54.680] by user-length is I mean it runs on a computer. [08:54.680 --> 09:00.520] The DPDK, which as, and this is why I'm saying it's user-length, its own user-length code [09:00.520 --> 09:07.680] that can run, but the DPDK also is like this, how do you call that? [09:07.680 --> 09:08.680] An API, I guess. [09:08.680 --> 09:11.280] It's an API that can target a bunch of devices. [09:11.280 --> 09:15.040] So it's not only user-length, but it has a user-length point. [09:15.040 --> 09:20.240] Then hardware, because you can use DPDK to also target hardware, say FPGAs, which are [09:20.240 --> 09:25.600] like, for people that don't know, things that you can use to, say, program CPUs or whatnot. [09:25.600 --> 09:30.200] You can basically just reconfigure the electrical level, electrical gates, et cetera, and then [09:30.200 --> 09:33.960] Q-Storm Asics, which are like Q-Storm processors and whatnot, basically. [09:33.960 --> 09:36.800] And then just emulated. [09:36.800 --> 09:39.200] I'm going to call it emulated because of how slow it is. [09:39.200 --> 09:43.760] It just basically, let's test the thing, BMW 2, I told you about it already. [09:43.760 --> 09:49.200] So obviously, all of us need some kind of interface to work on. [09:49.200 --> 09:56.560] And usually, you use the abstract switch interface with a few per-device changes and some, say, [09:56.560 --> 10:01.680] some control functions usually have a definition specific for EBBA than we like. [10:01.680 --> 10:05.920] So there are a few changes, but, overall, you can target basically all of these with [10:05.920 --> 10:08.960] mostly the same code, which is great. [10:08.960 --> 10:10.920] But this also needs changes to the transpiler. [10:10.920 --> 10:15.280] So you need to have more options, so you can target every different one. [10:15.280 --> 10:20.200] So you can, say, have the role, mess, auto-generated by a transpire. [10:20.200 --> 10:25.040] So, say, you can just, in NICS, define, hey, I want target X, and it's going to do it automatically [10:25.040 --> 10:29.080] for you without having to change the P4 code, which is mess. [10:29.080 --> 10:32.560] So now, introducing FPGAs on NICS. [10:32.560 --> 10:34.720] And I'm actually not kidding. [10:34.720 --> 10:38.160] The thing is, I forgot to actually take the picture before going to FOSM, so imagine it [10:38.160 --> 10:39.160] is really hard. [10:39.160 --> 10:44.000] Basically, an FPGA connected to a computer who, say, USB or whatnot, and then connected [10:44.000 --> 10:45.800] to the network for Ethernet. [10:45.800 --> 10:51.840] The idea is that we are not running NICS on FPGAs. [10:51.840 --> 10:58.160] We are using NICS to define what the FPGA wants, in this case, the network stack. [10:58.160 --> 11:02.800] So we need to define, like, say, the devices we have in NICS directly. [11:02.800 --> 11:12.320] So say I'm going to have, like, hardware.alastor.type is my FPGA, and then I'm going to explain [11:12.320 --> 11:15.120] how I can define it. [11:15.120 --> 11:20.720] Then you need some kind of auto-reload-deploy mechanism, usually, like, most FPGA providers [11:20.720 --> 11:23.280] just provide you a way for USB. [11:23.280 --> 11:28.040] You can also have just an auto-reload of tools through the control plane in P4 directly, [11:28.040 --> 11:34.320] if you want to go this way, which means basically no downtime network stack, which is nice. [11:34.320 --> 11:39.520] And yeah, you can have a data plane mechanism for feeding data to the host, right. [11:39.520 --> 11:41.840] All of this is a work in progress for now. [11:41.840 --> 11:44.000] So the transpiler is done. [11:44.000 --> 11:50.640] Basically, most of the source-to-source transpiler, including the targets are done, the targets [11:50.640 --> 11:55.760] are packaged, you can use BNV2 nowadays, you can use DPDK and whatnot, but the hardware [11:55.760 --> 11:57.480] definitions are currently work in progress. [11:57.480 --> 12:02.760] I haven't had the time to work on them yet, but it's close enough, close enough. [12:02.760 --> 12:03.760] And software works. [12:03.760 --> 12:07.400] So yeah, that's basically it. [12:07.400 --> 12:08.400] Any questions? [12:08.400 --> 12:09.400] Yeah? [12:09.400 --> 12:12.160] How can you get a switch that runs P4? [12:12.160 --> 12:16.440] So the question is, oh, can I get a switch that runs P4? [12:16.440 --> 12:17.840] P4. [12:17.840 --> 12:26.000] So you don't technically need a switch that runs P4. [12:26.000 --> 12:31.400] You can just use any, say, FPGA or DPU that you can reprogram, and that is targeted by [12:31.400 --> 12:32.400] DPDK. [12:32.400 --> 12:36.520] You also have, like, some routers that have A6 that are reprogrammable, as long as you [12:36.520 --> 12:41.400] can target it with DPDK, which is basically the API you can target. [12:41.400 --> 12:44.960] You can just use it, and when P4 on it, basically. [12:44.960 --> 12:55.280] Maybe I missed it, but what is it that makes NICs particularly happy for this compared [12:55.280 --> 12:59.360] to other things that might be different? [12:59.360 --> 13:06.240] Okay, so the question is, what makes NICs particularly nice in this case for P4? [13:06.240 --> 13:11.640] So what makes NICs, in this case, useful is that it's at the border of basically this [13:11.640 --> 13:18.760] bowing code to generate per target, and you also need to actually target an infrastructure. [13:18.760 --> 13:24.520] And basically, NICs is this great thing where you can just define an interface, an infrastructure, [13:24.520 --> 13:28.080] I mean, and you can then write automated code. [13:28.080 --> 13:32.160] So NICs, in this case, has two key roles. [13:32.160 --> 13:37.240] It's allowing you to write code with less, not having to rewrite every time thing, making [13:37.240 --> 13:39.640] it just basically this handy source tool. [13:39.640 --> 13:44.240] And the thing is to define your network stack in the same place as you would define your [13:44.240 --> 13:47.920] standard infrastructure, which means, say, I want to define this server, okay, cool, [13:47.920 --> 13:51.520] but now my server has like 100 million requests, I know. [13:51.520 --> 13:55.200] And I need to actually, like, huffload these things for some degree, so I'm just going [13:55.200 --> 14:01.520] to say, hey, hardware-definition.fga is my PGA, and now I have must-be magically, well, [14:01.520 --> 14:11.320] you need to pay for the PGA and program thing, but yeah, magically. [14:11.320 --> 14:12.320] Any other questions? [14:12.320 --> 14:13.320] Yeah? [14:13.320 --> 14:18.320] This is useful for things that are not network-based, right, so you can do things that have nothing [14:18.320 --> 14:22.800] to do with networking for your target thing, your PGA, is there a scope for this, or is [14:22.800 --> 14:25.000] it basically just for networking stuff? [14:25.000 --> 14:30.440] So the question is, is this going to be useful for other things than networking? [14:30.440 --> 14:31.800] And the answer is yes or no. [14:31.800 --> 14:37.160] So the FPGA deployment mechanism, yes, is going to be useful in this case, because you [14:37.160 --> 14:42.120] can just reuse this thing and use it outside of the scope of P4. [14:42.120 --> 14:46.880] The reason now is that everything else is basically P4-specific, but yes, the last part [14:46.880 --> 14:51.800] is probably going to be reusable for other things than P4, first of all. [14:51.800 --> 14:57.560] And second of all, there's going to be a few different FPGA targets, probably, like there's [14:57.560 --> 15:01.760] always going to be the proprietary ones with DPDK that have their own, say, deployment [15:01.760 --> 15:07.400] binaries that are clusters that are pain, but you can always see, like, try to implement, [15:07.400 --> 15:13.240] say, your these thing is and whatnot, so to have an open source FPGA setup, sorry. [15:13.240 --> 15:18.880] So yeah, you can use it outside of P4, basically. [15:18.880 --> 15:19.880] Yes? [15:19.880 --> 15:29.880] The P4 language is heavily exposed to the non-network thing, like, a lot of people using it for non-network. [15:29.880 --> 15:35.880] Like, Western digital, like, tax-coherent bus and the bloody thing. [15:35.880 --> 15:41.880] If you have any kind of stream that you can process, if you can express it in a way that [15:41.880 --> 15:45.880] it can be processed, if you'll find you can process it in a way that really remains specific [15:45.880 --> 15:52.880] for DPUs and network processors, like you have CUDA, OpenSQL, and so forth for DPUs. [15:52.880 --> 15:54.520] That's a really good point. [15:54.520 --> 15:59.360] Just for the live people, basically what was said is that you can use P4 to process any [15:59.360 --> 16:05.240] kind of stream and modify it, and it's used by other companies to process this in an accelerated [16:05.240 --> 16:08.120] version, which is very true and very handy. [16:08.120 --> 16:12.880] It's not the most common use of P4, but P4 is always not very common, so. [16:12.880 --> 16:15.880] We have two minutes left. [16:15.880 --> 16:18.880] Did you want to do a demo? [16:18.880 --> 16:21.880] Okay, let's do that. [16:21.880 --> 16:22.880] So, thank you. [16:22.880 --> 16:23.880] That was me. [16:23.880 --> 16:24.880] Go and find. [16:24.880 --> 16:25.880] Here's my website. [16:25.880 --> 16:26.880] Here's my email. [16:26.880 --> 16:29.880] And, yeah, we have one last thing to show. [16:29.880 --> 16:30.880] So, hmm? [16:30.880 --> 16:34.880] I want to take a photo of the P4, just like that. [16:34.880 --> 16:35.880] Ah, sure. [16:35.880 --> 16:37.880] It should be recorded, so. [16:37.880 --> 16:38.880] Yeah. [16:38.880 --> 16:39.880] Okay. [16:39.880 --> 16:43.880] Did I expect people to actually care about me, honestly? [16:43.880 --> 16:44.880] So. [16:44.880 --> 16:49.880] The demo is a bit of a strange thing, because the next talk is about secure boot. [16:49.880 --> 16:50.880] Yes. [16:50.880 --> 16:52.880] So, it's a demo about secure boot. [16:52.880 --> 16:55.880] There's, yeah, there's one thing I forgot to mention. [16:55.880 --> 16:59.880] I'm working P4 on this, but I'm also working on the secure boot thing, at least. [16:59.880 --> 17:02.880] So, what I wanted to show you is that. [17:02.880 --> 17:06.880] Let me just do that, do that, so I can just do this. [17:06.880 --> 17:13.880] I started this small PR the other day, which is like grab support for secure boot. [17:13.880 --> 17:18.880] So, yeah, I think now we can actually boot secure boot on XOS using Grubb. [17:18.880 --> 17:26.880] And to add on to that, let me just show it there, because it's, I have too many windows. [17:26.880 --> 17:28.880] There we go. [17:28.880 --> 17:31.880] Let me just, yeah, there we go. [17:31.880 --> 17:36.880] Welcome to Grubb. [17:36.880 --> 17:39.880] And I'm going to, I'm just going to zoom for the live. [17:39.880 --> 17:40.880] Okay. [17:40.880 --> 17:41.880] Yeah. [17:41.880 --> 17:42.880] There we go. [17:42.880 --> 17:45.880] And basically just, this whole thing is working on the secure boot. [17:45.880 --> 17:48.880] And as you can see, it's panicking, but it's close enough. [17:48.880 --> 17:49.880] It's close enough. [17:49.880 --> 17:50.880] So now. [17:50.880 --> 17:51.880] So now. [17:51.880 --> 18:02.880] I'm going to leave the next speaker explaining what Lensabuti is.