[00:00.000 --> 00:13.240] Good morning and welcome to the very first image based Linux and secure measured boot [00:13.240 --> 00:14.240] Devon. [00:14.240 --> 00:16.940] Bit of a mouthful we'll try a shorter one next year. [00:16.940 --> 00:19.080] So let me start by introducing myself. [00:19.080 --> 00:20.840] My name is Luca. [00:20.840 --> 00:25.960] By day I am a software engineer in the Linux systems group at Microsoft where I work on [00:25.960 --> 00:28.140] the Azure infrastructure. [00:28.140 --> 00:31.020] And by night I contribute to various open source projects. [00:31.020 --> 00:36.580] I am assistant demantainer, Debian developer, DPKLTS maintainer, a bunch of other stuff [00:36.580 --> 00:39.140] that I forget and neglect. [00:39.140 --> 00:44.420] So I will give you the introduction to the Devroom and an overview of all the topics [00:44.420 --> 00:50.220] that we touch on to hopefully give you a holistic view of what image based Linux is. [00:50.220 --> 00:55.900] So let me start by saying thank you to all the organizers and co-organizers for this [00:55.900 --> 01:01.100] Devroom, especially to Thilo, who unfortunately could not make it to Brussels this year, but [01:01.100 --> 01:06.340] he did most of the work on the FOSD website and CFP and so on. [01:06.340 --> 01:08.940] So thank you Thilo if you are watching. [01:08.940 --> 01:11.660] Now some boring logistics. [01:11.660 --> 01:17.500] This Devroom has a five minutes break between talks to allow to switch some spillover. [01:17.500 --> 01:24.180] We have a 10 minutes break at 10 past 12 and we finish at 20 past 2. [01:24.180 --> 01:27.500] Next Devroom starts at half past 2. [01:27.500 --> 01:32.700] Now in case this is your first FOSD or it's not but you never noticed, everything is live [01:32.700 --> 01:34.340] streamed and recorded. [01:34.340 --> 01:38.460] If you're not comfortable with your back of your head recorded or live streamed, best [01:38.460 --> 01:40.940] I can suggest is to sit at the sides. [01:40.940 --> 01:45.340] If you ask questions, remember again there will be live streamed and recorded. [01:45.340 --> 01:47.900] If you're not comfortable with that, there's a matrix chat. [01:47.900 --> 01:53.060] You can ask a question there and our Devroom organizers will repeat it for you. [01:53.060 --> 01:57.220] We do want people to ask questions, please do so. [01:57.220 --> 02:02.620] Please do not just start shouting at the presenter, raise your hand and then we will come to you [02:02.620 --> 02:04.540] with a microphone. [02:04.540 --> 02:10.540] If you're a speaker and people shout a question at you, please first repeat it and then answer [02:10.540 --> 02:12.060] it. [02:12.060 --> 02:14.100] And I think that's it. [02:14.100 --> 02:16.700] Now let's get into the interesting stuff. [02:16.700 --> 02:18.980] What is image-based Linux? [02:18.980 --> 02:23.940] Now if you're an embedded person or radiation to that word, you're probably thinking, what [02:23.940 --> 02:24.940] are these guys talking about? [02:24.940 --> 02:28.460] We've been doing image-based Linux for like 30 years, it's nothing new. [02:28.460 --> 02:30.700] And you wouldn't be completely wrong. [02:30.700 --> 02:36.580] Now the difference is that our focus within these people, this ecosystem, is on the security [02:36.580 --> 02:37.780] aspect. [02:37.780 --> 02:41.220] Because let's face it, Linux runs everywhere, right? [02:41.220 --> 02:43.740] Most of our infrastructure and economy runs on Linux these days. [02:43.740 --> 02:46.860] All the public clouds run on Linux, even Azure. [02:46.860 --> 02:51.500] So we want to get our security story straight. [02:51.500 --> 02:52.500] What does that mean? [02:52.500 --> 02:54.900] What are some of the basic concepts? [02:54.900 --> 03:00.940] First of all, we want to have first-class support for at least one, if not both, of [03:00.940 --> 03:07.100] Ufisecureboot and TPM-based measurements, hopefully both. [03:07.100 --> 03:11.420] Because the goal here is to extend the chain of trust at boot. [03:11.420 --> 03:18.660] Now if you're using a generic Linux distribution like Debian or Ubuntu or Fedora, the story [03:18.660 --> 03:25.020] in your firmware to kernel chain of trust is pretty well buttoned up by now. [03:25.020 --> 03:28.700] Because a lot of people did a lot of work in the past 12 years to get that story straight [03:28.700 --> 03:31.780] and they keep doing that to maintain it. [03:31.780 --> 03:37.820] So in your generous distribution, you will have your firmware which verifies the first-stage [03:37.820 --> 03:42.380] boot loader, which will be SHIM signed by Microsoft, and then SHIM, the first-stage [03:42.380 --> 03:46.780] boot loader, verifies the second-stage boot loader, and verifies the kernel, and the kernel [03:46.780 --> 03:49.420] verifies the kernel modules. [03:49.420 --> 03:54.420] So if you are within ring zero, the chain of trust is pretty solid. [03:54.420 --> 03:59.460] There is this small little thing to the side called user space where things are not so [03:59.460 --> 04:03.060] pitchy, and that is what we are trying to improve. [04:03.060 --> 04:08.300] So just a quick summary, we'll go into more details, but we want EATRDs to be signed. [04:08.300 --> 04:12.180] EATRDs are completely unprotected right now in most distributions. [04:12.180 --> 04:13.980] They are built locally. [04:13.980 --> 04:19.660] If an attacker or malware can get right access to that, they can embed their malware in there [04:19.660 --> 04:21.660] and you will be known the wiser. [04:21.660 --> 04:23.140] That's a bit of a problem. [04:23.140 --> 04:24.140] Same thing for your rootFS. [04:24.140 --> 04:29.460] It probably is encrypted these days, but that helps for offline attacks, not online ones. [04:29.460 --> 04:31.700] We want to do better there. [04:31.700 --> 04:37.260] One of the key requirements to use any of the specification infrastructure tools that [04:37.260 --> 04:41.900] we'll see is you need to have an hermetic Zesh user. [04:41.900 --> 04:42.900] What does that mean? [04:42.900 --> 04:46.060] It means your vendor tree must be within Zesh user. [04:46.060 --> 04:51.940] If you are in one of those handful of distributions that still have the top-level beans bean [04:51.940 --> 04:56.100] or lib directories that are not sim links, it's time to move on. [04:56.100 --> 05:02.900] Then Debbie and Marge finally kicking and screaming to get into merged users. [05:02.900 --> 05:05.460] That is our core requirement. [05:05.460 --> 05:12.220] The people who work on this stuff kind of got together from various distributions, companies, [05:12.220 --> 05:15.220] projects, and we created this UAPI group. [05:15.220 --> 05:16.780] We have a nice website. [05:16.780 --> 05:20.700] We have a GitHub organization with a discussion tab. [05:20.700 --> 05:23.980] There's already quite some interesting discussions going on there, so I encourage people who [05:23.980 --> 05:27.100] are interested in these topics to check that out. [05:27.100 --> 05:30.660] Now, what does actual image-based Linux mean? [05:30.660 --> 05:35.900] This is my personal understanding and analysis from my point of view. [05:35.900 --> 05:39.300] I see at least three different flavors of this. [05:39.300 --> 05:41.420] There's the pure image-based one. [05:41.420 --> 05:47.140] It's the one that I do in Azure where you build images, all images, and you ship to the machines. [05:47.140 --> 05:50.660] You have DM Verity to cover the root file system. [05:50.660 --> 05:51.820] I want to explain what that is. [05:51.820 --> 05:57.340] The next talk, we go into details, and then we have AB schemes for upgrade-downgrades. [05:57.340 --> 05:58.340] Nothing groundbreaking. [05:58.340 --> 06:01.500] Android has been doing this for 15 years or whatever. [06:01.500 --> 06:07.900] The second camp is the OS 3 one, pure or RPM-based, like Fedora Core, for example. [06:07.900 --> 06:13.140] What they do there is they build either packages or OS 3 snapshots, and then they apply them [06:13.140 --> 06:14.140] locally. [06:14.140 --> 06:16.300] You're a boot into the next snapshot or a different one. [06:16.300 --> 06:19.260] It's like having a Git tree for your file system. [06:19.260 --> 06:24.500] The root file system there is either a femoral or immutable runtime. [06:24.500 --> 06:25.500] I cannot remember. [06:25.500 --> 06:26.500] You cannot change it. [06:26.500 --> 06:28.420] You're a boot into the new one. [06:28.420 --> 06:30.660] ButterFest camp, very similar. [06:30.660 --> 06:34.500] Instead of OS 3, you use the ButterFest snapshot in capabilities. [06:34.500 --> 06:38.020] So you install a package that doesn't get installed in your root FS. [06:38.020 --> 06:41.820] You install it into the new snapshot and then reboot into it. [06:41.820 --> 06:46.620] The core thing I want you to take away from this is that there are different flavors, ways [06:46.620 --> 06:48.980] of stringing things up together. [06:48.980 --> 06:49.700] That's okay. [06:49.700 --> 06:53.140] That's what Linux is great at, at this diversity. [06:53.140 --> 06:57.580] But the core important concept is that we share goals. [06:57.580 --> 07:04.300] We share tools, infrastructure, and specifications, because we want the same thing in different [07:04.300 --> 07:05.300] ways. [07:05.300 --> 07:08.500] So let's look at some of these specifications. [07:08.500 --> 07:11.300] Fair warning, there's a lot of acronyms coming your way now. [07:11.300 --> 07:12.900] I apologize in advance. [07:12.900 --> 07:18.180] Now, UKI, you will hear a lot about this today, unified kernel image. [07:18.180 --> 07:19.180] What is this? [07:19.180 --> 07:24.340] A UKI is a single P binary, a UFI executable. [07:24.340 --> 07:25.340] Why is it good? [07:25.340 --> 07:29.620] Because you mix a UFI stub, a kernel, and an NTRD. [07:29.620 --> 07:32.940] And then you can sign it for secure boot. [07:32.940 --> 07:36.380] Remember I talked about the NTRD being unsecured before? [07:36.380 --> 07:37.820] This is how we fix it. [07:37.820 --> 07:40.420] The NTRD is no longer built locally. [07:40.420 --> 07:43.180] It's built by the vendor and shipped inside the UKI. [07:43.180 --> 07:47.700] So it's signed and it's verified as part of the boot process. [07:47.700 --> 07:51.380] I won't go into details in this process because one of the next talks will tell you everything [07:51.380 --> 07:53.140] about this. [07:53.140 --> 07:56.980] So the UKI is dropped into the boot partition or ESP. [07:56.980 --> 08:03.660] And then it's out to discover by bootloaders implementing the BLS, bootloader specification. [08:03.660 --> 08:08.500] What does that mean is that you don't need to configure your system to pick the UKI up [08:08.500 --> 08:10.140] when you boot. [08:10.140 --> 08:16.060] The bootloader, we parse what's available, get information out of it from the UKI itself [08:16.060 --> 08:17.260] and present you a menu. [08:17.700 --> 08:21.060] It's drop and plug and play basically. [08:21.060 --> 08:25.140] So this is supported by system reboot and there are patches for grab as well. [08:25.140 --> 08:31.620] I think Fedora will ship with those patches and hopefully they make it their way upstream. [08:31.620 --> 08:39.220] Another good thing about the UKIs is that not just we sign them and verify them as one, [08:39.220 --> 08:45.020] but also we can then predictably measure them into TPM in PCR 11. [08:45.020 --> 08:46.780] So the hashes will always match. [08:46.780 --> 08:49.540] If that doesn't make any sense to you, that's okay. [08:49.540 --> 08:53.300] And later we tell you everything about TPM and measurements. [08:53.300 --> 08:59.460] I just mentioned it here, so you have an overarching view of why this stuff is good. [08:59.460 --> 09:03.180] And we want to do some future work here, but the important thing is the specification is [09:03.180 --> 09:05.420] at this URL. [09:05.420 --> 09:06.540] And that's for UKIs. [09:06.540 --> 09:12.620] Now, next one, this is my favorite one, DDI, discoverable disk image. [09:12.620 --> 09:13.620] What is this thing? [09:13.620 --> 09:17.740] It's just a raw image wrapped into a GPT partition table. [09:17.740 --> 09:20.980] The good thing is that it is self-described. [09:20.980 --> 09:26.540] Each partition is tagged with a UID that is fixed and tells you what the purpose of the [09:26.540 --> 09:27.820] partition is. [09:27.820 --> 09:34.060] You don't need to say root equal devsDA5 because the partition is tagged with UID that says [09:34.060 --> 09:35.780] I'm the root of s. [09:35.780 --> 09:44.780] So also because security is important to us, this natively first class supports the unverity [09:44.780 --> 09:45.780] for protection. [09:45.780 --> 09:47.780] Again, the unverity will be delved into later. [09:47.780 --> 09:53.820] I won't tell you what it is, it's for securing the partition against tampering. [09:53.820 --> 09:59.060] All tools that support DDIs support the unverity natively. [09:59.060 --> 10:05.260] The other important thing is that given they are self-described, you just pass them to the [10:05.260 --> 10:09.980] right tool and they do the right thing that you expect out of the box. [10:09.980 --> 10:16.580] You put it in, if your disk where the SP was is a DDI, system D will automatically find [10:16.580 --> 10:20.340] the root partition by looking at UID and boot it from the internal D. [10:20.340 --> 10:26.580] If you pass the DDI to N-spawn, it will automatically use it for the root for a system or the container [10:26.580 --> 10:27.580] you're booting. [10:27.580 --> 10:32.780] You pass it to portable D or the system service as root image and it will automatically use [10:32.780 --> 10:37.140] for the root for a system of that amount namespace. [10:37.140 --> 10:42.580] You pass it to CZEXT and it will be automatically used to extend the root for a system. [10:42.580 --> 10:44.300] We'll see in an example that means. [10:44.300 --> 10:51.580] So one image format, self-described, give it to many tools, they do the right thing automatically. [10:51.580 --> 10:53.900] Security as first class concept. [10:53.900 --> 10:54.900] Now what's a CZEXT? [10:54.900 --> 10:58.300] This is important for the interdetalk later. [10:58.300 --> 11:01.420] So it's a specific type of DDI. [11:01.420 --> 11:04.460] It can be used to extend a root for a system. [11:04.460 --> 11:09.940] So it will ship the user hierarchy or the shop if you're a third party vendor and it's [11:09.940 --> 11:14.860] identified by the extension release file which matches the format of the OS release file [11:14.860 --> 11:17.180] that you're probably familiar with. [11:17.180 --> 11:20.780] Specification of this is that URL. [11:20.780 --> 11:30.180] So you get a root FS DDI, bunch of CZEXT DIs and bam, you get another AFS read-only that [11:30.180 --> 11:33.060] sums the content of all those images. [11:33.060 --> 11:36.380] And again, this is important for the later talk. [11:36.380 --> 11:40.380] Again, security, first class citizen, the dmware is supported for all of these and all [11:40.380 --> 11:43.900] the tools that use these CZEXT DIs. [11:43.900 --> 11:49.020] Same idea as before, you pass it to the right tool, it does the right thing by default. [11:49.020 --> 11:51.460] If it's your ESP, we'll see how it's be. [11:51.460 --> 11:53.900] It will be used to extend the internal D. [11:53.900 --> 11:58.700] If it's on var or ETC, system D will use it to extend your root FS. [11:58.700 --> 12:01.660] You pass it to portable D or to a system service. [12:01.660 --> 12:07.220] It will extend the root FS of the service or portable service. [12:07.220 --> 12:12.460] Again, with security in mind, so it's all protected read-only and enforced by the kernel. [12:12.460 --> 12:15.660] You pass it to EnSpawn and nothing happens because we don't support it yet. [12:15.660 --> 12:18.140] We should add that probably at some point. [12:18.140 --> 12:21.340] Specification of this URL. [12:21.340 --> 12:28.540] Now, all of these might sound like the clear and humbling of a raven lunatic, but I swear [12:28.540 --> 12:29.540] it's real. [12:29.540 --> 12:30.540] It exists. [12:30.540 --> 12:31.540] It's used in production. [12:31.540 --> 12:34.220] So what you can see here is the stuff that I work on. [12:34.220 --> 12:40.420] It's a PCI express card that has an ARM64 system on a chip. [12:40.420 --> 12:42.500] It's used in the Azure hosts. [12:42.500 --> 12:48.940] So the machines that run Azure virtual machines have these cards plugged in and the Linux [12:48.940 --> 12:54.700] R&D OS provides offloading and acceleration for the Azure nodes. [12:54.700 --> 12:58.140] So if you use Azure, you already use this DDI stuff. [12:58.140 --> 13:02.980] You just don't know about it because we use DDIs extensively and we are looking into UKIs [13:02.980 --> 13:03.980] as well. [13:03.980 --> 13:04.980] So this is all real. [13:04.980 --> 13:06.980] It's all true. [13:06.980 --> 13:10.540] Now, to conclude, come talk to us. [13:10.540 --> 13:12.980] We usually don't bite. [13:12.980 --> 13:18.860] Here's again the URL and for the website and for the GitHub organization. [13:18.860 --> 13:24.100] So we want you to join us and embrace some of the secure way of doing Linux. [13:24.100 --> 13:30.820] We want you to help us extend the specifications and also we want to finally get a work class [13:30.820 --> 13:33.940] of security bugs extinguished. [13:33.940 --> 13:37.620] And any questions? [13:37.620 --> 13:42.620] Yes, microphone. [13:42.620 --> 13:53.340] Hi, how would you compare UKIs to fit images from UBoot which also support signing and [13:53.340 --> 13:55.700] packaging all these parts into one single image? [13:55.700 --> 13:56.700] Yes. [13:56.700 --> 13:58.020] It is actually quite similar. [13:58.020 --> 13:59.420] Now, of course, the UBoot team. [13:59.420 --> 14:08.180] So UBoot is a firmware slash boot order environment used by embedded devices, essentially. [14:08.180 --> 14:15.580] They support this fit format, flattened image table and they have very similar concept absolutely. [14:15.580 --> 14:19.420] The main difference is done with TPMs in mind. [14:19.420 --> 14:23.860] I'm not sure UBoot support that and measure everything. [14:23.860 --> 14:24.460] Okay. [14:24.460 --> 14:28.300] I'm not very familiar with that, but they are very similar concepts. [14:28.300 --> 14:30.620] I don't know what the main difference would be. [14:30.620 --> 14:32.980] It's just different environments. [14:32.980 --> 14:42.220] I guess this is mainly for UFI, not UBoot is a specific boot loader environment, right? [14:42.220 --> 14:44.140] All right, okay. [14:44.140 --> 14:50.500] They are very similar, I guess, then, yeah. [14:50.500 --> 14:52.300] Thank you for the talk. [14:52.300 --> 15:00.420] From my understanding, we often in usual distribution have a SHIM sign and GRUB sign, [15:00.420 --> 15:05.460] but we don't have system UBoot sign or EXT Linux sign or UBoot sign. [15:05.460 --> 15:11.300] What is the plan in the future to have those sign maybe? [15:11.300 --> 15:12.860] Excellent question. [15:12.860 --> 15:21.500] Now, there is a group of people working on this problem of UFI and UFI 2.0 and everything [15:21.500 --> 15:24.540] that happened with the secure core PCs. [15:24.540 --> 15:26.540] Things are moving. [15:26.540 --> 15:29.940] I can't tell you much more than that right now. [15:29.940 --> 15:35.820] We do want to get as the boot sign for some internal use cases, so we want to get that [15:35.820 --> 15:43.140] allow listed to be allowed to be used as a payload for a second stage loader for SHIM. [15:43.140 --> 15:44.540] We have not done that yet. [15:44.540 --> 15:48.420] We would like to have that done at some point in the near future. [15:48.420 --> 15:49.420] Thank you. [15:49.420 --> 15:51.420] Can I make a comment? [15:51.420 --> 16:04.380] So, a kind of intermediate option is to have it signed by a certificate that is provided [16:04.380 --> 16:11.620] by the distribution and it's protected by the hardware security measurements and so [16:11.620 --> 16:17.140] on, but it's not trusted by SHIM, and then you can self-enroll on the first boot and [16:17.140 --> 16:19.940] have a trust on the first boot thing. [16:20.460 --> 16:26.260] We've done a bunch of work on this boot to allow self-enrollment on first use, so you [16:26.260 --> 16:27.260] can always do that. [16:27.260 --> 16:28.740] Of course, it doesn't work by default. [16:28.740 --> 16:31.700] You need to do the self-enrollment, but it's a step in the direction. [16:31.700 --> 16:34.060] Yes, thank you. [16:34.060 --> 16:35.060] Anything else? [16:35.060 --> 16:39.660] Can you pass me this, thank you. [16:39.660 --> 16:45.300] If you compile yourself, Linus Kernel, what you have to do then? [16:45.300 --> 16:56.220] So, in System 8.253, we ship a tool called U-key-fi, U-K-fi, U-K-fi, whatever you want, [16:56.220 --> 17:01.540] and that will, this one, will allow you to easily put together a UKI. [17:01.540 --> 17:05.300] Of course, in that case, you cannot sign with your off-site key. [17:05.300 --> 17:09.940] You need to do self-key enrollment and whatnot, but it is possible to build a UKI locally, [17:09.940 --> 17:10.940] absolutely. [17:10.940 --> 17:14.140] You need to sort out the signature by yourself, of course. [17:14.140 --> 17:20.900] I think that was at the back, yeah. [17:20.900 --> 17:25.580] You mentioned, well, I saw in one of your slides the abbreviation DPS. [17:25.580 --> 17:26.580] DPS, yes. [17:26.580 --> 17:27.580] What does that mean? [17:27.580 --> 17:30.740] Sorry, yes, I should have said that. [17:30.740 --> 17:34.500] Discoverable partition specification, yes, told you there were a lot of acronyms, so [17:34.500 --> 17:39.740] that is the list of all the UIDs and what they define, root of s, a verity, var, tmp, [17:39.740 --> 17:40.740] blah, blah, blah, et cetera. [17:40.740 --> 17:41.740] Thank you. [17:41.740 --> 17:42.740] Thank you. [17:42.740 --> 17:43.740] I completely forgot about that. [17:43.740 --> 17:45.220] I think we are three minutes. [17:45.220 --> 17:48.700] Any more questions? [17:48.700 --> 17:49.700] Anything online? [17:49.700 --> 17:50.700] Nope. [17:50.700 --> 17:56.700] Going once, going twice, well, thank you very much then.