[00:00.000 --> 00:08.640] Okay, we can start a bit early. [00:08.640 --> 00:11.040] So next speaker is Vananthau David. [00:11.040 --> 00:18.240] He will give a technical overview of Ubuntu Core, so thanks for having me. [00:18.240 --> 00:26.560] Yes, so this will be a very deep technical talk about how we do things in Ubuntu Core. [00:26.560 --> 00:34.960] Ubuntu Core is a distro. It's based on Ubuntu builds, so the Ubuntu packages, [00:34.960 --> 00:40.960] but you can't install or remove packages because we removed APT and DPKG and everything. [00:43.120 --> 00:51.680] The system is split in four atomic snaps, so there are four parts that you can upgrade independently. [00:51.680 --> 00:58.000] If you want to install anything on it, you need to install it as a snap. [00:59.680 --> 01:07.760] Ubuntu Core targets internal things for the month, so it's not a distro for desktop. [01:09.600 --> 01:14.480] You can have graphical isn't your first button, but it's not ready for desktop. [01:14.480 --> 01:24.800] Because it's targeting IoT, it supports lots of bootloaders, but in this talk, [01:24.800 --> 01:29.040] I will focus on UEFI because this is what is really interesting the rest is. [01:34.480 --> 01:40.480] Ubuntu Core has been doing a secret boot with fuller disconscription using the TPM for a while, [01:40.480 --> 01:44.640] and this was done before. SystemD had lots of nice tools to do that, [01:47.200 --> 01:52.560] so this is why I thought it was interesting to talk about how things were done because [01:52.560 --> 01:55.760] they might seem peculiar or it was a different approach. [01:58.720 --> 02:05.200] So what is a snap? A snap is, I mean, there is a different, you will see that there is a [02:05.200 --> 02:10.480] different type of snap, but what is command for a snap is just that it's a squash fs image, and [02:10.480 --> 02:18.080] there is a specific file that is metadata that describes what this image is for, and then after [02:18.080 --> 02:22.080] they will be depending on the type, there will be more information that you can get on that. [02:23.440 --> 02:35.040] The type of snaps first is the application, so this ship, the application and its run [02:35.040 --> 02:43.760] time, and it expects to have another snap to run on top, which is the base, [02:43.760 --> 02:49.360] which is the root file system, so it doesn't have the root file system, it just have the run time [02:49.360 --> 02:59.360] for the application itself. Typically, application exports services or commands. In the desktop [02:59.360 --> 03:05.200] world, we have also applications, desktop applications, but for Ubuntu Core, we don't care about these [03:05.200 --> 03:13.440] kinds of things, and those applications run confined. There is all type of application in [03:13.440 --> 03:21.840] Snap that we don't support in Ubuntu Core, which are not confined. We only support confined in [03:21.840 --> 03:30.320] Ubuntu Core. Then after, we have the base snaps, and those are the root file system, [03:31.440 --> 03:37.360] and they are used for the applications, but they're also used for the bootable system, [03:37.920 --> 03:45.120] so it has system in it. An application that runs on Ubuntu Core doesn't need [03:45.120 --> 03:50.560] necessary to have the same root file system as the host itself, it can just use a different [03:50.560 --> 03:57.760] version if it was built for another version. Then we have a snap D, because to handle the snap, [03:57.760 --> 04:06.880] how to unpack it and to install it, there is a demo, and this is distributed as a snap itself. [04:06.880 --> 04:13.840] It's not an application snap because it has specific things, so it has its own type. Then for [04:13.840 --> 04:23.920] Ubuntu Core, we have the kernel snap, which provides the kernel, which is a UKI, so it's [04:23.920 --> 04:36.560] a UFI kernel that is signed and has the entity in it. The snap also provides modules and firmware files, [04:36.560 --> 04:46.640] and then the final type is Gadget. I don't know why Gadget, but it provides the bootloader, [04:46.640 --> 04:52.880] so in our case, it's shim and grub, and it has also lots of configuration for snapd, [04:53.760 --> 05:02.240] how to make the image. Cloudy needs some initialization, and then we have also, [05:02.240 --> 05:12.240] if you want to have extra command line in your kernel. The disk starts, the image starts with [05:12.240 --> 05:17.840] just one partition, which is the ESP, when we call it the seed, and it contains grub and shim, [05:17.840 --> 05:26.080] of course, and it has the seed snaps, it's like the factory snaps that you can reverse to if you [05:26.080 --> 05:34.080] need it. It also has a seed kernel that has to be unpacked from the seed kernel snap, because [05:34.720 --> 05:46.080] grub cannot read from there, or I don't know, and it has also another file, it's a signed metadata [05:46.080 --> 05:57.120] file. How to explain, it explains how to update things, and it's some authority file. Once you [05:57.120 --> 06:00.640] have done the installation, I will not explain how the installation happens, because I don't think [06:00.640 --> 06:06.880] it's fun. I will explain how we run normally, but once it's installed, we have three other [06:06.880 --> 06:16.960] partitions that will appear in your disk. The second one is the boot, which also contains a grub, [06:18.080 --> 06:23.840] and typically the grub will change from the seed one, so you will always boot from the seed, [06:24.480 --> 06:28.160] and if it finds that it doesn't have to do recovery or anything, it has to do normal run, [06:28.160 --> 06:33.680] it will go to the second one. The second one will have the current active kernel that you have [06:33.680 --> 06:42.800] installed, and it will also have a seal key for the data partition, because at the time I think [06:42.800 --> 06:48.880] it was not common to you to have this seal key on to the header, looks to header, so it was done [06:48.880 --> 06:53.840] like that. Then after we have two partitions that are writable and they are encrypted, [06:53.840 --> 06:58.720] there's a safe partition, basically is to identify the device, it's not much on it, [06:58.720 --> 07:06.560] it's a very small partition, and then we have the data which contains most of the writable data. [07:07.120 --> 07:09.680] To get to have a runnable system, [07:15.040 --> 07:21.360] we have to do things in a trimfs, so first of all we use the systemd on both the trimfs [07:21.360 --> 07:28.880] and the main boot, but here I'm going to show the few things that we do that might be different [07:28.880 --> 07:34.800] to other systems, so one of the first steps that we have to do is mount all the disks that we have, [07:36.240 --> 07:44.160] and the first thing we do is measure the epoch, for now I think it's always just [07:44.160 --> 07:49.840] measure zero, it's something that has been done probably for revocation, if we need to revoke [07:49.840 --> 07:56.560] the code we can just increment that, but I don't know how useful it is because I don't think it [07:56.560 --> 08:01.760] has been touched. Then we mount the boot and seal partitions, those are not encrypted, [08:03.200 --> 08:08.560] from the seal partition we find the model and we will measure it in the PCR. [08:08.560 --> 08:22.800] After that we will find the seal key from the boot partition and we'll unseal it and open and mount [08:22.800 --> 08:29.760] the data partition, then we will do the same with the safe partition, the seal key for the [08:29.760 --> 08:35.360] safe partition is on the data partition, so we have to monitor the data partition before we open [08:35.360 --> 08:47.360] the safe, and last we will find the base snap that we need to mount and this we find it from the [08:47.360 --> 08:54.000] data partition, we will find a file that is described what is installed and we mount the base, [08:54.000 --> 09:02.000] the kernel and the gadget, so the base will be our sysroute and the kernel is needed because we need [09:02.000 --> 09:08.240] the modules and firmware and the gadget, there's some configuration there that we need, [09:09.200 --> 09:14.320] optionally we need snapd to be mounted also, but I will talk about this after, and then there's [09:14.320 --> 09:22.960] some looking of the seal keys so we don't unseal them again. Once we have mounted all the disks [09:23.840 --> 09:29.120] we have to prepare the file system, so first we'll buy mount the base into the slash sysroute, [09:29.120 --> 09:35.520] where we will do the switch route, then after we have to buy mount the user-lib modules and [09:35.520 --> 09:43.120] user-lib firmware, we do mounting of the seal and boot partition within boot, there's a specific [09:44.320 --> 09:49.520] way to do it, and then after we have to, from the data we have to do bind mounting of specific [09:49.520 --> 09:59.040] paths onto the root file system, so typically for example you want the slash of wire to be mounted, [09:59.040 --> 10:06.320] to be writable and your your your base snap is not writable because it's a squash fs, so you want [10:06.320 --> 10:17.040] to want to buy mount, we have a script for now that reads a list of paths that we have to do [10:17.040 --> 10:22.400] buy mount, and we also can configure saying that if it's not, if it doesn't exist we will initialize [10:22.400 --> 10:35.120] with the data that is, that is on the base a snap, or have it empty. This script will probably be [10:35.120 --> 10:47.440] replaced by using tmv files d and fstab, but it was written like that, but it wasn't there. [10:49.360 --> 10:55.280] We don't buy, we don't buy mount slash var and slash atc directly as writable, because most of it [10:55.280 --> 11:02.960] is not writable, this is to, here's the update, because we contract everything that needs to be [11:02.960 --> 11:08.560] updated in atc, and if someone modifies some file in atc we don't, it's very hard to track, [11:08.560 --> 11:14.720] so we have a specific list of paths that are bindable that you can write. [11:16.480 --> 11:21.360] So an example is atc systemd, this one we need to install [11:21.360 --> 11:33.680] services and mounts for all the snaps, so this needs to be, this path needs to be writable, [11:34.880 --> 11:41.760] but there are some things that are really annoying for us doing this, for example slash [11:41.760 --> 11:51.200] atc hostname, because systemd does some atomic write, so that means that it would make a temporary file [11:52.960 --> 11:59.840] in the directory slash atc slash osnab does some temporary name and do a switch, [12:00.640 --> 12:05.360] but it's not possible if slash atc is not writable, so there's some patches for that, [12:05.360 --> 12:14.080] and atc local time is even worse because it's a sim link that has to be rewritten, and we need [12:15.760 --> 12:18.960] to follow the sim links until we find the sim link that is writable, [12:21.120 --> 12:24.960] so it's not, all these things are a bit confusing and it's a bit annoying, so [12:24.960 --> 12:35.040] yes, and this was the initial decision during the initial time of fs, and then after we [12:35.040 --> 12:39.680] switched to the normal boot, and most of what happens is just systemd, [12:43.520 --> 12:48.960] there is two main things that will happen, is done by snapd, is mounting the active snaps, [12:48.960 --> 12:55.600] so and starting the snap service from the application that are installed, and those are [12:55.600 --> 13:02.880] just units in systemd, so they are installed by snapd inside slash atc slash systemd, [13:04.080 --> 13:14.480] and systemd will just start them for us, we just have snapd installing those things there, [13:14.480 --> 13:18.000] but the problem is that in the first boot snapd is not installed, [13:18.000 --> 13:22.640] that means that we have something special for the first boot where we find that snapd is not [13:22.640 --> 13:30.160] installed, so we have this process where we have to find a snapd that was mounted from the inner [13:30.160 --> 13:36.640] trim fs, run it to tell it to install itself so that we can just continue, so this was called [13:36.640 --> 13:51.280] the the the seeding of the snapd, so the disks are encrypted using lockstove, we use a tpm, [13:53.440 --> 14:01.600] and the thing that we the pcr that we use for for the policy are four seven and twelve, if you [14:01.600 --> 14:06.720] don't know what four and seven is, it's from the system, you don't have to worry about that, [14:06.720 --> 14:15.760] twelve is something that is that we deal with, so we do our measurements, and we might have [14:16.480 --> 14:22.560] several values, expected values, because we have parameters that the current parameter that change [14:22.560 --> 14:30.080] if we want to do recovery or some other things that needs to still mount the the five systems [14:30.080 --> 14:37.600] that are encrypted, so we have I think there's a policy or there's several values for the pcr, [14:37.600 --> 14:43.440] and then we have another in the policy, another thing is that we have a counter [14:44.160 --> 14:55.440] that we use for evocation, this is because we we need to reseal our keys every time we [14:55.440 --> 15:03.920] change what the values of the pcr can be, and to not allow all their values to be able to boot [15:03.920 --> 15:09.360] again, there is also a non-volatile counter that is that is used that we increment each time we [15:09.360 --> 15:17.360] have to change, what we measure in pcr12 is one is done by system distub because we use a system [15:17.360 --> 15:26.160] distub to make the uki, this is the kernel command line, and it's important for us because [15:27.280 --> 15:35.120] yeah, and then we have as I said before in the try my first we have the epoch and the model, [15:37.760 --> 15:44.240] there are some interesting things that we have, what happens when you have a failure that happens [15:44.240 --> 15:51.680] and you want to have the emergencies a shell to to happen, there is two things, first Ubuntu [15:51.680 --> 15:56.080] Core doesn't have, we try to not have password, you can have password but by default you don't [15:56.080 --> 16:04.800] have password, but even if we add we don't we have the intramfs that is built for all [16:05.760 --> 16:12.640] that has to be signed in the uki, it can't it can be only built for everyone, one built for [16:12.640 --> 16:21.040] everybody right, so that means that there is no, in the intramfs there is no, there is no [16:21.040 --> 16:25.920] password but we still need to have an emergency shell if we want to be able to debug things, [16:26.960 --> 16:36.320] to allow that we only have this emergency shell if we have a specific kernel command [16:36.320 --> 16:44.640] line parameter which is dangerous, and you have to remember that since we measure the kernel [16:44.640 --> 16:52.080] command line in pcr12 that means that you will not be able to unlock with the tpm your disk [16:52.080 --> 17:00.400] if you are in that shell, so you will need to use a recovery key to be able to unlock if you want, [17:00.400 --> 17:09.360] but we can do this, I mean we can do some debugging and figure things but you have to [17:09.360 --> 17:18.480] change things and to disallow you do, to unlock, to use a tpm, but maybe there is all the things, [17:18.480 --> 17:23.200] the other ways we could have done things, maybe we could have done some specific measurement for [17:23.200 --> 17:34.960] when there is any form of emergency, I don't know, oh yeah, let's see quickly, when we update the [17:34.960 --> 17:49.440] gadget, the interesting thing to know is that the gadget provides, the gadget snap provides the [17:49.440 --> 17:56.720] shim and grub and we might want to update it in different partitions, so the seed or the boot, [17:56.720 --> 18:02.160] usually we want only the boot, but if we might want to do the seed which does the recovery, [18:03.440 --> 18:10.400] there is some versioning that we use to do that, that's very important, then when we upgrade the [18:10.400 --> 18:19.440] kernel snap, we copy the new kernel in the specific way, in the boot partition we update a grub environment [18:19.440 --> 18:24.560] file, the grub will boot and see that there is something to try, try to boot and if it found [18:24.560 --> 18:30.000] that it did try to boot and it didn't work it will roll back, then after if snapd found that it [18:30.000 --> 18:34.800] managed to boot with the new kernel that it knows that it has finished the update and will update [18:34.800 --> 18:41.520] everything, this we can skip, this is not that important, if you have any questions. [18:53.120 --> 19:03.680] So I'm from the Bluetouch team and the Bluetouch team still use the writeable path system, [19:03.680 --> 19:11.360] the same thing that is still in use in Mudu core today, my question is that does Mudu core have [19:11.360 --> 19:20.160] any plan to move on from the system or are you still planning to use the system in the foreseeable [19:20.160 --> 19:38.240] future or any update on that? My plan is to move to TMP files and fstab, so use just tools that [19:38.240 --> 19:45.200] comes from systemd because we can do those things, it means there is a bit of duplication in how you [19:45.200 --> 19:51.280] write the things but that means that we have one less shell script to maintain that is not maybe [19:51.280 --> 19:59.840] the nicest thing, so but there is no, I don't think there is any date or when it will happen, [19:59.840 --> 20:06.720] it's just there is, we know it works and we have to decide when we move to that. [20:06.720 --> 20:28.480] Thank you. You mentioned resigning or resealing on updates, could you do away with that and use [20:28.480 --> 20:38.480] the trick with binding to a hash of a certificate and then using the certificate to sign whatever [20:38.480 --> 20:44.880] needs to be signed? Yeah, I mean that would be better, I mean what I showed you is how the [20:44.880 --> 20:52.880] state is now and it would be better not to have to reseal but we didn't have the, [20:52.880 --> 21:00.320] there was not that much experience I think in the community about that at that moment [21:00.320 --> 21:05.120] that we could just do that which was better so when it's been designed I think people didn't [21:05.120 --> 21:13.600] realize that could do something better but yeah I mean at some point we will get there but I don't, [21:14.640 --> 21:21.600] I don't think we have any fourth concept on Ubuntu Core using this yet, we have talked about it, [21:21.600 --> 21:29.360] that's it. We still have time for questions? Yes, last question. [21:32.320 --> 21:40.640] So it feels to me like using SNAPs in this way is a bit complicated, my question would be [21:41.360 --> 21:47.920] what benefits do you get from this approach compared to something like system D, CIS extensions [21:47.920 --> 21:57.360] that were talked about earlier today? You mean those, because the point is that [21:58.320 --> 22:02.480] the application SNAP is the important thing for the user experience because they want to have [22:02.480 --> 22:07.840] their own application and they can do quite a lot of things and this is very interesting and having [22:07.840 --> 22:14.800] everything to update in the same way makes things simpler for them, I mean it might not, because I [22:14.800 --> 22:20.960] showed you what was the complex things behind the curtain but I think the point of having [22:20.960 --> 22:29.200] SNAPs there is just for the simplicity of the users, people who make image for their applications [22:30.640 --> 22:34.240] they have to deal just with SNAPs and not many different technologies. [22:34.240 --> 22:47.680] Yeah, we could probably, I don't know how to use all the things and make it look like it's a SNAP, [22:47.680 --> 22:52.800] I'm not sure, maybe there is a way but which would be nice, I mean because if we have less [22:52.800 --> 22:58.080] code to maintain and we just have a wrapper to something that something else is with, [22:58.080 --> 23:02.960] it's much better but for the user I believe that it has to look like a SNAP. [23:02.960 --> 23:06.960] Okay, round of applause.