[00:00.000 --> 00:18.760]  So, hello everyone, first of all, this is not my talk.
[00:18.760 --> 00:24.840]  I've been receiving this talk because my colleague didn't make it to get the visa on time.
[00:24.840 --> 00:27.880]  So I'm sorry, I don't know anything about Kubernetes.
[00:27.880 --> 00:31.920]  I'm usually more into low-level stuff, kernel, and embedded.
[00:31.920 --> 00:35.600]  But I will deliver the talk with the notes that I received, and if you have questions,
[00:35.600 --> 00:38.200]  you can directly direct it by email to my colleague.
[00:38.200 --> 00:40.200]  I wouldn't be able to answer.
[00:40.200 --> 00:42.640]  I'm sorry for this in advance.
[00:42.640 --> 00:43.960]  Okay.
[00:43.960 --> 00:50.880]  So before getting to the architecture and principle of the QBOS, let's define what
[00:50.880 --> 00:51.880]  it's all.
[00:51.880 --> 00:57.680]  So there is a cloud-native development that is encouraged by Docker Kubernetes communities.
[00:57.680 --> 01:02.720]  And many infrastructure is being cloudified.
[01:02.720 --> 01:07.760]  But some of the problems with the general-purpose operating systems reappear in this cloud-native
[01:07.760 --> 01:08.760]  environment.
[01:08.760 --> 01:14.320]  So you have container management, workloads scheduling, automatic service deployment,
[01:14.320 --> 01:16.720]  rollbacks of updates, and so on.
[01:16.720 --> 01:23.480]  That's all capabilities that are provided by Kubernetes, but it is unable to control
[01:23.480 --> 01:26.880]  the cluster-node operating system directly.
[01:26.880 --> 01:33.520]  So the first problem in cloud-native environments is the desynchronization between OS and Kubernetes
[01:33.520 --> 01:38.600]  that are managed and controlled completely separately.
[01:38.600 --> 01:43.000]  Also Kubernetes, like the operating system management, needs a key, upgrades, user access
[01:43.000 --> 01:45.920]  control, all these things.
[01:45.920 --> 01:53.000]  And then you can have like the ops operation guys or patient people, sorry, that need to
[01:53.000 --> 01:57.120]  complain ridden and task between the two systems.
[01:57.120 --> 02:03.360]  The maintenance are therefore poorly synchronized usually, and the greater modification of the
[02:03.360 --> 02:09.800]  OS components can affect the availability of the OS and which require additional monitoring
[02:09.800 --> 02:11.880]  from Kubernetes.
[02:11.880 --> 02:18.680]  So an example is that you have operation staff that must block the nodes to stop new workloads
[02:18.680 --> 02:23.880]  from arriving in order to upgrade the OS without interfering with the Kubernetes.
[02:23.880 --> 02:30.000]  And after everything is clear and everything is updated, you can unblock the node again.
[02:30.000 --> 02:35.040]  So this makes it complicated and expensive.
[02:35.040 --> 02:39.920]  So another issue is the OS version management.
[02:39.920 --> 02:49.560]  So if you have a standard package manager and you can add, remove, modify packages independently
[02:49.560 --> 02:53.640]  on the OS, at the beginning you have an image which is clean, but then you start differing
[02:53.640 --> 02:56.240]  from your different instances.
[02:56.240 --> 03:00.120]  So you have like what they call OS version splitting.
[03:00.120 --> 03:04.440]  So you will have different packages installed on different nodes.
[03:04.440 --> 03:10.960]  The version of these packages can also differ, security updates and all that stuff.
[03:10.960 --> 03:14.120]  So you have this divergence that appear over time.
[03:14.120 --> 03:20.320]  So if you want some integrity and consistency that you want to ensure for your OS nodes,
[03:20.320 --> 03:25.640]  this can harm this constraint.
[03:25.640 --> 03:32.360]  And yes, so if you want also to update to a major version, it's also more difficult.
[03:32.360 --> 03:36.320]  So other people have worked on this problem.
[03:36.320 --> 03:47.680]  So rebuilding the operating system is an approach that has been taken to solve these problems.
[03:47.680 --> 03:52.880]  So previously you have many technology packages that are part of the OS that are moving to
[03:52.880 --> 03:54.080]  containers.
[03:54.080 --> 03:58.440]  So the old guest OS is less reliant.
[03:58.440 --> 04:03.000]  We rely less on the guest OS so it can be replaced by a lightweight operating system
[04:03.000 --> 04:06.040]  with less services that are on and so on.
[04:06.040 --> 04:13.040]  So container OS is a lightweight operating system designed to run containers.
[04:13.040 --> 04:19.080]  And so like on the figure on the right, there is an OS OS and it's not the OS running inside
[04:19.080 --> 04:20.240]  the container.
[04:20.240 --> 04:26.640]  So you have three important aspects, minimalism, usability and atomic updates.
[04:26.640 --> 04:36.840]  It means that you will only include what you really need as components in the host OS.
[04:36.840 --> 04:43.280]  So the container OS requires a Linux kernel, container engines like Docker, container D,
[04:43.280 --> 04:47.800]  and security mechanisms such as SE Linux to ensure the security.
[04:47.800 --> 04:52.840]  And other applications that are running containers are running containers because you don't need
[04:52.840 --> 04:55.880]  it in the host.
[04:55.880 --> 05:01.040]  And this can also reduce the attack surface because you have less in the host OS.
[05:01.040 --> 05:06.280]  Emutability is that you use a read-only file system that can be configured at the start
[05:06.280 --> 05:09.440]  of the deployment and also reduce the risk.
[05:09.440 --> 05:14.360]  And the atomic update is that you do the upgrade for the entire OS and not individually for
[05:14.360 --> 05:16.640]  packages.
[05:16.640 --> 05:22.720]  So the core OS was started in 2013 and was the first widely used container operating
[05:22.720 --> 05:23.720]  system.
[05:23.720 --> 05:35.600]  You also have a system like AWS bottle rocket, flat car, and container optimized OS.
[05:35.600 --> 05:40.800]  So QBOS, it's a container operating system built on OpenOiler, which is a distribution
[05:40.800 --> 05:42.720]  maintained by Huawei.
[05:42.720 --> 05:47.400]  So QBOS main design concept is to use Kubernetes to manage the operating systems.
[05:47.400 --> 05:52.680]  Once you have QBOS that has been installed on a cluster, the user only knew the Qube
[05:52.680 --> 05:55.520]  control command and YAML file on the master node.
[05:55.520 --> 06:00.480]  The OS of the cluster worker node can be managed.
[06:00.480 --> 06:06.840]  And this OS on QBOS is connected to the cluster as a Kubernetes component, putting it in the
[06:06.840 --> 06:09.960]  same position as the other resources in the clusters.
[06:09.960 --> 06:16.080]  And containers and operating system can just be matched in a unified way through Kubernetes.
[06:16.080 --> 06:21.440]  So OpenOiler based reconstruction is used so that the operating system can be updated
[06:21.440 --> 06:26.760]  optimally, like to avoid the problems I introduced before.
[06:26.760 --> 06:32.360]  So now we are going to go a little bit in more depth about QBOS.
[06:32.360 --> 06:39.880]  So the first feature is the ability to manage the OS through directly Kubernetes.
[06:39.880 --> 06:49.160]  So we use API extension, custom resource, CRD, to design and registering in the cluster.
[06:49.160 --> 06:54.200]  We use Kubernetes operating framework to create customized controller for the OS to monitor
[06:54.200 --> 06:58.120]  and manage it.
[06:58.120 --> 07:08.520]  Then this Kubernetes operating framework, we use it to create customers.
[07:08.520 --> 07:14.840]  So the user only need to modify this CR, enter the expected OS status to the cluster, and
[07:14.840 --> 07:24.600]  the QBOS and Kubernetes handle this, and you only have to manage it in the control plane.
[07:24.600 --> 07:28.560]  So the next one is atomicity management of the OS.
[07:28.560 --> 07:31.680]  QBOS upgrade is an atomic dual zone upgrade.
[07:31.680 --> 07:35.160]  It does not include packet manager.
[07:35.160 --> 07:40.240]  The change of each software package corresponds to the change of the operating system version.
[07:40.240 --> 07:45.680]  Then the OS version corresponds to a specific OS image or RPM package combination.
[07:45.680 --> 07:51.880]  Each software update as shown in this diagram is an OS version update.
[07:51.880 --> 07:57.000]  So you avoid the version splitting problems, and the cluster nodes remain consistent at
[07:57.000 --> 07:59.720]  all times.
[07:59.720 --> 08:06.200]  So QBOS is lightweight with unnecessary components removed to reduce the attack surface and enable
[08:06.200 --> 08:10.000]  faster start-up and upgrade.
[08:10.000 --> 08:13.280]  So this is a diagram of the QBOS overall architecture.
[08:13.280 --> 08:14.960]  So you have two main parts.
[08:14.960 --> 08:21.000]  The first with three different components, OS operator, OS proxy, and OS agent.
[08:21.000 --> 08:26.320]  In the red box above the diagram, which are used for Kubernetes cluster docking, complete
[08:26.320 --> 08:28.280]  OS monitoring and management.
[08:28.280 --> 08:32.440]  And the second part is the QBOS image creation tool.
[08:32.440 --> 08:37.880]  The user can use QBOS scripts to generate QBOS images from the open or lower repo source,
[08:37.880 --> 08:44.040]  which supports the generation of container image, virtual machine image, and so on.
[08:44.040 --> 08:49.720]  So the three main components I mentioned, like OS operator, proxy, and agent, are critical
[08:49.720 --> 08:53.320]  to the ability to manage cluster using Kubernetes.
[08:53.320 --> 08:58.040]  The OS operator and proxy are the operators we mentioned earlier.
[08:58.040 --> 09:04.120]  The OS operator will be deployed in the cluster as deployment and daemon set, and will communicate
[09:04.120 --> 09:08.040]  with Kubernetes to issue upgrade instructions.
[09:08.040 --> 09:12.240]  The operator is a global OS manager that monitors all cluster nodes.
[09:12.240 --> 09:16.560]  When a new version of the OS information is configured by the user, it determines whether
[09:16.560 --> 09:20.000]  to upgrade and send a great task to each node.
[09:20.000 --> 09:25.040]  The proxy is a single node operating system manager that monitors the current node information.
[09:25.040 --> 09:29.160]  When the operator sends a great notification, it will lock the node to expel the pods and
[09:29.160 --> 09:32.440]  forward the OS information to the agent.
[09:32.440 --> 09:37.040]  The agent is not included in the Kubernetes cluster.
[09:37.040 --> 09:44.800]  The real executor of the OS management communicates with the proxy via Unix domain sockets, receive
[09:44.800 --> 09:51.320]  a message from the proxy, and perform the upgrade rollback and configuration operations.
[09:51.320 --> 09:59.600]  So the upgrade process, we will use the work process as an explaining example.
[09:59.600 --> 10:03.040]  So we consider how the different components communicate and interact.
[10:03.040 --> 10:07.920]  First the user configures the OS information to be upgraded via Qt control and enable files,
[10:07.920 --> 10:13.000]  such as OS version, address of the OS image, number of nodes to be upgraded concurrently,
[10:13.000 --> 10:14.280]  and so on.
[10:14.280 --> 10:20.040]  Then when the OS instance changes, the operator begins the upgrade process, labels the nodes
[10:20.040 --> 10:23.880]  that must be upgraded, and limits the number of nodes to be upgraded each time to the number
[10:23.880 --> 10:25.880]  specified by the user.
[10:25.880 --> 10:30.960]  Then the proxy checks to see if the current node is marked as an upgrade node, locks the
[10:30.960 --> 10:37.080]  nodes to expel the pods, and retrieves the OS information from the cluster before sending
[10:37.080 --> 10:39.040]  it to the OS agent.
[10:39.040 --> 10:43.480]  After receiving the message, the agents will download the upgraded package from the address
[10:43.480 --> 10:47.760]  specified by the user, complete the upgrade, and restart.
[10:47.760 --> 10:52.280]  After restarting, the proxy will detect that the node OS version has reached the expected
[10:52.280 --> 10:56.080]  version and will unlock the node and remove the upgrade level of the node.
[10:56.080 --> 11:01.760]  So this is the complete upgrade process.
[11:01.760 --> 11:04.560]  Then finally the file system.
[11:04.560 --> 11:10.040]  So how do we design and upgrade the file system in QBOS?
[11:10.040 --> 11:16.160]  It adopts a dual-area upgrade, like mentioned earlier, to upgrade the OS, so you have two
[11:16.160 --> 11:22.440]  root partitions, the upgrade of partition A is to download the updated image for the partition
[11:22.440 --> 11:27.560]  B, and then modify the default bootloader as the B partition after, and then you restart
[11:27.560 --> 11:32.280]  from the B by default, and the opposite happens for the next upgrade.
[11:32.280 --> 11:37.360]  So it's a classical dual image thing.
[11:37.360 --> 11:43.360]  The file system of QBOS is recently, which improved the security, but we also support
[11:43.360 --> 11:46.040]  persistent data partitions.
[11:46.040 --> 11:50.040]  The union path, which is mounted as an overlay, and the files in the image other than the
[11:50.040 --> 11:53.120]  user change can still be seen.
[11:53.120 --> 11:59.600]  There is a writable path, which has a writable file layer to the image using the bind mounts.
[11:59.600 --> 12:04.680]  The files in the image are not displayed, only user data is stored, and there is also
[12:04.680 --> 12:09.760]  the boot partition, which contains the bootloader files.
[12:09.760 --> 12:15.120]  So we determine the main concept of QBOS and design, and implemented a set of components
[12:15.120 --> 12:20.320]  to complete the OS management, and we intend to continue completing more functions based
[12:20.320 --> 12:22.440]  on this process.
[12:22.440 --> 12:27.680]  One thing is the ability to provide a configuration, like in the grid process, the configuration
[12:27.680 --> 12:32.520]  is delivered to the node via the Kubernetes cluster on the cluster control plane to ensure
[12:32.520 --> 12:37.280]  the consistency of the configurations of the nodes, and given that some of the configuration
[12:37.280 --> 12:42.400]  must be complete before the nodes join the cluster, more configuration capabilities to
[12:42.400 --> 12:45.440]  the QBOS image creation are planned.
[12:45.440 --> 12:48.440]  Then there is the improved upgrade capability.
[12:48.440 --> 12:53.000]  We have realized the function-based OS upgrade, and we will provide upgrade strategies that
[12:53.000 --> 12:59.000]  user can customize, such as upgrading based on the cluster node label to provide more
[12:59.000 --> 13:00.400]  upgrade solutions.
[13:00.400 --> 13:06.600]  In addition to the rich functions, we intend to improve the usability of QBOS by displaying
[13:06.600 --> 13:11.600]  the upgrade of configuration process and improving the image creation tool so that user can
[13:11.600 --> 13:13.600]  more easily customize the image.
[13:13.600 --> 13:18.600]  Okay, and that's it.
[13:18.600 --> 13:27.760]  Sorry again for the functions, but for the question, you can always shoot the colleague
[13:27.760 --> 13:45.760]  in the middle.