[00:00.000 --> 00:26.000]  Thank you, let's welcome Francesco on the present and future of the app.
[00:26.000 --> 00:32.000]  Okay, thanks everyone. This session is about Obestac and self-integration.
[00:32.000 --> 00:39.000]  With our quick glance to the Kubernetes world, it is a quick agenda.
[00:39.000 --> 00:46.000]  I'm going to talk about what would have been the integration with SAF in the Obestac system,
[00:46.000 --> 00:50.000]  in the Obestac community in general, what's the status of the app,
[00:50.000 --> 00:57.000]  what has been changed with the SAFADM in the Bermedal world and what means having Kubernetes in this picture.
[00:57.000 --> 01:03.000]  There is also a demo which is, I'm not going to show the demo, but it's linked in the session,
[01:03.000 --> 01:06.000]  so you can feel free to take it later.
[01:06.000 --> 01:12.000]  So, for reasons not familiar with the Obestac project, it's just pretty old at this point.
[01:12.000 --> 01:15.000]  It's infrastructure as a service.
[01:15.000 --> 01:22.000]  As you can see on the right side, there are a lot of projects being part of the Obestac ecosystem.
[01:22.000 --> 01:30.000]  Each of them provide interfaces to each other and they cooperate together to provide processing, storage, network resources.
[01:30.000 --> 01:36.000]  You basically can build your cloud infrastructure using all these projects together.
[01:36.000 --> 01:46.000]  It's open source, it's now 10 years, 13 years old, so there are a lot of projects that are stable now.
[01:46.000 --> 01:53.000]  And SAF is part of this picture in the sense that you can probably don't see this picture very well,
[01:53.000 --> 02:03.000]  but in both computes, in the storage part, we have SAF that basically can be used as a backend
[02:03.000 --> 02:11.000]  for NOVA, which is the compute processing component, so we can provide ephemeral disks using SAF as a backend.
[02:11.000 --> 02:18.000]  We have Manila providing object, there is a good providing file.
[02:18.000 --> 02:25.000]  Swift providing object, there is a long story with the integration with Radus Gateway.
[02:25.000 --> 02:32.000]  Glance for images, so basically all these components you see there in the storage area,
[02:32.000 --> 02:41.000]  they can use SAF as a backend and this is a justification to have this integration with these two technologies.
[02:41.000 --> 02:47.000]  So, why the integration and why it's relevant?
[02:47.000 --> 02:51.000]  There is the HCI, I will convert the infrastructures.
[02:51.000 --> 03:04.000]  Operators can save hardware resources, co-locating the compute part of the infrastructure and OSD nodes.
[03:04.000 --> 03:14.000]  This means that you can save hardware, you can have both technologies together serving as a full operational infrastructure.
[03:14.000 --> 03:29.000]  This is funny, I just asked chat GPT, why SAF is relevant and why the integration of these two projects is interesting.
[03:29.000 --> 03:42.000]  I just put there, you can not see probably this part on the right, but it's basically my marketing sentence on why this thing makes sense.
[03:42.000 --> 03:51.000]  And there is scalability, resiliency, scalability and all this kind of keywords that we like a lot.
[03:51.000 --> 03:58.000]  But this session is about orchestrating services, deploying object stack and SAF together.
[03:58.000 --> 04:07.000]  There have been a lot of deployment tools and strategies over the past and I want just to focus on Cefantibol and Cefadm
[04:07.000 --> 04:14.000]  because Cefantibol has been the official tool, the most useful tool in the past and now it's FADM.
[04:14.000 --> 04:18.000]  Things have changed a bit, especially in the object stack ecosystem.
[04:18.000 --> 04:27.000]  So, previously the approach was I need my infrastructure as a service, I need to deploy object stack and SAF.
[04:27.000 --> 04:37.000]  I want to describe my entire cluster with a lot, a ton of variables and I push that magic button that makes it so.
[04:37.000 --> 04:44.000]  So, Cefantibol was there in this picture during the deploying of object stack.
[04:44.000 --> 04:50.000]  There was this particular part where Cefantibol was triggered to deploy SAF behind the scene.
[04:50.000 --> 05:01.000]  So, the drawback of the solution is that if you need to change anything in your Cef cluster, you have to run again the playbook,
[05:01.000 --> 05:10.000]  the Cefantibol playbook, because there is this Ansible layer that manage everything for you and it needs to stay consistent with the status of the cluster.
[05:10.000 --> 05:19.000]  So, basically variables and the human operator is supposed to provide variables and maintain those variables, which is a bit different.
[05:19.000 --> 05:29.000]  Also, it affects scale down, scale up operation and day-to-operation, especially day-to-operation.
[05:29.000 --> 05:40.000]  I want to change something in my Cef cluster, I need to run the deployment again because I can rely on the Ansible hidden potency, basically.
[05:40.000 --> 05:49.000]  But with CefADM things are a bit different, the status of the cluster is not made by tons of Ansible variables.
[05:49.000 --> 05:57.000]  There is CefADM, which is able to keep the status of the cluster, watch the cluster and take an action.
[05:57.000 --> 06:02.000]  I want to deploy a new SD whenever I attach a new disk.
[06:02.000 --> 06:14.000]  I can do that and I don't have to run deployment again or do any fancy day-to-operation with my tool that is supposed to manage my infrastructure in general,
[06:14.000 --> 06:20.000]  which is made by both Opestuck and Cef.
[06:20.000 --> 06:30.000]  And this changes a bit because we had the Cefantibol word where we describe all our cloud infrastructure.
[06:30.000 --> 06:37.000]  We pushed that magic button and everything was deployed and sometimes broken.
[06:37.000 --> 06:46.000]  But now, operators are more aware about the steps, so things have changed because you have to provide networks.
[06:46.000 --> 06:54.000]  And networks means that you want to manage your hardware, you want to build your networks, you want to use...
[06:54.000 --> 07:04.000]  This is specifically for the triple project where we integrated Cefantibol in the past and now we're moving to CefADM.
[07:04.000 --> 07:12.000]  And now people should provide networks, should provide metal, the description of the nodes and they are just separated steps.
[07:12.000 --> 07:23.000]  The storage Cef is deployed first, so you can start deploying the Opestuck infrastructure with a minimal Cef cluster already deployed.
[07:23.000 --> 07:32.000]  And when you deploy Opestuck, you can say, I have Cinder, I need Volumes, I need the Volumes pool and I can finalize my cluster,
[07:32.000 --> 07:35.000]  creating and configuring the Cef cluster accordingly.
[07:35.000 --> 07:42.000]  I have Manila, I need CefFS, I can deploy MDS, doing other stuff behind the scene.
[07:42.000 --> 07:57.000]  But you're basically aware that according to the service you're deploying in your Opestuck infrastructure, you can enable pieces in your Cef cluster and you can just tune it accordingly.
[07:57.000 --> 08:08.000]  At that point, we still have the Ansible layer managing Opestuck and all the infrastructure in general, but at that point the Cef cluster is seen as a separated entity.
[08:08.000 --> 08:16.000]  So it's like having an external Cef cluster, even though you have the same tool deploying both technologies together.
[08:16.000 --> 08:28.000]  And CefADM and the manager, the orchestrator, is the piece in Cef that is supposed to provide interfaces where the operators can interact with.
[08:28.000 --> 08:43.000]  And it's basically the slide. We still have Ansible doing everything on top, but the orchestrator interface is what you can rely on to make changes in your cluster
[08:43.000 --> 08:53.000]  without having to redeploy everything again around the Ansible that can take a lot of time if your infrastructure is large.
[08:53.000 --> 09:01.000]  And we don't have any, the operator is not supposed to keep and maintain any variable in the Ansible system.
[09:01.000 --> 09:16.000]  You can just interact with the CefADM CLI, create a spec which is a little definition of the Cef service that will be translated by the orchestrator and it will be deployed as a new demon in the Cef cluster.
[09:16.000 --> 09:28.000]  So this is about why it's so easy. Because you have Ansible, at some point you can just bootstrap a minimal Cef cluster with this command,
[09:28.000 --> 09:34.000]  bootstrap, providing a monitor IP address because networks are already there at that stage.
[09:34.000 --> 09:43.000]  And you can create a spec file that is basically the description of the nodes that should be enrolled in Cef and you can just apply them.
[09:43.000 --> 09:53.000]  It's even easier rather than running Ansible with a lot of roles, execution time, which is long enough.
[09:53.000 --> 09:56.000]  And data operation are complicated.
[09:56.000 --> 10:06.000]  Are complicated because not just because of this slide and this is the interaction with the CefADM CLI, you can query the cluster and see the health status.
[10:06.000 --> 10:15.000]  You can see where demons are placed, you can list the hosts and manage their labels and assign roles to these hosts.
[10:15.000 --> 10:17.000]  You can do a lot of things.
[10:17.000 --> 10:30.000]  But the real point here is that with CefADM we don't have the need to run again all the deployment of the Opesta infrastructure.
[10:30.000 --> 10:36.000]  An example of this, of how projects are impacted by this change is Manila.
[10:36.000 --> 10:47.000]  It's not just because we need a new version of LibreVD, we need to be compatible and we are changing from CefAnsible to CefADM,
[10:47.000 --> 10:53.000]  but just because we are doing architectural changes to the use cases provided by Opesta.
[10:53.000 --> 11:01.000]  Manila is that service that curves out CefFS volumes and provides them to the virtual machine within tenants,
[11:01.000 --> 11:06.000]  which means that you have a dedicated network that you can use to mount your shares.
[11:06.000 --> 11:11.000]  And behind the scene we have CefFS or NFS with Ganesha.
[11:11.000 --> 11:18.000]  In the past, Manila was supposed to use the bus to interact with NFS Ganesha.
[11:18.000 --> 11:25.000]  And it was complicated because we had to run privileged containers.
[11:25.000 --> 11:34.000]  We had to use this interface to update and manage shares using Ganesha as a gateway.
[11:34.000 --> 11:43.000]  And from an architectural point of view, we had an active passive model made by Peacemaker and SystemD.
[11:43.000 --> 11:49.000]  So you basically had Peacemaker honing the virtual IP as an entry point and then one active Ganesha,
[11:49.000 --> 11:52.000]  even though you have more than one instance.
[11:52.000 --> 11:57.000]  And with some constraints with SystemD.
[11:57.000 --> 12:03.000]  Now with CefADM there is an interface with the manager, with the NFS cluster,
[12:03.000 --> 12:09.000]  and Manila can use a new driver to interact with this component.
[12:09.000 --> 12:15.000]  We don't have to do the bus anymore, we don't have to do the bus to the Ganesha container anymore.
[12:15.000 --> 12:23.000]  And that's the new model where we rely on the Ingress Demon provided by CefADM,
[12:23.000 --> 12:27.000]  and this Ingress Demon is made by HAProxy and KIPaLiveD.
[12:27.000 --> 12:33.000]  KIPaLiveD honing the V as an entry point, HAProxy for load balancing across the,
[12:33.000 --> 12:38.000]  and distributing the load across the NFS Ganesha server.
[12:38.000 --> 12:42.000]  It's a better approach, we still have some limitation on this area,
[12:42.000 --> 12:47.000]  because considering that Manila is an infrastructure service for Obestac,
[12:47.000 --> 12:54.000]  but providing shares within the tenant, virtual machines, with a dedicated network,
[12:54.000 --> 13:01.000]  we need client restrictions to avoid other tenants mounting the same share.
[13:01.000 --> 13:09.000]  And there is an effort doing the proxy protocol in Ganesha to make sure that we can
[13:09.000 --> 13:16.000]  use client restriction with this new model, which is the current limitation basically.
[13:16.000 --> 13:23.000]  Or at least there is some effort to provide floating stable IP addresses to the Ingress Demon
[13:23.000 --> 13:28.000]  and skip the HAProxy layer, which is always an additional hope.
[13:28.000 --> 13:36.000]  And in terms of performance, this can be something that has an impact, of course.
[13:36.000 --> 13:45.000]  Lastly, at this point we had Cefansible, we have CefADM, what Kubernetes means in this world.
[13:45.000 --> 13:54.000]  We have Rook as a way to deploy Cef within Kubernetes, regardless of how Obestac is deployed,
[13:54.000 --> 13:58.000]  we have several combinations of these two things together.
[13:58.000 --> 14:04.000]  You can have converged infrastructure where Obestac control plane is virtualized,
[14:04.000 --> 14:12.000]  or it can be containerized, so basically using the same model,
[14:12.000 --> 14:15.000]  the same approach to deploy both, it can be useful,
[14:15.000 --> 14:19.000]  because it's a unified approach to manage your infrastructure.
[14:19.000 --> 14:29.000]  It's easy, deployable and reproducible, because Kubernetes poses a standard approach to deploy things,
[14:29.000 --> 14:37.000]  so we don't have anymore that flexibility that today is triple O, but it's easier from that point of view.
[14:37.000 --> 14:42.000]  And the same Cef cluster can be shared between Obestac and Kubernetes.
[14:42.000 --> 14:44.000]  We have different workloads.
[14:44.000 --> 14:54.000]  Kubernetes is PVC interfaces provided by Rook. Obestac is mostly RBD and your workload runs virtual machines,
[14:54.000 --> 15:01.000]  and it's usually outside, so you have to reach from the compute node the Rook cluster,
[15:01.000 --> 15:05.000]  the Cef cluster deployed by Rook within Kubernetes,
[15:05.000 --> 15:14.000]  which poses some networking challenges that they can be managed using host networking through,
[15:14.000 --> 15:21.000]  so you're using Kubernetes as a platform to deploy your Cef cluster,
[15:21.000 --> 15:28.000]  but you're still relying on the host networking to reach it and provide RBD to the outside workload,
[15:28.000 --> 15:32.000]  and that's the point of this slide.
[15:32.000 --> 15:38.000]  There are some things that are not mentioned here, like some tuning in Rook,
[15:38.000 --> 15:44.000]  that is supposed to be done to make sure that there is Kubernetes in the middle,
[15:44.000 --> 15:51.000]  so it's not natively, the native Cef cluster we usually have.
[15:51.000 --> 15:58.000]  So at this point, the thing is that we should do some tuning, especially in the HCI world,
[15:58.000 --> 16:03.000]  because Iverconverged is still one of the most popular use cases,
[16:03.000 --> 16:07.000]  and HCI is provided out of the box by Kubernetes.
[16:07.000 --> 16:13.000]  You can tag your infra nodes, you can deploy Rook, you can assign those nodes for OSDs.
[16:13.000 --> 16:15.000]  That's it, that's it.
[16:15.000 --> 16:24.000]  But at that point, you have to make sure that both your cluster and the connection is valuable for the outside workload.
[16:24.000 --> 16:28.000]  So this can be done, it's done by this demo.
[16:28.000 --> 16:34.000]  I'm not going to show this, but it's all there, it's all described,
[16:34.000 --> 16:37.000]  it's all I was describing in this slide.
[16:37.000 --> 16:44.000]  So you can have your OB-STAC infrastructure deployed with DevStack or TripleO,
[16:44.000 --> 16:52.000]  and it's still bare metal, and it can consume a Cef cluster deployed by Rook using RBD.
[16:52.000 --> 16:58.000]  You can still use the CSI, actually, to provide PVC interface.
[16:58.000 --> 17:11.000]  It's not something that it's mutually exclusive, but it's just a good exercise to see how these two technologies can work together in the future, probably.
[17:11.000 --> 17:24.000]  And yeah, just some additional resources for those interested in looking at these slides offline and some contacts for people in the OB-STAC world
[17:24.000 --> 17:29.000]  if you want to dig more into these experiments.
[17:29.000 --> 17:42.000]  And that's it. Thank you very much.