[00:00.000 --> 00:10.000] Hi, everyone. How are you? How is hosting the weekend? Good? [00:10.000 --> 00:11.000] Yes. [00:11.000 --> 00:16.140] That's nice. I'm happy to be here. It's my first time in Europe and it's the first time [00:16.140 --> 00:23.400] that I will talk in English for a first event in person. This is pretty nice. My name is [00:23.400 --> 00:31.120] Edet Buja. I am a technology evangelist at Percona and this is a very basic and friendly [00:31.120 --> 00:43.040] introduction about databases and containers. About me, I am from Peru in South America. [00:43.040 --> 00:49.800] I am working as Six Months in Percona. It's an open source company. We create open source [00:49.800 --> 00:59.800] databases free. I am a Google woman tech maker. I was nominated as a docker captain last year [00:59.800 --> 01:06.280] and I am a database and container enthusiast. You can follow me on Twitter and LinkedIn. [01:06.280 --> 01:18.360] I used to post about containers, Kubernetes, open source. For the agenda today, we are going [01:18.360 --> 01:28.120] to see about containers. We will see docker architecture. We will see the workflow between [01:28.120 --> 01:35.240] the components of docker. We are going to have two examples of how we are running a single [01:35.240 --> 01:41.160] Percona server MySQL container and we are going to run multiple containers for Percona [01:41.160 --> 01:48.960] server MySQL. We will see the docker volume, how this is important in this work of databases [01:48.960 --> 01:58.480] on containers. We will see backups, restores of databases and best practices. Let's start [01:58.480 --> 02:10.520] it. What's a container? How many of you knows what's with docker? Yeah, a lot. Okay. That's [02:10.520 --> 02:16.960] nice. Docker or do you use other tools? Yeah, there are different kinds of tools for container [02:16.960 --> 02:27.760] application. But a container is like a single unit, lightweight unit of software that package [02:27.760 --> 02:32.400] everything that you need for your application. When we run application, when we build application, [02:32.400 --> 02:37.120] we know that we need a lot of packages. If you are running, for example, if you are building [02:37.120 --> 02:44.640] a Java application, you need libraries, dependencies, many things to run your application. So everything [02:44.640 --> 02:50.480] have to be containerized in a single unit of software and this is going to be isolated [02:50.480 --> 02:56.440] for other things like your infrastructure. And the good thing is that your container [02:56.440 --> 03:02.800] can run on different platforms in your laptop, in your server, in your cloud. With this, [03:02.800 --> 03:08.920] we end with a problem that we have when we say, hey, your program runs. Yes, this works [03:08.920 --> 03:17.040] just on my computer. But no, it has to run in different platforms. We don't need to have [03:17.040 --> 03:24.600] this problem to dependencies and other kind of things when we test our application in [03:24.600 --> 03:32.680] other platforms. There are different tools, as I say, for containerization. We have container [03:32.680 --> 03:39.560] interface, for example. We have container D and we have Docker that is the tool that [03:39.560 --> 03:48.800] we are going to focus now. All these tools are also in the cloud native computing foundation [03:48.800 --> 03:54.520] ecosystem. If you see the landscape, you will see a lot of tools there. There is a part [03:54.520 --> 04:02.760] for containerization and there are more than three. There are a lot of tools for them. [04:02.760 --> 04:11.840] The Docker architecture, it works like a client-server model. We have the Docker DMO, which is going [04:11.840 --> 04:20.040] to process all the commands. It's going to start to listen to the client always and the [04:20.040 --> 04:29.200] client is going to send a request to the DMO through the REST app. With this model, the [04:29.200 --> 04:36.600] Docker DMO can also manage network containers, images, and Docker volumes. If we go more [04:36.600 --> 04:43.040] in detail, we will see that we have the client, the DMO that is also called the engine of [04:43.040 --> 04:48.480] Docker, and we have another component that could be your Docker registry, the public, [04:48.480 --> 04:55.840] which is Docker Hub, where all the official images are published, and also we can have [04:55.840 --> 05:02.360] our own private registry in case we don't want to share it with the public. In this [05:02.360 --> 05:09.680] case, this is the flow of a component. For example, if we do a pull, we are going to [05:09.680 --> 05:18.840] try to bring the image from the Docker Hub into the Docker DMO cache. If the Docker DMO [05:18.840 --> 05:24.120] doesn't find the image in cache, it's going to bring it from the Docker Hub. But if this [05:24.120 --> 05:28.600] is in cache, it's going to take it just that and start to process. The same with Docker [05:28.600 --> 05:37.640] build. When we run Docker build from the client, the Docker DMO will try to take a Docker file. [05:37.640 --> 05:45.720] A Docker file is a recipe with a lot of instructions where we put all the commands to run our application [05:45.720 --> 05:51.400] and deploy it. So I'm going to, the Docker DMO is going to take the Docker file and build [05:51.400 --> 05:59.000] it, build the image, and if you want, we can also run it. We run, we will create a container. [05:59.000 --> 06:06.480] The container is our application that is already alive and is ready to make connections [06:06.480 --> 06:15.840] of petitions. One more thing here is that we can have everything in our host or we can [06:15.840 --> 06:31.600] have clients, remote clients that could make petitions to the Docker DMO. Container benefits. [06:31.600 --> 06:37.080] There are pros and cons, but now I'm going to focus on these benefits, the containers [06:37.080 --> 06:44.920] give us. So one of these is we can reduce costs with this because we can run several containers [06:44.920 --> 06:51.240] in a single infrastructure. That's infrastructure that we have because of the technology of [06:51.240 --> 06:58.560] containers is different than the virtualization. In virtualization, we use the hypervisor and [06:58.560 --> 07:06.440] when you create virtual machines, it consumes more resources from your, from your infrastructure, [07:06.440 --> 07:11.160] but when you use containers, it's very different. You are using that technology, a container [07:11.160 --> 07:16.680] would make it possible to run different, a lot of containers in a single machine. So [07:16.680 --> 07:23.040] for that reason, it's possible to reduce costs. Also, the containers are very friendly with [07:23.040 --> 07:28.760] continuous integration and continuous delivery process. If you have like a big application, [07:28.760 --> 07:35.680] a monolithic application, this, and you want to, you want to run container, you want to [07:35.680 --> 07:42.080] integrate it in the DevOps process. This is going to be hard. We have to work like microservices [07:42.080 --> 07:50.320] to make each service as a container and included in the continuous integration and continuous [07:50.320 --> 07:56.440] delivery process. It's easy. When we build, when we build our application over a container, [07:56.440 --> 08:01.120] it's easy to kill it. It's easy to create it again. It's easy to fail and the process [08:01.120 --> 08:11.480] is faster. Another benefit is the multicloud compatibility with the time several companies [08:11.480 --> 08:17.760] try to migrate to a hybrid cloud. They just don't, don't want to have everything on premise. [08:17.760 --> 08:27.640] They also want to scale. They want to grow. So for a reason, they opt for cloud and containers [08:27.640 --> 08:40.520] fit very good in this. You can install Docker. I know you did it. You can choose your distro. [08:40.520 --> 08:46.120] You are, you use Debian, the CentOS, everything. So you can go to the official Docker documentation [08:46.120 --> 08:52.680] and easily look all the steps. When you install this, it will install it, the Docker client, [08:52.680 --> 09:01.360] the Docker DMO and other tools that you will need to use Docker in your local matching. [09:01.360 --> 09:08.720] We already talk about containers, right? But this talk is about exploring database on containers. [09:08.720 --> 09:14.400] We are going to talk about my SQL, which is at this base relational database. We know [09:14.400 --> 09:22.560] that it's a database. And to run my SQL on containers, we need to understand how volumes [09:22.560 --> 09:28.680] works because the most important thing running databases on containers is the data. If we [09:28.680 --> 09:38.200] lost the data, we lost everything. For the next slides, we are going to focus in this [09:38.200 --> 09:47.760] part. We will use the image of Percona server for my SQL. This Percona server for my SQL [09:47.760 --> 09:55.800] is open source. It's like my SQL, but with more nice things. You can use it. It's open [09:55.800 --> 10:02.920] source. It's in Docker Hub. So we will use this image and we will create a Docker container. [10:02.920 --> 10:08.240] We will see how it works with all volumes. We will see the layers in Docker and then [10:08.240 --> 10:16.120] we will create a persistent volume and we will see how it changes in the layers of Docker. [10:16.120 --> 10:25.720] So just here to see that if you want to have an image, it's necessary to have a Docker file. [10:25.720 --> 10:33.360] You can use a Docker file before by yourself. That's good. A Docker file is a recipe where [10:33.360 --> 10:39.920] you will put everything for your application. So you need this to create an image. Then [10:39.920 --> 10:47.040] you need an image to create your Docker container. There are three essential steps here to remember [10:47.040 --> 11:00.840] how Docker works. We will run a single Percona server for my SQL container. We will use Docker [11:00.840 --> 11:10.680] run to create the image. No. We don't use Docker run to create the image. We use Docker [11:10.680 --> 11:19.520] run to create a container. So we use this to create a container. So we will do dash [11:19.520 --> 11:28.960] D to say run this container in the background. I don't want to use the terminal. And I will [11:28.960 --> 11:35.880] call it Percona server for my Percona server one. I will pass it like the environment variable, [11:35.880 --> 11:41.880] for the root. This is not a good practice here. This is just to show how we are going [11:41.880 --> 11:46.880] to create a container. And we will use this official Percona server for my SQL. With this [11:46.880 --> 11:53.800] I am creating a container, right? I'm creating a container with this one. Okay? So if we [11:53.800 --> 12:03.000] go to Docker image LS, this is going to pull the image of Percona server and then it will [12:03.000 --> 12:09.160] create the container. That command is going to do two things. It's going to bring the [12:09.160 --> 12:15.600] image from the official Dockerfab and it's going to create a container. So if we see [12:15.600 --> 12:27.400] Docker container PS, our container is up. Okay. After we have the database, we need [12:27.400 --> 12:33.120] to add data. We will add databases, we will add data, we will change registers, we will [12:33.120 --> 12:43.080] have transactions, many things that we can do like a regular database. Okay. If we run [12:43.080 --> 12:50.520] a single Percona server in my SQL container, we know how it works in layers. If we see [12:50.520 --> 12:57.600] this in green, there are layers from Percona, Percona server image. This is the image that [12:57.600 --> 13:04.040] we pull it, that we can change. This is just react only. We can change this, but in top [13:04.040 --> 13:10.040] of that, it's going to be created a layer, a new layer. This layer, this layer is react [13:10.040 --> 13:16.320] only. I can add data. This layer is the one that will contain all the things that I am [13:16.320 --> 13:23.720] doing in Docker on that image, on that container. I added a new database. Yes. I create a new [13:23.720 --> 13:29.120] registry. I delete it. I add the transactions. All this is going to save it here. But what [13:29.120 --> 13:35.720] happens if I don't have volume? My container is ephemeral, right? It could die. It could [13:35.720 --> 13:41.480] crash. My machine could crash. And all my data is going to be lost. I will, I will [13:41.480 --> 13:48.880] lose all the data. We will see how it works with multiple containers. To run multiple [13:48.880 --> 13:53.960] containers with the same image, if we see this is the same image, the same version of [13:53.960 --> 13:58.600] the image, we will just change the name of this container. Also, we can change another [13:58.600 --> 14:09.320] thing because this is a database, right? What thing we can change? They run in a port, right? [14:09.320 --> 14:17.600] In which port my SQL used to run? Yeah. So I need to change the port for the other container [14:17.600 --> 14:29.760] to avoid the conflict. Okay. How it works in layers. The same. We will use the same layer. [14:29.760 --> 14:35.280] We will use the same layer for Percona, Percona server, which can, we can modify. But in top [14:35.280 --> 14:40.160] of that, we are going to have two layers more. One of the first containers that I created [14:40.160 --> 14:45.880] and the second for the other that I can add. I can add data. I can change things. But once [14:45.880 --> 14:51.040] again, if I don't have volume, this is going to die. But this is how to work if we want [14:51.040 --> 15:02.280] to create an application when it doesn't matter if we save the state of this application. [15:02.280 --> 15:13.760] This is important. Persist data in databases is really important for this kind of application [15:13.760 --> 15:20.840] because sometimes we think that, like Kubernetes, since it was created for a state less application, [15:20.840 --> 15:26.480] but now we have options to use stateful applications on containers. And this is one of the reasons. [15:26.480 --> 15:33.240] Create volumes. So it's pretty easy to create volume. We can create a volume just with dash [15:33.240 --> 15:41.920] V or dash, dash volume. And we can say it, we can create a local volume with local run [15:41.920 --> 15:49.240] and detach. We will call it Percona server. The same process. And when we say dash V, [15:49.240 --> 15:56.320] we are saying, okay, this will be my volume in a host, in my local data directory. And [15:56.320 --> 16:11.640] this one is going to be inside my container. So this is like a mirror from this image. [16:11.640 --> 16:17.880] And how it works. In layers, we have the same, the layer that we can modify. And in top of [16:17.880 --> 16:23.280] that, we are going to create another layer. But in this case, we are adding, we are creating [16:23.280 --> 16:33.760] the mounted volume in BarLivMySQL. There are other directories that we can create the volume. [16:33.760 --> 16:40.360] I am just adding, as an example, this, because in MySQL, we have configuration files. We [16:40.360 --> 16:46.400] have logs. We have another things. But for that, we want to create these volumes for [16:46.400 --> 16:51.800] all of that things. I am just adding, as an example, BarLivMySQL, which is also a directory [16:51.800 --> 16:57.840] that is very important. And this local directory is the one that could be in my host. But it [16:57.840 --> 17:02.720] is not recommended, because if your host crashes, everything crashes too with your volumes. [17:02.720 --> 17:21.680] It is preferable to run it in a remote host. Okay. Two backups. Who here make backups? [17:21.680 --> 17:29.760] Okay. I use the very easy way to make backups. I use it for logical backups, my SQL dump [17:29.760 --> 17:36.520] used in the container. And for physical backups, we use in the company PerconextraVacup, which [17:36.520 --> 17:44.640] is, have more features to have that physical backup. And for restore, I will use also my [17:44.640 --> 17:54.040] SQL dump. And we don't use PerconextraVacup in this case, because it has a lot of pins. [17:54.040 --> 18:03.400] For backup, I will execute a backup in a container that is already running. PerconaserverVacup [18:03.400 --> 18:10.280] is already running. Let's see that we created. And we are executing Docker exit, it, to enter [18:10.280 --> 18:19.920] into the Percona in that container and type that common, my SQL dump, to create a backup [18:19.920 --> 18:27.520] of the database. So the backup is going to be in that file, dump SQL. And the same process [18:27.520 --> 18:35.240] with restore, we can take that backup. And this is a different container. I'm going to [18:35.240 --> 18:42.480] restore the dot SQL file in a different container. In this case, in PerconaserverRestore, using [18:42.480 --> 18:52.040] my SQL, use that command, my SQL. Okay. Best practices or some recommendation to use containers [18:52.040 --> 19:01.120] in database. Okay. And one of this is that we can keep constantly monitoring our database [19:01.120 --> 19:05.840] and the whole system, because we don't know when we are going to don't have enough resources [19:05.840 --> 19:11.960] for our containers. We should be aware of that or have notifications to say, hey, you [19:11.960 --> 19:17.280] don't have a note disk, you don't have a note memory, so provision or try to scale in your [19:17.280 --> 19:22.880] resources. So we should keep monitoring. Using some tools for that, for example, is PMM. [19:22.880 --> 19:30.960] We can use open source monitors to monitor our databases on containers. And we can store [19:30.960 --> 19:36.000] this data in persistent volume outside the container. It recommended no inside the container, [19:36.000 --> 19:44.720] because it's easy to create plans for recovery. We can restore the data easily also and fast. [19:44.720 --> 19:52.640] We should limit the resources of utilization of our containers. Our containers, we know [19:52.640 --> 20:01.280] that they are small, but also we should limit when they are a lot. And we should regularly [20:01.280 --> 20:09.800] have backups of the database and store these backups in a different location. And have [20:09.800 --> 20:16.400] a plan of migration and disaster recovery is really great. In that case, having a monitoring [20:16.400 --> 20:28.160] tool helps a lot. And what more? That's all. You can find me in LinkedIn and Twitter. [20:28.160 --> 20:43.520] Okay, we have time for questions. If you absolutely need to leave and you can't wait until the [20:43.520 --> 20:58.800] talk is over, please do so as quietly as possible so we can understand the questions. Thanks. [20:58.800 --> 21:02.960] Hi. Thank you so much for your talk. It was really interesting. I'm wondering what kind [21:02.960 --> 21:08.320] of limitations do you see when you're speaking about having a databases arriving in containers? [21:08.320 --> 21:16.080] There is storage limitations, CPU, or something else? Guys, can you please be a little quiet [21:16.080 --> 21:21.840] so we can understand the question? All right, I will try it with the microphone. [21:21.840 --> 21:27.120] Yeah, you. The people can you. Thank you. I was wondering maybe, first of all, really [21:27.120 --> 21:31.760] cool talk. Thank you so much. My question would be, could you maybe talk us through some kind [21:31.760 --> 21:39.440] of limitations that you can see when you're running databases from containers? You didn't [21:39.440 --> 21:46.880] understand it? Thank you so much for the talk. It was really cool. Maybe you can share with [21:46.880 --> 21:52.480] us some kind of limitations that you see when you're running to the solution of running databases [21:52.480 --> 21:57.120] inside containers, right? You cannot really run very big database. You probably will have [21:57.120 --> 22:03.200] a problem with that. What kind of limitations do you see? [22:03.200 --> 22:10.720] So, yeah, the question is about sorry, the question is about what limitations you can [22:10.720 --> 22:17.840] run into with database containers? Yeah, I don't want to say this, but it depends really of the [22:18.560 --> 22:24.000] business. Okay, if you want to invest a lot of money in infrastructure, but because at the end, [22:24.000 --> 22:30.320] your database, the volume that you have is not going to be part of your container, [22:30.320 --> 22:35.200] it's going to be outside. And this depends on you. You want to invest a lot of money [22:35.200 --> 22:42.240] to save that data. It's good. You want to replicate it? Please try and be quiet while we [22:42.240 --> 22:56.320] are asking questions. Are there any more questions? [22:56.320 --> 23:13.680] There is one more question from the back, so please be quiet. [23:13.680 --> 23:22.800] Thank you. Hello. I wanted to ask, did you notice any kind of performance issues? [23:22.800 --> 23:29.840] Did you benchmark things? Did you identify some kind of overheads going on when you [23:29.840 --> 23:57.200] containerize a database like MySQL or other kind of databases really? Sorry, I didn't get your [23:57.200 --> 24:03.360] question. All right, I'm just going to ask you. When you containerize a database, [24:03.360 --> 24:09.760] be it MySQL or Postgres or any kind of open source database that you may have tested on this kind [24:09.760 --> 24:18.560] of setup, did you notice any kind of overheads, compute, memory, or disk, essentially, where [24:18.560 --> 24:24.960] you can see that the database performance or operation is significantly affected by the fact [24:24.960 --> 24:33.920] of being containerized? I'm not sure about that, but if you use open source to monitor [24:33.920 --> 24:40.800] your containers on databases, you can have a visualization of these things if you don't have [24:40.800 --> 24:47.120] enough resources so it can show you alerts or things like that where you can figure out where [24:47.120 --> 24:55.520] exactly is your limitation. Okay, so for example, did you run Benchmark? [25:02.400 --> 25:06.960] Could you help me? Okay, could you help me to answer? Okay, my friend is going to help me to [25:06.960 --> 25:14.880] answer this. All right, thank you. Yeah, thank you to you. Hey, so usually the performance [25:14.880 --> 25:22.240] degradation is around two, three, four percent. The issue is more about how you configure the [25:22.240 --> 25:29.280] database, kind of storage, if it's local or network storage, but the virtualization is [25:29.280 --> 25:38.080] minimal. It's like running on a EC2 instance. Okay, so there is an impact, miserable, at least you [25:38.080 --> 25:45.280] say around four or five percent, but you say that's not going to be the, that there are configurations [25:45.280 --> 25:49.840] we can do to try to avoid that. Do you have any kind of paper or any kind of resources that we might [25:49.840 --> 26:02.080] use to avoid those kind of bottlenecks? If I got correctly, not much. The measure that we do in [26:02.080 --> 26:09.680] databases is measuring TPS. So you will notice on, if we're running Benchmarks, we've seen Bench, [26:09.680 --> 26:17.120] for example, three percent, like if you are running 1,000 credits per second, you will get [26:17.760 --> 26:27.280] 980, 990 credits per second when containerized. Okay, and do you have any kind of recommendations, [26:27.280 --> 26:32.000] kind of generic recommendations you can do so that when you run a database in a container, [26:32.000 --> 26:36.480] here is what you can do to try and negate some of the performance bottleneck that you guys have [26:36.480 --> 26:48.240] noticed? To be honest, on real-day activities, I would say 99 percent of the performance will come [26:48.240 --> 26:57.520] from how you configure my SQL, not the containerization is like just a small piece of the game. [26:59.760 --> 27:03.680] You can make more effect by modifying the database configuration. [27:04.480 --> 27:05.760] All right, thank you very much. [27:05.760 --> 27:19.040] Thanks to you.