[00:00.000 --> 00:10.000]  Hi, everyone. How are you? How is hosting the weekend? Good?
[00:10.000 --> 00:11.000]  Yes.
[00:11.000 --> 00:16.140]  That's nice. I'm happy to be here. It's my first time in Europe and it's the first time
[00:16.140 --> 00:23.400]  that I will talk in English for a first event in person. This is pretty nice. My name is
[00:23.400 --> 00:31.120]  Edet Buja. I am a technology evangelist at Percona and this is a very basic and friendly
[00:31.120 --> 00:43.040]  introduction about databases and containers. About me, I am from Peru in South America.
[00:43.040 --> 00:49.800]  I am working as Six Months in Percona. It's an open source company. We create open source
[00:49.800 --> 00:59.800]  databases free. I am a Google woman tech maker. I was nominated as a docker captain last year
[00:59.800 --> 01:06.280]  and I am a database and container enthusiast. You can follow me on Twitter and LinkedIn.
[01:06.280 --> 01:18.360]  I used to post about containers, Kubernetes, open source. For the agenda today, we are going
[01:18.360 --> 01:28.120]  to see about containers. We will see docker architecture. We will see the workflow between
[01:28.120 --> 01:35.240]  the components of docker. We are going to have two examples of how we are running a single
[01:35.240 --> 01:41.160]  Percona server MySQL container and we are going to run multiple containers for Percona
[01:41.160 --> 01:48.960]  server MySQL. We will see the docker volume, how this is important in this work of databases
[01:48.960 --> 01:58.480]  on containers. We will see backups, restores of databases and best practices. Let's start
[01:58.480 --> 02:10.520]  it. What's a container? How many of you knows what's with docker? Yeah, a lot. Okay. That's
[02:10.520 --> 02:16.960]  nice. Docker or do you use other tools? Yeah, there are different kinds of tools for container
[02:16.960 --> 02:27.760]  application. But a container is like a single unit, lightweight unit of software that package
[02:27.760 --> 02:32.400]  everything that you need for your application. When we run application, when we build application,
[02:32.400 --> 02:37.120]  we know that we need a lot of packages. If you are running, for example, if you are building
[02:37.120 --> 02:44.640]  a Java application, you need libraries, dependencies, many things to run your application. So everything
[02:44.640 --> 02:50.480]  have to be containerized in a single unit of software and this is going to be isolated
[02:50.480 --> 02:56.440]  for other things like your infrastructure. And the good thing is that your container
[02:56.440 --> 03:02.800]  can run on different platforms in your laptop, in your server, in your cloud. With this,
[03:02.800 --> 03:08.920]  we end with a problem that we have when we say, hey, your program runs. Yes, this works
[03:08.920 --> 03:17.040]  just on my computer. But no, it has to run in different platforms. We don't need to have
[03:17.040 --> 03:24.600]  this problem to dependencies and other kind of things when we test our application in
[03:24.600 --> 03:32.680]  other platforms. There are different tools, as I say, for containerization. We have container
[03:32.680 --> 03:39.560]  interface, for example. We have container D and we have Docker that is the tool that
[03:39.560 --> 03:48.800]  we are going to focus now. All these tools are also in the cloud native computing foundation
[03:48.800 --> 03:54.520]  ecosystem. If you see the landscape, you will see a lot of tools there. There is a part
[03:54.520 --> 04:02.760]  for containerization and there are more than three. There are a lot of tools for them.
[04:02.760 --> 04:11.840]  The Docker architecture, it works like a client-server model. We have the Docker DMO, which is going
[04:11.840 --> 04:20.040]  to process all the commands. It's going to start to listen to the client always and the
[04:20.040 --> 04:29.200]  client is going to send a request to the DMO through the REST app. With this model, the
[04:29.200 --> 04:36.600]  Docker DMO can also manage network containers, images, and Docker volumes. If we go more
[04:36.600 --> 04:43.040]  in detail, we will see that we have the client, the DMO that is also called the engine of
[04:43.040 --> 04:48.480]  Docker, and we have another component that could be your Docker registry, the public,
[04:48.480 --> 04:55.840]  which is Docker Hub, where all the official images are published, and also we can have
[04:55.840 --> 05:02.360]  our own private registry in case we don't want to share it with the public. In this
[05:02.360 --> 05:09.680]  case, this is the flow of a component. For example, if we do a pull, we are going to
[05:09.680 --> 05:18.840]  try to bring the image from the Docker Hub into the Docker DMO cache. If the Docker DMO
[05:18.840 --> 05:24.120]  doesn't find the image in cache, it's going to bring it from the Docker Hub. But if this
[05:24.120 --> 05:28.600]  is in cache, it's going to take it just that and start to process. The same with Docker
[05:28.600 --> 05:37.640]  build. When we run Docker build from the client, the Docker DMO will try to take a Docker file.
[05:37.640 --> 05:45.720]  A Docker file is a recipe with a lot of instructions where we put all the commands to run our application
[05:45.720 --> 05:51.400]  and deploy it. So I'm going to, the Docker DMO is going to take the Docker file and build
[05:51.400 --> 05:59.000]  it, build the image, and if you want, we can also run it. We run, we will create a container.
[05:59.000 --> 06:06.480]  The container is our application that is already alive and is ready to make connections
[06:06.480 --> 06:15.840]  of petitions. One more thing here is that we can have everything in our host or we can
[06:15.840 --> 06:31.600]  have clients, remote clients that could make petitions to the Docker DMO. Container benefits.
[06:31.600 --> 06:37.080]  There are pros and cons, but now I'm going to focus on these benefits, the containers
[06:37.080 --> 06:44.920]  give us. So one of these is we can reduce costs with this because we can run several containers
[06:44.920 --> 06:51.240]  in a single infrastructure. That's infrastructure that we have because of the technology of
[06:51.240 --> 06:58.560]  containers is different than the virtualization. In virtualization, we use the hypervisor and
[06:58.560 --> 07:06.440]  when you create virtual machines, it consumes more resources from your, from your infrastructure,
[07:06.440 --> 07:11.160]  but when you use containers, it's very different. You are using that technology, a container
[07:11.160 --> 07:16.680]  would make it possible to run different, a lot of containers in a single machine. So
[07:16.680 --> 07:23.040]  for that reason, it's possible to reduce costs. Also, the containers are very friendly with
[07:23.040 --> 07:28.760]  continuous integration and continuous delivery process. If you have like a big application,
[07:28.760 --> 07:35.680]  a monolithic application, this, and you want to, you want to run container, you want to
[07:35.680 --> 07:42.080]  integrate it in the DevOps process. This is going to be hard. We have to work like microservices
[07:42.080 --> 07:50.320]  to make each service as a container and included in the continuous integration and continuous
[07:50.320 --> 07:56.440]  delivery process. It's easy. When we build, when we build our application over a container,
[07:56.440 --> 08:01.120]  it's easy to kill it. It's easy to create it again. It's easy to fail and the process
[08:01.120 --> 08:11.480]  is faster. Another benefit is the multicloud compatibility with the time several companies
[08:11.480 --> 08:17.760]  try to migrate to a hybrid cloud. They just don't, don't want to have everything on premise.
[08:17.760 --> 08:27.640]  They also want to scale. They want to grow. So for a reason, they opt for cloud and containers
[08:27.640 --> 08:40.520]  fit very good in this. You can install Docker. I know you did it. You can choose your distro.
[08:40.520 --> 08:46.120]  You are, you use Debian, the CentOS, everything. So you can go to the official Docker documentation
[08:46.120 --> 08:52.680]  and easily look all the steps. When you install this, it will install it, the Docker client,
[08:52.680 --> 09:01.360]  the Docker DMO and other tools that you will need to use Docker in your local matching.
[09:01.360 --> 09:08.720]  We already talk about containers, right? But this talk is about exploring database on containers.
[09:08.720 --> 09:14.400]  We are going to talk about my SQL, which is at this base relational database. We know
[09:14.400 --> 09:22.560]  that it's a database. And to run my SQL on containers, we need to understand how volumes
[09:22.560 --> 09:28.680]  works because the most important thing running databases on containers is the data. If we
[09:28.680 --> 09:38.200]  lost the data, we lost everything. For the next slides, we are going to focus in this
[09:38.200 --> 09:47.760]  part. We will use the image of Percona server for my SQL. This Percona server for my SQL
[09:47.760 --> 09:55.800]  is open source. It's like my SQL, but with more nice things. You can use it. It's open
[09:55.800 --> 10:02.920]  source. It's in Docker Hub. So we will use this image and we will create a Docker container.
[10:02.920 --> 10:08.240]  We will see how it works with all volumes. We will see the layers in Docker and then
[10:08.240 --> 10:16.120]  we will create a persistent volume and we will see how it changes in the layers of Docker.
[10:16.120 --> 10:25.720]  So just here to see that if you want to have an image, it's necessary to have a Docker file.
[10:25.720 --> 10:33.360]  You can use a Docker file before by yourself. That's good. A Docker file is a recipe where
[10:33.360 --> 10:39.920]  you will put everything for your application. So you need this to create an image. Then
[10:39.920 --> 10:47.040]  you need an image to create your Docker container. There are three essential steps here to remember
[10:47.040 --> 11:00.840]  how Docker works. We will run a single Percona server for my SQL container. We will use Docker
[11:00.840 --> 11:10.680]  run to create the image. No. We don't use Docker run to create the image. We use Docker
[11:10.680 --> 11:19.520]  run to create a container. So we use this to create a container. So we will do dash
[11:19.520 --> 11:28.960]  D to say run this container in the background. I don't want to use the terminal. And I will
[11:28.960 --> 11:35.880]  call it Percona server for my Percona server one. I will pass it like the environment variable,
[11:35.880 --> 11:41.880]  for the root. This is not a good practice here. This is just to show how we are going
[11:41.880 --> 11:46.880]  to create a container. And we will use this official Percona server for my SQL. With this
[11:46.880 --> 11:53.800]  I am creating a container, right? I'm creating a container with this one. Okay? So if we
[11:53.800 --> 12:03.000]  go to Docker image LS, this is going to pull the image of Percona server and then it will
[12:03.000 --> 12:09.160]  create the container. That command is going to do two things. It's going to bring the
[12:09.160 --> 12:15.600]  image from the official Dockerfab and it's going to create a container. So if we see
[12:15.600 --> 12:27.400]  Docker container PS, our container is up. Okay. After we have the database, we need
[12:27.400 --> 12:33.120]  to add data. We will add databases, we will add data, we will change registers, we will
[12:33.120 --> 12:43.080]  have transactions, many things that we can do like a regular database. Okay. If we run
[12:43.080 --> 12:50.520]  a single Percona server in my SQL container, we know how it works in layers. If we see
[12:50.520 --> 12:57.600]  this in green, there are layers from Percona, Percona server image. This is the image that
[12:57.600 --> 13:04.040]  we pull it, that we can change. This is just react only. We can change this, but in top
[13:04.040 --> 13:10.040]  of that, it's going to be created a layer, a new layer. This layer, this layer is react
[13:10.040 --> 13:16.320]  only. I can add data. This layer is the one that will contain all the things that I am
[13:16.320 --> 13:23.720]  doing in Docker on that image, on that container. I added a new database. Yes. I create a new
[13:23.720 --> 13:29.120]  registry. I delete it. I add the transactions. All this is going to save it here. But what
[13:29.120 --> 13:35.720]  happens if I don't have volume? My container is ephemeral, right? It could die. It could
[13:35.720 --> 13:41.480]  crash. My machine could crash. And all my data is going to be lost. I will, I will
[13:41.480 --> 13:48.880]  lose all the data. We will see how it works with multiple containers. To run multiple
[13:48.880 --> 13:53.960]  containers with the same image, if we see this is the same image, the same version of
[13:53.960 --> 13:58.600]  the image, we will just change the name of this container. Also, we can change another
[13:58.600 --> 14:09.320]  thing because this is a database, right? What thing we can change? They run in a port, right?
[14:09.320 --> 14:17.600]  In which port my SQL used to run? Yeah. So I need to change the port for the other container
[14:17.600 --> 14:29.760]  to avoid the conflict. Okay. How it works in layers. The same. We will use the same layer.
[14:29.760 --> 14:35.280]  We will use the same layer for Percona, Percona server, which can, we can modify. But in top
[14:35.280 --> 14:40.160]  of that, we are going to have two layers more. One of the first containers that I created
[14:40.160 --> 14:45.880]  and the second for the other that I can add. I can add data. I can change things. But once
[14:45.880 --> 14:51.040]  again, if I don't have volume, this is going to die. But this is how to work if we want
[14:51.040 --> 15:02.280]  to create an application when it doesn't matter if we save the state of this application.
[15:02.280 --> 15:13.760]  This is important. Persist data in databases is really important for this kind of application
[15:13.760 --> 15:20.840]  because sometimes we think that, like Kubernetes, since it was created for a state less application,
[15:20.840 --> 15:26.480]  but now we have options to use stateful applications on containers. And this is one of the reasons.
[15:26.480 --> 15:33.240]  Create volumes. So it's pretty easy to create volume. We can create a volume just with dash
[15:33.240 --> 15:41.920]  V or dash, dash volume. And we can say it, we can create a local volume with local run
[15:41.920 --> 15:49.240]  and detach. We will call it Percona server. The same process. And when we say dash V,
[15:49.240 --> 15:56.320]  we are saying, okay, this will be my volume in a host, in my local data directory. And
[15:56.320 --> 16:11.640]  this one is going to be inside my container. So this is like a mirror from this image.
[16:11.640 --> 16:17.880]  And how it works. In layers, we have the same, the layer that we can modify. And in top of
[16:17.880 --> 16:23.280]  that, we are going to create another layer. But in this case, we are adding, we are creating
[16:23.280 --> 16:33.760]  the mounted volume in BarLivMySQL. There are other directories that we can create the volume.
[16:33.760 --> 16:40.360]  I am just adding, as an example, this, because in MySQL, we have configuration files. We
[16:40.360 --> 16:46.400]  have logs. We have another things. But for that, we want to create these volumes for
[16:46.400 --> 16:51.800]  all of that things. I am just adding, as an example, BarLivMySQL, which is also a directory
[16:51.800 --> 16:57.840]  that is very important. And this local directory is the one that could be in my host. But it
[16:57.840 --> 17:02.720]  is not recommended, because if your host crashes, everything crashes too with your volumes.
[17:02.720 --> 17:21.680]  It is preferable to run it in a remote host. Okay. Two backups. Who here make backups?
[17:21.680 --> 17:29.760]  Okay. I use the very easy way to make backups. I use it for logical backups, my SQL dump
[17:29.760 --> 17:36.520]  used in the container. And for physical backups, we use in the company PerconextraVacup, which
[17:36.520 --> 17:44.640]  is, have more features to have that physical backup. And for restore, I will use also my
[17:44.640 --> 17:54.040]  SQL dump. And we don't use PerconextraVacup in this case, because it has a lot of pins.
[17:54.040 --> 18:03.400]  For backup, I will execute a backup in a container that is already running. PerconaserverVacup
[18:03.400 --> 18:10.280]  is already running. Let's see that we created. And we are executing Docker exit, it, to enter
[18:10.280 --> 18:19.920]  into the Percona in that container and type that common, my SQL dump, to create a backup
[18:19.920 --> 18:27.520]  of the database. So the backup is going to be in that file, dump SQL. And the same process
[18:27.520 --> 18:35.240]  with restore, we can take that backup. And this is a different container. I'm going to
[18:35.240 --> 18:42.480]  restore the dot SQL file in a different container. In this case, in PerconaserverRestore, using
[18:42.480 --> 18:52.040]  my SQL, use that command, my SQL. Okay. Best practices or some recommendation to use containers
[18:52.040 --> 19:01.120]  in database. Okay. And one of this is that we can keep constantly monitoring our database
[19:01.120 --> 19:05.840]  and the whole system, because we don't know when we are going to don't have enough resources
[19:05.840 --> 19:11.960]  for our containers. We should be aware of that or have notifications to say, hey, you
[19:11.960 --> 19:17.280]  don't have a note disk, you don't have a note memory, so provision or try to scale in your
[19:17.280 --> 19:22.880]  resources. So we should keep monitoring. Using some tools for that, for example, is PMM.
[19:22.880 --> 19:30.960]  We can use open source monitors to monitor our databases on containers. And we can store
[19:30.960 --> 19:36.000]  this data in persistent volume outside the container. It recommended no inside the container,
[19:36.000 --> 19:44.720]  because it's easy to create plans for recovery. We can restore the data easily also and fast.
[19:44.720 --> 19:52.640]  We should limit the resources of utilization of our containers. Our containers, we know
[19:52.640 --> 20:01.280]  that they are small, but also we should limit when they are a lot. And we should regularly
[20:01.280 --> 20:09.800]  have backups of the database and store these backups in a different location. And have
[20:09.800 --> 20:16.400]  a plan of migration and disaster recovery is really great. In that case, having a monitoring
[20:16.400 --> 20:28.160]  tool helps a lot. And what more? That's all. You can find me in LinkedIn and Twitter.
[20:28.160 --> 20:43.520]  Okay, we have time for questions. If you absolutely need to leave and you can't wait until the
[20:43.520 --> 20:58.800]  talk is over, please do so as quietly as possible so we can understand the questions. Thanks.
[20:58.800 --> 21:02.960]  Hi. Thank you so much for your talk. It was really interesting. I'm wondering what kind
[21:02.960 --> 21:08.320]  of limitations do you see when you're speaking about having a databases arriving in containers?
[21:08.320 --> 21:16.080]  There is storage limitations, CPU, or something else? Guys, can you please be a little quiet
[21:16.080 --> 21:21.840]  so we can understand the question? All right, I will try it with the microphone.
[21:21.840 --> 21:27.120]  Yeah, you. The people can you. Thank you. I was wondering maybe, first of all, really
[21:27.120 --> 21:31.760]  cool talk. Thank you so much. My question would be, could you maybe talk us through some kind
[21:31.760 --> 21:39.440]  of limitations that you can see when you're running databases from containers? You didn't
[21:39.440 --> 21:46.880]  understand it? Thank you so much for the talk. It was really cool. Maybe you can share with
[21:46.880 --> 21:52.480]  us some kind of limitations that you see when you're running to the solution of running databases
[21:52.480 --> 21:57.120]  inside containers, right? You cannot really run very big database. You probably will have
[21:57.120 --> 22:03.200]  a problem with that. What kind of limitations do you see?
[22:03.200 --> 22:10.720]  So, yeah, the question is about sorry, the question is about what limitations you can
[22:10.720 --> 22:17.840]  run into with database containers? Yeah, I don't want to say this, but it depends really of the
[22:18.560 --> 22:24.000]  business. Okay, if you want to invest a lot of money in infrastructure, but because at the end,
[22:24.000 --> 22:30.320]  your database, the volume that you have is not going to be part of your container,
[22:30.320 --> 22:35.200]  it's going to be outside. And this depends on you. You want to invest a lot of money
[22:35.200 --> 22:42.240]  to save that data. It's good. You want to replicate it? Please try and be quiet while we
[22:42.240 --> 22:56.320]  are asking questions. Are there any more questions?
[22:56.320 --> 23:13.680]  There is one more question from the back, so please be quiet.
[23:13.680 --> 23:22.800]  Thank you. Hello. I wanted to ask, did you notice any kind of performance issues?
[23:22.800 --> 23:29.840]  Did you benchmark things? Did you identify some kind of overheads going on when you
[23:29.840 --> 23:57.200]  containerize a database like MySQL or other kind of databases really? Sorry, I didn't get your
[23:57.200 --> 24:03.360]  question. All right, I'm just going to ask you. When you containerize a database,
[24:03.360 --> 24:09.760]  be it MySQL or Postgres or any kind of open source database that you may have tested on this kind
[24:09.760 --> 24:18.560]  of setup, did you notice any kind of overheads, compute, memory, or disk, essentially, where
[24:18.560 --> 24:24.960]  you can see that the database performance or operation is significantly affected by the fact
[24:24.960 --> 24:33.920]  of being containerized? I'm not sure about that, but if you use open source to monitor
[24:33.920 --> 24:40.800]  your containers on databases, you can have a visualization of these things if you don't have
[24:40.800 --> 24:47.120]  enough resources so it can show you alerts or things like that where you can figure out where
[24:47.120 --> 24:55.520]  exactly is your limitation. Okay, so for example, did you run Benchmark?
[25:02.400 --> 25:06.960]  Could you help me? Okay, could you help me to answer? Okay, my friend is going to help me to
[25:06.960 --> 25:14.880]  answer this. All right, thank you. Yeah, thank you to you. Hey, so usually the performance
[25:14.880 --> 25:22.240]  degradation is around two, three, four percent. The issue is more about how you configure the
[25:22.240 --> 25:29.280]  database, kind of storage, if it's local or network storage, but the virtualization is
[25:29.280 --> 25:38.080]  minimal. It's like running on a EC2 instance. Okay, so there is an impact, miserable, at least you
[25:38.080 --> 25:45.280]  say around four or five percent, but you say that's not going to be the, that there are configurations
[25:45.280 --> 25:49.840]  we can do to try to avoid that. Do you have any kind of paper or any kind of resources that we might
[25:49.840 --> 26:02.080]  use to avoid those kind of bottlenecks? If I got correctly, not much. The measure that we do in
[26:02.080 --> 26:09.680]  databases is measuring TPS. So you will notice on, if we're running Benchmarks, we've seen Bench,
[26:09.680 --> 26:17.120]  for example, three percent, like if you are running 1,000 credits per second, you will get
[26:17.760 --> 26:27.280]  980, 990 credits per second when containerized. Okay, and do you have any kind of recommendations,
[26:27.280 --> 26:32.000]  kind of generic recommendations you can do so that when you run a database in a container,
[26:32.000 --> 26:36.480]  here is what you can do to try and negate some of the performance bottleneck that you guys have
[26:36.480 --> 26:48.240]  noticed? To be honest, on real-day activities, I would say 99 percent of the performance will come
[26:48.240 --> 26:57.520]  from how you configure my SQL, not the containerization is like just a small piece of the game.
[26:59.760 --> 27:03.680]  You can make more effect by modifying the database configuration.
[27:04.480 --> 27:05.760]  All right, thank you very much.
[27:05.760 --> 27:19.040]  Thanks to you.