[00:00.000 --> 00:13.000] Okay. Our next talk is going to start right now. Mark's already on stage. He's going to [00:13.000 --> 00:18.080] talk about automating secrets, rotation, and Kubernetes, and please quiet down so we can [00:18.080 --> 00:41.080] understand him. Okay. Hello. Can you hear me? All right. So thank you for joining here today. My name is Mark. I'm an engineer [00:41.080 --> 00:49.080] tech lead at Cisco. For the last couple of years, or maybe the better part of the decade, my primary job was helping [00:49.080 --> 00:57.080] engineering teams around their business applications and Kubernetes and helping them succeed without having to get into too [00:57.080 --> 01:08.080] much details about Kubernetes. Let me start with the story. I'm pretty sure this will sound familiar to a lot of us here. [01:08.080 --> 01:18.080] A couple of years ago, I was in the middle of debugging session. It was already the middle of the night. Everyone was tired. And finally, we [01:18.080 --> 01:29.080] found a problem. I committed the change, pushed the code, and then suddenly all the buzz went off. We received an e-mail from [01:29.080 --> 01:39.080] AWS that a pair of credentials was committed in a public repository. Who did something like that before? Come on. I'm pretty [01:39.080 --> 01:49.080] sure it's more than that. There's no shame in that. Everyone has to go through that once. So we obviously had to revoke the [01:49.080 --> 01:57.080] credentials, generate a new pair, and deploy it to production. And we were able to do that because we had, like, good [01:57.080 --> 02:05.080] secret management pipeline in place. And this kind of hints at why rotating secrets or being able to rotate secrets is [02:05.080 --> 02:12.080] important, because if you have an incident like this, you have to be able to act quickly and rotate those secrets and make [02:12.080 --> 02:21.080] sure that, well, in a first-case scenario, people may steal your data in a better scenario than AWS. Someone might start [02:21.080 --> 02:30.080] mining Bitcoin. But you have to be able to react quickly. Another reason why this is a very important topic is we often have to [02:30.080 --> 02:40.080] meet certain compliance requirements that require us to rotate every secret we have, like, every 90 days. I'm pretty sure many of [02:40.080 --> 02:49.080] us have to deal with that. But the worst of all, the worst situation of all is when you don't even know that a secret has [02:49.080 --> 02:57.080] been leaked. Or maybe an angry ex-employee took something with home. And you don't even know that happened. And they are [02:57.080 --> 03:05.080] stealing your data. They are stealing your customer's data. Or they are mining Bitcoin in a better situation. [03:05.080 --> 03:12.080] All right. So probably nobody disputes that secret rotation is important. But unfortunately, it comes with its own [03:12.080 --> 03:21.080] self-challenges, which often turns people away from actually caring about this. And obviously, secret rotation or managing [03:21.080 --> 03:27.080] secrets or configuration is a very complex problem, especially in a Kubernetes environment where you may have multiple [03:27.080 --> 03:33.080] different clusters, multiple different in-spaces where you have to deploy these secrets, many different secrets and [03:33.080 --> 03:44.080] integration, which means it takes a lot of time to do it right. And it's still an error-prone process. And in an idea [03:44.080 --> 03:53.080] scenario, if you screw something up, it may not result in an actual outage or incident. But it may, which is obviously, it [03:53.080 --> 04:01.080] would affect the business, which is what we wanted to avoid in the first place by making these secret rotations. So all [04:01.080 --> 04:08.080] right. So I'm going to talk about some of the key challenges and why it's important points to that secret rotation [04:08.080 --> 04:16.080] should be possible. I mean, it's probably always possible. But I've seen situations where rotating certain secrets would [04:16.080 --> 04:23.080] have been very, very hard. Like it would have taken like hours, which is a problem. But so it should be possible. And [04:23.080 --> 04:30.080] you should be able to do it relatively quickly. Secret rotation should also be as much automated as [04:30.080 --> 04:40.080] possible. Like we are not really trustworthy, like we make mistakes, exhibit A. So it should be ultimately as much as [04:40.080 --> 04:48.080] possible. And humans should interact with secrets and secret rotation as little as possible. And finally, secret [04:48.080 --> 04:55.080] rotation should happen periodically. Like you shouldn't have a secret that you use for years, because as I mentioned, you [04:55.080 --> 05:04.080] don't know if it's been leaked. And if you don't know if it's been leaked, how do you know if your system is secure or not? [05:04.080 --> 05:13.080] So how does secret rotation look like in general? We are not even talking about Kubernetes here. First, you need to [05:13.080 --> 05:20.080] have a secret store. If you don't have a secret store, then the whole thing is a lot more complex than it should be. You [05:20.080 --> 05:26.080] have a secret store where you store your secrets, and then you have some solution to deploy those secrets to your production [05:26.080 --> 05:33.080] environment or production environments. Now, when you need to change a secret, depending on what type of secret that is, [05:33.080 --> 05:41.080] you have to go to the secret provider, which may be a third-party provider like AWS or GitHub or anything like that. You [05:41.080 --> 05:48.080] have to issue a new pair of credentials or generate a new secret, change that in the secret store, and then you need some [05:48.080 --> 05:58.080] sort of mechanism to deploy the new secret. That probably should be an automatic process that notices the secret [05:58.080 --> 06:04.080] change, and it should deploy the secrets for you in your production environment. Now, in some cases, if you have a secret [06:04.080 --> 06:09.080] store that supports that, for example, Hashicorp's vault, your secret store may be able to [06:09.080 --> 06:18.080] automatically issue credentials for you, for example, for AWS, your database, or whatever else Hashicorp's vault supports, so you [06:18.080 --> 06:26.080] don't even need to do that manually. Hashicorp's vault takes care of that, and that's like the best case scenario. Now, how does [06:26.080 --> 06:33.080] this look like in Kubernetes? First of all, you have to decide whether you want to use Kubernetes secrets at all or not. There [06:33.080 --> 06:41.080] are options when you don't have to use Kubernetes secrets, but that's probably the easiest way to many secrets in [06:41.080 --> 06:48.080] Kubernetes, and the reason why generally people don't like using Kubernetes secrets is because they have this notion that [06:48.080 --> 06:56.080] Kubernetes secrets are not secure because they are base 64 encoded, and that's not secure. So that's an entirely [06:56.080 --> 07:05.080] different conversation. The bottom line is if you have envelope encryption enabled, which is disabled by default, then you're [07:05.080 --> 07:13.080] probably safe using Kubernetes secrets. Now, if you decided to use Kubernetes secrets, then you need something that [07:13.080 --> 07:20.080] deploys the secrets from your secret store to Kubernetes, and this could be, for example, the external secrets [07:20.080 --> 07:27.080] operator. There are other solutions, but this is probably the one that the community organizes around a lot lately. So [07:27.080 --> 07:36.080] external secrets operator is able to synchronize your secrets from an external store, external being to Kubernetes in this [07:36.080 --> 07:43.080] case. For example, Hashicorp's vault or AWS secret manager or whatever else you have, external secrets operator is able to [07:43.080 --> 07:51.080] synchronize secrets to Kubernetes secrets, and it's also able to pick up changes. It doesn't actively monitor changes, but [07:51.080 --> 07:58.080] periodically it takes a look at the secrets, and if something changes, then it synchronizes the changes to Kubernetes. So [07:58.080 --> 08:05.080] we have that part covered, and then you can use the Kubernetes secrets, either as environment variables or mount them as [08:05.080 --> 08:17.080] files, however you want to use them. Now, the secrets change. What then? So if you mount secrets as files, and your [08:17.080 --> 08:26.080] application is able to pick up that change, then you don't have anything to do. Your application will already reload the [08:26.080 --> 08:33.080] configuration, and you have the whole thing covered. Now, if your application can't do that, or if your application [08:33.080 --> 08:41.080] uses environment variables, you mount secrets as environment variables, but that's a more difficult problem, and for years we [08:41.080 --> 08:48.080] didn't really have a solution for that other than manual restarts. A couple of years ago, this component called reloader [08:48.080 --> 08:58.080] appeared on the market, which basically watches workloads that have, that references secrets, and it also watches the [08:58.080 --> 09:06.080] secrets, obviously, and when it detects a change, it triggers a standard workload rollout, similarly to how you would do [09:06.080 --> 09:12.080] that with kubectl rollout, for example. So it may change the annotation of the workload, and that would result in the [09:12.080 --> 09:19.080] workload being rolled out, which means that it would run with the new environment variables, and it would remount the [09:19.080 --> 09:28.080] secret with the changed file. And if we take a look at the whole process from the previous diagram, we don't have one component that [09:28.080 --> 09:35.080] takes care of the deployment, in this case, but we have two, one that synchronizes the secrets from the secrets store to [09:35.080 --> 09:43.080] Kubernetes, and the other one that takes care of the rollouts, or making sure that the workloads notice the secret [09:43.080 --> 09:53.080] change. Well, let's take a look at a very quick demo, how that looks like in action, and I have a repository [09:53.080 --> 10:00.080] prepared, you can go ahead and try it if you want to, and I have a Kubernetes cluster running here with both external [10:00.080 --> 10:10.080] secrets and reloader installed, and in addition to that, we have like a simple echo server, which just, I believe it's, [10:10.080 --> 10:21.080] yeah, we just output something. So let's take a look at how we configure external secrets first. So as I mentioned, [10:21.080 --> 10:27.080] you configure external secrets, or maybe I don't need to mention, I don't know, but you configure external secrets [10:27.080 --> 10:36.080] via custom resources, which means you create, can you see it from the back? Okay, cool. So you configure external [10:36.080 --> 10:45.080] secrets via custom resource called external secret, and you tell external secrets to, you tell external secrets how to, [10:45.080 --> 10:52.080] and from which external store should it synchronize secrets from, and where it should put it. So in this case, we are [10:52.080 --> 11:00.080] telling external secrets to synchronize secrets from a store I created and called as fake. This is basically a static [11:00.080 --> 11:10.080] secret store in this case. It synchronizes secrets into a secret called full bar, and it's going to synchronize from the [11:10.080 --> 11:21.080] fake secret store under the key, from under the key full slash bar to a key under hello in the Kubernetes secret. [11:21.080 --> 11:33.080] So let's take a look at, if we do, in fact, have that secret there. So we have a full bar secret. That's good so far. [11:33.080 --> 11:49.080] And we have a hello key here. I'm sure if you can see that. Now, if I change this secret right now, this, [11:49.080 --> 12:00.080] this is just a command that patches the external or the fake store to change the secret value. If I go back and check the [12:00.080 --> 12:11.080] secret value, it should be changed to everyone. Now, if I try to curdle the service again, there are no changes here. So if I [12:11.080 --> 12:22.080] manually restart the pod, let's see, do I have the command here? Yeah, I have a rollout command. If I manually restart the pod [12:22.080 --> 12:35.080] and restart the port forward as well, then I should see that the secret value is in fact changed. Maybe I haven't shown [12:35.080 --> 12:50.080] you, but I do have the application deployment here that references the full bar secret. All right. So now we have the secret [12:50.080 --> 13:00.080] synchronization part covered. Now, let's see how it works if I want the workload to be automatically rolled out when the [13:00.080 --> 13:11.080] secret changes. So I can annotate the echo server with this reloader annotation, which will make reloader start [13:11.080 --> 13:23.080] watching this workload and the secrets mounted in it. So nothing changed yet. I should still see everyone. That's fine. And now [13:23.080 --> 13:39.080] let's change the secret again to fuzz them. So if I, yeah, the secret is changed to fuzz them. And if we take a look at the, [13:39.080 --> 13:49.080] I probably have to restart this. If we take a look at the service, it should now say hello fuzz them. So in this case, I [13:49.080 --> 13:55.080] didn't have to restart the virtual manually because reloader did that for me when I changed the secret. When I changed the [13:55.080 --> 14:01.080] secret in the store, that external secret synchronized into the Kubernetes secret and reloader noticed that [14:01.080 --> 14:12.080] change, so it rolled out the deployment. So that's what I wanted to show you today. If you have any questions, I'm happy to [14:12.080 --> 14:22.080] answer them. [14:22.080 --> 14:43.080] Hi. Thanks for your presentation. Can we use a reloader? Can you speak up, please, because I can't hear you. Please stay [14:43.080 --> 14:54.080] quiet. Thank you. Can we use reloader without Kubernetes secrets? Because we're one of, can we use reloader without [14:54.080 --> 15:03.080] syncing to Kubernetes secrets? I mean, you absolutely can. So with reloader, you can watch either secrets or [15:03.080 --> 15:10.080] config maps or both if you want to. But you need to use Kubernetes secrets and config maps. How do you change secrets [15:10.080 --> 15:16.080] is up to you. If you don't want to automatically synchronize, you don't have to. You can use reloader just to trigger [15:16.080 --> 15:21.080] a reload without using external secrets or synchronized secrets. So if you want to do that manually, you can [15:21.080 --> 15:36.080] absolutely do that. Does it answer your question? No. [15:36.080 --> 15:45.080] I would like to do something like synchronize secrets right into volumes, for example, like skipping Kubernetes secrets [15:45.080 --> 15:50.080] totally, because we don't want to, like, resist that in that CD. [15:50.080 --> 15:57.080] So no, probably reloader is not really useful in that case. But I see what you mean. So if you, for example, if you [15:57.080 --> 16:07.080] use something like bolt-amp and you grab the secrets directly from within the pod and you want to trigger a reload, [16:07.080 --> 16:15.080] then no, reloader can't be used that way. But we are actually working, so I'm from Cisco and before that I was [16:15.080 --> 16:21.080] working for Banzai Cloud and we are working on a solution right now exactly for that so we can have, like, [16:21.080 --> 16:28.080] a component that watches secrets that have external bolt references and reloads a component or [16:28.080 --> 16:38.080] trigger reloads for workloads based on those changes. But none of these tools support that at the moment. [16:38.080 --> 16:52.080] So are there some risks of using this method instead of using, for example, a secret vault? I mean, with a secret [16:52.080 --> 17:00.080] vault, if you watch for a file and if you watch for a secret that should be written in a file or somewhere, [17:00.080 --> 17:09.080] if the secret change vault usually emits a signal like a sig up to reload the process. [17:09.080 --> 17:19.080] So what when the secret changes? Usually vault emits a signal, an up signal to reload the process and load the [17:19.080 --> 17:28.080] configuration. In this way you are reloading the whole container so there are some risks. [17:28.080 --> 17:33.080] The problem is that only works if you talk to a vault directly from your workloads and with the solution you [17:33.080 --> 17:40.080] don't have to integrate vault directly, like you can use whatever secret story you want to. And the problem is that [17:40.080 --> 17:48.080] vault doesn't actually know where it should set its signal. So in this case you may deploy the secrets to a [17:48.080 --> 17:52.080] number of different clusters and the logists wouldn't know where to send those signals. [17:52.080 --> 17:58.080] So the minor advantage is that it's fully transparent to the solution. I don't know. [17:58.080 --> 18:03.080] We have time for one more question. [18:03.080 --> 18:12.080] Any advice about some tools to do the rotation on the other part, like, for example, rotate the standard database credentials, [18:12.080 --> 18:19.080] something like that, that will automatically update in the secret store then trigger the workshop? [18:19.080 --> 18:24.080] The problem with that is that secret providers, like, there are many different secret providers. [18:24.080 --> 18:30.080] So it's really hard to build a central solution for that. But hashicorp vault is one. [18:30.080 --> 18:36.080] Hashicorp vault has a bunch of, I think it's called old backends or something like that, that you can use to issue [18:36.080 --> 18:42.080] credentials, for example, to a Postgres database. And that credential can actually have a TTL, a deadline. [18:42.080 --> 18:49.080] And then after a certain time, hashicorp's vault would issue a new pair of credentials and then external secrets would be [18:49.080 --> 18:53.080] able to synchronize those credentials. We actually use that with AWS back end. [18:53.080 --> 18:57.080] And that's how we rotate database credentials every 90 days. [18:57.080 --> 19:02.080] Okay. Thank you so much for the talk. Thank you for all the questions. Thank you for staying quiet. [19:02.080 --> 19:12.080] Thank you.