[00:00.000 --> 00:13.000]  Okay. Our next talk is going to start right now. Mark's already on stage. He's going to
[00:13.000 --> 00:18.080]  talk about automating secrets, rotation, and Kubernetes, and please quiet down so we can
[00:18.080 --> 00:41.080]  understand him. Okay. Hello. Can you hear me? All right. So thank you for joining here today. My name is Mark. I'm an engineer
[00:41.080 --> 00:49.080]  tech lead at Cisco. For the last couple of years, or maybe the better part of the decade, my primary job was helping
[00:49.080 --> 00:57.080]  engineering teams around their business applications and Kubernetes and helping them succeed without having to get into too
[00:57.080 --> 01:08.080]  much details about Kubernetes. Let me start with the story. I'm pretty sure this will sound familiar to a lot of us here.
[01:08.080 --> 01:18.080]  A couple of years ago, I was in the middle of debugging session. It was already the middle of the night. Everyone was tired. And finally, we
[01:18.080 --> 01:29.080]  found a problem. I committed the change, pushed the code, and then suddenly all the buzz went off. We received an e-mail from
[01:29.080 --> 01:39.080]  AWS that a pair of credentials was committed in a public repository. Who did something like that before? Come on. I'm pretty
[01:39.080 --> 01:49.080]  sure it's more than that. There's no shame in that. Everyone has to go through that once. So we obviously had to revoke the
[01:49.080 --> 01:57.080]  credentials, generate a new pair, and deploy it to production. And we were able to do that because we had, like, good
[01:57.080 --> 02:05.080]  secret management pipeline in place. And this kind of hints at why rotating secrets or being able to rotate secrets is
[02:05.080 --> 02:12.080]  important, because if you have an incident like this, you have to be able to act quickly and rotate those secrets and make
[02:12.080 --> 02:21.080]  sure that, well, in a first-case scenario, people may steal your data in a better scenario than AWS. Someone might start
[02:21.080 --> 02:30.080]  mining Bitcoin. But you have to be able to react quickly. Another reason why this is a very important topic is we often have to
[02:30.080 --> 02:40.080]  meet certain compliance requirements that require us to rotate every secret we have, like, every 90 days. I'm pretty sure many of
[02:40.080 --> 02:49.080]  us have to deal with that. But the worst of all, the worst situation of all is when you don't even know that a secret has
[02:49.080 --> 02:57.080]  been leaked. Or maybe an angry ex-employee took something with home. And you don't even know that happened. And they are
[02:57.080 --> 03:05.080]  stealing your data. They are stealing your customer's data. Or they are mining Bitcoin in a better situation.
[03:05.080 --> 03:12.080]  All right. So probably nobody disputes that secret rotation is important. But unfortunately, it comes with its own
[03:12.080 --> 03:21.080]  self-challenges, which often turns people away from actually caring about this. And obviously, secret rotation or managing
[03:21.080 --> 03:27.080]  secrets or configuration is a very complex problem, especially in a Kubernetes environment where you may have multiple
[03:27.080 --> 03:33.080]  different clusters, multiple different in-spaces where you have to deploy these secrets, many different secrets and
[03:33.080 --> 03:44.080]  integration, which means it takes a lot of time to do it right. And it's still an error-prone process. And in an idea
[03:44.080 --> 03:53.080]  scenario, if you screw something up, it may not result in an actual outage or incident. But it may, which is obviously, it
[03:53.080 --> 04:01.080]  would affect the business, which is what we wanted to avoid in the first place by making these secret rotations. So all
[04:01.080 --> 04:08.080]  right. So I'm going to talk about some of the key challenges and why it's important points to that secret rotation
[04:08.080 --> 04:16.080]  should be possible. I mean, it's probably always possible. But I've seen situations where rotating certain secrets would
[04:16.080 --> 04:23.080]  have been very, very hard. Like it would have taken like hours, which is a problem. But so it should be possible. And
[04:23.080 --> 04:30.080]  you should be able to do it relatively quickly. Secret rotation should also be as much automated as
[04:30.080 --> 04:40.080]  possible. Like we are not really trustworthy, like we make mistakes, exhibit A. So it should be ultimately as much as
[04:40.080 --> 04:48.080]  possible. And humans should interact with secrets and secret rotation as little as possible. And finally, secret
[04:48.080 --> 04:55.080]  rotation should happen periodically. Like you shouldn't have a secret that you use for years, because as I mentioned, you
[04:55.080 --> 05:04.080]  don't know if it's been leaked. And if you don't know if it's been leaked, how do you know if your system is secure or not?
[05:04.080 --> 05:13.080]  So how does secret rotation look like in general? We are not even talking about Kubernetes here. First, you need to
[05:13.080 --> 05:20.080]  have a secret store. If you don't have a secret store, then the whole thing is a lot more complex than it should be. You
[05:20.080 --> 05:26.080]  have a secret store where you store your secrets, and then you have some solution to deploy those secrets to your production
[05:26.080 --> 05:33.080]  environment or production environments. Now, when you need to change a secret, depending on what type of secret that is,
[05:33.080 --> 05:41.080]  you have to go to the secret provider, which may be a third-party provider like AWS or GitHub or anything like that. You
[05:41.080 --> 05:48.080]  have to issue a new pair of credentials or generate a new secret, change that in the secret store, and then you need some
[05:48.080 --> 05:58.080]  sort of mechanism to deploy the new secret. That probably should be an automatic process that notices the secret
[05:58.080 --> 06:04.080]  change, and it should deploy the secrets for you in your production environment. Now, in some cases, if you have a secret
[06:04.080 --> 06:09.080]  store that supports that, for example, Hashicorp's vault, your secret store may be able to
[06:09.080 --> 06:18.080]  automatically issue credentials for you, for example, for AWS, your database, or whatever else Hashicorp's vault supports, so you
[06:18.080 --> 06:26.080]  don't even need to do that manually. Hashicorp's vault takes care of that, and that's like the best case scenario. Now, how does
[06:26.080 --> 06:33.080]  this look like in Kubernetes? First of all, you have to decide whether you want to use Kubernetes secrets at all or not. There
[06:33.080 --> 06:41.080]  are options when you don't have to use Kubernetes secrets, but that's probably the easiest way to many secrets in
[06:41.080 --> 06:48.080]  Kubernetes, and the reason why generally people don't like using Kubernetes secrets is because they have this notion that
[06:48.080 --> 06:56.080]  Kubernetes secrets are not secure because they are base 64 encoded, and that's not secure. So that's an entirely
[06:56.080 --> 07:05.080]  different conversation. The bottom line is if you have envelope encryption enabled, which is disabled by default, then you're
[07:05.080 --> 07:13.080]  probably safe using Kubernetes secrets. Now, if you decided to use Kubernetes secrets, then you need something that
[07:13.080 --> 07:20.080]  deploys the secrets from your secret store to Kubernetes, and this could be, for example, the external secrets
[07:20.080 --> 07:27.080]  operator. There are other solutions, but this is probably the one that the community organizes around a lot lately. So
[07:27.080 --> 07:36.080]  external secrets operator is able to synchronize your secrets from an external store, external being to Kubernetes in this
[07:36.080 --> 07:43.080]  case. For example, Hashicorp's vault or AWS secret manager or whatever else you have, external secrets operator is able to
[07:43.080 --> 07:51.080]  synchronize secrets to Kubernetes secrets, and it's also able to pick up changes. It doesn't actively monitor changes, but
[07:51.080 --> 07:58.080]  periodically it takes a look at the secrets, and if something changes, then it synchronizes the changes to Kubernetes. So
[07:58.080 --> 08:05.080]  we have that part covered, and then you can use the Kubernetes secrets, either as environment variables or mount them as
[08:05.080 --> 08:17.080]  files, however you want to use them. Now, the secrets change. What then? So if you mount secrets as files, and your
[08:17.080 --> 08:26.080]  application is able to pick up that change, then you don't have anything to do. Your application will already reload the
[08:26.080 --> 08:33.080]  configuration, and you have the whole thing covered. Now, if your application can't do that, or if your application
[08:33.080 --> 08:41.080]  uses environment variables, you mount secrets as environment variables, but that's a more difficult problem, and for years we
[08:41.080 --> 08:48.080]  didn't really have a solution for that other than manual restarts. A couple of years ago, this component called reloader
[08:48.080 --> 08:58.080]  appeared on the market, which basically watches workloads that have, that references secrets, and it also watches the
[08:58.080 --> 09:06.080]  secrets, obviously, and when it detects a change, it triggers a standard workload rollout, similarly to how you would do
[09:06.080 --> 09:12.080]  that with kubectl rollout, for example. So it may change the annotation of the workload, and that would result in the
[09:12.080 --> 09:19.080]  workload being rolled out, which means that it would run with the new environment variables, and it would remount the
[09:19.080 --> 09:28.080]  secret with the changed file. And if we take a look at the whole process from the previous diagram, we don't have one component that
[09:28.080 --> 09:35.080]  takes care of the deployment, in this case, but we have two, one that synchronizes the secrets from the secrets store to
[09:35.080 --> 09:43.080]  Kubernetes, and the other one that takes care of the rollouts, or making sure that the workloads notice the secret
[09:43.080 --> 09:53.080]  change. Well, let's take a look at a very quick demo, how that looks like in action, and I have a repository
[09:53.080 --> 10:00.080]  prepared, you can go ahead and try it if you want to, and I have a Kubernetes cluster running here with both external
[10:00.080 --> 10:10.080]  secrets and reloader installed, and in addition to that, we have like a simple echo server, which just, I believe it's,
[10:10.080 --> 10:21.080]  yeah, we just output something. So let's take a look at how we configure external secrets first. So as I mentioned,
[10:21.080 --> 10:27.080]  you configure external secrets, or maybe I don't need to mention, I don't know, but you configure external secrets
[10:27.080 --> 10:36.080]  via custom resources, which means you create, can you see it from the back? Okay, cool. So you configure external
[10:36.080 --> 10:45.080]  secrets via custom resource called external secret, and you tell external secrets to, you tell external secrets how to,
[10:45.080 --> 10:52.080]  and from which external store should it synchronize secrets from, and where it should put it. So in this case, we are
[10:52.080 --> 11:00.080]  telling external secrets to synchronize secrets from a store I created and called as fake. This is basically a static
[11:00.080 --> 11:10.080]  secret store in this case. It synchronizes secrets into a secret called full bar, and it's going to synchronize from the
[11:10.080 --> 11:21.080]  fake secret store under the key, from under the key full slash bar to a key under hello in the Kubernetes secret.
[11:21.080 --> 11:33.080]  So let's take a look at, if we do, in fact, have that secret there. So we have a full bar secret. That's good so far.
[11:33.080 --> 11:49.080]  And we have a hello key here. I'm sure if you can see that. Now, if I change this secret right now, this,
[11:49.080 --> 12:00.080]  this is just a command that patches the external or the fake store to change the secret value. If I go back and check the
[12:00.080 --> 12:11.080]  secret value, it should be changed to everyone. Now, if I try to curdle the service again, there are no changes here. So if I
[12:11.080 --> 12:22.080]  manually restart the pod, let's see, do I have the command here? Yeah, I have a rollout command. If I manually restart the pod
[12:22.080 --> 12:35.080]  and restart the port forward as well, then I should see that the secret value is in fact changed. Maybe I haven't shown
[12:35.080 --> 12:50.080]  you, but I do have the application deployment here that references the full bar secret. All right. So now we have the secret
[12:50.080 --> 13:00.080]  synchronization part covered. Now, let's see how it works if I want the workload to be automatically rolled out when the
[13:00.080 --> 13:11.080]  secret changes. So I can annotate the echo server with this reloader annotation, which will make reloader start
[13:11.080 --> 13:23.080]  watching this workload and the secrets mounted in it. So nothing changed yet. I should still see everyone. That's fine. And now
[13:23.080 --> 13:39.080]  let's change the secret again to fuzz them. So if I, yeah, the secret is changed to fuzz them. And if we take a look at the,
[13:39.080 --> 13:49.080]  I probably have to restart this. If we take a look at the service, it should now say hello fuzz them. So in this case, I
[13:49.080 --> 13:55.080]  didn't have to restart the virtual manually because reloader did that for me when I changed the secret. When I changed the
[13:55.080 --> 14:01.080]  secret in the store, that external secret synchronized into the Kubernetes secret and reloader noticed that
[14:01.080 --> 14:12.080]  change, so it rolled out the deployment. So that's what I wanted to show you today. If you have any questions, I'm happy to
[14:12.080 --> 14:22.080]  answer them.
[14:22.080 --> 14:43.080]  Hi. Thanks for your presentation. Can we use a reloader? Can you speak up, please, because I can't hear you. Please stay
[14:43.080 --> 14:54.080]  quiet. Thank you. Can we use reloader without Kubernetes secrets? Because we're one of, can we use reloader without
[14:54.080 --> 15:03.080]  syncing to Kubernetes secrets? I mean, you absolutely can. So with reloader, you can watch either secrets or
[15:03.080 --> 15:10.080]  config maps or both if you want to. But you need to use Kubernetes secrets and config maps. How do you change secrets
[15:10.080 --> 15:16.080]  is up to you. If you don't want to automatically synchronize, you don't have to. You can use reloader just to trigger
[15:16.080 --> 15:21.080]  a reload without using external secrets or synchronized secrets. So if you want to do that manually, you can
[15:21.080 --> 15:36.080]  absolutely do that. Does it answer your question? No.
[15:36.080 --> 15:45.080]  I would like to do something like synchronize secrets right into volumes, for example, like skipping Kubernetes secrets
[15:45.080 --> 15:50.080]  totally, because we don't want to, like, resist that in that CD.
[15:50.080 --> 15:57.080]  So no, probably reloader is not really useful in that case. But I see what you mean. So if you, for example, if you
[15:57.080 --> 16:07.080]  use something like bolt-amp and you grab the secrets directly from within the pod and you want to trigger a reload,
[16:07.080 --> 16:15.080]  then no, reloader can't be used that way. But we are actually working, so I'm from Cisco and before that I was
[16:15.080 --> 16:21.080]  working for Banzai Cloud and we are working on a solution right now exactly for that so we can have, like,
[16:21.080 --> 16:28.080]  a component that watches secrets that have external bolt references and reloads a component or
[16:28.080 --> 16:38.080]  trigger reloads for workloads based on those changes. But none of these tools support that at the moment.
[16:38.080 --> 16:52.080]  So are there some risks of using this method instead of using, for example, a secret vault? I mean, with a secret
[16:52.080 --> 17:00.080]  vault, if you watch for a file and if you watch for a secret that should be written in a file or somewhere,
[17:00.080 --> 17:09.080]  if the secret change vault usually emits a signal like a sig up to reload the process.
[17:09.080 --> 17:19.080]  So what when the secret changes? Usually vault emits a signal, an up signal to reload the process and load the
[17:19.080 --> 17:28.080]  configuration. In this way you are reloading the whole container so there are some risks.
[17:28.080 --> 17:33.080]  The problem is that only works if you talk to a vault directly from your workloads and with the solution you
[17:33.080 --> 17:40.080]  don't have to integrate vault directly, like you can use whatever secret story you want to. And the problem is that
[17:40.080 --> 17:48.080]  vault doesn't actually know where it should set its signal. So in this case you may deploy the secrets to a
[17:48.080 --> 17:52.080]  number of different clusters and the logists wouldn't know where to send those signals.
[17:52.080 --> 17:58.080]  So the minor advantage is that it's fully transparent to the solution. I don't know.
[17:58.080 --> 18:03.080]  We have time for one more question.
[18:03.080 --> 18:12.080]  Any advice about some tools to do the rotation on the other part, like, for example, rotate the standard database credentials,
[18:12.080 --> 18:19.080]  something like that, that will automatically update in the secret store then trigger the workshop?
[18:19.080 --> 18:24.080]  The problem with that is that secret providers, like, there are many different secret providers.
[18:24.080 --> 18:30.080]  So it's really hard to build a central solution for that. But hashicorp vault is one.
[18:30.080 --> 18:36.080]  Hashicorp vault has a bunch of, I think it's called old backends or something like that, that you can use to issue
[18:36.080 --> 18:42.080]  credentials, for example, to a Postgres database. And that credential can actually have a TTL, a deadline.
[18:42.080 --> 18:49.080]  And then after a certain time, hashicorp's vault would issue a new pair of credentials and then external secrets would be
[18:49.080 --> 18:53.080]  able to synchronize those credentials. We actually use that with AWS back end.
[18:53.080 --> 18:57.080]  And that's how we rotate database credentials every 90 days.
[18:57.080 --> 19:02.080]  Okay. Thank you so much for the talk. Thank you for all the questions. Thank you for staying quiet.
[19:02.080 --> 19:12.080]  Thank you.