[00:00.000 --> 00:29.640]  Hi everyone, I'm here today to talk about delivering a cross-plane based platform.
[00:29.640 --> 00:33.920]  A few words about myself. My name is Maximilian Blatt. I'm a Kubernetes and
[00:33.920 --> 00:42.000]  cross-plane developer and consultant at Accenture in Germany. I'm using or working
[00:42.000 --> 00:48.560]  with cross-plane for almost two years or yeah it's two years now and I'm the
[00:48.560 --> 00:53.040]  maintainer of several cross-plane related open source projects including the
[00:53.040 --> 00:58.720]  provider for AWS, the provider Styra, provider AguCD and I've contributed to
[00:58.720 --> 01:06.880]  many more including cross-plane itself. Now since this is the CI CD dev room I
[01:06.880 --> 01:12.200]  don't know if everyone is familiar with cross-plane so I just want to spend a
[01:12.200 --> 01:18.640]  minute or two explaining what it is. So cross-plane essentially is an extension
[01:18.640 --> 01:24.320]  to the Kubernetes API and it allows you to create cloud resources the way you
[01:24.320 --> 01:31.840]  would create resources in Kubernetes. So the thing on the left is something most
[01:31.840 --> 01:37.760]  of you probably have seen once or twice which is a Kubernetes pod and it's a
[01:37.760 --> 01:42.720]  very common resource that you have in Kubernetes and it basically just schedules
[01:42.720 --> 01:48.360]  and container where you can run an application. And on the right you see a
[01:48.360 --> 01:54.160]  bucket as you would create it with cross-plane and it represents an actual
[01:54.160 --> 02:01.240]  bucket on AWS S3. And if you look at both of these objects then you see that
[02:01.240 --> 02:05.400]  they are very very similar because they are both inside the Kubernetes cluster
[02:05.400 --> 02:11.160]  and you have both very common or the same kind of structure. You have your API
[02:11.160 --> 02:16.280]  version and your kind. You have the metadata that comes with every cross-plane
[02:16.280 --> 02:22.440]  with every Kubernetes object. You have a declarative spec so where you
[02:22.440 --> 02:27.200]  describe the state of the resources the resource that you want and then you have
[02:27.200 --> 02:34.080]  the status information about the resource itself. And that is one of the
[02:34.080 --> 02:39.920]  features that cross-plane does for you so it connects external APIs any kind of
[02:39.920 --> 02:44.520]  external APIs with Kubernetes and lets you manage your whole cloud
[02:44.520 --> 02:51.600]  infrastructure through one Kubernetes cluster. And the second very powerful
[02:51.600 --> 02:55.320]  feature of cross-plane is that it allows you to create your custom
[02:55.320 --> 03:00.840]  Kubernetes APIs by using something that is called compositions and then it's the
[03:00.840 --> 03:06.000]  thing that you can see in the middle. It's a very rough and simplified graph
[03:06.000 --> 03:12.080]  to show the way cross-plane works and it essentially is always works that you
[03:12.080 --> 03:17.600]  have the user claim for a resource for your API that you have to find using a
[03:17.600 --> 03:23.720]  so-called XID or a composite resource definition and that is then passed to a
[03:23.720 --> 03:28.160]  composition and then the composition spawns a number of managed resources.
[03:28.160 --> 03:31.760]  Managed resources are something that you have seen in this slide before which is
[03:31.760 --> 03:38.600]  in a bucket or any other kind of external resource on any other kind of
[03:38.600 --> 03:45.840]  external API. Today I want to talk mostly about XIDs and compositions because
[03:45.840 --> 03:51.320]  that is what you do most of the time when you are working with cross-plane.
[03:51.320 --> 03:59.080]  Now developing a platform with cross-plane. If you look at simple CI CD
[03:59.080 --> 04:04.240]  pipeline then you have usually build, test and then deploy and that is that is
[04:04.240 --> 04:09.280]  very easy and for most software projects that is also very easy to
[04:09.280 --> 04:16.840]  understand but because cross-plane is a bit different and you have different
[04:16.840 --> 04:24.560]  things that you do inside these steps. So what you do with cross-plane is you
[04:24.560 --> 04:30.520]  are first building and pushing a package and you are you're not writing code
[04:30.520 --> 04:35.680]  but you are just writing YAML objects which are then applied on the cluster
[04:35.680 --> 04:42.520]  and then they are handled and treated like data by cross-plane and then when
[04:42.520 --> 04:47.880]  you are testing your cross-plane platform then you are applying all your
[04:47.880 --> 04:53.440]  compositions and your XIDs to a test cluster and then you are claiming them
[04:53.440 --> 04:58.240]  and then you see if they work and then if that is okay then you are deploying
[04:58.240 --> 05:06.400]  them and you're just doing the same but on a production cluster. I don't want to
[05:06.400 --> 05:11.080]  talk about the deployment today because that is very simple that is basically
[05:11.080 --> 05:16.360]  just like doing a Kubernetes deployment you are building an OCI image and then
[05:16.360 --> 05:20.600]  pushing that and then you are installing that on a cluster using
[05:20.600 --> 05:24.920]  cross-plane and that's it there's not much to tell about but I want to talk
[05:24.920 --> 05:33.080]  about the building and the testing. Let's start with the building. If you have
[05:33.080 --> 05:38.720]  worked with cross-plane before then that is probably very familiar for you. On
[05:38.720 --> 05:43.000]  the left you see an XID as you would write it and on the right you see a
[05:43.000 --> 05:50.440]  composition. So an XID I said it basically just defines the API that your
[05:50.440 --> 05:55.520]  user has applied to and it's very similar to custom resource definitions that
[05:55.520 --> 06:03.040]  you are writing in plain Kubernetes. So you have your API schema in the spec of
[06:03.040 --> 06:08.440]  your XID and then in the composition what you do is you define the resource
[06:08.440 --> 06:14.280]  that should be created when the user claims this API and that can be an
[06:14.280 --> 06:17.960]  arbitrary number of resources so you don't have to create just one resource
[06:17.960 --> 06:22.120]  but you can create dozens of them so I've written compositions where you are
[06:22.120 --> 06:30.040]  creating 30 or more resources at once but that is essentially how it how it
[06:30.040 --> 06:36.280]  is done you are specifying a base resource and then you can modify this
[06:36.280 --> 06:43.120]  resource by copying information from the user claim into the resource that you
[06:43.120 --> 06:47.560]  want to create. That is what you do the whole time you are working with
[06:47.560 --> 06:51.320]  prospering you are writing an XID and then you are writing a composition or
[06:51.320 --> 06:55.000]  multiple compositions and then the user can claim it and then choose the
[06:55.000 --> 07:03.640]  composition that he he wants. That now looks easy at first but when you are
[07:03.640 --> 07:10.640]  doing this on an enterprise level then you are very easily you end up with
[07:10.640 --> 07:15.680]  compositions that can be thousands of lines of code where you are creating
[07:15.680 --> 07:22.680]  dozens of objects and then because you are just dealing with pure YAML then
[07:22.680 --> 07:27.480]  you really starting to get at the limit because you have a lot of things that
[07:27.480 --> 07:32.720]  are very repetitive inside compositions you have very similar structures let's
[07:32.720 --> 07:40.000]  say if you are spawning a lot of similar objects on your cluster but in
[07:40.000 --> 07:43.760]  different compositions then you sometimes you have the same patches that
[07:43.760 --> 07:49.760]  you are reusing for example if you just want to patch the name of a resource by
[07:49.760 --> 07:53.120]  what the user has given to you then you are repeating this patch over and over
[07:53.120 --> 07:58.720]  for every resource for every file you are writing and sometimes you then have
[07:58.720 --> 08:03.320]  compositions who only vary in details if you have different environments for
[08:03.320 --> 08:09.240]  example you are in different AWS accounts and you only want resources to
[08:09.240 --> 08:15.720]  appear in specific accounts or you have different values like the region or
[08:15.720 --> 08:21.080]  static resources that you are that you want to connect like the account ID and
[08:21.080 --> 08:25.840]  then you have to to write the same composition over and over but just with
[08:25.840 --> 08:29.920]  different values and then you see that you are ending up with something that
[08:29.920 --> 08:33.640]  gets really really complicated because you're just doing a lot of copy and
[08:33.640 --> 08:41.200]  paste and so you need something to generate the YAML dynamically and in
[08:41.200 --> 08:46.680]  these two years I spent a lot of thoughts how to simplify this process and
[08:46.680 --> 08:53.840]  I have experimented with a bunch of stuff and we've tried out Q which is some
[08:53.840 --> 09:00.760]  form of JSON like framework that allows you to build structures and have them
[09:00.760 --> 09:06.960]  validated but it's very complex and not very easy for newcomers so if you have
[09:06.960 --> 09:11.040]  new developers and teams then it's a bit hard to to onboard them on it on it
[09:11.040 --> 09:19.080]  because the error messages are not very helpful in many cases and the tool that
[09:19.080 --> 09:28.480]  we ended up establishing was Helm and not the biggest fan of Helm because it's
[09:28.480 --> 09:35.440]  a bit quirky to use and sometimes if you have error messages or if you have
[09:35.440 --> 09:39.600]  errors then it's sometimes hard to detect where the error actually is because it
[09:39.600 --> 09:42.680]  just tells you all there's something wrong with your YAML but you don't know
[09:42.680 --> 09:50.720]  where exactly happened but the good thing with Helm is that it can do
[09:50.720 --> 09:56.880]  everything that we need you can replace common code blocks such as constants
[09:56.880 --> 10:03.120]  with things that you have written out in your values YAML you can use templates
[10:03.120 --> 10:11.200]  to parameterize patches and to save lines of code and you can even replace the
[10:11.200 --> 10:17.960]  the API schemas of XRDs by something that you can generate and that is a
[10:17.960 --> 10:21.360]  really really cool thing so I just checked the code in our repository and
[10:21.360 --> 10:28.360]  we have about a hundred lines of code for for Helm I'm sorry 10,000 lines of
[10:28.360 --> 10:34.200]  code for Helm and we are generating 200,000 lines of code of compositions
[10:34.200 --> 10:43.480]  that are then applied on our API clusters if you are doing this if you
[10:43.480 --> 10:49.360]  are generating code for for crossplane with Helm or any other kind of code
[10:49.360 --> 10:58.160]  generation tool then I recommend you to check these generated YAML bits into
[10:58.160 --> 11:05.760]  your Git because as it turned out it's very hard to detect unintended changes
[11:05.760 --> 11:10.080]  that you are doing in Helm with your bare eyes if you are changing one value or
[11:10.080 --> 11:15.800]  a template somewhere and then it might have some side effects that you're not
[11:15.800 --> 11:23.080]  seeing so easily and so I really recommend you to check these generated
[11:23.080 --> 11:30.080]  codes YAML code into your Git and do not treat it as artifacts and then if you
[11:30.080 --> 11:35.040]  are in your CI then you should what we are doing and that is really helpful is
[11:35.040 --> 11:39.480]  that you should regenerate all your package and your generated YAML and
[11:39.480 --> 11:45.080]  see if any diff appears and if that is the case then you should just treat this
[11:45.080 --> 11:49.920]  as an error and abort and if there is no diff then it's okay and then you can
[11:49.920 --> 11:59.520]  continue on push your package to the OCI repository. Now so much for the
[11:59.520 --> 12:07.840]  building now let's look at the testing. The first things that you are doing
[12:07.840 --> 12:13.000]  probably when you are starting working with crossplane is that you are
[12:13.000 --> 12:16.240]  writing your composition and then you are applying it on a cluster and then you
[12:16.240 --> 12:21.680]  are claiming it and then you see if it works if all the resources get
[12:21.680 --> 12:27.960]  ready and if you can use them and then it's done and that is all manual and
[12:27.960 --> 12:31.800]  that is very easy to do because it requires no additional setup and you
[12:31.800 --> 12:38.520]  can just use the cluster that you have but when you are really want to do
[12:38.520 --> 12:47.600]  automatic testing or enterprise level testing then that is not enough and
[12:47.600 --> 12:52.720]  because you have manual steps you have an outcome that is not reproducible
[12:52.720 --> 12:59.920]  because you are doing the things all by yourself then also you don't have to
[12:59.920 --> 13:04.880]  find what is actually expected outcome because sometimes even if a resource
[13:04.880 --> 13:09.920]  gets healthy it doesn't mean that the resource is configured the way you want
[13:09.920 --> 13:20.200]  it. So we also tried and tested a few things and we started with go testing
[13:20.200 --> 13:23.960]  but it turned out to be much more complicated because you have to write
[13:23.960 --> 13:32.760]  a lot of Bola plate code stuff and so we ended up using Cuddle. I don't know
[13:32.760 --> 13:39.640]  if some people know it. It's basically a Kubernetes testing toolkit and that
[13:39.640 --> 13:46.800]  allows you to define all your test cases in YAML and then just let Cuddle do
[13:46.800 --> 13:53.400]  all the work all the application of the YAML on the server and then you
[13:53.400 --> 13:59.160]  can define the resources that you expect afterwards and if you're imagining the
[13:59.160 --> 14:02.560]  graph that I showed you before where you have the composition and then you
[14:02.560 --> 14:05.960]  claim it and then you have a number of managed resources that are then spawned
[14:05.960 --> 14:10.960]  and so you can have the claim as an input and then you can just define the
[14:10.960 --> 14:14.600]  resources that you want to have created as an output and then you can handle
[14:14.600 --> 14:22.160]  let Cuddle handle all the rest for you and then it can do things in
[14:22.160 --> 14:28.040]  parallel and such and this is a really really great thing. So I recommend
[14:28.040 --> 14:36.200]  Cuddle just to show you an example how these tests look like so you have your
[14:36.200 --> 14:43.160]  small bucket claim if we are sticking to this simple bucket example then you
[14:43.160 --> 14:48.440]  have your bucket claim on the left which is your test case and then on the right
[14:48.440 --> 14:53.240]  you are defining all the objects that you want. You have the bucket claim
[14:53.240 --> 14:59.000]  itself which has a resource status that should become ready and then you have
[14:59.000 --> 15:02.960]  composite resource which is an internal resource that gets created by
[15:02.960 --> 15:07.640]  crossplane where it stores some reconciling information which should
[15:07.640 --> 15:12.280]  also become ready and then you have your actual bucket managed resource which
[15:12.280 --> 15:17.360]  also has properties that you are expecting it to have and it also
[15:17.360 --> 15:25.040]  has a status and so that is all you need to do testing with Cuddle for
[15:25.040 --> 15:32.480]  crossplane and one thing I want to highlight is because in crossplane the
[15:32.480 --> 15:37.000]  names of the composite resource are always generated by the Qube API
[15:37.000 --> 15:44.760]  server so every time you are claiming an API the name is different it's always
[15:44.760 --> 15:51.440]  different and you cannot influence it so what you can do with Cuddle is
[15:51.440 --> 15:57.200]  you can let Cuddle identify the objects that you are expecting via the
[15:57.200 --> 16:00.560]  labels you don't have to pass the name but instead just tell Yammer that you
[16:00.560 --> 16:05.200]  just want an object with certain properties and label set and then Cuddle
[16:05.200 --> 16:09.960]  will look for one object for any object on the server and if there is one that
[16:09.960 --> 16:16.360]  satisfies this constraint then you are good to go.
[16:20.280 --> 16:28.440]  One other thing that we've experienced is very good is you should run your
[16:28.440 --> 16:36.720]  tests in separate clusters for every pipeline that you are running so we are
[16:36.720 --> 16:42.040]  using virtual clusters or B clusters for that that they run inside a physical
[16:42.040 --> 16:46.760]  cluster of course you can create your your own physical cluster all the time
[16:46.760 --> 16:51.320]  but if you are spinning up physical clusters at least on EKS it can take
[16:51.320 --> 16:56.640]  up to 30 minutes and that is not something that you want for every test
[16:56.640 --> 17:01.880]  and it also costs a lot of money and so you're just spinning up virtual clusters
[17:01.880 --> 17:06.760]  which are Kubernetes control planes that are running as POTS inside a cluster
[17:06.760 --> 17:12.120]  where you can then install cross-plane its providers apply the compositions and
[17:12.120 --> 17:16.040]  then run all the tests with Cuddle and then once you are done with the tests
[17:16.040 --> 17:21.600]  then you can just delete the cluster and everything is fine and also you don't
[17:21.600 --> 17:25.640]  have any intervention between two different pipelines because compositions
[17:25.640 --> 17:34.520]  are cluster scope and they are most likely overriding each other. Now I've
[17:34.520 --> 17:38.640]  been talking a lot about end-to-end tests and they are really good and I
[17:38.640 --> 17:43.160]  recommend you to write end-to-end tests when you are building a cross-plane
[17:43.160 --> 17:48.560]  platform but end-to-end tests also take a lot of time to run if you're
[17:48.560 --> 17:52.400]  considering that you have an API where you are creating real physical cloud
[17:52.400 --> 17:57.320]  resources and then you always have to wait for your resource to actually start
[17:57.320 --> 18:03.560]  and then after some time maybe it says it says that something is misconfigured
[18:03.560 --> 18:08.480]  and then you have to look for an error and if you're really just doing
[18:08.480 --> 18:13.600]  development that it really slows you down because you have always this 10, 15, 20
[18:13.600 --> 18:21.720]  minutes gaps between something happening and there are a lot of
[18:21.720 --> 18:27.600]  mistakes that you can make when you are writing compositions and so I just want
[18:27.600 --> 18:30.600]  to highlight a few things so you have these composite type rest that
[18:30.600 --> 18:34.320]  reference the composition with the XRD they have to match and they are only
[18:34.320 --> 18:40.240]  validated at runtime then you have the group names which have to match with the
[18:40.240 --> 18:48.960]  XRD name you have an unstructured open API schema because XRD is because
[18:48.960 --> 18:56.200]  Kubernetes does not support recursive API schemers yet maybe it will come in the
[18:56.200 --> 19:00.600]  future but as of now it's not supported the same goes for the resource base
[19:00.600 --> 19:08.000]  which can also have any kind of field and then you have the resource patches by
[19:08.000 --> 19:12.520]  default the behavior in cross-plane is if you have if you want to patch from a
[19:12.520 --> 19:17.680]  field to another field and the path of your source does not exist then cross-plane
[19:17.680 --> 19:21.640]  cross-plane default behavior is that it will just ignore the patch and it will
[19:21.640 --> 19:26.240]  not throw an error or anything and if that is the case and you you might
[19:26.240 --> 19:31.600]  easily swallow any any errors and then it you're wondering why things don't
[19:31.600 --> 19:36.040]  work but but you just have a typo in your patch and it's really hard to find
[19:36.040 --> 19:41.440]  these if you have two thousand signs of YAML code and then you have types that
[19:41.440 --> 19:45.960]  must match so if the user is inputting a string then you have to make sure that
[19:45.960 --> 19:52.640]  the string is actually expected and not an integer on the on the actually
[19:52.640 --> 19:58.760]  bucket API for example and then you have the indentation the big thing that if
[19:58.760 --> 20:03.360]  when you are writing YAML files that is my big problem if I'm writing YAML
[20:03.360 --> 20:11.280]  files I always mess up the indentation and then things get all messy so we
[20:11.280 --> 20:16.360]  need something to detect these errors sooner because the sooner you detect an
[20:16.360 --> 20:22.560]  error the easier it is to fix so what we have done because there is nothing out
[20:22.560 --> 20:28.280]  there at least we couldn't find anything we've developed a linter for
[20:28.280 --> 20:35.480]  cross-plane compositions where we are loading actual XID and CRD schemas and
[20:35.480 --> 20:41.240]  then comparing them with the compositions and then applying a set of
[20:41.240 --> 20:47.840]  rules like ensuring that the composition actually supports a valid XID type
[20:47.840 --> 20:52.200]  that you don't have duplicate objects which can sometimes happen especially if
[20:52.200 --> 20:57.840]  you are generating things with helm and then the most important thing is that
[20:57.840 --> 21:02.280]  it actually validates the patches that you are running against the CRD and the
[21:02.280 --> 21:07.360]  XID schemas and that is really really helpful that the first time when we ran
[21:07.360 --> 21:13.640]  this against our production code it turned out to have I think 800 errors
[21:13.640 --> 21:24.080]  that nobody noticed but somehow our our platform still worked yeah and other
[21:24.080 --> 21:29.160]  cool thing about our linter is that it's pure CLI and you don't need a
[21:29.160 --> 21:32.800]  Kubernetes cluster or a cross-plane installation you can just run this
[21:32.800 --> 21:38.680]  locally without setting anything else up and you can it really takes maybe one
[21:38.680 --> 21:43.360]  minute or two and then you have all your your your compositions linter and that
[21:43.360 --> 21:48.520]  is really really really great you're wondering where to get it and there will
[21:48.520 --> 21:55.080]  be a link on the last slide where you can find the code yeah summing things up
[21:55.080 --> 22:02.800]  and so this is our CD CI CD pipeline that we have developed after a couple of
[22:02.800 --> 22:09.360]  years of testing and failing so we use helm to write and build our compositions
[22:09.360 --> 22:17.160]  to generate the YAML code dynamically we use our self-written linter to lint
[22:17.160 --> 22:23.680]  our compositions and we use Cuddle to run all the end-to-end tests and then we
[22:23.680 --> 22:31.960]  are just pushing things with train or any other kind of OCI tool that that comes
[22:31.960 --> 22:42.360]  handy yeah so so much here's a QR code for the linter we are actually making
[22:42.360 --> 22:48.040]  this open source today so you are the first one to actually see the code
[22:48.040 --> 23:07.520]  except us yeah thank you do we have time for questions okay any questions
[23:18.040 --> 23:28.840]  so my question is more about crossplane then crossplane this looks really good
[23:28.840 --> 23:33.880]  though and how does crossplane compare to things like cluster API and the CRDs
[23:33.880 --> 23:38.080]  that that introduces like where's the distinction between the two of them just
[23:38.080 --> 23:44.040]  you know if you're familiar with cluster API so crossplane makes use of CRDs
[23:44.040 --> 23:48.520]  under the hood so if you are if you are applying your XIDs on the cluster then
[23:48.520 --> 23:55.840]  crossplane will generate CRDs which are then used as the API that can be
[23:55.840 --> 23:58.960]  the user can claim
[23:58.960 --> 24:14.320]  if there are no more questions then thank you we're going to make a five
[24:14.320 --> 24:29.880]  minutes break