[00:00.000 --> 00:11.440] Hello. How's it going, everybody? A lot of people in this room didn't really expect [00:11.440 --> 00:15.640] so many. This is wonderful. Thank you for coming to see us. I just want to say that [00:15.640 --> 00:19.640] we want to talk today about mutation testing. That's what we're here for. If you like this [00:19.640 --> 00:27.480] penguin, does anyone not like this penguin? Just you. Okay. Personal vendetta noted. [00:27.480 --> 00:33.200] This is a penguin generated by Dolly. Hopefully, it's friendly enough because this is going [00:33.200 --> 00:36.640] to be part of our talk. We're going to see a lot of penguins in this talk. If anyone [00:36.640 --> 00:40.640] has a personal objection to penguins, please speak now. Otherwise, if you like penguins, [00:40.640 --> 00:44.280] can I get a hand up just to see if we cool with that? Awesome. I've never seen so many [00:44.280 --> 00:47.880] people want to put their hands up but not really be sure. I absolutely love the energy [00:47.880 --> 00:55.240] in this room. My name's Max. I'm this guy. As you can tell, I'm also this guy. I'm here [00:55.240 --> 00:59.440] to talk to you about mutation testing. I work for a company called Vonage and I'm a Python [00:59.440 --> 01:04.600] developer advocate there. Now, what that means is that I maintain our Python tooling. I'm [01:04.600 --> 01:07.760] here to talk about mutation testing because I've just kind of went through this process [01:07.760 --> 01:11.760] myself of understanding all this stuff and applying it to my own work. I want to show [01:11.760 --> 01:16.160] you kind of how that went. But with me, not only do I have the tallest person in the room. [01:16.160 --> 01:22.360] Stand up straight. Stand up straight. This person is 196 centimeters tall. I'm like 177. [01:22.360 --> 01:28.040] I'm not sure. I promise. I'm average in Britain. In this place, right? This person knows a [01:28.040 --> 01:31.000] lot more about mutation testing than me. I'm really not the expert here but I just want [01:31.000 --> 01:36.520] to say this is Paco. Yes. I'm Paco. I work for Opavailu, a small consultancy company [01:36.520 --> 01:41.880] in the Netherlands. I got into mutation testing via my thesis. When I wrote my thesis on test [01:41.880 --> 01:47.360] effectiveness, I wanted to learn more about mutation. Also, after that, I got into speaking [01:47.360 --> 01:51.440] at conferences and more spreading the word about this. Quite awesome, too. I hope that [01:51.440 --> 01:57.680] at the end of the talk, you have another cool tool in your toolbox to write better code. [01:57.680 --> 02:02.440] Awesome. If we're cool with that, we do have to do the obligatory. These companies paid [02:02.440 --> 02:06.680] for us to come here and paid for our flights and stuff. What my company does, I'll just [02:06.680 --> 02:11.640] quickly tell you. We do communications APIs as a service, basically. Things like SMS, [02:11.640 --> 02:16.000] like voice calls, like video chats, like two-factor authentication, all via API. That's kind [02:16.000 --> 02:20.200] of what we do. That's really just what I want to say. It is relevant because I will show [02:20.200 --> 02:24.120] you what I actually applied this to, which was one of our SD case. [02:24.120 --> 02:27.960] For me, we don't actually have a product to sell. Also, definitely didn't fly here from [02:27.960 --> 02:34.200] the Netherlands just to make that sure. It's just a two-hour car drive. No, so we're here [02:34.200 --> 02:38.000] just a consultancy company, and we really like to share knowledge. That's mostly the [02:38.000 --> 02:41.520] reason why I'm here to tell you more and teach you more. It's quite simple. [02:41.520 --> 02:47.120] Yeah. He doesn't have the funding crush that I do, unfortunately. Luckily, we're all good. [02:47.120 --> 02:51.240] There's two of us on this talk. There's two of us here, and actually, there is a third [02:51.240 --> 02:55.000] person in this talk. We've seen a hint about this person already, but this person's really [02:55.000 --> 02:58.680] the thing that's going to tie this whole talk together, and it's going to get us all feeling [02:58.680 --> 03:03.280] good about mutation testing. This person's very important, so say hello to Henry. This [03:03.280 --> 03:05.520] is Henry. Look at his little face. [03:05.520 --> 03:11.920] Thank you. Hands up if you think Henry's a cute AF penguin. [03:11.920 --> 03:17.280] As for it, thank you very much. Yes, I'm glad we agree. I'm glad we're on the same page. [03:17.280 --> 03:21.200] Now, just some quick audience participation, because if you can't tell, we're quite big [03:21.200 --> 03:25.720] on audience participation. So, quick question here. Who has heard of this stock photo, but [03:25.720 --> 03:29.600] more importantly, who's heard of testing? This is just a check to see if we found the [03:29.600 --> 03:35.040] room. Thank you very much. Great stuff. Okay. Who's heard about code coverage? A lot of [03:35.040 --> 03:37.680] people maybe not everybody, and that's okay if you haven't. We're going to talk about [03:37.680 --> 03:41.680] code coverage, so please don't worry if you haven't. But yeah, it's awesome to know that [03:41.680 --> 03:46.200] some people have. That's a good starting point too. Okay, final one. I'm going to say other [03:46.200 --> 03:51.880] than Vaia knowing about this talk, who's heard of mutation testing? Oh, quite a few. Yeah. [03:51.880 --> 03:59.520] And now, quick break. Who actually was already using mutase testing? Ah, nice. There are [03:59.520 --> 04:03.160] enough quick wins here, and hopefully you have some good experiences. [04:03.160 --> 04:07.680] Yeah. So, really nice to see that people are familiar with the concept, but if you're not, [04:07.680 --> 04:10.200] it's also okay, because we're going to go through this like you don't know anything [04:10.200 --> 04:13.440] at all, because when I started doing this, you know, a few months ago, I didn't know [04:13.440 --> 04:16.160] anything at all, and so I want to take you through that journey as well, and that's what [04:16.160 --> 04:21.280] we're going to do. But before that, what I want to do first is give us some background, [04:21.280 --> 04:23.920] and what I actually really want to do is pass to Paco, who knows a lot more about this than [04:23.920 --> 04:28.560] me, so I'm going to pass to you right now. Yes. This is going to be some improvising. [04:28.560 --> 04:34.080] Good work. Good luck. I'm going to drink water with this. I'll feed you. Yeah. Nice. Great. [04:34.080 --> 04:37.720] So, yeah, we're first going to talk a bit about testing in general, and then we're going [04:37.720 --> 04:43.040] to more specifically talk about unit testing. So, just a quick check. Does anybody know [04:43.040 --> 04:47.920] what a unit test is? That's great. I don't have to explain that part. For those who don't [04:47.920 --> 04:52.040] know, it's the smallest possible test you can write in your code base, just in one method, [04:52.040 --> 04:56.280] and you write one test for it to test the outcome of that method. Now, there are many [04:56.280 --> 05:01.920] different reasons why we're writing unit tests, and I think one of them, my favorite or the [05:01.920 --> 05:06.800] most used one is for maintenance. We write tests because we want to be confident in the [05:06.800 --> 05:10.600] changes we make to our code base. So, whenever we make a small change, we add a new field [05:10.600 --> 05:15.120] to some endpoint that we know that we didn't completely break the database integration [05:15.120 --> 05:21.680] because it can happen at times. So, yeah, that's very important maintenance regression [05:21.680 --> 05:28.200] testing, but there are more reasons. One I like also a lot is tests can actually serve [05:28.200 --> 05:35.440] documentation purposes as documentation. You can use tests to describe certain scenarios [05:35.440 --> 05:39.640] in your code base that when you have a specific test for that, it already makes clear this [05:39.640 --> 05:46.240] is intended behavior. I have an example for this, which is I worked for a company where [05:46.240 --> 05:52.920] we had an endpoint that returned warehouses, and these warehouses, just a domain object, [05:52.920 --> 05:56.320] had a soft delete. So, there was a flag in there that indicated whether it was deleted [05:56.320 --> 06:02.960] or not. At some point, so this endpoint returned both deleted and non-deleted warehouses, and [06:02.960 --> 06:08.080] at some point over time, as we were working on it, a new guy came in and looked at it [06:08.080 --> 06:12.320] and said, hmm, that's strange. Why are we returning deleted warehouses? Why would you [06:12.320 --> 06:17.360] want that? It was a fair question because we also forgot, and there was only one test [06:17.360 --> 06:21.440] which tested the success flow, and you can already kind of guess here a bit. So, the [06:21.440 --> 06:25.960] success flow in this case meant they only returned non-deleted warehouses in the test. [06:25.960 --> 06:30.480] So he made the changes, and we all thought, oh, this makes sense. It looks broken. Of [06:30.480 --> 06:34.120] course, they didn't check with product management, the product team deployed it, and then you [06:34.120 --> 06:39.240] can guess, of course, this was broken, so we had to revert it. And the whole lesson here [06:39.240 --> 06:44.040] was just one test which also included a negative scenario with tests of warehouses that were [06:44.040 --> 06:48.120] deleted could have already been a trigger to think like, hey, this behavior is intended, [06:48.120 --> 06:53.400] and that's where documentation, so where tests can serve as sort of a documentation purpose. [06:53.400 --> 06:57.080] Also very useful in getting to learn a new code base. So, whenever you're on a new code [06:57.080 --> 07:01.720] base, you have this very complicated method. A test can help you step through the method [07:01.720 --> 07:05.760] to sort of explain what's going on, for example, while debugging it. [07:05.760 --> 07:11.640] Now, another one, and this one is here for the consultant. So, who here works as a consultant? [07:11.640 --> 07:18.480] Oh, not that many. Wow. Because we're sort of the root of all evil always. We tend to [07:18.480 --> 07:24.400] run to the next project, and we don't have to maintain our own code often, not always. [07:24.400 --> 07:32.200] So I have this nice quote that's mostly also for us. Keep in mind that you're not doing [07:32.200 --> 07:39.160] this only for yourself. I had a colleague who once told me, keep in mind that you always [07:39.160 --> 07:42.280] have this point in your development process where you think, okay, should I write a unit [07:42.280 --> 07:45.960] test for this? It's going to be a painful unit test. I know that it works. I do really [07:45.960 --> 07:50.720] have to document it. We all know how it works. Yeah, sure, we all know how it works, but [07:50.720 --> 07:56.520] we also leave the project and then go on and go to another project. We as consultants. [07:56.520 --> 08:01.680] And I will speak to myself, what would I do if I would be the next person? So what would [08:01.680 --> 08:06.480] I do if I were the next John or Jane Doe working on this project? So tests are not there just [08:06.480 --> 08:11.040] for you, but also for the next person working. I would actually like to jump in here, because [08:11.040 --> 08:15.920] I've been that person. Thank you. I've been the person who works on a project after someone's [08:15.920 --> 08:20.440] left that. And honestly, if you have good documentation or if you don't have that, if you have good [08:20.440 --> 08:25.120] testing, thank you, you do your water break. So if you have good testing, it can really [08:25.120 --> 08:29.600] help you understand what a project does. And so when I came to a certain project recently, [08:29.600 --> 08:33.160] I didn't have necessarily the kind of testing that I would have liked to really document [08:33.160 --> 08:37.000] my code that well. And so like, honestly, if I'd had someone like Parker, who actually [08:37.000 --> 08:40.160] was a bit more conscientious with what they tested, that would have really helped me get [08:40.160 --> 08:43.560] on board with the project quickly. But as it was, this was a real problem for me. And [08:43.560 --> 08:46.920] it was something that we want to hopefully avoid other people having to deal with as well. [08:46.920 --> 08:50.720] Like, quick question, actually. Has anybody ever taken over a code base that they may be [08:50.720 --> 08:56.000] looking at and go, what the heck is this? Okay, so you know what the point of this slide, [08:56.000 --> 08:58.800] right? You know why we're saying this. We know this is important. Now, let's stop that [08:58.800 --> 09:03.400] from happening to the next generation of very pain developers, right? Let's stop that happening. [09:03.400 --> 09:09.320] Yes, so write tests. And so if all these reasons haven't convinced you, there's often maybe [09:09.320 --> 09:13.960] a team lead or a boss or somebody else who's telling you to write tests. In most cases, [09:13.960 --> 09:21.120] there's always, of course, exceptions. Ah, okay. Wow. This is annoying. So at the end [09:21.120 --> 09:26.240] of the day, we're all writing tests, if it's not for ourselves and it's for someone else. [09:26.240 --> 09:29.600] And as we're, even though we're now sort of happily all adding tests, we also have to [09:29.600 --> 09:35.200] sort of sketch a problem scenario here. And this problem is that as projects evolve and [09:35.200 --> 09:41.200] grow, our tests also evolve and grow. But the problem is that we do reflect there a lot [09:41.200 --> 09:44.640] and we spend a lot of time on keeping our production code clean and well monitored. We [09:44.640 --> 09:49.000] have lots of metrics where on the other hand for tests, what you can see on long living [09:49.000 --> 09:52.800] projects is that sometimes you just get tests where nothing more than a blank setup and [09:52.800 --> 09:59.720] tear down and some mocking going on because the functionality already moved long ago, [09:59.720 --> 10:03.120] which means to the point that test code is often not monitored. Test code is sort of [10:03.120 --> 10:08.920] our, the kid that didn't get all the attention that needed. So there is still one metric [10:08.920 --> 10:15.920] for testing. What's one, what do you think is the most used metric for test code? Yes, [10:15.920 --> 10:20.920] yeah, we have sort of gave it away already in the intro, but yes, yes, code coverage. [10:20.920 --> 10:25.680] Code coverage tells you how much of the code is executed when you run the test suite. And [10:25.680 --> 10:30.120] I personally really like code coverage because it already helps you write more and better [10:30.120 --> 10:36.600] tests. And I want to go through a simple example here to show you how it can already help you. [10:36.600 --> 10:42.160] So here we have a submit method. So this is the Python guy. I'm the Java guy. Yeah, he [10:42.160 --> 10:48.280] said simple example, but I don't, I don't think so. Yeah. So the context is you are at [10:48.280 --> 10:54.680] the conference and you have a service where you can submit proposals. You can only have, [10:54.680 --> 10:59.080] you can't have more than three, three or more over proposals and you can submit after the [10:59.080 --> 11:05.240] deadline. If you do that, there will be a failure and otherwise you will get success. So quite [11:05.240 --> 11:09.360] a simple method with everything as a parameter just to make it easy to explain. So if you [11:09.360 --> 11:13.680] would take method coverage, method coverage is the simplest coverage metric we can get [11:13.680 --> 11:19.480] which checks is this method coverage as or no, we can add one simple test called a test [11:19.480 --> 11:25.480] X which submits a proposal. There are no open proposals, which is good. And we have a deadline [11:25.480 --> 11:32.160] that's 999 seconds in the future. So great. Now we can get a step further. We can get [11:32.160 --> 11:36.760] into statement coverage and with statement coverage, we check, well, if each statement [11:36.760 --> 11:42.400] was executed and now we see, hey, we didn't cover our unhappy flow. So we need to add [11:42.400 --> 11:47.840] another test. In this case, we add another test which has five over proposals, which [11:47.840 --> 11:54.160] means this check evaluates the true and we have a negative scenario. Now we can even [11:54.160 --> 11:59.440] go one step further through, for example, condition coverage. And with condition coverage, [11:59.440 --> 12:05.320] we check if each Boolean sub-expression has been evaluated to both true and false because [12:05.320 --> 12:09.680] what we don't know now is whether our deadline check is actually working. We just know that [12:09.680 --> 12:14.640] it returns false, but we haven't seen it return true yet. So we add one more test now with [12:14.640 --> 12:21.440] a deadline that is 999 seconds in the past. And now we have three tests. And this is already [12:21.440 --> 12:26.080] why I like code coverage so much because it really helps you write proper tests. Proper, [12:26.080 --> 12:31.120] it helps you write tests because let me get on to the good part here. As I said, writes [12:31.120 --> 12:36.160] better and more tests. Code coverage is really easy and cheap to measure. In most, I think [12:36.160 --> 12:40.200] most of the languages, it's just a matter of instrumenting the code. You run the test [12:40.200 --> 12:43.920] suite and you get a nice report out of it that everybody can quickly see and you can [12:43.920 --> 12:51.280] quickly see the pain points of where you're lacking in testing. But to get a bit further, [12:51.280 --> 12:56.640] so it guarantees, as I mentioned, it shows you what you didn't test. But the only guarantee [12:56.640 --> 13:00.560] I'm going to get to the bad parts next is that the only thing that shows you that what [13:00.560 --> 13:06.160] you did test didn't crash. It doesn't guarantee anything actually about functionality because [13:06.160 --> 13:12.080] code coverage can actually be quite misleading. It doesn't guarantee any test quality. So [13:12.080 --> 13:16.400] if I take this method, for example, this is a unit test, a valid unit test, this test [13:16.400 --> 13:21.160] generates coverage. It calls a method, but there is no assertion on the result, which [13:21.160 --> 13:26.240] makes this test, for example, generate 80% coverage, yet the test actually only guarantees [13:26.240 --> 13:30.520] it said the method doesn't crash. It doesn't tell us whether it's returned true, false [13:30.520 --> 13:36.760] or anything. And this is the pain point of code coverage, which brings us to the something [13:36.760 --> 13:40.240] nice which Max told me about, which is called the good horse law. So can you maybe explain [13:40.240 --> 13:47.120] a bit about that? Can I grab your clicker? Can I explain about [13:47.120 --> 13:55.240] good horse law? No, sorry. I can't. Just kidding. Okay, so when a metric becomes a target, it [13:55.240 --> 13:59.240] seems it's to be a good metric. So quick question, has anyone ever written a unit test just to [13:59.240 --> 14:03.920] get coverage up rather than because the test was useful? Come on, let's be honest. This [14:03.920 --> 14:10.600] is the safe space. Okay. Microphone, okay. Hello, everybody. Welcome to the live stream. [14:10.600 --> 14:17.040] This is our radio announcer voice. Right. So this is something, I'll be honest, I've [14:17.040 --> 14:20.800] done this. We now know a lot of people in the room have done this. But what we don't [14:20.800 --> 14:23.920] want to have is with code coverage. It's supposed to tell us something about our code. But if [14:23.920 --> 14:27.840] instead we turn that into a target, that can really limit what we actually, you know, [14:27.840 --> 14:31.920] what the kind of useful tests that we actually create. And that leads to a few quite big questions [14:31.920 --> 14:37.720] that we do genuinely care about. So I'll wait for that photo if you. Cool. Sorry, I'm very [14:37.720 --> 14:41.680] audience participation. I'm very sorry. So the next question that we ask there is how [14:41.680 --> 14:45.480] do we know if our tests are high quality? How do we know if these tests are actually [14:45.480 --> 14:50.620] good quality tests? We test them. We test them. Great, great answer. I've got a further [14:50.620 --> 14:56.000] follow up question for you. How can we understand what our tests are really doing? Same answer, [14:56.000 --> 15:04.680] if anyone, I see a hand. I literally had a code base where I could delete half the tests [15:04.680 --> 15:09.320] and nothing changed. And they all, yeah. So I'm in delete or half the kid. Hello. [15:09.320 --> 15:29.200] Yes. So just for the live stream, I'll just repeat that because that's a really good [15:29.200 --> 15:32.240] point. I won't repeat this wearing, but I do understand and appreciate the, you know, [15:32.240 --> 15:37.080] the emotion behind it. If you, if you end up, you know, shipping some code that does, [15:37.080 --> 15:40.520] does not do what it's supposed to do. You end up with users getting very angry at you. [15:40.520 --> 15:44.240] And yeah, that's a problem, right? That's going to be an issue. And that is a way of [15:44.240 --> 15:47.680] finding out, but I guess the real question we're asking here is how do we know if we [15:47.680 --> 15:52.240] can trust our tests? That's really the crux of this, this problem, right? And so as it [15:52.240 --> 15:57.880] turns out, the very, the very famous Roman poet juvenile, actually in 100 AD after he'd [15:57.880 --> 16:02.680] had a few drinks, he was able to summarize this in such a beautiful way. And this was [16:02.680 --> 16:06.280] something that maybe wasn't appreciated at the time because, you know, obviously he [16:06.280 --> 16:10.720] was talking about mutation testing 2,000 years before it was relevant. But I will mention [16:10.720 --> 16:15.800] it here. It's who watches the watchers, right? And this is the question. Who's testing our [16:15.800 --> 16:20.000] tests? Who cares about that? Who's, how do we actually gain trustworthiness for our tests? [16:20.000 --> 16:23.080] And I see there's, there's people in production who's having bugs. There's people who understand [16:23.080 --> 16:28.000] here that this is a really big deal. Luckily, we have a two-word answer for you, which is [16:28.000 --> 16:36.640] the reason we're all in this room. Mutation testing. So, spot the odd one out. You might [16:36.640 --> 16:40.280] see here, that's, that's Henry. He's having a great time, but maybe he shouldn't be still [16:40.280 --> 16:44.160] in a row of pigeons. But more importantly right now, I'll just explain the basic premise [16:44.160 --> 16:48.040] and then Paco here will explain in a little more detail how it's actually kind of done. [16:48.040 --> 16:51.400] So first of all, mutation testing, this is a really quick summary. What you do is you [16:51.400 --> 16:54.520] introduce some faults in your code, so just a few little things that you change. And for [16:54.520 --> 17:00.280] each of those little changes, that's a mutant version of your code. Once you've got that, [17:00.280 --> 17:04.960] you run your test suite against those mutant versions of your code. And if they fail, awesome, [17:04.960 --> 17:10.240] because that means that, awesome, because that means that your, your tests have actually [17:10.240 --> 17:13.880] picked up that change. And that's a good thing, right? That's, that's good. We want those tests [17:13.880 --> 17:18.680] to fail if our code changes, right? But if they don't fail, that's a bad time because [17:18.680 --> 17:23.000] that means those tests didn't test that change. It didn't test for that. And so that's something [17:23.000 --> 17:27.600] that could have made it to production. So what mutation testing kind of gives you is [17:27.600 --> 17:32.080] a way to evaluate that test quality. But this is very abstract. So let's look at penguins. [17:32.080 --> 17:36.200] I like penguins. So Henry here, he's a great example and he's going to, he's going to bring [17:36.200 --> 17:40.720] all this home. So I was kind of unfamiliar to the topic. So I kind of created some analogies [17:40.720 --> 17:44.000] with penguins that really helped me. So I'll share those with you. So the way I kind of [17:44.000 --> 17:47.880] imagine my software is we do lots of stuff with messaging. And so I imagine software [17:47.880 --> 17:52.240] that works properly to be like a pigeon or a dove, like a bird that can fly. I've used [17:52.240 --> 17:56.040] a dove here because Paco has a deadly fear of pigeons. He's terrified of them. [17:56.040 --> 17:57.360] Not fear. Vendetta. [17:57.360 --> 18:00.960] He has a personal vendetta against pigeons. Sorry. He doesn't like them. So I've used [18:00.960 --> 18:04.960] a dove here. But ideally we want something that I can tie a message to the bird's leg [18:04.960 --> 18:08.600] and it can go and deal with that message for me, right? So it can go, it can go do something [18:08.600 --> 18:14.080] like that. So one of the key features of penguins is that they're not very good at flying, [18:14.080 --> 18:18.320] right? I think we, can we all agree that that's probably not the best. If you want to tie [18:18.320 --> 18:22.120] a message to a bird's leg and get it to deliver it, a penguin might not be the bird you choose [18:22.120 --> 18:26.800] unless you may be delivering something underwater. So this is the kind of example here where [18:26.800 --> 18:30.160] we've got an a bird, but it's not the kind of thing that performs the way we expect [18:30.160 --> 18:33.720] it to. And this would cause some serious problems if we try to use this kind of thing in production. [18:33.720 --> 18:37.520] If we wanted to send a message via a penguin, we're going to have a tough time, right? [18:37.520 --> 18:41.240] So Paco, I'd like you, if possible, to explain this in a way that makes more sense than what [18:41.240 --> 18:42.240] I just did. [18:42.240 --> 18:43.240] Good luck. [18:43.240 --> 18:50.720] We have one mic. It's a bit, it's a bit, yeah. So let's get into the positive mutation testing. [18:50.720 --> 18:55.240] The first step of mutation testing, so we've, what Max just taught you is about introducing [18:55.240 --> 19:00.960] faults. So you can do, introduce faults manually, but this is a process that's, well, manually, [19:00.960 --> 19:04.160] and that means it's a lot of work and it's usually also not that reproducible. You don't [19:04.160 --> 19:08.200] want to do it manually. We want to do this in an automated manner. And this is where [19:08.200 --> 19:12.120] mutation testing comes in. In the first step of mutation testing, we're going to generate [19:12.120 --> 19:18.160] mutants. And each mutant is just a very tiny version of the production code. Mutate testing [19:18.160 --> 19:22.760] works with the concept of mutators. And mutators are the ones that are making these very small [19:22.760 --> 19:30.520] changes. So what we have in this case, we have a perfectly fine dove, which is the production [19:30.520 --> 19:36.640] code. And then at the end of it, we have a mutator, which generates, makes a tiny change, [19:36.640 --> 19:40.160] which kind of transforms this into Henry, our penguin who can't fly and we want our [19:40.160 --> 19:44.800] software to fly. So this would be a bad thing. So how does it look? Because this is still [19:44.800 --> 19:50.720] a bit abstract. And I'm going to give you some examples. This would be an example here. [19:50.720 --> 19:55.200] So for the Dutch, and I think for other countries as well, you have to be 17 years or older [19:55.200 --> 20:00.080] to apply for driving license. This could be code that's in your code base, which will [20:00.080 --> 20:06.040] fly, which is good. Now, the mutant would be the entire code base stays the same. And [20:06.040 --> 20:09.800] just this, this little piece change. So here we inverted the logic. This is, of course, [20:09.800 --> 20:15.120] the bug. This is something we don't want to manage and get into production. And actually [20:15.120 --> 20:20.240] just from this, this single line, we can already generate quite some mutants because we can [20:20.240 --> 20:26.160] not only invert the conditional operator, we can also change the conditional boundaries. [20:26.160 --> 20:31.520] So this means that we now have age larger than 17, which is a very nice bug that would [20:31.520 --> 20:36.520] force us to test the edge cases, the, the famous off by one errors, whether we forgot [20:36.520 --> 20:42.360] our equal operation in our conditional check. This, this will help you find that one. But [20:42.360 --> 20:45.920] it can also just return always true and false. We can generate quite some mutants for this [20:45.920 --> 20:49.440] this and we can do the same for, for example, mathematical operations. We can make each [20:49.440 --> 20:57.040] plus into a minus each multiplication into a division, etc. And therefore, we also have [20:57.040 --> 21:01.760] the ability to remove statements. So in this case, we have a method that adds a published [21:01.760 --> 21:06.680] date to some object. And we can also just remove the whole setter. And now this means [21:06.680 --> 21:12.040] that we have a bug in which we don't set this attribute anymore, which is something that, [21:12.040 --> 21:16.760] of course, we don't want to make the production. What's important to note here is that we [21:16.760 --> 21:20.680] mutate testing is always important that the code actually compiles because we're not testing [21:20.680 --> 21:25.720] the compiler. We're testing the code. The compiler is definitely out of scope here. Now at the [21:25.720 --> 21:31.960] end of step one, we have a lot of Henry's. We have a lot of mutants. And now Henry is [21:31.960 --> 21:41.120] going to try to fly. So he already got his wings ready to try to fly. And now for each [21:41.120 --> 21:46.040] Henry, we're going to run the test suite. And if this test suite fails, as Mike's already [21:46.040 --> 21:49.960] mentioned, then we have, then we, then it's good because then we expose Henry 40 is, which [21:49.960 --> 21:55.240] is just a penguin, something that can't fly. So this is great. The not so happy scenario [21:55.240 --> 22:00.120] is where the test passed, which means that Henry made it into production. And as we know, [22:00.120 --> 22:06.280] well, assuming that it also got through the PR, of course, we have more than just tests. [22:06.280 --> 22:10.320] Is that a problem? Because Henry is not supposed to fly. And then we have a bug into production. [22:10.320 --> 22:16.000] So this is something that you don't want. So this is the theory of mutation testing. [22:16.000 --> 22:19.880] And now, Max, you can tell a bit more about the frameworks. [22:19.880 --> 22:27.520] Sure. It works for me. Alrighty. So first of all, I just want to say I'm so proud of [22:27.520 --> 22:31.360] this prompt. I don't know why Dali chose this, but I'm really happy. Like, I think I typed [22:31.360 --> 22:37.480] in penguin trying to be a pigeon. And it came up with this. And I'm very happy. Okay. So [22:37.480 --> 22:42.360] moving on, yeah, frameworks. So this is going to get a little bit more specific to, you [22:42.360 --> 22:47.960] know, to actually implementing this stuff. So anyone here is a Python developer? Heck [22:47.960 --> 22:53.200] yeah. All right. Awesome. So I'm going to show you what I did in Python. So as you [22:53.200 --> 22:57.160] can see, you know, Parker's Java developer, he'll explain Java in a sec. But I'll just [22:57.160 --> 23:01.200] show you the kind of basic concepts, but using using my code and using what I did. So there's [23:01.200 --> 23:05.600] two kind of main supported packages that you can use in Python. It's not like, you know, [23:05.600 --> 23:08.560] in Java, there's like an enterprise thing you can get in Python, it's very community [23:08.560 --> 23:12.760] supported. So you're not, you know, you're not going to get big products. But what we [23:12.760 --> 23:17.280] do have are these kind of like nice and supported repose for mutation testing, which have just [23:17.280 --> 23:21.800] these packages. So I am not a professional, you know, in this, I'm not a doctor, I'm not [23:21.800 --> 23:25.720] a lawyer, I'm not a professional financial advisor. I'm just a person who, you know, has [23:25.720 --> 23:29.920] a certain opinion. And so my opinion of those two frameworks I showed you, there's Muttmutt [23:29.920 --> 23:36.200] and Cosmic Ray. And personally, I prefer Muttmutt. Because it's easy to get going. Oh, angry, [23:36.200 --> 23:44.240] angry face, shaking hairs. You don't like Muttmutt. We will talk later. [23:44.240 --> 23:52.000] So if we have time, we'll have a third presenter very shortly. So for now, while I've still [23:52.000 --> 23:57.000] got the mic, while I'm still, you know, while I'm still here, we'll talk about Muttmutt. [23:57.000 --> 24:01.520] And so this framework is quite simple to use. You know, it's the reason I kind of like it [24:01.520 --> 24:04.720] is because it's very much you install it and you run it. You know, there's a bit of config [24:04.720 --> 24:07.960] you can do. But really, it's quite simple just to get an idea of your code base and [24:07.960 --> 24:12.360] what's going on. So I want to show you this slide. This is just, this is the SDK that [24:12.360 --> 24:16.240] I maintain. And I'm showing you this because it's what I've applied my mutation testing [24:16.240 --> 24:21.880] to. So it's what I'm showing my examples. But basically, what we do is when we go here, [24:21.880 --> 24:27.280] I had this locally, first of all. So I installed Muttmutt with pip install. It's that simple. [24:27.280 --> 24:30.080] It's a Python package. It's what we do. If you went to my talk on malware earlier, you [24:30.080 --> 24:36.040] know why that's a bad idea, but I did it. So after we do that, we've got Muttmutt run, [24:36.040 --> 24:40.560] which just runs those tests for you. So when we do that, I'll show you what my output was. [24:40.560 --> 24:44.520] So when I ran this myself, I actually got a whole lot of this output. But really what's [24:44.520 --> 24:48.280] important here is that, first of all, it ran my entire test suite. And the reason it ran [24:48.280 --> 24:52.560] my entire test suite is just to check how long that's supposed to take and just to make [24:52.560 --> 24:55.880] sure everything does work as expected because there's various types of mutants to do with [24:55.880 --> 25:01.160] timeouts as well that we might want to consider. After it's done that, what it will do is it [25:01.160 --> 25:06.520] will generate mutants based on lines of code in my code base. That's what it will do. And [25:06.520 --> 25:11.120] once it's done that, it will run my tests against those. So there's a few different types and [25:11.120 --> 25:14.920] it can characterize them like this. So the first type is mutants that we've caught, not [25:14.920 --> 25:18.720] killed. We never kill a penguin. We love penguins. We catch them. We've caught them and put them [25:18.720 --> 25:23.680] back into the zoo. In this case, in this case, we've managed to say, yep, our test failed. [25:23.680 --> 25:28.760] That's great. But it could be the case where the mutant's timed out. So it's taken way [25:28.760 --> 25:32.240] too long for this code to run or it's taken enough time that we feel like we're not so [25:32.240 --> 25:37.320] great feeling great about that code. Alternatively, we might end up in a situation where the mutant [25:37.320 --> 25:41.520] survived and made it through our test code. In that case, it corresponds to a bug that [25:41.520 --> 25:46.920] might make it to production. So when I run this on my particular SDK, what I saw was [25:46.920 --> 25:53.000] that we checked the stuff. I created 682 mutants, versions of my code, with changes in them. [25:53.000 --> 26:00.080] And of that, it managed to catch 512 of those, but it managed to miss 170 of them. Now, if [26:00.080 --> 26:03.280] that's a good number or a bad number, we'll talk about later. But what's important now [26:03.280 --> 26:08.360] is let's just look at some of those mutants. So first of all, the ones that we actually [26:08.360 --> 26:12.120] did catch. Here's a couple of examples. So here's a line where basically we say, here [26:12.120 --> 26:15.840] are some valid message channels. So for our messages API, here are some valid messages [26:15.840 --> 26:20.640] ways you can send, right? But what's important here is that this basically removed the ability [26:20.640 --> 26:25.440] to send an SMS. And so when I tried to test that, it failed, which is what we want to [26:25.440 --> 26:30.960] see. Here's another one. Again, this is Python. So if you're a Java dev, don't worry, we'll [26:30.960 --> 26:36.920] look after you soon. And here's another one. We've got a decorator here, which basically [26:36.920 --> 26:41.200] runs this method. And we can see when we remove that, that will never happen. This is actually [26:41.200 --> 26:45.320] through Pydantic, if anyone has used that before. But basically, it means that we're [26:45.320 --> 26:49.040] not going to round a number anymore. And so when we test for that, a number doesn't get [26:49.040 --> 26:54.880] rounded and we catch that. But that is not really very interesting. That doesn't tell [26:54.880 --> 26:57.960] us anything. That tells us about this much, right? It doesn't tell us much at all. And [26:57.960 --> 27:01.400] the reason for that is that we kind of know that our tests work. We kind of know that [27:01.400 --> 27:05.720] our tests work for that. Thank you very much. I'll do the M&M thing. So we kind of know [27:05.720 --> 27:09.800] that our tests work for that. And so what's kind of useful is to see if we do what much [27:09.800 --> 27:16.400] show, we can see the mutants that we didn't catch. We can also do what more HTML, which [27:16.400 --> 27:21.560] shows us essentially an HTML coverage output as well. So we can see in a list all of the [27:21.560 --> 27:26.400] mutants that we didn't catch. So with more and more show, on that code base that I just [27:26.400 --> 27:30.800] showed you, we can see the 170 mutants that survived. It shows you the indices of these. [27:30.800 --> 27:36.600] And then we can manually specify the ones we want to look at. So here we can see, for [27:36.600 --> 27:39.880] example, that we changed the authentication method to fail. And we can see in this case [27:39.880 --> 27:44.760] we caught that because we did a test for authentication and it failed, so that's great. [27:44.760 --> 27:48.680] But more importantly though is you get this HTML output, which you can then explore. You [27:48.680 --> 27:53.440] can explore every method, every sort of module that you have. You can explore all the methods [27:53.440 --> 28:00.040] inside of there and which ones were and were caught. And you do that with the HTML command. [28:00.040 --> 28:03.160] So to do that, I'll just show you this is a mutant that we did not catch. And I want [28:03.160 --> 28:06.080] to show you why we didn't catch it and what it's going to do. And I'll just do that for [28:06.080 --> 28:11.200] a few just so you get some context if that's cool. So first of all, what this mutant did [28:11.200 --> 28:16.040] was it renamed the logger. Now I think logging is out of scope of my test code, so personally [28:16.040 --> 28:19.880] I don't care too much about anything related to logging. So I don't mind if I don't get [28:19.880 --> 28:26.840] a pass here. Here's another one. In this case, what we do is we've slightly changed the value [28:26.840 --> 28:30.880] of a constant. This is just part of a function signature. And again, we don't care about [28:30.880 --> 28:36.040] this that much. It isn't something that I really mind about. What's more important though [28:36.040 --> 28:40.800] is this mutant here. Because this is from our client class where we instantiate all [28:40.800 --> 28:45.920] of our different API classes. And you can see we actually set voice to non, so we completely [28:45.920 --> 28:51.640] remove that instantiation. And our tests are still passing. So the reason that actually [28:51.640 --> 28:56.520] still works, our code base still works even though this isn't testing that case, is because [28:56.520 --> 29:01.440] our tests actually test the voice API separately. They call it manually. But if our clients [29:01.440 --> 29:04.560] are calling it like this, maybe we should have a test for this as well. So this tells [29:04.560 --> 29:09.320] me, hey, maybe my test suite does need to be expanded. Does that make sense? I'm seeing [29:09.320 --> 29:14.040] some very, very like, yeah, yeah, that makes sense. I like it. Awesome. Okay, so if you [29:14.040 --> 29:16.680] are a Python Dev, this isn't the end of the talk, by the way, this is, you know, we've [29:16.680 --> 29:19.760] got some more context and we'll show you about CI. But if you are interested, then, you know, [29:19.760 --> 29:25.360] feel free to scan this. You've got like four seconds before I move slides. And as I move [29:25.360 --> 29:30.640] slides, I'll very slow motion, I'll be passing over this microphone. Because this was just [29:30.640 --> 29:36.720] Python, of course. And I think there are more non-Python Devs here. Just not Python. [29:36.720 --> 29:42.280] Let's see. We, of course, have more frameworks. I think they're from more languages out there. [29:42.280 --> 29:46.760] But I think they're the most important ones that I like personally. And pretty much the [29:46.760 --> 29:52.240] only really good one for Java is PyTest. And we also have Striker. And Striker is one that [29:52.240 --> 29:57.080] supports quite some languages. It supports JavaScript, C sharp, Scala. Of course, it doesn't [29:57.080 --> 30:02.880] do this in one two. Each one has their own dependencies because you can't have one solution [30:02.880 --> 30:07.800] for all. And what you particularly like about is that it supports JavaScript. And this brings [30:07.800 --> 30:13.640] this kind of back end heavy tool. Testing is usually mostly, I think, in front end can [30:13.640 --> 30:18.760] use some law when it comes to testing often. And this also brings the testing frameworks [30:18.760 --> 30:23.200] and the testing quality more to the front end. So that's what I really, really like. [30:23.200 --> 30:28.400] But we wanted to discuss a bit more. Mike's already sort of introduced it. So what is [30:28.400 --> 30:35.440] a good mutation score? We had the good hearts law where we sort of saw that code coverage [30:35.440 --> 30:41.560] can also lead to people implementing tests just to improve coverage, not just sort of [30:41.560 --> 30:45.440] defeats the purpose. You're doing it just for the metric, not for the actual purpose. [30:45.440 --> 30:51.920] And so how does this work with mutation score? Now, first, here's a picture of how PyTest [30:51.920 --> 30:57.920] report looks. So not to bash on Python, but much prettier and much clearer. Because now [30:57.920 --> 31:01.640] what, particularly what is interesting about this one, it shows you both the line coverage [31:01.640 --> 31:05.960] and the mutation coverage. We can ignore the test strain. And this shows us the sweet spots [31:05.960 --> 31:10.840] in a report. Because at the end, we have generated a lot of mutants. We have a lot of classes. [31:10.840 --> 31:13.760] And we only have very little time. So where are we going to look and investigate this [31:13.760 --> 31:18.640] report to see where the strains are? And the one that's the least interesting here is the [31:18.640 --> 31:23.160] notification service. The notification service also doesn't have any coverage. And if there [31:23.160 --> 31:26.280] is no coverage, then the mutants are also not interesting because you have a bigger problem [31:26.280 --> 31:31.000] here, which is you don't have tests at all for this. Then you have a choice. You have [31:31.000 --> 31:34.280] the proposal service and proposed service too. Now, the fact that they are named equally [31:34.280 --> 31:39.480] is because they're from another example. But proposed service too is the one that has 100% [31:39.480 --> 31:43.600] coverage and yet it didn't kill a single mutant. And this is the sweet spot because this means [31:43.600 --> 31:48.200] that we have code that is well tested. Or at least there's tests that covering this [31:48.200 --> 31:52.520] piece of code, but there is no single bug that was caught. So this deserves some attention [31:52.520 --> 31:56.600] because it means that we didn't fully test this. So these are the hotspots where you [31:56.600 --> 32:00.960] open a report. The ones with high line coverage and low mutation coverage, those that are [32:00.960 --> 32:04.760] the ones you really want to go through. Those are the ones that give you the findings to [32:04.760 --> 32:08.960] go to your team and say, hey, see, we need mutation testing because here, just these [32:08.960 --> 32:14.440] two classes alone already, it showed me that we need to improve our quality. Now, back [32:14.440 --> 32:25.440] to the score. So the example we had, we managed to kill 512 out of 682 mutants, which is about [32:25.440 --> 32:36.520] a 75% score. Now, the question is, is this a good score? Yes, yes, the golden answer. [32:36.520 --> 32:41.960] It depends. I love that answer. We already saw that 100% doesn't make sense. Things like [32:41.960 --> 32:46.600] logging and there are more things like generated code, et cetera, things that you don't necessarily [32:46.600 --> 32:50.920] want to test, even though there are mutants generated for it. Now, there are a couple [32:50.920 --> 32:54.600] of things you can, of course, do. You can also, depending on the language and the framework [32:54.600 --> 32:59.600] you use, you can tweak the mutation testing framework quite a bit. For example, the PyTest [32:59.600 --> 33:04.600] version actually, out of the box, already ignores and doesn't mutate any logging lines. [33:04.600 --> 33:10.400] And all the big frameworks are known to the tool. So anything that goes to SLF4J, it doesn't [33:10.400 --> 33:14.840] mutate it. So it also doesn't appear in your report, which is quite nice. And you can easily [33:14.840 --> 33:19.440] add things, like if you have a custom metrics facade somewhere, also typically something [33:19.440 --> 33:24.960] you don't want to cover unit tests, you can add that as well. So the thing here is that [33:24.960 --> 33:28.480] mutate testing is not really a score you want to achieve. It's more that the report can [33:28.480 --> 33:32.840] be interesting to look at and gives you sort of the nice spots. And once you completely [33:32.840 --> 33:36.160] set it up nice and you're familiar with the report, you can maybe start looking at the [33:36.160 --> 33:41.200] score, but definitely it shouldn't become an 80% goal or something like it was of code [33:41.200 --> 33:49.800] coverage. It just goes through the report instead. So now we've sort of discussed all [33:49.800 --> 33:54.880] the tools you need. We have discussed the frameworks. We have discussed the technology [33:54.880 --> 34:03.200] technology. And now it's time, of course, for you to fly. So we need to, how would you [34:03.200 --> 34:08.240] get started on this? And the thing I think that's important here is if you want to start, [34:08.240 --> 34:13.920] so you now think, oh, this is a great talk. I want to start with mutate testing. Depending [34:13.920 --> 34:18.360] on the size of your project, it might be wise to just start with just a single package. [34:18.360 --> 34:23.520] I've done this on projects that are a couple of, say, a thousand lines big. And even though [34:23.520 --> 34:27.640] in Max's example, we had 682 mutants, this can also, depending on the kind of code you [34:27.640 --> 34:31.880] have there, easily grow to tens of thousands of mutants, which can be quite slow. It can [34:31.880 --> 34:35.080] also be that there's something weird in your code base that doesn't really work well with [34:35.080 --> 34:41.320] mutate testing or something that's just extremely slow. An example that I had was that we hadn't [34:41.320 --> 34:48.360] thought it's good to keep in mind. It's actually just to take a sidestep. The mutate test framework [34:48.360 --> 34:53.800] also measures in the beginning for each individual test, which code it covers. So there's a nice [34:53.800 --> 34:59.520] graph from code, production code, to the tests. This helps us optimize because if we want [34:59.520 --> 35:03.440] to run the entire test suite, all the tests for every single mutant, it's going to take [35:03.440 --> 35:08.000] endless. Instead, because we know the coverage, we can also see if we mutate this one line, [35:08.000 --> 35:13.280] we know which test is covered. So we only need to execute those few tests. But what [35:13.280 --> 35:16.880] if you have tests that actually cover half your code base? So, for example, one of the [35:16.880 --> 35:21.440] things you can do in Java, if you're doing things with Spring, is you can actually boot [35:21.440 --> 35:25.520] up the entire Spring application and start doing acceptance tests from your unit tests, [35:25.520 --> 35:30.080] which is typically also quite, not necessarily the worst thing to do, but you now have a [35:30.080 --> 35:34.640] very slow test that does cover half your code base that will be executed for each single [35:34.640 --> 35:38.360] mutant. So these are things you want to get rid of. You want to exclude this acceptance [35:38.360 --> 35:44.120] test because otherwise you're going to be waiting endlessly. So my point about starting [35:44.120 --> 35:47.920] locally and starting small was start just with one package. Start with the utility package [35:47.920 --> 35:52.080] to see if it works, see if the report works for you. And then from there, you can expand [35:52.080 --> 35:58.160] at more packages and also you can see, oh, now it's taking 10 times as long. Why is this? [35:58.160 --> 36:03.960] And you can find the painful packages there. So as I mentioned, you can exclude some tests [36:03.960 --> 36:07.960] and also there are often candidates, certain pieces of code you might want to exclude. [36:07.960 --> 36:13.320] For example, there's no use in testing generated code, but also it might be that you have certain [36:13.320 --> 36:19.640] domain packages that contain just all your domain objects, your pojos, which is just [36:19.640 --> 36:23.880] setters and getters, something that you also typically want to exclude in your coverage [36:23.880 --> 36:31.440] report. You might also want to exclude this from code mutation, from mutation testing. [36:31.440 --> 36:36.680] And now that's done. We need to, so we talked about running it on your machine. We also [36:36.680 --> 36:43.720] can do this in the cloud, of course. Thank you. So as you can see, there's a pigeon [36:43.720 --> 36:47.760] on the slide and Paco, as we've said, has a personal vendetta, so I've taken over the [36:47.760 --> 36:53.760] section. So here we can see that we're going to run off our machine. So why would you want [36:53.760 --> 36:57.200] to run off your machine rather than on your machine? Any questions? Any ideas? [36:57.200 --> 36:59.800] What happens in the background? [36:59.800 --> 37:03.440] Yes. So what happens in the background is what was said there. Any other reason you [37:03.440 --> 37:08.640] might want to run non-locally? No. I got a couple. Oh, oh, hand. [37:08.640 --> 37:09.640] CI. [37:09.640 --> 37:12.760] CI. Yeah, you might want to end your CI system. In fact, that's what we'll be showing you. [37:12.760 --> 37:19.040] So foreshadowing. I like it. So yeah, it takes some time. And if you're using a CI [37:19.040 --> 37:22.960] system, you get to use those cloud resources. And also what's important is that you can, [37:22.960 --> 37:26.920] if you've got code which is maybe dependent on different OSes, might behave differently, [37:26.920 --> 37:31.400] you can specify different versions and platforms to run on as well. [37:31.400 --> 37:35.680] So stop talking. I hear you cry. Well, I'm afraid this is what we're here for. But unfortunately, [37:35.680 --> 37:40.200] I will be keeping talking. But what I will do is show it a bit of an example. So I applied [37:40.200 --> 37:45.160] this to my code base, my own code base myself, into my CI system. So you can see here, this [37:45.160 --> 37:49.640] is GitHub Actions. And I've got a piece of YAML, essentially. I've got this mutation [37:49.640 --> 37:55.520] test.yaml file. And what that does is set up an action for me to use. So this is something [37:55.520 --> 38:00.760] that I manually run. And I can do this here. So I manually run that. What it will do is [38:00.760 --> 38:05.840] do the mutation test non-locally, and it will produce some HTML output for me to look at. [38:05.840 --> 38:10.320] Now that seems, I'll go a little bit into what YAML does, but it seems like something [38:10.320 --> 38:14.400] that should be able for everyone to do themselves if they want to. So GitHub Actions, the reason [38:14.400 --> 38:17.760] I show that partly is because what we use, but also it's free for open source projects. [38:17.760 --> 38:21.920] So, you know, it's been useful for me because I've not had to pay for it. So, you know, [38:21.920 --> 38:25.920] just a heads up. See, I'll be showing you this with GitHub Actions really quickly. And [38:25.920 --> 38:29.040] I'll show you the YAML, I'll show you what I did. Hopefully by the end of this, the next [38:29.040 --> 38:33.400] couple of slides, you will see how easy it is actually to do this and why actually this [38:33.400 --> 38:37.000] is all good and maybe you want to try this yourself when you get home. [38:37.000 --> 38:42.160] So here's some YAML. First of all, this is our mutation test YAML. It's got one job. [38:42.160 --> 38:45.600] It's pretty simple. All we're doing, we're running on Ubuntu. We're running one specific [38:45.600 --> 38:50.360] Python version to do this. Depending on what your test base is. Oh, they're running a great [38:50.360 --> 38:56.240] time in there. Oh, there is thunder. So basically, we have, yeah, we're testing on one version [38:56.240 --> 38:59.760] for me because I just, my code doesn't vary enough between versions and OSes. So for me, [38:59.760 --> 39:04.320] it's not relevant to do that. But if we look at this next slide, I'll actually show you [39:04.320 --> 39:09.040] the workflow that goes through when I actually run this action. So first of all, we check [39:09.040 --> 39:14.520] out the code. Then we set up a version of Python with it. Once we've done that, we actually [39:14.520 --> 39:18.320] install our dependencies, including now MutMut as well as our regular dependencies. So now [39:18.320 --> 39:21.840] we've got the new mutation testing framework installed here as well on this kind of test [39:21.840 --> 39:27.280] runner. Then what we do is we run a mutation test. So we do that with MutMut run. But because [39:27.280 --> 39:31.240] we're running in a CI system, we don't want insanely long logs and due to how it's outputted, [39:31.240 --> 39:34.560] we want a no-progress flag there just to show that we're not seeing every line of output. [39:34.560 --> 39:39.360] We just see the important parts. We also have the CI flag, which is one of my only contributions [39:39.360 --> 39:45.000] to watch a open source, but I added that and I'm kind of proud of myself. So that basically [39:45.000 --> 39:50.000] means that you get a good, sensible output, like return code when you run in a CI system. [39:50.000 --> 39:52.840] Because the default for MutMut is, depending on the type of mutants that we call, it will [39:52.840 --> 39:57.800] give you a different exit code that is non-zero. So you kind of need to consider that or to [39:57.800 --> 40:02.120] suppress that with some scary, scary bash. That's why I did it first. That's why I wrote [40:02.120 --> 40:06.800] the flag. So once we've done that, we save it as HTML and we upload it so that you can [40:06.800 --> 40:11.400] access that yourself as well. So that's it. That's the whole piece of YAML. It's 35 lines [40:11.400 --> 40:14.720] and that set up the entire mutation test for my suite. So you can see, hopefully, does [40:14.720 --> 40:19.080] this seem kind of easy? I think it seems pretty gentle to do, at least in this sort of scope. [40:19.080 --> 40:23.480] If you're a Java dev with a 20,000 line project, you might want to be a bit more careful, but [40:23.480 --> 40:28.480] if you've got like a Python hobby thing, try it out, right? Try it out. What I would say, [40:28.480 --> 40:33.000] there are some more concerns. So first of all, I chose to run this manually when I want [40:33.000 --> 40:38.240] to run it. I chose not to run this on push or PR. I chose to run this manually. And the [40:38.240 --> 40:41.720] reason for that is that I don't expect my code base to sufficiently change between [40:41.720 --> 40:45.960] like small commits. And what I want to do is really not use like mutation test as a, [40:45.960 --> 40:49.920] you know, that kind of score, that 75%. I don't want that to be a metric for me that [40:49.920 --> 40:53.840] I've just turned into a target. I want it to stay as just a good idea, an indicator [40:53.840 --> 40:57.680] of what my tests are doing and what I could be doing better. So for me, I don't want to [40:57.680 --> 41:01.040] run every time, partly because it takes a blum in a long time, especially if I'm using [41:01.040 --> 41:06.360] multiple versions, which we also have to factor in. So you might want to do that. So I didn't. [41:06.360 --> 41:09.440] I just ran on a bun two and that was fine for me. But yeah, depends on what your code [41:09.440 --> 41:12.680] is, you might want to run on different platforms, right? So do factor that in and that will [41:12.680 --> 41:17.240] help you a lot if you're in a CI system. So the other question there is just should [41:17.240 --> 41:20.480] we run on push or PR? My opinion is no. I think there'll be people in this room who disagree [41:20.480 --> 41:24.040] with me, maybe say on a PR, you should run that or maybe there's some kind of metric [41:24.040 --> 41:28.080] you want to associate with score that you then want to look at in some way. For me, [41:28.080 --> 41:31.360] that's not how I use mutation testing. And I think what I want to get out of this is [41:31.360 --> 41:35.560] we don't want, we don't want a situation where mutation testing becomes a new target. We've [41:35.560 --> 41:38.360] got to get a certain score because then we're just kind of abstracting that problem of code [41:38.360 --> 41:41.480] coverage targets. We're just doing that all over again, right? So we're trying to avoid [41:41.480 --> 41:47.720] that. So the final question here is one I'll ask of Paco to explain is Paco, do you think [41:47.720 --> 41:51.120] I should use mutation testing as, you know, in my role as an audience member right now? [41:51.120 --> 41:58.320] What do you reckon? Yes. Well, so with stair already, it depends. There's some things you [41:58.320 --> 42:03.880] can ask yourself because needed is a question. So mutation testing is of course definitely [42:03.880 --> 42:09.000] not a silver bullet. It's something that the reports take quite some time to go through [42:09.000 --> 42:16.800] and of course it's quite computationally expensive to run the process. So the couple of questions [42:16.800 --> 42:22.640] that you can ask yourself that are quite obvious are for projects which have a really high [42:22.640 --> 42:26.920] quality goal. When people die or when a lot of money is lost or a combination of those [42:26.920 --> 42:32.440] two. So just to check, how many of you are working on a project that fits in these three? [42:32.440 --> 42:41.000] Okay, then you need this yesterday. Yes. But for the rest of the room, including me, there [42:41.000 --> 42:45.240] are some other questions we can ask yourself. And I think one of the important ones is are [42:45.240 --> 42:48.960] you using code coverage? Because if you're not using code coverage, let's start with [42:48.960 --> 42:53.760] that and let's first get coverage and get to see how many tests you have. Then the next [42:53.760 --> 42:58.640] question is, is how much value do you put into this? How much value do you get out of [42:58.640 --> 43:04.160] this code coverage? And what I mean with that is, do you make decisions based on it? It's [43:04.160 --> 43:09.120] like a definition of done on your sprint or is it with a build fail if there's 80% coverage [43:09.120 --> 43:14.880] or also in the case of due diligence, you're selling a company, not something we also would [43:14.880 --> 43:19.440] do. But then you would also want to know how well is the software I'm buying or how well [43:19.440 --> 43:24.440] is the software I'm working on. So here I would say, if you're using code coverage [43:24.440 --> 43:28.360] and you're making decisions based on that code coverage, then yes, you should at least [43:28.360 --> 43:33.240] have a look at mutations testing to see what the state is. You don't have to do this always. [43:33.240 --> 43:37.640] You don't have to put it in CI just once a year or go home, run it on your computer [43:37.640 --> 43:40.680] once just to see what the current state of your team is. Because it can very well be [43:40.680 --> 43:44.680] that you're on a high-performing team, which already has their PRs and everything so well [43:44.680 --> 43:49.520] and set up that it's not worth the time maybe. Because apparently the mutation testing report [43:49.520 --> 43:54.920] might even confirm that, the fact that you killed all the mutants. So that would be great. [43:54.920 --> 44:00.760] And there's another question that I like. What's the cost of fixing a bug? And I have [44:00.760 --> 44:05.280] two stories for this. My first example is, and that's the first company I worked for. [44:05.280 --> 44:09.760] This was an enterprise company that built software that was running on-premise at the [44:09.760 --> 44:16.800] customer and the customer was government. And then you're in the line with all these [44:16.800 --> 44:20.880] big integrators, which means you have feature freezes and moments where you can actually [44:20.880 --> 44:24.560] go to the customer and deploy your software, which is quite expensive, which also means [44:24.560 --> 44:31.040] that if you get a bug after this feature freeze or after this upgrade window, you have a serious [44:31.040 --> 44:34.840] issue because you need to go to the customer, you need to explain what went wrong. It's [44:34.840 --> 44:39.880] a very costly thing, a very costly issue. So here definitely, again, mutated testing [44:39.880 --> 44:44.040] can be quite interesting because a lot of money can be involved with the reputation. [44:44.040 --> 44:49.240] The other example that I had was more of a Greenfield project, which had more of the [44:49.240 --> 44:54.320] startup vibes where it was really of a feel fast and fix fast mentality. So this was a [44:54.320 --> 45:00.160] project where rather than focusing on getting our quality monitoring up to speed, we were [45:00.160 --> 45:07.160] mostly focusing on making sure that we could very quickly fix bugs as well. It was of course [45:07.160 --> 45:11.680] running on-premises in the cloud, so we could control it. And most important goal was there [45:11.680 --> 45:15.720] to just click a button and be in production again in 10 minutes and have active monitoring [45:15.720 --> 45:20.200] to see if anything goes wrong. Here the cost of fixing a bug is already a lot lower, which [45:20.200 --> 45:25.800] means that the reason to consider it might be a bit less, especially if you're again [45:25.800 --> 45:30.080] in a, for example, a high-performing team, which are all worked into each other, you [45:30.080 --> 45:33.560] know what you're doing, and you know you can trust each other because you're really, [45:33.560 --> 45:38.200] you're all professionals. Then maybe it's not worth to also spend half a day going through [45:38.200 --> 45:41.520] mutated testing report if you already know what the outcome is probably going to be. [45:41.520 --> 45:45.640] Then again, still do it once. These are two things you could consider and want to use it. [45:45.640 --> 45:50.760] So those are the things I want to give away with you is don't go into it blindly, just [45:50.760 --> 45:57.200] ask yourself, should I really use it? And then, yeah, for the last part, I'd just like [45:57.200 --> 46:02.720] to sum up. So I think hopefully if we've gotten here, we've kind of shown you what [46:02.720 --> 46:06.240] mutation testing is, why you might want to consider using it, and how you could possibly [46:06.240 --> 46:10.280] get going starting with running that, and also why you should. So if we're here, I just [46:10.280 --> 46:14.520] want to summarize, first of all, I'm sorry I used this penguin as an evil penguin earlier. [46:14.520 --> 46:18.040] It is adorable. I just like the Dali. When I asked it to give it some fake wings, it [46:18.040 --> 46:23.080] gave it three. It gave it this extra flipper here. I'm not sure what that was for. But [46:23.080 --> 46:28.200] what I'd like to do is just quickly summarize what we talked about today. First of all, [46:28.200 --> 46:33.760] mutation testing is a way to test your tests. It helps you to beat the problem where you're [46:33.760 --> 46:37.400] using Goodhart's law for coverage, right? So it saves you from just trying to turn [46:37.400 --> 46:41.400] coverage into a metric that you don't have as a target, right? You don't want to have [46:41.400 --> 46:45.320] code coverage. It's got to be above this threshold or we don't merge. That's not where we want [46:45.320 --> 46:52.080] to be. What we want to do is write good tests. So if you are going to do this yourself, an [46:52.080 --> 46:56.360] important part is to start small. So start locally on your machine. If you've got a big [46:56.360 --> 47:00.400] code base, then what you need to do really is run on a subset of that code base. If you've [47:00.400 --> 47:04.600] got a smaller code base like me, you're probably okay. But either way, start locally on your [47:04.600 --> 47:09.840] machine. You also want to run if you can. If you want asynchronous reports, if you want [47:09.840 --> 47:14.800] to use the resources available on a CI system, you can run mutation testing there. So do [47:14.800 --> 47:20.560] consider that if your stuff is in CI. And finally, I just want to say that mutants, hopefully [47:20.560 --> 47:23.800] we've demonstrated that mutants are like adorable penguins, right? They're valuable and they [47:23.800 --> 47:26.920] are wonderful, right? They're really great to use. They can tell you so much about your [47:26.920 --> 47:31.920] code. They're extremely useful. So don't fear them because you should love them. Thank you [47:31.920 --> 47:32.920] very much. [47:32.920 --> 48:02.160] If there are any questions, comments, objections, love mail, hate mail, anything, shout at me. [48:02.160 --> 48:06.440] So the question there was just if we can give some more examples of the kind of range of [48:06.440 --> 48:10.880] things that are possible to mutate. So essentially, the short answer is anything that will still [48:10.880 --> 48:16.240] make the code run. So in the Java case, the code compiled, in my case, the code run. So [48:16.240 --> 48:21.120] in this situation, things like, I'll give you some Python examples. For example, changing [48:21.120 --> 48:25.120] a variable from a certain type to another. So you might typecast something. You might, [48:25.120 --> 48:28.520] with a mathematical expression, you might add extra terms to that expression. You might [48:28.520 --> 48:34.480] change return types, error types. You might set things to non at any given time. You might [48:34.480 --> 48:38.760] call something and, yeah, remove parts of it, set things to zero. There's other stuff. [48:38.760 --> 48:41.280] Paco, can you think of any mutation testing Java examples? [48:41.280 --> 48:44.960] Yeah. So I think that the examples you gave for that, it sort of depends on the mutators [48:44.960 --> 48:49.000] you use. So you often, you can also, of each framework, you can also go through the list [48:49.000 --> 48:53.040] of mutators to see what kind of mutators are out there. What's good to keep in mind is that [48:53.040 --> 48:57.960] it does use some basic, fundamental strategies to determine if it can be mutated. Because [48:57.960 --> 49:01.880] for example, if you have a stream, and in this stream, you do some operations, which [49:01.880 --> 49:07.640] you could, in theory, cut out, you're still using the return value, which means that the [49:07.640 --> 49:11.080] mutation testing framework thinks, okay, let's keep that intact. The same goes for if you're [49:11.080 --> 49:16.280] using the spring reactor framework. You could do lots and lots of smart mutations in there, [49:16.280 --> 49:20.840] but it's not really there yet. It's really the rudimental things, the conditional logic, [49:20.840 --> 49:26.120] the mathematical logic, I think, are the two main things you'll see. And actually also, [49:26.120 --> 49:34.320] the count for often the most typical programming errors, I would say, awesome. I mean, anything [49:34.320 --> 49:37.240] you'd like to mutate, you know, because I guess a lot of these things are open source, [49:37.240 --> 49:42.240] you know, anything that you might be good if it did exist. Any other question, answer? [49:42.240 --> 49:57.240] So, two questions. The first one, could you comment on some framework for C and C++? And [49:57.240 --> 50:08.240] the second, what do you think about the idea to force developers or to require developers [50:08.240 --> 50:15.240] to use that code, which they have actually changed, just to save computational power [50:15.240 --> 50:18.240] on the entire machine and the server side is yet. [50:18.240 --> 50:22.680] Okay. So, the question there just for the live stream was two things. One is, are there [50:22.680 --> 50:27.720] any mutation testing frameworks for C or C++? I will say personally, I don't know. You haven't [50:27.720 --> 50:31.480] used C++ since my physics degree, so I couldn't tell you. I don't know if you know anything [50:31.480 --> 50:32.680] about that, Parker. [50:32.680 --> 50:39.800] I just did a quick Google search. That's all. So, I see there are some frames available [50:39.800 --> 50:40.800] for you. [50:40.800 --> 50:45.800] There is a project by the University of Luxembourg, which is called FAQAS. [50:45.800 --> 50:46.800] FAQAS. [50:46.800 --> 50:58.280] And it's not there quite yet, but it's something for C and also for a bit more for C++. Regarding [50:58.280 --> 51:03.280] your other question, by the way, so should you do it as a get hook? I'm given that it's, [51:03.280 --> 51:04.280] right, that was the question. [51:04.280 --> 51:11.280] Yeah, the idea was basically to require developers to run those mutation tests, but not the full [51:11.280 --> 51:17.280] set, but only mutation tests which are touching the machine, which are testing the unit tests, [51:17.280 --> 51:20.280] which are testing the code, which was modified in this one. [51:20.280 --> 51:24.000] Yeah. Yeah. So, actually there are some, depending on the framework, some have features which [51:24.000 --> 51:28.600] are incremental reports, so where they can just store the last state, then you can do [51:28.600 --> 51:33.800] a diff and use the results from your last execution to not execute all mutants and not [51:33.800 --> 51:37.480] generate all mutants, because it knows I only changed these production lines, so I only [51:37.480 --> 51:42.040] need to generate mutants for these, and I only changed these tests, so I only need to rerun [51:42.040 --> 51:47.280] the tests for these mutants, which can tremendously speed it up. But still using it as a get [51:47.280 --> 51:52.800] hook, I'm not sure. You can, by the way, use the same logic in CI as well to use the incremental [51:52.800 --> 51:55.680] reporting that saves a bit, despite and also supports a thing. [51:55.680 --> 52:00.640] Yeah. So, with what you have, you have caching, so you can cache those tests that you've done [52:00.640 --> 52:05.480] already, and if those cases aren't touched, then you're sort of good if that changes to [52:05.480 --> 52:10.920] your code don't affect that. So, that is an option. I would say, yeah, thank you. My [52:10.920 --> 52:16.800] opinion is, again, that maybe you don't want to explicitly mandate this on every run, and [52:16.800 --> 52:20.040] the reason for that is it's kind of like, it can then become kind of a metric that you [52:20.040 --> 52:23.320] try and optimize for or something to look at, whereas really, I think the nice way to [52:23.320 --> 52:27.000] use it is every now and then is how I would say so. I think if you've got a super critical [52:27.000 --> 52:31.320] project where that's really important, you may want to run it like that. For me, I don't [52:31.320 --> 52:35.360] need to, but I think that's really up to you, up to you as an implementer, what you want [52:35.360 --> 52:38.520] to do, and I think there's definitely a use case to do it in that way if that was important [52:38.520 --> 52:39.520] to you. [52:39.520 --> 52:56.080] Hand over here, hello. Yes, yes. Short answer is yes. Long answer is, depending on the actual [52:56.080 --> 52:59.880] framework, it might be that you add a comment to ignore it. Alternatively, there is a config [52:59.880 --> 53:04.280] file set up as well in Python where you can say, only mutate these paths, only do these [53:04.280 --> 53:14.680] things. What language do you use? That would be Striker. I would say yes. I haven't looked [53:14.680 --> 53:18.680] that much into Striker, but I think they make quite some nice stuff. It's quite generic [53:18.680 --> 53:23.880] for all frameworks. Exclude code for mutation, definitely yes. Depending on the framework, [53:23.880 --> 53:29.240] some even have nice things like exclude, do not mutate any calls to these classes, which [53:29.240 --> 53:32.760] is interesting for the logging, for example. Do not mutate any calls to this logging class, [53:32.760 --> 53:36.520] but same you can do for packages, class paths, et cetera. [53:36.520 --> 53:41.840] I'd say with Striker as well. One of my colleagues uses Striker because he maintains our.NET SDK, [53:41.840 --> 53:45.480] and he's actually also got mutation testing there in Striker. It does seem very perform [53:45.480 --> 53:49.080] and it seems like it does have a lot of those features as well. Honestly, if you're interested [53:49.080 --> 53:51.200] in TypeScript, I think there is something for you there. [53:51.200 --> 53:52.200] Cool. [53:52.200 --> 53:54.400] I think it might be free on open source repos. Sorry, another question. [53:54.400 --> 54:09.720] Yeah. Are specific mutated runs that are reproducible for debugging purposes? If you have a run [54:09.720 --> 54:14.720] and you see something you don't expect, can you reproduce that specific run with a given [54:14.720 --> 54:15.720] SC or something? [54:15.720 --> 54:18.000] Well, so the question is, how reproducible are the mutants? If you find one, then next [54:18.000 --> 54:22.640] run, is it still there? As far as I know, there shouldn't be any randomness in these [54:22.640 --> 54:26.280] mutant generations. It just goes over the code. Any condition that it finds that it [54:26.280 --> 54:31.680] can mutate, it will mutate. So the next time you run it, the same mutant should be there [54:31.680 --> 54:35.800] at the same place. So you could also see whether you killed it the next time. So yes, it's [54:35.800 --> 54:36.800] reproducible. [54:36.800 --> 55:00.800] I think that's the person who was first, sorry. [55:00.800 --> 55:14.200] That's a good question. I'll repeat that one. That's a good one. So the question there [55:14.200 --> 55:18.000] was, so mutation testing, we've talked all the big game, we've come up here and been [55:18.000 --> 55:21.080] like, hey, look, this is important, right? That's what we've talked about. And the question, [55:21.080 --> 55:23.680] which is a very valid question is, hey, if it's so important, why is no one supporting [55:23.680 --> 55:27.120] this in Python? Why is this all open source stuff, right? And you know what? I agree. [55:27.120 --> 55:31.080] That's a really good question. It's one I asked as well, to be honest. So no, I totally [55:31.080 --> 55:35.160] support the question. And the question I'll properly say is, yeah, why aren't employers [55:35.160 --> 55:40.680] supporting this? The short answer I think is to do with ROI, unfortunately. And that [55:40.680 --> 55:45.480] sucks, honestly, because I would like us to invest more time in certain things. And I [55:45.480 --> 55:49.880] think it's just to do with company priorities, right? So I would like to spend more time. [55:49.880 --> 55:52.560] Honestly, I had quite a lot of fun adding the one feature I did get to add. I'd quite [55:52.560 --> 55:57.920] like to do some more. But again, I've got this API to implement, so do I have time? Well, [55:57.920 --> 56:01.800] no one's funding me to do it. So unfortunately, it really is like, unless there's an obvious [56:01.800 --> 56:04.680] ROI, this just seems to be the way things go. Unfortunately, that's the way we've kind [56:04.680 --> 56:10.400] of structured our platforms and so on. So I gave a talk earlier on a PyPy and malware. [56:10.400 --> 56:14.600] And there was actually, the reason that that kind of is so prevalent and so possible on [56:14.600 --> 56:21.040] PyPy is because PyPy haven't really implemented many ways to actually protect against malware [56:21.040 --> 56:24.520] being uploaded. So currently, I've uploaded some malware to PyPy that you can get yourself. [56:24.520 --> 56:29.560] And actually, the reason that they, it's not real malware to be clear, it's a rick roll. [56:29.560 --> 56:35.240] But you saw that. But basically, what I'm trying to say here is that that project kind [56:35.240 --> 56:38.840] of didn't really get off the ground in terms of protecting users just because I think originally [56:38.840 --> 56:42.560] Facebook were funding it and they stopped funding and that just didn't then continue. [56:42.560 --> 56:45.440] So unfortunately, yeah, this is just kind of the way that things are in open source right [56:45.440 --> 56:49.120] now. And yeah, I do feel your pain. I do understand. But that's all I can really say. I'm afraid. [56:49.120 --> 56:54.400] Yeah, to quickly add to this, by the way, a Striker, for example, is actually funded, [56:54.400 --> 56:59.480] is backed by a company who, for example, let's work, interns work on it as well. So some [56:59.480 --> 57:03.120] frameworks actually are backed and there are people already investing it. So it's not always [57:03.120 --> 57:04.920] bad. But sorry, Nick, let's go to that side. [57:04.920 --> 57:09.920] So you showed some HTML reports for the results of the mutual test. [57:09.920 --> 57:10.920] Yes. [57:10.920 --> 57:15.920] We all know all managers and PDT teams love your KPIs. So I'm wondering, is there any [57:15.920 --> 57:22.880] integration or plugins to export the mutual test results in Sonar Cloud or other platforms? [57:22.880 --> 57:26.720] That's a really good question. So I'll answer quickly for Python and then I'll pass it over. [57:26.720 --> 57:31.320] Because in Python, the answer is quite short. The answer is unfortunately no. So the maintainers, [57:31.320 --> 57:35.880] the maintainer movement is not really a big fan of the CI system stuff and the report stuff. [57:35.880 --> 57:38.960] I think it's like, I think the premise there is, you know, I like running this locally [57:38.960 --> 57:43.040] and, you know, that's fair. And that is really how you can get started and get an idea. So [57:43.040 --> 57:46.720] in Python, unfortunately, the answer is no. But I think that Paco might have a more positive [57:46.720 --> 57:47.720] answer for you. [57:47.720 --> 57:51.480] Yeah. So let's also ask the, you were the maintainer of the other framework. So how [57:51.480 --> 57:54.480] does it go for the other part of the framework? [57:54.480 --> 58:03.480] So, okay. So I talked about not having that facility, that feature in Cosmic Ray. Is that [58:03.480 --> 58:13.480] a bit un-maintained? I don't want to say names, but it is a very, very large, 450 maybe vendor [58:13.480 --> 58:18.480] that uses it. And we asked them, can you fund development? They said, you know, no. And [58:18.480 --> 58:19.480] yeah. [58:19.480 --> 58:23.480] So they have shown this around at large events, like in front of thousands, thousands of people. [58:23.480 --> 58:28.480] But yeah. They're like, okay, we keep all the data stores there for whatever we find [58:28.480 --> 58:29.480] as it is. [58:29.480 --> 58:32.480] Yeah. So for the Python frameworks. [58:32.480 --> 58:33.480] Yeah. [58:33.480 --> 58:34.480] Yeah. [58:34.480 --> 58:35.480] Yeah. [58:35.480 --> 58:36.480] Yeah. [58:36.480 --> 58:47.480] So for the Python frameworks, there's not really CI plug-in support. I do know that, for [58:47.480 --> 58:52.680] example, for PyTest, there is support for Jenkins and Sonar. And I'm not sure about Stryker, [58:52.680 --> 58:57.880] but I know it's there. And usually these things are relatively easy to build yourself here [58:57.880 --> 59:02.280] as well, because all you have to do is, if there is a report in some JSON file, you can [59:02.280 --> 59:06.680] quite easily parse it and make a nice HTML form about this. Because again, they're all [59:06.680 --> 59:10.280] open for contributions. Do we have time for one last? [59:10.280 --> 59:12.280] I want to just add to that a little bit. [59:12.280 --> 59:13.280] Okay. [59:13.280 --> 59:17.840] Okay. Really quickly. First of all, with your question, yeah, I, when I originally implemented [59:17.840 --> 59:21.120] my mutt mutt thing, I did do it on PR. And in that case, I got a, you know, an action [59:21.120 --> 59:24.720] that would comment my coverage in a nice metricy way. And so you can, it's quite simple to [59:24.720 --> 59:25.720] do. [59:25.720 --> 59:29.120] So about, about cosmic, very first of all, that sucks. And I'm sorry. That's, that's [59:29.120 --> 59:34.360] blumming awful. Like, yeah, sadly, it does seem that a lot of, a lot of what we've kind [59:34.360 --> 59:36.920] of been discussing on the side of the room is just like, man, it would be good if some, [59:36.920 --> 59:39.600] you know, we all agree this is important, right? And it's useful for a lot of things. [59:39.600 --> 59:43.600] It'd be great if someone funded it. So I think, unfortunately with Python, that is the state [59:43.600 --> 59:49.120] of play. And it does suck. But yes, I, I get you. Any other questions? Finally, I think [59:49.120 --> 59:50.120] one, yes, hello. [59:50.120 --> 59:56.800] Can you write custom mutation to mutate your code in a custom logic? [59:56.800 --> 01:00:02.480] That's a really good question. So, sorry. That's, I will now repeat your really good [01:00:02.480 --> 01:00:08.040] question. The question was, so the question was, if I, if I have a certain type of mutant [01:00:08.040 --> 01:00:14.080] that I want to make, can I do that? So I would say with the stuff that I used in Python, [01:00:14.080 --> 01:00:18.520] the answer is you, you'd need to actually, you know, use the version you've downloaded, [01:00:18.520 --> 01:00:21.720] edit it yourself and add that stuff. So sadly, there's not an easy customizable way. That [01:00:21.720 --> 01:00:25.320] will be an awesome enhancement, though, that I would, I would like to see, you know, that [01:00:25.320 --> 01:00:28.040] would be cool. In other platforms, Paco, any other? [01:00:28.040 --> 01:00:32.280] I do know that I think Python did have some extension points. So it really depends. I [01:00:32.280 --> 01:00:36.760] know that the company I work for currently called Picnic, they're also working on extending [01:00:36.760 --> 01:00:42.960] it, for example, for reactive code. So there are some extension points often. So in short, [01:00:42.960 --> 01:00:47.440] it depends on the framework and how easy it is. [01:00:47.440 --> 01:00:50.960] Are we, we're done. Okay. We're at time. Thank you so much. This has been a really nice discussion [01:00:50.960 --> 01:00:51.960] as well. So thank you for sharing this. [01:00:51.960 --> 01:00:55.960] Thank you.