[00:00.000 --> 00:15.960] Hi. Hello. I hope you had good beers yesterday. Thank you for coming this morning. I'm going [00:15.960 --> 00:21.120] to talk to you about a work that a few of us started a while back, which is implementing [00:21.120 --> 00:26.480] one of the pieces of software that we all have in our computers, so the corridors. So [00:26.480 --> 00:31.800] I will go through the history of that project, explain what we are trying to do, the why, [00:31.800 --> 00:38.720] and maybe do a demo, let's see what happens. So who I am. I'm doing a lot of things, way [00:38.720 --> 00:43.920] too many things according to my partner. I'm a Debian developer for like 15 years, LLVM [00:43.920 --> 00:51.000] for 10 years. I'm also, my actual job is I'm a director at Mozilla. I'm doing 2,000 things [00:51.000 --> 00:55.760] every day. But that work is clearly unrelated to what we are doing at Mozilla. Don't tweet [00:55.760 --> 00:59.200] saying Mozilla is working on that stuff. I will get in trouble. I don't want to get into [00:59.200 --> 01:04.560] troubles. But it's not a Mozilla project. But I have been working with a rest developer [01:04.560 --> 01:11.040] for a long time. I manage some key people in the rest project. And in Paris, we had the [01:11.040 --> 01:16.080] chance to have a bunch of people who worked on the rest for 10 years. So I have been in [01:16.080 --> 01:23.400] touch with those developers for a long time. I also uploaded the initial version of Resi [01:23.400 --> 01:30.880] in Debian a long time ago. And if you don't know about packaging, you can package a software [01:30.880 --> 01:34.920] when you are not an expert in the language it has been written in. I know that sounds [01:34.920 --> 01:39.400] crazy, but I'm not a C++ compiler developer, but I'm maintaining clang for like 10 years [01:39.400 --> 01:45.800] or so. And I'm also maintaining some of the most common rest packages in the Debian archive [01:45.800 --> 01:52.160] for a long time and therefore Ubuntu. But yeah, let's talk about what happened. If you [01:52.160 --> 01:57.600] remember, something weird happened three years ago now. Most of the planet went on lockdown [01:57.600 --> 02:02.440] in our country, sorry, in France. And I think it was the same in Belgium and Italy and some [02:02.440 --> 02:07.360] country. So they decided to close everything. So I don't know what you have done on your [02:07.360 --> 02:13.560] side, but myself, I asked myself, what can I do with that three times that I had? So [02:13.560 --> 02:20.280] some people make bread. So I stole a picture from Julien Donju. He made some fancy bread. [02:20.280 --> 02:28.720] Who did bread here? A few. Cool. Some others did some woodworking. So I stole a picture [02:28.720 --> 02:33.400] from someone who used to work at Red Hat, but now he's working at Mozilla. He did some [02:33.400 --> 02:37.680] woodworking. The picture is ugly. It's not my fault, but some people did that stuff. [02:37.680 --> 02:44.720] Some people did gardening. And myself, what I've done, it's my son on the top right. [02:44.720 --> 02:49.440] He loves Lego, but he's disfiring everything. Everything was put in a single bucket and [02:49.440 --> 02:55.400] we decided to rebuild the Lego. So it kept us busy for like three or four days. And then [02:55.400 --> 03:01.880] we still have to work, but my partner decided to wear 40 something, and she decided to rewatch [03:01.880 --> 03:07.680] Buffy as a vampire slayer. I have a good memory for TV show, so I was like, yeah, I don't [03:07.680 --> 03:15.120] want to watch it again. So, and then she did also that. So you can do the math. It's like [03:15.120 --> 03:21.800] 200 and four hours. And basically, I used that time to work on the rest, because she [03:21.800 --> 03:27.400] was watching that stuff and I already saw this episode when I was younger. So yeah, wanted [03:27.400 --> 03:34.400] to learn the rest. So before what I've done, 10 years ago, I worked on that fancy project [03:34.400 --> 03:40.400] which was at the beginning of Clang when it was just starting to support some basic C++. [03:40.400 --> 03:45.280] I packaged it into Debian and then I rebuilt the Debian archive instead of CCC. I used [03:45.280 --> 03:52.080] Clang and coped me a job at Medea at the end, and a lot of fun. I'm still doing that stuff [03:52.080 --> 03:59.480] even if I should stop at some point. So the idea of that project was how can you replace [03:59.480 --> 04:03.520] a compiler by another one? Or like, yeah, I'd like to do the same with the rest. I'm [04:03.520 --> 04:07.240] not a student anymore, so I don't want to do projects that are useless. So I want to [04:07.240 --> 04:12.040] work on something interesting. So I started thinking about what can I do? So the first [04:12.040 --> 04:17.680] one is, do I want to rewrite the GLC in rest? Maybe not. It's probably too hard. So Clang, [04:17.680 --> 04:23.360] Clang, LLVM, it's crazy. There is no way I can do that. And nobody is going to be interested [04:23.360 --> 04:28.160] in those projects. So I was like, what about the corridors? So one of the things of the [04:28.160 --> 04:31.560] corridors is that initially, I didn't know anything about that stuff. So like, oh, it's [04:31.560 --> 04:36.240] probably full of assembly. I don't want to learn assembly. I don't want to read assembly. [04:36.240 --> 04:42.640] But at the end, there is no assembly in coroutils. There is just one file that I never, I don't [04:42.640 --> 04:48.680] care about that one anyway. I'm good. But yeah, some people are going to say, why are [04:48.680 --> 04:54.040] you doing that? It's pointless. So yeah, it makes sense. You can think about that stuff. [04:54.040 --> 04:58.760] But the first one is why not? Like, we all had crazy ideas in our careers and this one [04:58.760 --> 05:04.840] is one of them. Rust is amazing. So you can bring some value and you will hear during [05:04.840 --> 05:12.440] that talk, me repeating that stuff many times. I would like to, I will repeat that stuff [05:12.440 --> 05:17.960] a few times. But the new implementation is fantastic. People doing that work are amazing. [05:17.960 --> 05:24.160] They are giant. And we always hear that Rust is amazing at security. At Mozilla, we keep [05:24.160 --> 05:30.360] repeating that stuff. But for the new implementation, it's not an argument. There are only 17 CVE [05:30.360 --> 05:36.600] for the last 20 years. So it's not about security, the re-implementation. And it's not about [05:36.600 --> 05:41.720] the license. I know that some company cares a lot about license. Myself, as soon as I [05:41.720 --> 05:48.080] can upload it into Debian, I'm fine. But some people love debating how the TPL or MIT are [05:48.080 --> 05:53.720] amazing or the SAC, depending on who you talk to. I am not interested to have that debate. [05:53.720 --> 05:58.880] I leave that debate to Reddit or the other one. And last but not least, it is super [05:58.880 --> 06:02.680] interesting. I hope during that presentation, I will be able to convince you to contribute [06:02.680 --> 06:11.320] and write patches. It's not that hard. I will even do a demo of fixing a bug life, hopefully. [06:11.320 --> 06:15.840] So I keep talking about that stuff. But I think you all have a basic understanding at [06:15.840 --> 06:21.880] least of what are the corridors. So we'll start with a quiz. So who was born before [06:21.880 --> 06:29.480] 2000 in that room? I see a lot of gray hair. Oh, yeah, a bunch of people. After 90, who [06:29.480 --> 06:41.000] was born after 90? After 80? After 71? Yeah. So congrats, you are younger than the initial [06:41.000 --> 06:50.160] implementation of the corridors. So the first version was published by Ken Thompson in 70. [06:50.160 --> 06:57.440] So thanks to software age, the archive done by Inria and a lot of actors, you can see [06:57.440 --> 07:04.520] the sources of the initial implementation. As you can see, the.s means assembly. I won't [07:04.520 --> 07:08.120] share any assembly code today, don't worry. But you can look at the source and it's pretty [07:08.120 --> 07:17.320] amazing to see that that code has been written 53 years ago. So Ken Thompson and Denise Richie, [07:17.320 --> 07:21.560] they worked on that stuff a long time ago. And what we are doing right now, and you can [07:21.560 --> 07:26.160] generalize to most of the things in text, is that we are building stuff on the shoulders [07:26.160 --> 07:30.320] of those giants. So those two folks invented things that we are still using on a daily [07:30.320 --> 07:35.600] basis, like we all use CPMV and all that stuff. Even if you don't know about it, your system [07:35.600 --> 07:40.480] is probably going to use it behind. I will also mention that it Postcat is very good [07:40.480 --> 07:46.120] from Adam Gordon-Bel, who is interviewing Brian Kerrigan, talking about the history of [07:46.120 --> 07:51.880] the unique operating system. So those folks wrote a new implementation [07:51.880 --> 08:00.520] of those commands in 72. So you can see that, for example, CPE and the if command were written [08:00.520 --> 08:08.360] in C. The code is surprisingly easy to read. So this one is a source from, again, 50 years [08:08.360 --> 08:13.480] ago of CHMOD. I'm sure that if you know a bit of C, you can read it. And found that [08:13.480 --> 08:18.520] fascinating to see that the code those folks wrote 50 years ago, I'm showing you that first [08:18.520 --> 08:26.040] time in 2023, and it's still relevant, and people can relate to that code. And you can [08:26.040 --> 08:30.960] ask yourself, is your code you are writing today still going to be valuable in 50 years? [08:30.960 --> 08:34.960] Probably not. But those folks, they made it. And it's probably going to stay for a long [08:34.960 --> 08:41.000] time. And this one is the actual implementation of the CHMOD function in the function written [08:41.000 --> 08:47.560] in 72. So it's not crazy code. It's full of bugs, probably. But it worked, and it is [08:47.560 --> 08:54.120] what started Unix a long time ago. And what I found surprising listening to that podcast [08:54.120 --> 09:01.840] is also how amazing programming language, the cori-teals and those command are. We take [09:01.840 --> 09:08.160] that for granted. But when you think about it, when you use sort, unique, cat, and all [09:08.160 --> 09:12.880] that stuff, it allows you to do some crazy things very quickly. So let's take a few seconds [09:12.880 --> 09:18.600] and think, I'll give you a text file, and you want to tell me what is the most common, [09:18.600 --> 09:25.800] the five most common words in the Shakespeare books, longer than six car. We can all do [09:25.800 --> 09:32.480] that stuff. It's probably six, seven lines of Python, same in Rust and same in source [09:32.480 --> 09:39.480] languages. But if you do that in cori-teals with bash, it is that common. And when you [09:39.480 --> 09:44.760] think about it, it's very impressive. That pipe and the redirection are the key things [09:44.760 --> 09:49.480] that we are doing, and how easy it is to program on a daily basis. All your system is running [09:49.480 --> 09:53.640] that kind of stuff on a daily basis, and it makes it super easy. I'm not saying that it [09:53.640 --> 09:58.840] is great. It's not fault-tolerant. If you have the single error in that command, everything [09:58.840 --> 10:02.640] is going to break, and you are not going to get what you want. But still, you can do that [10:02.640 --> 10:07.760] kind of thing very quickly and very easily. So by the way, the results are those ones. [10:07.760 --> 10:13.800] So in Shakespeare, more than five letters, it should further accent. I don't know how [10:13.800 --> 10:17.520] to pronounce that one. So I had to Google what it means. It means that you leave the [10:17.520 --> 10:21.600] scene, or that is what it means in English for Shakespeare before and master. And it's [10:21.600 --> 10:25.320] pretty funny to do that stuff. I did it until eight character, and it's quite interesting [10:25.320 --> 10:31.600] to see what Shakespeare used in terms of words. So now, let's talk about today. My brother [10:31.600 --> 10:38.320] is a story teacher. I'm not. So I will talk about what we have now. So we have 105 commands [10:38.320 --> 10:43.800] in the implementation. In the glue implementation, we are trying to reach that level. You are [10:43.800 --> 10:48.600] very familiar with many of those. Some of them you probably never heard of. And I'd [10:48.600 --> 10:55.280] like also to remind that what is in the corridors can be weird. So sometimes you don't have [10:55.280 --> 10:59.920] fine, you don't have, in the corridors you don't have fine less tops and all those commands, [10:59.920 --> 11:07.480] but you have some other things. And most of these commands, they come up with arguments [11:07.480 --> 11:10.920] which sometimes are conflicting with each other, sometimes are completely changing the [11:10.920 --> 11:19.040] argument of the behavior of the command depending on what you enter. So second quiz. So who [11:19.040 --> 11:25.400] knows about those commands? So L-I-C-P-M-V. Everybody, sure. And then this one, probably [11:25.400 --> 11:30.520] too. Now we are starting with the art stuff. Num format. So it's a command that I, yeah, [11:30.520 --> 11:35.560] there is one of the maintainers of the project with myself. So of course he knows. But really [11:35.560 --> 11:38.240] much it's the only one. So it's the kind of stuff that we have to deal with because we [11:38.240 --> 11:43.640] want to be a drop team replacement for the new project, but we have those kind of things. [11:43.640 --> 11:52.600] And who knows about PR? Yeah, one guy. So someone else. Convert text files for printing. So [11:52.600 --> 11:56.960] it is one of that command and it has a huge number of arguments which are probably conflicting [11:56.960 --> 12:02.880] with each other. So C-split, who knows about C-split? Daniel knows. Yeah, just a few people [12:02.880 --> 12:08.880] in that room. So it is to split a file into sections determined by the context line. Yeah, [12:08.880 --> 12:14.960] it's scripting, right? Yeah, it's weird. And we have plenty of other. So we have factors [12:14.960 --> 12:19.040] to do math. We have Pinky. I don't remember what he's doing. T-Sort is doing some kind [12:19.040 --> 12:26.480] of search. Shred is to delete, really remove the data of a file on the drive. I think it's [12:26.480 --> 12:34.800] more common, but still I rarely see that one in scripts. And so we have a bunch of implementation [12:34.800 --> 12:39.520] of the curators available on the market. So the most common that everybody knows is a new [12:39.520 --> 12:45.080] implementation. There are BSD, which is used on Mac, for example. BZbox is the one when [12:45.080 --> 12:51.480] you want to use on MBD devices or when you want to recover a system. Toybox is one of [12:51.480 --> 12:55.840] the core developers of BZbox. Decided to rewrite Toybox because it was sick of license [12:55.840 --> 13:00.880] discussion. I learned recently that there is a VLAN implementation. Don't ask me what [13:00.880 --> 13:06.240] is VLAN. I don't know. And if you are aware of the implementation, please let me know. [13:06.240 --> 13:13.640] I will tell you why I want to know. So let's talk about our implementation. So it was started [13:13.640 --> 13:23.480] by Jordy. I will butcher his name, but Bushiano in 2013. Before version 1.0, I sent an email [13:23.480 --> 13:29.840] to Jordy because he has a.be email addressing. I'm going to present the work that you started [13:29.840 --> 13:35.440] 10 years ago and he said, cool, glad to hear that this project is still alive. Myself, [13:35.440 --> 13:41.040] I found it in early 2020, before COVID, and then I started contributing in April. Remember [13:41.040 --> 13:47.360] that COVID started in Europe in March. It's not a coincidence. Now, he reads the size of [13:47.360 --> 13:56.560] the project. So we have 13,000 stars and we have 350 contributors. The second contributor [13:56.560 --> 14:00.720] to that project, you see his picture with a white background is over there. Well done, [14:00.720 --> 14:06.360] Terz, for your amazing work. He's the one reviewing RPR. I'm doing the easy one. I'm [14:06.360 --> 14:12.280] not a very good developer in general. Now, it's packaged in most of the distro. Obviously, [14:12.280 --> 14:18.440] it's not a coincidence in Debian and Ubuntu, but Fedora, Gen2, and most of them. It's used [14:18.440 --> 14:26.600] by it is shipping in Apertis, which is a Linux distro for cars, which between you and me, [14:26.600 --> 14:33.760] from that scary, that they are using our work in production. But I'm always the imposterous [14:33.760 --> 14:39.440] syndrome in terms of development. And it's used by a social network through the Yachter [14:39.440 --> 14:44.280] project. So it's not Facebook, the famous social networks, another one, and they are [14:44.280 --> 14:49.040] making glasses and so on. So I think you can guess who they are. But they are using that [14:49.040 --> 14:56.320] to take pictures in the glasses. And this one is for license reasons. So now I'm doing [14:56.320 --> 15:02.520] some product placement for one of the Mozilla achievement rest. So why do we want to do [15:02.520 --> 15:08.680] it in rest? It's, you don't have to worry about security issue at Mozilla on Firefox [15:08.680 --> 15:16.800] in particular. We see security issue caused by C and C++, not on a daily basis, but almost. [15:16.800 --> 15:23.600] And you should not do C and C++ anymore if you care about security. It's very portable. [15:23.600 --> 15:28.080] One of the things that I learned with that project is we are supporting a lot of configuration, [15:28.080 --> 15:34.560] a lot of operating system. And rest is really amazing for that. It was one of my big discovery. [15:34.560 --> 15:40.400] So views probably for some of the rest developer in the room. But for me, it was a surprise. [15:40.400 --> 15:46.200] And I really don't like to invent the wheel. So we can leverage a lot of great which has [15:46.200 --> 15:51.200] been developed by very talented people over the years. So LS color, for example, is used [15:51.200 --> 15:56.800] by LSD or XR probably to provide the same color as LS. And we are using that stuff in [15:56.800 --> 16:03.840] the rest corridors. Worked here to do a recursive operation on the directory. We are using that [16:03.840 --> 16:09.760] crate so we don't have to worry about that one. Temfile, we use it for MKTem, for example. [16:09.760 --> 16:15.200] So if you look at the sources of the node corridors, they have to implement everything [16:15.200 --> 16:19.720] by themselves. Sometimes they use that in some libraries, but they have to rewrite a [16:19.720 --> 16:26.560] lot of things. While for us, we can reuse what others have been doing. And last but [16:26.560 --> 16:31.360] not least, we have amazing performances. I will do a demo later of some of the performances [16:31.360 --> 16:36.840] that we have. But surprisingly, we are in some cases, we are significantly faster than [16:36.840 --> 16:46.200] the glue implementation. And it's a very popular language. No need to explain why, but we have [16:46.200 --> 16:51.280] a lot of contributors. And sometimes we are struggling to keep up with a number of requests [16:51.280 --> 16:56.840] just because everybody wants to learn Rust. And you should if you don't. But it's very [16:56.840 --> 17:06.360] popular. So what is the goal of this specific implementation? So when I took over that project [17:06.360 --> 17:12.280] a few years ago, I had exactly the same idea in mind as Chris Latner and Apple did back [17:12.280 --> 17:18.480] then with Clang is to be a drop-in replacement. So if you are not aware of that story, when [17:18.480 --> 17:24.200] Apple decided to work on Clang, one of their goals is to be a pure drop-in replacement [17:24.200 --> 17:29.680] for GCC in general for most of the options. And it has been one of the success you just [17:29.680 --> 17:35.800] had to overwrite the CC or CXX variable and you could use Clang directly. And if it was [17:35.800 --> 17:41.720] not working, it was a bug for most of the cases. So it works surprisingly well. Now [17:41.720 --> 17:47.160] Clang is a de facto standard for compiling most of the very complex applications like [17:47.160 --> 17:55.400] Chrome or Firefox. What can we do to replicate that? So the security is we focused on that [17:55.400 --> 18:04.320] one to be a drop-in replacement. We want that stuff to be cross-platform. It has been decided [18:04.320 --> 18:08.360] before my time as the leader of the project, but I love the idea. So we support the operating [18:08.360 --> 18:15.440] system. We are struggling with a free BSD, a CI because it sucks on GitHub. But besides [18:15.440 --> 18:21.040] that, it's working pretty well. Also, except for Fushia, we have CI for every one of them. [18:21.040 --> 18:30.760] So for every PR, we run a lot of tests on those. It's very easy to test. So on my laptop, [18:30.760 --> 18:36.240] running the full test suite takes less than a minute. And it's covering a lot of part [18:36.240 --> 18:42.320] of the code. I will share some of the code coverage information. I don't care about it. [18:42.320 --> 18:46.840] Some people do. For some people, it's a strength. For some people, it's a weakness. But it's [18:46.840 --> 18:54.320] an MIT license so that social network can reuse our code to save money. Anyway, I'm [18:54.320 --> 19:00.440] not interested in having that debate. And in my opinion, and I haven't seen anyone in [19:00.440 --> 19:06.920] the community using that argument, it's not a fight against the GNU project or the FSF. [19:06.920 --> 19:10.840] The GNU project has been doing a lot of good things for us in the open source world for [19:10.840 --> 19:19.200] 20 or 30 years. We are standing on their shoulder. It's not a fight. I know that some GNU [19:19.200 --> 19:24.160] quality developer of monitoring that project for a long time have been very friendly and [19:24.160 --> 19:30.240] so on. And I met Tim Mayoring 10 years or 20 years ago. And I was very impressed by that [19:30.240 --> 19:37.480] person. So, yeah, it's not about fighting. It's about collaboration. So when I started [19:37.480 --> 19:42.000] that project two or three years ago, my initial goal where I want to be able to boot a Debian [19:42.000 --> 19:48.840] on it, so my laptop here is running the GNU curricules now. For example, so it's not lying [19:48.840 --> 19:56.360] or it's working well now. Then it was to install the top 1,000 packages in Debian. So if I [19:56.360 --> 20:01.680] wanted to do that, is that Debian has a lot of script to configure the package because [20:01.680 --> 20:10.960] it's post-inst and they are usually done in bash and using the sort and Cp and MV and install. [20:10.960 --> 20:17.800] It's exercising a lot of features of the GNU corretails. And one of the goals was also to [20:17.800 --> 20:25.520] build three of the big projects I care about. So Linux, LLVM and Firefox, obviously. So we [20:25.520 --> 20:31.640] don't use that much scripting, so bash or corretails, but I still found some bugs, building [20:31.640 --> 20:38.440] so Linux can know some corner cases. And of course, package it into Debian and Ubuntu. [20:38.440 --> 20:45.520] I published some blog about it. They have been shared on a bunch of places. I've had [20:45.520 --> 20:53.120] some interesting comments, some of them not very interesting. Anyway, so to achieve those [20:53.120 --> 20:58.400] goals, we had to deploy a CI, add code coverage support, improve the code coverage. So it's [20:58.400 --> 21:02.760] one of the things to get familiar with a project. I wrote a lot of unit tests. So now the code [21:02.760 --> 21:09.600] coverage of our implementation is 80%. If you don't know much about code coverage, 80% [21:09.600 --> 21:13.680] is usually what we are trying to achieve in a project. It means that the code coverage [21:13.680 --> 21:20.120] is very good. It's very hard to reach 100% and sometimes it's a waste of time, but 80 [21:20.120 --> 21:24.960] is usually considered as being a very good code coverage. I think on Firefox, we are [21:24.960 --> 21:32.680] at 65 or 70, something like that. And we plug the bunch of tools. I mean, Lov is static [21:32.680 --> 21:38.160] analysis, LinkedIn, so of course I had to do that for that project. And we also documented [21:38.160 --> 21:43.800] a lot of those processes. Everybody loves about docs, so we wrote plenty of docs. And [21:43.800 --> 21:48.600] it took about a year to reach that state. So now the current stages. So what we have [21:48.600 --> 21:56.080] here is CI that is running for every PR. And we run our implementation against the new [21:56.080 --> 22:01.840] test suite. This is the latest graph so you can see that we have been working a lot on [22:01.840 --> 22:08.160] improving the compatibility. We are not there yet. I will fix one of them with you later [22:08.160 --> 22:12.840] during that presentation. But there is no silver bullet. For many of those, you have [22:12.840 --> 22:19.840] to spend a few hours to fix one. And you improve one by one. Before you ask a question, the [22:19.840 --> 22:26.280] skip is mostly that it is AC Linux. So CPMV and some common or CH-con, they are using [22:26.280 --> 22:32.640] AC Linux and GitHub action uses Ubuntu and Ubuntu doesn't have like default, so it's [22:32.640 --> 22:36.600] tricky to test that stuff into the CI. If you know how to do it, please reach out to [22:36.600 --> 22:44.040] us and like to fix that stuff. So how do we work? As I said many times, we want that stuff [22:44.040 --> 22:50.440] to be a timely replacement. So we wrote a mini wrapper to make it super easy. So that's [22:50.440 --> 22:59.480] command that I share is going to run the not-owner test for the touch command on GNU. So it's [22:59.480 --> 23:07.000] going to use the GNU test and run it against our implementation. So it's super easy to [23:07.000 --> 23:13.960] test. And we wrote some script to make our life easier. So we have a Python script which [23:13.960 --> 23:18.160] is going to tell us to the list. If you do it, it's several pages because we still have [23:18.160 --> 23:27.880] a lot of tests to fix. And we wrote also a fancy page. I can show you what it looks like. [23:27.880 --> 23:34.560] So here it is. We have the list of all the tests for each command. And of course, the [23:34.560 --> 23:41.920] big one is MISC where we have a mix. So sometimes it's just one line change in the code. Sometimes [23:41.920 --> 23:45.800] it's a big refactor. It depends. It's part of the fun. You never know what you are going [23:45.800 --> 23:57.080] to get. So let's use an example, for example, for MKgear. So this is the GNU I would put. [23:57.080 --> 24:03.920] So you do dash P. Who knows what is dash P in that one? Okay, cool. So it's create a [24:03.920 --> 24:08.360] recursive and V is verbose. Everybody knows that. So in the GNU implementations, they [24:08.360 --> 24:13.600] decided to do it directly by directly. You can argue that maybe it's not a good use. It's [24:13.600 --> 24:19.480] not smart. Maybe it is. Who cares? It's legacy. We have to deal with legacy. So our implementation [24:19.480 --> 24:26.480] was that one. So you can argue that maybe ours was better or worse. Who cares? But we [24:26.480 --> 24:31.800] updated that code. So we match exactly what GNU is doing. So if you look, I will share [24:31.800 --> 24:37.800] the slide after. You can look at the change. It's pretty easy to understand. This one is [24:37.800 --> 24:44.640] one of my favorite. So in Debian, when you install LibOffice, it is using app or more. [24:44.640 --> 24:49.520] And someone decided that instead of doing a touch, you do install DevNule, which creates [24:49.520 --> 24:56.520] an empty file. It's legacy also. We want to fix legacy everywhere. Probably not. So it [24:56.520 --> 25:00.720] was one of my favorite bugs. So I started investigating. So you use some REST codes [25:00.720 --> 25:07.240] which reproduce that issue. So if you do a copy of DevNule into a text file, it is failing [25:07.240 --> 25:14.120] with a source pass. It's not an existing regular file. So I open a bug upstream. So thread [25:14.120 --> 25:18.240] is quite interesting. Like people in the REST community are very passionate about that [25:18.240 --> 25:23.720] kind of thing. Too long didn't read. It hasn't been fixed. So here is a workaround. It's [25:23.720 --> 25:30.800] ugly. But if you know a better way to do it, besides fixing REST and dealing with a fallout [25:30.800 --> 25:36.920] of the fix, this is the best that we have. So we are looking if the input is DevNule. [25:36.920 --> 25:43.040] If it is DevNule, we are creating an empty file. This is what it takes to deal with legacy. [25:43.040 --> 25:50.040] So let's do a demo together. So please bear with me. It's going to be fun with the mic. [25:50.040 --> 26:17.040] It's going to be fun with the mic. Hello. So, ah, uppercase. Sure. Ah crap. Just switch [26:17.040 --> 26:24.040] to alacrity and never remember the shortcut to increase the font. Ha ha. This one. Better. [26:24.040 --> 26:37.040] Cool. So this is not the new version that I'm running. You can see I'm running our implementation. [26:37.040 --> 26:43.720] So we are working third release 0.0 17 weeks ago, something like that one week ago. So [26:43.720 --> 26:50.720] we are using that implementation of it. So when I took the train to come here, I looked [26:50.720 --> 26:56.960] at the test suite and I tried to find a cool bug for the demo. So I found one and I want [26:56.960 --> 27:06.960] to show you what it is. Because I don't always remember the command because I'm a font of [27:06.960 --> 27:19.960] a crowd. I don't want to look stupid. So I did a post bet within French. So here is building [27:19.960 --> 27:28.960] some stuff. So what I'm testing right now is a sort command. The sort command has some [27:28.960 --> 27:34.960] fancy flag. So this is a command from the GNU project. So I will show you what the test [27:34.960 --> 27:51.960] looks like on the GNU side. So here is the test. So of course we have the TPL3 on top [27:51.960 --> 28:01.560] and then there is a list of file with version name. So we have on the top we have the input. [28:01.560 --> 28:12.200] So for in and below we have the expected one. So the command that they are testing is a [28:12.200 --> 28:19.560] stable sort with a sort by version. So sort and same for LS. So you can specify what kind [28:19.560 --> 28:24.800] of sort you want. And of course doing sort of version is super complex and we can base [28:24.800 --> 28:30.440] it about that stuff for hours. I love it. So here is what they are testing. So of course [28:30.440 --> 28:39.120] as you can see it failed on our side. And why did it fail? So it seems that it is complaining [28:39.120 --> 28:51.480] that 5.4.0 it's not sorted the same way as 5.0.4. Super interesting, isn't it, right? [28:51.480 --> 28:55.960] So basically it means that we are sorting that version differently. Of course it's so [28:55.960 --> 29:00.520] obvious that it is the same version and it should be equal but when you are sorting in [29:00.520 --> 29:05.360] those cases equal doesn't mean anything so you need to make a decision. And of course [29:05.360 --> 29:10.640] we decided to do otherwise because I think the person who did the implementation didn't [29:10.640 --> 29:17.160] realize that GNU was doing something different. So let's try to fix that together. So what [29:17.160 --> 29:23.400] I like to do, I don't know how other contributors are doing is that I like to have test cases. [29:23.400 --> 29:31.480] So what I'm doing is I'm creating some basic command to be able to reproduce that easily [29:31.480 --> 29:35.400] so that I don't have to run the test. So I spare you the details. So here is the test [29:35.400 --> 30:01.000] case. Let me do that right now. So I've got my test file. So what I'm doing is I'm forcing [30:01.000 --> 30:05.720] the full pass to use the GNU implementation. I used the two arguments that we mentioned [30:05.720 --> 30:19.760] earlier and now I have the input and the output. In theory if I do a diff it should be empty. [30:19.760 --> 30:26.800] And it is. I love when demo works well. It's just the beginning so I will probably affect [30:26.800 --> 30:39.960] about some point. So now I want to test the GNU implementation. So here it is. So now [30:39.960 --> 30:43.640] I have a simple test case that I can work on. So I don't have to run the test suite. [30:43.640 --> 30:47.480] I don't have to do anything else. It's one of the things that I love with that project [30:47.480 --> 30:51.320] and that's why I'm not doing REST code at medias because I'm not a very good developer [30:51.320 --> 30:56.080] and Firefox is super complex and that stuff is pretty easy to do. You can do that on the [30:56.080 --> 31:00.000] train to come here. For example, what I need is taking this one. It's probably going to [31:00.000 --> 31:05.600] take 20 minutes to fix. Well, there is a weird thing at the end but I'm not going to produce [31:05.600 --> 31:11.200] a surprise yet. Anyway, this is the code that I have. So let's dive in now into the actual [31:11.200 --> 31:26.320] code. So let's use GDB. So GDB and REST works very well. So if I do a run, yeah, it works. [31:26.320 --> 31:33.200] If I do a breakpoint on main, yeah, it works. Next, next, next. Here it is. You can see [31:33.200 --> 31:38.280] the REST code. You can evaluate the variables working super well. So I won't open and look [31:38.280 --> 31:41.640] for the function. I already did that for you. So I already know the function where to put [31:41.640 --> 31:49.480] the breakpoint. So it's version underscore CMP. Our code is pretty well written. So I [31:49.480 --> 31:54.160] guess you'll understand what version CMP is doing. It's comparing version. So now I continue [31:54.160 --> 32:00.760] the execution and then I'm in the function doing it. So if I look at A, I see that it [32:00.760 --> 32:07.680] is that string. So the first one, if I look at B, here it is. So I have the two strings. [32:07.680 --> 32:13.920] So I will move in quick mode. There is the execution. So we have the version compare. [32:13.920 --> 32:18.280] And here is a function that I care about. So I will scroll. But you see in that function, [32:18.280 --> 32:23.240] it's probably what I care about. Because it is, if you don't know about REST, it is going [32:23.240 --> 32:28.320] to trim the zero at the beginning. So obviously in that case, it is what I'm looking for. [32:28.320 --> 32:46.320] So I want to remove the zero at first. So let's fix that code now. I learned from one [32:46.320 --> 33:03.680] of my colleagues at NANO. So I'm following the example. Yeah, exactly. It's the best, [33:03.680 --> 33:25.520] right? So here it is. I removed the trim. Of course, it's going to break something else [33:25.520 --> 33:34.160] after. I'm confident that it works. So let's rebuild that stuff. So here it is. It's rebuilding. [33:34.160 --> 33:39.720] It is what the file that I touched is using. It's a UCOR. So it's one of our basic library [33:39.720 --> 33:43.920] to do file management and so on. So it's normal that it is rebuilding the values dependencies. [33:43.920 --> 34:00.360] Not GDB. I want without GDB. Please. That one. And did it work? Yeah, it worked. Cool. [34:00.360 --> 34:08.200] I fixed the bug. So very proud of myself. I love how geeks love versioning comparison. [34:08.200 --> 34:17.240] So now the funny story is, of course, because it would be too easy. Now I'm not going to [34:17.240 --> 34:21.440] run the full test suite even if it takes less than a minute, but I'm just going to run that [34:21.440 --> 34:26.560] test because I know that that one fails. So LS has the same function, but LS has a different [34:26.560 --> 34:30.840] expectation in terms of version sorting. So, of course, that test is going to fail because [34:30.840 --> 34:38.240] LS likes the zero before instead of the other one. So, of course, that test fails. So one [34:38.240 --> 34:43.480] of the things that we could do is in the version compare, we could have, do we want zero to [34:43.480 --> 34:52.600] be first or do we want zero to be after doing a boolean? And when I say this, because I don't [34:52.600 --> 34:56.480] like the version comparison function, it's on our tree. So I think it's done. Someone [34:56.480 --> 35:00.040] else did it. Of course, there is a great doing it. So I reached out to the developer saying, [35:00.040 --> 35:03.720] can you add an option to change the sort when it is a zero? So the upstream made fun of [35:03.720 --> 35:08.280] me, but they have a PR ready, so they will probably end it. So we are going to remove [35:08.280 --> 35:12.400] 100 offline of code and use that crate because it is what we want to do. We don't want to [35:12.400 --> 35:20.160] maintain a comparison of version because who cares, right? So let's come back to the [35:20.160 --> 35:25.840] presentation. So performances. Benchmarking performance is hard. I know that in the room [35:25.840 --> 35:31.120] we have some experts in performances and they are not going to contract with me that when [35:31.120 --> 35:37.320] I say that benching is hard. But we are using hyperfine. We can see that, for example, the [35:37.320 --> 35:43.240] start is almost five times faster than the boolean implementation because some people [35:43.240 --> 35:48.840] spend a lot of time improving it in our code, but also in the crate. So that one is basically [35:48.840 --> 35:54.960] taking all Shakespeare books, the text file, making it random and sort it. So we are significantly [35:54.960 --> 36:04.480] faster than the new implementation. Similarly, if you do a recursive LS or a recursive CP, [36:04.480 --> 36:14.000] we are 1.5 faster than new is probably because the code generated by RC is much better than [36:14.000 --> 36:19.440] the one and returned by GNU for a long time. But I don't want to pretend that we are always [36:19.440 --> 36:26.840] doing better. For example, with the factor function, we are significantly slower than [36:26.840 --> 36:31.520] the new implementation five times. I don't know who uses the comment factor. I learned [36:31.520 --> 36:35.840] about that comment when I started contributing, but still, we want to be a good replacement [36:35.840 --> 36:42.880] and that one is very slow. So I'm going to do another demo using, I'm going to do some [36:42.880 --> 36:50.160] product placement for a project called Sempli. He is the author of that project. He is in [36:50.160 --> 36:56.440] the room, so if you need anything, don't blame me, blame him. I will do a quick demo [36:56.440 --> 37:03.600] about how can you do a proper benchmarking of performances. I will try to find how to [37:03.600 --> 37:25.680] do the primary perf kernel change. That one. I love the SH. So I just did a recursive LS [37:25.680 --> 37:31.200] with a comment called Sempli. So basically, it is going to instrument it and it's going [37:31.200 --> 37:37.840] to upload into the Firefox profiler, the analysis of that program. So I will zoom in [37:37.840 --> 37:45.240] a second. So here it is. I think Bodhiya did four presentations of the profiler today, [37:45.240 --> 37:50.440] so if you haven't seen any, I'm going to do a quick demo. Anyway, it is one of the magic [37:50.440 --> 37:55.680] things of Rust is that we can easily do that stuff. You saw how long it took. I will show [37:55.680 --> 38:01.440] you the command again that I run. This is that one. So Sempli records the binary and [38:01.440 --> 38:07.400] the argument. So it's very easy. If I try to do that, we see another language is going [38:07.400 --> 38:13.600] to take forever, but Rust and Sempli makes that so easy. So here I have the flame graph. [38:13.600 --> 38:18.880] So I can see that for a recursive LS, most of the time that we are spending is in display [38:18.880 --> 38:23.360] grid. So most of the time that LS is going to spend is not reading at the file or reading [38:23.360 --> 38:27.200] the metadata. It's just doing computation about how do you want to show the result [38:27.200 --> 38:34.200] into the terminal. And one of the amazing things is that it's going to, I can look at [38:34.200 --> 38:39.760] the source directly and see the counters and I can easily benchmark that stuff. So if you [38:39.760 --> 38:47.040] are into performances in Rust, you can use Sempli and the Firefox profiler to do it. [38:47.040 --> 38:52.160] And again, not here as Bodhiya, but it's very valuable for any project. It's not Bodhiya [38:52.160 --> 38:59.360] related and it makes that stuff super easy. So we have also some fancy documentation. [38:59.360 --> 39:05.240] Tert in the room did most of the work. So one of the things that he did that I love [39:05.240 --> 39:10.920] is that one of the things that I really struggled at first when I became a UNIX developer like [39:10.920 --> 39:16.960] 20 years ago is that I never knew how to do an example and find the example. So Tert linked [39:16.960 --> 39:25.480] to TLDR.sh where it's providing example for every comment. So for example, here it's [39:25.480 --> 39:30.800] base 64. You have example, but I'm going to use a comment that I didn't know about, shred. [39:30.800 --> 39:34.840] You have example for shred. So you don't have to Google on Stack Overflow how to do that [39:34.840 --> 39:46.440] stuff. You have that out of the box in our documentation. So we have development documentation. [39:46.440 --> 39:50.760] I want good suite for a matter of time. But one of the things, we are taking the liberty [39:50.760 --> 39:56.480] also to extend the new clarity. So we don't want to break compatibility, but we are doing [39:56.480 --> 40:07.120] some fancy thing like progress bar because who doesn't love progress bar? So for example, [40:07.120 --> 40:18.240] of course, it is that one. So we have a fancy progress bar now. You don't have that upstream. [40:18.240 --> 40:24.280] So you can think the guy is over there. I haven't done anything. But before the talk, [40:24.280 --> 40:29.560] he came to me and said, I looked at the implementation, at the patch that wasn't merging new and that [40:29.560 --> 40:34.640] patch was very hard to understand. If you look at the diff on the rest side, even if [40:34.640 --> 40:38.520] you don't know about rest, you will understand it because we are relying on a crate. I think [40:38.520 --> 40:45.000] the upstream developer of the crate is for them also. So thank you for your hard work. [40:45.000 --> 40:49.400] So here is what kind of stuff we can do. We can do it for MV. And MV is interesting because [40:49.400 --> 40:55.520] if you are moving between one file system to the other, it's not always super fast. [40:55.520 --> 40:59.760] We are also implementing some tools that new doesn't have. So B3Sum, for example, because [40:59.760 --> 41:06.440] it's like B2Sum, so it's a hash algorithm. And cut-w, someone contributed that patch [41:06.440 --> 41:15.120] recently. It's one of the options from BSD that new doesn't have. So what is next? We [41:15.120 --> 41:20.280] want to implement the missing option for the values binary. It's not hard. It's fun. We [41:20.280 --> 41:26.000] try to be nice. I'm not always nice, but I try to. We want to have a full compatibility [41:26.000 --> 41:32.320] with new. So the list that I show you, I like that to be fully green at some point. It's [41:32.320 --> 41:37.080] going to take years, but it's fun. It proves the performances on some key programs, like [41:37.080 --> 41:40.520] for example, for Factor. I don't know if we want to spend too much time, but there are [41:40.520 --> 41:47.960] other things where we could improve the performances. And if you're interested, that link is the [41:47.960 --> 41:55.280] most interesting one, to know where to start. It's well documented. We, if you are a student, [41:55.280 --> 42:01.080] we are probably going to apply for the Google Summer of Code. And if someone is interested [42:01.080 --> 42:06.160] in sponsoring that project, if your company or your foundation is interesting, I'd like [42:06.160 --> 42:12.840] to buy some credits on GitHub Action to build faster, because we're running the new street [42:12.840 --> 42:19.760] test about an hour. And I think we are using a lot of resources, so it would be nice to [42:19.760 --> 42:29.760] have a faster CI. And now I'd like to predict the future. So the Linux kernel, they landed [42:29.760 --> 42:36.960] the first support of Rust inside the tree. So if you do some stat on the Linux Git repo, [42:36.960 --> 42:42.920] you see that they are first. So there is only 37 files upstream. It's mostly glue. So how [42:42.920 --> 42:50.000] do you manage memory and compatibility with the system and support for the build system? [42:50.000 --> 42:56.360] We hope that they are going to land something in mainline soon, a feature, but is it going [42:56.360 --> 43:00.440] to happen this year or not? We don't know. And some people are still challenging Rust [43:00.440 --> 43:08.400] in the Linux kernel. The main argument that I heard is mainly for legacy system. But my [43:08.400 --> 43:15.120] prediction is that next year, we will start to see more and more distro vendor or cloud [43:15.120 --> 43:21.040] company to ship with some more and more Rust code. Probably my prediction is that some Linux [43:21.040 --> 43:27.880] distro are going to ship in the cloud our implementation in the next few years. So thanks [43:27.880 --> 43:31.600] for your time. I'd like to remind that it is not a Mozilla project, so please don't say [43:31.600 --> 43:36.120] Mozilla works on that. It will make my life easier. And I think we have a few minutes [43:36.120 --> 43:39.120] for questions. [43:39.120 --> 44:07.120] Here, I think we need to, in the back. We David. [44:07.120 --> 44:22.480] It's an issue for every Rust project currently. The fact that cargo is amazing at updating [44:22.480 --> 44:28.880] dependency is many of the Rust projects are subject to supply chain attack lately. And [44:28.880 --> 44:35.440] we are no exception. So I can do product placement if you, Mozilla has developed a tool called [44:35.440 --> 44:42.040] the cargo vet, which mitigates that stuff, but it's too complex to deploy for a hobby [44:42.040 --> 44:50.560] like this one. So yeah, you can do supply chain attack in our project. But we don't merge [44:50.560 --> 44:57.640] the dependable pull request immediately. We are trying to mitigate that by taking our [44:57.640 --> 45:13.160] time. Any other question? Yeah, two over there. Yeah, please. [45:13.160 --> 45:18.680] The code size compared to busy box, I don't know. With GNU, it's similar. So you have [45:18.680 --> 45:22.920] different way to compile the project. So the trick that I'm using in Debian for the package [45:22.920 --> 45:28.200] is that I'm building a single binary and I'm doing sim link. So if you do a sim link from [45:28.200 --> 45:34.920] curateals to a CP or to NS, the binary is going to understand that you are calling CP. [45:34.920 --> 45:40.640] So the size is comparable to the new implementation in that case. But because of the nature of [45:40.640 --> 45:52.760] course, the binaries are bigger than the rest. Another question? Yeah, it's a question that [45:52.760 --> 46:04.960] comes back so often. Do you compute the code coverage only on the passing tests or all [46:04.960 --> 46:10.720] of them? We run code coverage on everything, including the GNU test suite. So it takes [46:10.720 --> 46:24.040] an hour and a half. So the tests that are failing? Yeah, yeah. We test everything. [46:24.040 --> 46:30.120] Did you find any bugs in the original GNU implementation when doing the re-implementation? [46:30.120 --> 46:36.320] Yeah. We have a contributor who found two or three bugs in the upstream implementation [46:36.320 --> 46:49.560] and they have been fixed up. We ported and fixed up. Have you had any need to use unsafe [46:49.560 --> 46:54.880] and if yes, can you give a few examples in which kind of areas? We have some unsafe when [46:54.880 --> 46:59.520] we are calling the libc, some function of the libc we are calling it directly. So we have [46:59.520 --> 47:17.560] some unsafe. Yeah. I'm going to repeat what he said. He said that it's for the libc. It's [47:17.560 --> 47:28.160] mostly native calls. Have you looked at code complexity metrics like [47:28.160 --> 47:32.760] cyclomatic complexity? Because I guess with the error handling, the memory management [47:32.760 --> 47:38.440] and cleanup in C is often really messy and that could be a huge boon in maintainability. [47:38.440 --> 47:44.360] Yeah, yeah. We have some codes that clearly need to be factoring. So for example, in LS [47:44.360 --> 47:49.640] we have to work around the limitation of CLAP and the code complexity here is getting more [47:49.640 --> 47:54.400] and more complex. Yeah. That project needs some policies of our implementation need to [47:54.400 --> 47:59.280] be factoring to decrease it. But the code is usually pretty easy to understand. Thanks. [47:59.280 --> 48:09.160] Okay. I think we're out of time. Thank you all. Don't hesitate to contribute. I can find [48:09.160 --> 48:25.200] an email if you need.