[00:00.000 --> 00:15.960]  Hi. Hello. I hope you had good beers yesterday. Thank you for coming this morning. I'm going
[00:15.960 --> 00:21.120]  to talk to you about a work that a few of us started a while back, which is implementing
[00:21.120 --> 00:26.480]  one of the pieces of software that we all have in our computers, so the corridors. So
[00:26.480 --> 00:31.800]  I will go through the history of that project, explain what we are trying to do, the why,
[00:31.800 --> 00:38.720]  and maybe do a demo, let's see what happens. So who I am. I'm doing a lot of things, way
[00:38.720 --> 00:43.920]  too many things according to my partner. I'm a Debian developer for like 15 years, LLVM
[00:43.920 --> 00:51.000]  for 10 years. I'm also, my actual job is I'm a director at Mozilla. I'm doing 2,000 things
[00:51.000 --> 00:55.760]  every day. But that work is clearly unrelated to what we are doing at Mozilla. Don't tweet
[00:55.760 --> 00:59.200]  saying Mozilla is working on that stuff. I will get in trouble. I don't want to get into
[00:59.200 --> 01:04.560]  troubles. But it's not a Mozilla project. But I have been working with a rest developer
[01:04.560 --> 01:11.040]  for a long time. I manage some key people in the rest project. And in Paris, we had the
[01:11.040 --> 01:16.080]  chance to have a bunch of people who worked on the rest for 10 years. So I have been in
[01:16.080 --> 01:23.400]  touch with those developers for a long time. I also uploaded the initial version of Resi
[01:23.400 --> 01:30.880]  in Debian a long time ago. And if you don't know about packaging, you can package a software
[01:30.880 --> 01:34.920]  when you are not an expert in the language it has been written in. I know that sounds
[01:34.920 --> 01:39.400]  crazy, but I'm not a C++ compiler developer, but I'm maintaining clang for like 10 years
[01:39.400 --> 01:45.800]  or so. And I'm also maintaining some of the most common rest packages in the Debian archive
[01:45.800 --> 01:52.160]  for a long time and therefore Ubuntu. But yeah, let's talk about what happened. If you
[01:52.160 --> 01:57.600]  remember, something weird happened three years ago now. Most of the planet went on lockdown
[01:57.600 --> 02:02.440]  in our country, sorry, in France. And I think it was the same in Belgium and Italy and some
[02:02.440 --> 02:07.360]  country. So they decided to close everything. So I don't know what you have done on your
[02:07.360 --> 02:13.560]  side, but myself, I asked myself, what can I do with that three times that I had? So
[02:13.560 --> 02:20.280]  some people make bread. So I stole a picture from Julien Donju. He made some fancy bread.
[02:20.280 --> 02:28.720]  Who did bread here? A few. Cool. Some others did some woodworking. So I stole a picture
[02:28.720 --> 02:33.400]  from someone who used to work at Red Hat, but now he's working at Mozilla. He did some
[02:33.400 --> 02:37.680]  woodworking. The picture is ugly. It's not my fault, but some people did that stuff.
[02:37.680 --> 02:44.720]  Some people did gardening. And myself, what I've done, it's my son on the top right.
[02:44.720 --> 02:49.440]  He loves Lego, but he's disfiring everything. Everything was put in a single bucket and
[02:49.440 --> 02:55.400]  we decided to rebuild the Lego. So it kept us busy for like three or four days. And then
[02:55.400 --> 03:01.880]  we still have to work, but my partner decided to wear 40 something, and she decided to rewatch
[03:01.880 --> 03:07.680]  Buffy as a vampire slayer. I have a good memory for TV show, so I was like, yeah, I don't
[03:07.680 --> 03:15.120]  want to watch it again. So, and then she did also that. So you can do the math. It's like
[03:15.120 --> 03:21.800]  200 and four hours. And basically, I used that time to work on the rest, because she
[03:21.800 --> 03:27.400]  was watching that stuff and I already saw this episode when I was younger. So yeah, wanted
[03:27.400 --> 03:34.400]  to learn the rest. So before what I've done, 10 years ago, I worked on that fancy project
[03:34.400 --> 03:40.400]  which was at the beginning of Clang when it was just starting to support some basic C++.
[03:40.400 --> 03:45.280]  I packaged it into Debian and then I rebuilt the Debian archive instead of CCC. I used
[03:45.280 --> 03:52.080]  Clang and coped me a job at Medea at the end, and a lot of fun. I'm still doing that stuff
[03:52.080 --> 03:59.480]  even if I should stop at some point. So the idea of that project was how can you replace
[03:59.480 --> 04:03.520]  a compiler by another one? Or like, yeah, I'd like to do the same with the rest. I'm
[04:03.520 --> 04:07.240]  not a student anymore, so I don't want to do projects that are useless. So I want to
[04:07.240 --> 04:12.040]  work on something interesting. So I started thinking about what can I do? So the first
[04:12.040 --> 04:17.680]  one is, do I want to rewrite the GLC in rest? Maybe not. It's probably too hard. So Clang,
[04:17.680 --> 04:23.360]  Clang, LLVM, it's crazy. There is no way I can do that. And nobody is going to be interested
[04:23.360 --> 04:28.160]  in those projects. So I was like, what about the corridors? So one of the things of the
[04:28.160 --> 04:31.560]  corridors is that initially, I didn't know anything about that stuff. So like, oh, it's
[04:31.560 --> 04:36.240]  probably full of assembly. I don't want to learn assembly. I don't want to read assembly.
[04:36.240 --> 04:42.640]  But at the end, there is no assembly in coroutils. There is just one file that I never, I don't
[04:42.640 --> 04:48.680]  care about that one anyway. I'm good. But yeah, some people are going to say, why are
[04:48.680 --> 04:54.040]  you doing that? It's pointless. So yeah, it makes sense. You can think about that stuff.
[04:54.040 --> 04:58.760]  But the first one is why not? Like, we all had crazy ideas in our careers and this one
[04:58.760 --> 05:04.840]  is one of them. Rust is amazing. So you can bring some value and you will hear during
[05:04.840 --> 05:12.440]  that talk, me repeating that stuff many times. I would like to, I will repeat that stuff
[05:12.440 --> 05:17.960]  a few times. But the new implementation is fantastic. People doing that work are amazing.
[05:17.960 --> 05:24.160]  They are giant. And we always hear that Rust is amazing at security. At Mozilla, we keep
[05:24.160 --> 05:30.360]  repeating that stuff. But for the new implementation, it's not an argument. There are only 17 CVE
[05:30.360 --> 05:36.600]  for the last 20 years. So it's not about security, the re-implementation. And it's not about
[05:36.600 --> 05:41.720]  the license. I know that some company cares a lot about license. Myself, as soon as I
[05:41.720 --> 05:48.080]  can upload it into Debian, I'm fine. But some people love debating how the TPL or MIT are
[05:48.080 --> 05:53.720]  amazing or the SAC, depending on who you talk to. I am not interested to have that debate.
[05:53.720 --> 05:58.880]  I leave that debate to Reddit or the other one. And last but not least, it is super
[05:58.880 --> 06:02.680]  interesting. I hope during that presentation, I will be able to convince you to contribute
[06:02.680 --> 06:11.320]  and write patches. It's not that hard. I will even do a demo of fixing a bug life, hopefully.
[06:11.320 --> 06:15.840]  So I keep talking about that stuff. But I think you all have a basic understanding at
[06:15.840 --> 06:21.880]  least of what are the corridors. So we'll start with a quiz. So who was born before
[06:21.880 --> 06:29.480]  2000 in that room? I see a lot of gray hair. Oh, yeah, a bunch of people. After 90, who
[06:29.480 --> 06:41.000]  was born after 90? After 80? After 71? Yeah. So congrats, you are younger than the initial
[06:41.000 --> 06:50.160]  implementation of the corridors. So the first version was published by Ken Thompson in 70.
[06:50.160 --> 06:57.440]  So thanks to software age, the archive done by Inria and a lot of actors, you can see
[06:57.440 --> 07:04.520]  the sources of the initial implementation. As you can see, the.s means assembly. I won't
[07:04.520 --> 07:08.120]  share any assembly code today, don't worry. But you can look at the source and it's pretty
[07:08.120 --> 07:17.320]  amazing to see that that code has been written 53 years ago. So Ken Thompson and Denise Richie,
[07:17.320 --> 07:21.560]  they worked on that stuff a long time ago. And what we are doing right now, and you can
[07:21.560 --> 07:26.160]  generalize to most of the things in text, is that we are building stuff on the shoulders
[07:26.160 --> 07:30.320]  of those giants. So those two folks invented things that we are still using on a daily
[07:30.320 --> 07:35.600]  basis, like we all use CPMV and all that stuff. Even if you don't know about it, your system
[07:35.600 --> 07:40.480]  is probably going to use it behind. I will also mention that it Postcat is very good
[07:40.480 --> 07:46.120]  from Adam Gordon-Bel, who is interviewing Brian Kerrigan, talking about the history of
[07:46.120 --> 07:51.880]  the unique operating system. So those folks wrote a new implementation
[07:51.880 --> 08:00.520]  of those commands in 72. So you can see that, for example, CPE and the if command were written
[08:00.520 --> 08:08.360]  in C. The code is surprisingly easy to read. So this one is a source from, again, 50 years
[08:08.360 --> 08:13.480]  ago of CHMOD. I'm sure that if you know a bit of C, you can read it. And found that
[08:13.480 --> 08:18.520]  fascinating to see that the code those folks wrote 50 years ago, I'm showing you that first
[08:18.520 --> 08:26.040]  time in 2023, and it's still relevant, and people can relate to that code. And you can
[08:26.040 --> 08:30.960]  ask yourself, is your code you are writing today still going to be valuable in 50 years?
[08:30.960 --> 08:34.960]  Probably not. But those folks, they made it. And it's probably going to stay for a long
[08:34.960 --> 08:41.000]  time. And this one is the actual implementation of the CHMOD function in the function written
[08:41.000 --> 08:47.560]  in 72. So it's not crazy code. It's full of bugs, probably. But it worked, and it is
[08:47.560 --> 08:54.120]  what started Unix a long time ago. And what I found surprising listening to that podcast
[08:54.120 --> 09:01.840]  is also how amazing programming language, the cori-teals and those command are. We take
[09:01.840 --> 09:08.160]  that for granted. But when you think about it, when you use sort, unique, cat, and all
[09:08.160 --> 09:12.880]  that stuff, it allows you to do some crazy things very quickly. So let's take a few seconds
[09:12.880 --> 09:18.600]  and think, I'll give you a text file, and you want to tell me what is the most common,
[09:18.600 --> 09:25.800]  the five most common words in the Shakespeare books, longer than six car. We can all do
[09:25.800 --> 09:32.480]  that stuff. It's probably six, seven lines of Python, same in Rust and same in source
[09:32.480 --> 09:39.480]  languages. But if you do that in cori-teals with bash, it is that common. And when you
[09:39.480 --> 09:44.760]  think about it, it's very impressive. That pipe and the redirection are the key things
[09:44.760 --> 09:49.480]  that we are doing, and how easy it is to program on a daily basis. All your system is running
[09:49.480 --> 09:53.640]  that kind of stuff on a daily basis, and it makes it super easy. I'm not saying that it
[09:53.640 --> 09:58.840]  is great. It's not fault-tolerant. If you have the single error in that command, everything
[09:58.840 --> 10:02.640]  is going to break, and you are not going to get what you want. But still, you can do that
[10:02.640 --> 10:07.760]  kind of thing very quickly and very easily. So by the way, the results are those ones.
[10:07.760 --> 10:13.800]  So in Shakespeare, more than five letters, it should further accent. I don't know how
[10:13.800 --> 10:17.520]  to pronounce that one. So I had to Google what it means. It means that you leave the
[10:17.520 --> 10:21.600]  scene, or that is what it means in English for Shakespeare before and master. And it's
[10:21.600 --> 10:25.320]  pretty funny to do that stuff. I did it until eight character, and it's quite interesting
[10:25.320 --> 10:31.600]  to see what Shakespeare used in terms of words. So now, let's talk about today. My brother
[10:31.600 --> 10:38.320]  is a story teacher. I'm not. So I will talk about what we have now. So we have 105 commands
[10:38.320 --> 10:43.800]  in the implementation. In the glue implementation, we are trying to reach that level. You are
[10:43.800 --> 10:48.600]  very familiar with many of those. Some of them you probably never heard of. And I'd
[10:48.600 --> 10:55.280]  like also to remind that what is in the corridors can be weird. So sometimes you don't have
[10:55.280 --> 10:59.920]  fine, you don't have, in the corridors you don't have fine less tops and all those commands,
[10:59.920 --> 11:07.480]  but you have some other things. And most of these commands, they come up with arguments
[11:07.480 --> 11:10.920]  which sometimes are conflicting with each other, sometimes are completely changing the
[11:10.920 --> 11:19.040]  argument of the behavior of the command depending on what you enter. So second quiz. So who
[11:19.040 --> 11:25.400]  knows about those commands? So L-I-C-P-M-V. Everybody, sure. And then this one, probably
[11:25.400 --> 11:30.520]  too. Now we are starting with the art stuff. Num format. So it's a command that I, yeah,
[11:30.520 --> 11:35.560]  there is one of the maintainers of the project with myself. So of course he knows. But really
[11:35.560 --> 11:38.240]  much it's the only one. So it's the kind of stuff that we have to deal with because we
[11:38.240 --> 11:43.640]  want to be a drop team replacement for the new project, but we have those kind of things.
[11:43.640 --> 11:52.600]  And who knows about PR? Yeah, one guy. So someone else. Convert text files for printing. So
[11:52.600 --> 11:56.960]  it is one of that command and it has a huge number of arguments which are probably conflicting
[11:56.960 --> 12:02.880]  with each other. So C-split, who knows about C-split? Daniel knows. Yeah, just a few people
[12:02.880 --> 12:08.880]  in that room. So it is to split a file into sections determined by the context line. Yeah,
[12:08.880 --> 12:14.960]  it's scripting, right? Yeah, it's weird. And we have plenty of other. So we have factors
[12:14.960 --> 12:19.040]  to do math. We have Pinky. I don't remember what he's doing. T-Sort is doing some kind
[12:19.040 --> 12:26.480]  of search. Shred is to delete, really remove the data of a file on the drive. I think it's
[12:26.480 --> 12:34.800]  more common, but still I rarely see that one in scripts. And so we have a bunch of implementation
[12:34.800 --> 12:39.520]  of the curators available on the market. So the most common that everybody knows is a new
[12:39.520 --> 12:45.080]  implementation. There are BSD, which is used on Mac, for example. BZbox is the one when
[12:45.080 --> 12:51.480]  you want to use on MBD devices or when you want to recover a system. Toybox is one of
[12:51.480 --> 12:55.840]  the core developers of BZbox. Decided to rewrite Toybox because it was sick of license
[12:55.840 --> 13:00.880]  discussion. I learned recently that there is a VLAN implementation. Don't ask me what
[13:00.880 --> 13:06.240]  is VLAN. I don't know. And if you are aware of the implementation, please let me know.
[13:06.240 --> 13:13.640]  I will tell you why I want to know. So let's talk about our implementation. So it was started
[13:13.640 --> 13:23.480]  by Jordy. I will butcher his name, but Bushiano in 2013. Before version 1.0, I sent an email
[13:23.480 --> 13:29.840]  to Jordy because he has a.be email addressing. I'm going to present the work that you started
[13:29.840 --> 13:35.440]  10 years ago and he said, cool, glad to hear that this project is still alive. Myself,
[13:35.440 --> 13:41.040]  I found it in early 2020, before COVID, and then I started contributing in April. Remember
[13:41.040 --> 13:47.360]  that COVID started in Europe in March. It's not a coincidence. Now, he reads the size of
[13:47.360 --> 13:56.560]  the project. So we have 13,000 stars and we have 350 contributors. The second contributor
[13:56.560 --> 14:00.720]  to that project, you see his picture with a white background is over there. Well done,
[14:00.720 --> 14:06.360]  Terz, for your amazing work. He's the one reviewing RPR. I'm doing the easy one. I'm
[14:06.360 --> 14:12.280]  not a very good developer in general. Now, it's packaged in most of the distro. Obviously,
[14:12.280 --> 14:18.440]  it's not a coincidence in Debian and Ubuntu, but Fedora, Gen2, and most of them. It's used
[14:18.440 --> 14:26.600]  by it is shipping in Apertis, which is a Linux distro for cars, which between you and me,
[14:26.600 --> 14:33.760]  from that scary, that they are using our work in production. But I'm always the imposterous
[14:33.760 --> 14:39.440]  syndrome in terms of development. And it's used by a social network through the Yachter
[14:39.440 --> 14:44.280]  project. So it's not Facebook, the famous social networks, another one, and they are
[14:44.280 --> 14:49.040]  making glasses and so on. So I think you can guess who they are. But they are using that
[14:49.040 --> 14:56.320]  to take pictures in the glasses. And this one is for license reasons. So now I'm doing
[14:56.320 --> 15:02.520]  some product placement for one of the Mozilla achievement rest. So why do we want to do
[15:02.520 --> 15:08.680]  it in rest? It's, you don't have to worry about security issue at Mozilla on Firefox
[15:08.680 --> 15:16.800]  in particular. We see security issue caused by C and C++, not on a daily basis, but almost.
[15:16.800 --> 15:23.600]  And you should not do C and C++ anymore if you care about security. It's very portable.
[15:23.600 --> 15:28.080]  One of the things that I learned with that project is we are supporting a lot of configuration,
[15:28.080 --> 15:34.560]  a lot of operating system. And rest is really amazing for that. It was one of my big discovery.
[15:34.560 --> 15:40.400]  So views probably for some of the rest developer in the room. But for me, it was a surprise.
[15:40.400 --> 15:46.200]  And I really don't like to invent the wheel. So we can leverage a lot of great which has
[15:46.200 --> 15:51.200]  been developed by very talented people over the years. So LS color, for example, is used
[15:51.200 --> 15:56.800]  by LSD or XR probably to provide the same color as LS. And we are using that stuff in
[15:56.800 --> 16:03.840]  the rest corridors. Worked here to do a recursive operation on the directory. We are using that
[16:03.840 --> 16:09.760]  crate so we don't have to worry about that one. Temfile, we use it for MKTem, for example.
[16:09.760 --> 16:15.200]  So if you look at the sources of the node corridors, they have to implement everything
[16:15.200 --> 16:19.720]  by themselves. Sometimes they use that in some libraries, but they have to rewrite a
[16:19.720 --> 16:26.560]  lot of things. While for us, we can reuse what others have been doing. And last but
[16:26.560 --> 16:31.360]  not least, we have amazing performances. I will do a demo later of some of the performances
[16:31.360 --> 16:36.840]  that we have. But surprisingly, we are in some cases, we are significantly faster than
[16:36.840 --> 16:46.200]  the glue implementation. And it's a very popular language. No need to explain why, but we have
[16:46.200 --> 16:51.280]  a lot of contributors. And sometimes we are struggling to keep up with a number of requests
[16:51.280 --> 16:56.840]  just because everybody wants to learn Rust. And you should if you don't. But it's very
[16:56.840 --> 17:06.360]  popular. So what is the goal of this specific implementation? So when I took over that project
[17:06.360 --> 17:12.280]  a few years ago, I had exactly the same idea in mind as Chris Latner and Apple did back
[17:12.280 --> 17:18.480]  then with Clang is to be a drop-in replacement. So if you are not aware of that story, when
[17:18.480 --> 17:24.200]  Apple decided to work on Clang, one of their goals is to be a pure drop-in replacement
[17:24.200 --> 17:29.680]  for GCC in general for most of the options. And it has been one of the success you just
[17:29.680 --> 17:35.800]  had to overwrite the CC or CXX variable and you could use Clang directly. And if it was
[17:35.800 --> 17:41.720]  not working, it was a bug for most of the cases. So it works surprisingly well. Now
[17:41.720 --> 17:47.160]  Clang is a de facto standard for compiling most of the very complex applications like
[17:47.160 --> 17:55.400]  Chrome or Firefox. What can we do to replicate that? So the security is we focused on that
[17:55.400 --> 18:04.320]  one to be a drop-in replacement. We want that stuff to be cross-platform. It has been decided
[18:04.320 --> 18:08.360]  before my time as the leader of the project, but I love the idea. So we support the operating
[18:08.360 --> 18:15.440]  system. We are struggling with a free BSD, a CI because it sucks on GitHub. But besides
[18:15.440 --> 18:21.040]  that, it's working pretty well. Also, except for Fushia, we have CI for every one of them.
[18:21.040 --> 18:30.760]  So for every PR, we run a lot of tests on those. It's very easy to test. So on my laptop,
[18:30.760 --> 18:36.240]  running the full test suite takes less than a minute. And it's covering a lot of part
[18:36.240 --> 18:42.320]  of the code. I will share some of the code coverage information. I don't care about it.
[18:42.320 --> 18:46.840]  Some people do. For some people, it's a strength. For some people, it's a weakness. But it's
[18:46.840 --> 18:54.320]  an MIT license so that social network can reuse our code to save money. Anyway, I'm
[18:54.320 --> 19:00.440]  not interested in having that debate. And in my opinion, and I haven't seen anyone in
[19:00.440 --> 19:06.920]  the community using that argument, it's not a fight against the GNU project or the FSF.
[19:06.920 --> 19:10.840]  The GNU project has been doing a lot of good things for us in the open source world for
[19:10.840 --> 19:19.200]  20 or 30 years. We are standing on their shoulder. It's not a fight. I know that some GNU
[19:19.200 --> 19:24.160]  quality developer of monitoring that project for a long time have been very friendly and
[19:24.160 --> 19:30.240]  so on. And I met Tim Mayoring 10 years or 20 years ago. And I was very impressed by that
[19:30.240 --> 19:37.480]  person. So, yeah, it's not about fighting. It's about collaboration. So when I started
[19:37.480 --> 19:42.000]  that project two or three years ago, my initial goal where I want to be able to boot a Debian
[19:42.000 --> 19:48.840]  on it, so my laptop here is running the GNU curricules now. For example, so it's not lying
[19:48.840 --> 19:56.360]  or it's working well now. Then it was to install the top 1,000 packages in Debian. So if I
[19:56.360 --> 20:01.680]  wanted to do that, is that Debian has a lot of script to configure the package because
[20:01.680 --> 20:10.960]  it's post-inst and they are usually done in bash and using the sort and Cp and MV and install.
[20:10.960 --> 20:17.800]  It's exercising a lot of features of the GNU corretails. And one of the goals was also to
[20:17.800 --> 20:25.520]  build three of the big projects I care about. So Linux, LLVM and Firefox, obviously. So we
[20:25.520 --> 20:31.640]  don't use that much scripting, so bash or corretails, but I still found some bugs, building
[20:31.640 --> 20:38.440]  so Linux can know some corner cases. And of course, package it into Debian and Ubuntu.
[20:38.440 --> 20:45.520]  I published some blog about it. They have been shared on a bunch of places. I've had
[20:45.520 --> 20:53.120]  some interesting comments, some of them not very interesting. Anyway, so to achieve those
[20:53.120 --> 20:58.400]  goals, we had to deploy a CI, add code coverage support, improve the code coverage. So it's
[20:58.400 --> 21:02.760]  one of the things to get familiar with a project. I wrote a lot of unit tests. So now the code
[21:02.760 --> 21:09.600]  coverage of our implementation is 80%. If you don't know much about code coverage, 80%
[21:09.600 --> 21:13.680]  is usually what we are trying to achieve in a project. It means that the code coverage
[21:13.680 --> 21:20.120]  is very good. It's very hard to reach 100% and sometimes it's a waste of time, but 80
[21:20.120 --> 21:24.960]  is usually considered as being a very good code coverage. I think on Firefox, we are
[21:24.960 --> 21:32.680]  at 65 or 70, something like that. And we plug the bunch of tools. I mean, Lov is static
[21:32.680 --> 21:38.160]  analysis, LinkedIn, so of course I had to do that for that project. And we also documented
[21:38.160 --> 21:43.800]  a lot of those processes. Everybody loves about docs, so we wrote plenty of docs. And
[21:43.800 --> 21:48.600]  it took about a year to reach that state. So now the current stages. So what we have
[21:48.600 --> 21:56.080]  here is CI that is running for every PR. And we run our implementation against the new
[21:56.080 --> 22:01.840]  test suite. This is the latest graph so you can see that we have been working a lot on
[22:01.840 --> 22:08.160]  improving the compatibility. We are not there yet. I will fix one of them with you later
[22:08.160 --> 22:12.840]  during that presentation. But there is no silver bullet. For many of those, you have
[22:12.840 --> 22:19.840]  to spend a few hours to fix one. And you improve one by one. Before you ask a question, the
[22:19.840 --> 22:26.280]  skip is mostly that it is AC Linux. So CPMV and some common or CH-con, they are using
[22:26.280 --> 22:32.640]  AC Linux and GitHub action uses Ubuntu and Ubuntu doesn't have like default, so it's
[22:32.640 --> 22:36.600]  tricky to test that stuff into the CI. If you know how to do it, please reach out to
[22:36.600 --> 22:44.040]  us and like to fix that stuff. So how do we work? As I said many times, we want that stuff
[22:44.040 --> 22:50.440]  to be a timely replacement. So we wrote a mini wrapper to make it super easy. So that's
[22:50.440 --> 22:59.480]  command that I share is going to run the not-owner test for the touch command on GNU. So it's
[22:59.480 --> 23:07.000]  going to use the GNU test and run it against our implementation. So it's super easy to
[23:07.000 --> 23:13.960]  test. And we wrote some script to make our life easier. So we have a Python script which
[23:13.960 --> 23:18.160]  is going to tell us to the list. If you do it, it's several pages because we still have
[23:18.160 --> 23:27.880]  a lot of tests to fix. And we wrote also a fancy page. I can show you what it looks like.
[23:27.880 --> 23:34.560]  So here it is. We have the list of all the tests for each command. And of course, the
[23:34.560 --> 23:41.920]  big one is MISC where we have a mix. So sometimes it's just one line change in the code. Sometimes
[23:41.920 --> 23:45.800]  it's a big refactor. It depends. It's part of the fun. You never know what you are going
[23:45.800 --> 23:57.080]  to get. So let's use an example, for example, for MKgear. So this is the GNU I would put.
[23:57.080 --> 24:03.920]  So you do dash P. Who knows what is dash P in that one? Okay, cool. So it's create a
[24:03.920 --> 24:08.360]  recursive and V is verbose. Everybody knows that. So in the GNU implementations, they
[24:08.360 --> 24:13.600]  decided to do it directly by directly. You can argue that maybe it's not a good use. It's
[24:13.600 --> 24:19.480]  not smart. Maybe it is. Who cares? It's legacy. We have to deal with legacy. So our implementation
[24:19.480 --> 24:26.480]  was that one. So you can argue that maybe ours was better or worse. Who cares? But we
[24:26.480 --> 24:31.800]  updated that code. So we match exactly what GNU is doing. So if you look, I will share
[24:31.800 --> 24:37.800]  the slide after. You can look at the change. It's pretty easy to understand. This one is
[24:37.800 --> 24:44.640]  one of my favorite. So in Debian, when you install LibOffice, it is using app or more.
[24:44.640 --> 24:49.520]  And someone decided that instead of doing a touch, you do install DevNule, which creates
[24:49.520 --> 24:56.520]  an empty file. It's legacy also. We want to fix legacy everywhere. Probably not. So it
[24:56.520 --> 25:00.720]  was one of my favorite bugs. So I started investigating. So you use some REST codes
[25:00.720 --> 25:07.240]  which reproduce that issue. So if you do a copy of DevNule into a text file, it is failing
[25:07.240 --> 25:14.120]  with a source pass. It's not an existing regular file. So I open a bug upstream. So thread
[25:14.120 --> 25:18.240]  is quite interesting. Like people in the REST community are very passionate about that
[25:18.240 --> 25:23.720]  kind of thing. Too long didn't read. It hasn't been fixed. So here is a workaround. It's
[25:23.720 --> 25:30.800]  ugly. But if you know a better way to do it, besides fixing REST and dealing with a fallout
[25:30.800 --> 25:36.920]  of the fix, this is the best that we have. So we are looking if the input is DevNule.
[25:36.920 --> 25:43.040]  If it is DevNule, we are creating an empty file. This is what it takes to deal with legacy.
[25:43.040 --> 25:50.040]  So let's do a demo together. So please bear with me. It's going to be fun with the mic.
[25:50.040 --> 26:17.040]  It's going to be fun with the mic. Hello. So, ah, uppercase. Sure. Ah crap. Just switch
[26:17.040 --> 26:24.040]  to alacrity and never remember the shortcut to increase the font. Ha ha. This one. Better.
[26:24.040 --> 26:37.040]  Cool. So this is not the new version that I'm running. You can see I'm running our implementation.
[26:37.040 --> 26:43.720]  So we are working third release 0.0 17 weeks ago, something like that one week ago. So
[26:43.720 --> 26:50.720]  we are using that implementation of it. So when I took the train to come here, I looked
[26:50.720 --> 26:56.960]  at the test suite and I tried to find a cool bug for the demo. So I found one and I want
[26:56.960 --> 27:06.960]  to show you what it is. Because I don't always remember the command because I'm a font of
[27:06.960 --> 27:19.960]  a crowd. I don't want to look stupid. So I did a post bet within French. So here is building
[27:19.960 --> 27:28.960]  some stuff. So what I'm testing right now is a sort command. The sort command has some
[27:28.960 --> 27:34.960]  fancy flag. So this is a command from the GNU project. So I will show you what the test
[27:34.960 --> 27:51.960]  looks like on the GNU side. So here is the test. So of course we have the TPL3 on top
[27:51.960 --> 28:01.560]  and then there is a list of file with version name. So we have on the top we have the input.
[28:01.560 --> 28:12.200]  So for in and below we have the expected one. So the command that they are testing is a
[28:12.200 --> 28:19.560]  stable sort with a sort by version. So sort and same for LS. So you can specify what kind
[28:19.560 --> 28:24.800]  of sort you want. And of course doing sort of version is super complex and we can base
[28:24.800 --> 28:30.440]  it about that stuff for hours. I love it. So here is what they are testing. So of course
[28:30.440 --> 28:39.120]  as you can see it failed on our side. And why did it fail? So it seems that it is complaining
[28:39.120 --> 28:51.480]  that 5.4.0 it's not sorted the same way as 5.0.4. Super interesting, isn't it, right?
[28:51.480 --> 28:55.960]  So basically it means that we are sorting that version differently. Of course it's so
[28:55.960 --> 29:00.520]  obvious that it is the same version and it should be equal but when you are sorting in
[29:00.520 --> 29:05.360]  those cases equal doesn't mean anything so you need to make a decision. And of course
[29:05.360 --> 29:10.640]  we decided to do otherwise because I think the person who did the implementation didn't
[29:10.640 --> 29:17.160]  realize that GNU was doing something different. So let's try to fix that together. So what
[29:17.160 --> 29:23.400]  I like to do, I don't know how other contributors are doing is that I like to have test cases.
[29:23.400 --> 29:31.480]  So what I'm doing is I'm creating some basic command to be able to reproduce that easily
[29:31.480 --> 29:35.400]  so that I don't have to run the test. So I spare you the details. So here is the test
[29:35.400 --> 30:01.000]  case. Let me do that right now. So I've got my test file. So what I'm doing is I'm forcing
[30:01.000 --> 30:05.720]  the full pass to use the GNU implementation. I used the two arguments that we mentioned
[30:05.720 --> 30:19.760]  earlier and now I have the input and the output. In theory if I do a diff it should be empty.
[30:19.760 --> 30:26.800]  And it is. I love when demo works well. It's just the beginning so I will probably affect
[30:26.800 --> 30:39.960]  about some point. So now I want to test the GNU implementation. So here it is. So now
[30:39.960 --> 30:43.640]  I have a simple test case that I can work on. So I don't have to run the test suite.
[30:43.640 --> 30:47.480]  I don't have to do anything else. It's one of the things that I love with that project
[30:47.480 --> 30:51.320]  and that's why I'm not doing REST code at medias because I'm not a very good developer
[30:51.320 --> 30:56.080]  and Firefox is super complex and that stuff is pretty easy to do. You can do that on the
[30:56.080 --> 31:00.000]  train to come here. For example, what I need is taking this one. It's probably going to
[31:00.000 --> 31:05.600]  take 20 minutes to fix. Well, there is a weird thing at the end but I'm not going to produce
[31:05.600 --> 31:11.200]  a surprise yet. Anyway, this is the code that I have. So let's dive in now into the actual
[31:11.200 --> 31:26.320]  code. So let's use GDB. So GDB and REST works very well. So if I do a run, yeah, it works.
[31:26.320 --> 31:33.200]  If I do a breakpoint on main, yeah, it works. Next, next, next. Here it is. You can see
[31:33.200 --> 31:38.280]  the REST code. You can evaluate the variables working super well. So I won't open and look
[31:38.280 --> 31:41.640]  for the function. I already did that for you. So I already know the function where to put
[31:41.640 --> 31:49.480]  the breakpoint. So it's version underscore CMP. Our code is pretty well written. So I
[31:49.480 --> 31:54.160]  guess you'll understand what version CMP is doing. It's comparing version. So now I continue
[31:54.160 --> 32:00.760]  the execution and then I'm in the function doing it. So if I look at A, I see that it
[32:00.760 --> 32:07.680]  is that string. So the first one, if I look at B, here it is. So I have the two strings.
[32:07.680 --> 32:13.920]  So I will move in quick mode. There is the execution. So we have the version compare.
[32:13.920 --> 32:18.280]  And here is a function that I care about. So I will scroll. But you see in that function,
[32:18.280 --> 32:23.240]  it's probably what I care about. Because it is, if you don't know about REST, it is going
[32:23.240 --> 32:28.320]  to trim the zero at the beginning. So obviously in that case, it is what I'm looking for.
[32:28.320 --> 32:46.320]  So I want to remove the zero at first. So let's fix that code now. I learned from one
[32:46.320 --> 33:03.680]  of my colleagues at NANO. So I'm following the example. Yeah, exactly. It's the best,
[33:03.680 --> 33:25.520]  right? So here it is. I removed the trim. Of course, it's going to break something else
[33:25.520 --> 33:34.160]  after. I'm confident that it works. So let's rebuild that stuff. So here it is. It's rebuilding.
[33:34.160 --> 33:39.720]  It is what the file that I touched is using. It's a UCOR. So it's one of our basic library
[33:39.720 --> 33:43.920]  to do file management and so on. So it's normal that it is rebuilding the values dependencies.
[33:43.920 --> 34:00.360]  Not GDB. I want without GDB. Please. That one. And did it work? Yeah, it worked. Cool.
[34:00.360 --> 34:08.200]  I fixed the bug. So very proud of myself. I love how geeks love versioning comparison.
[34:08.200 --> 34:17.240]  So now the funny story is, of course, because it would be too easy. Now I'm not going to
[34:17.240 --> 34:21.440]  run the full test suite even if it takes less than a minute, but I'm just going to run that
[34:21.440 --> 34:26.560]  test because I know that that one fails. So LS has the same function, but LS has a different
[34:26.560 --> 34:30.840]  expectation in terms of version sorting. So, of course, that test is going to fail because
[34:30.840 --> 34:38.240]  LS likes the zero before instead of the other one. So, of course, that test fails. So one
[34:38.240 --> 34:43.480]  of the things that we could do is in the version compare, we could have, do we want zero to
[34:43.480 --> 34:52.600]  be first or do we want zero to be after doing a boolean? And when I say this, because I don't
[34:52.600 --> 34:56.480]  like the version comparison function, it's on our tree. So I think it's done. Someone
[34:56.480 --> 35:00.040]  else did it. Of course, there is a great doing it. So I reached out to the developer saying,
[35:00.040 --> 35:03.720]  can you add an option to change the sort when it is a zero? So the upstream made fun of
[35:03.720 --> 35:08.280]  me, but they have a PR ready, so they will probably end it. So we are going to remove
[35:08.280 --> 35:12.400]  100 offline of code and use that crate because it is what we want to do. We don't want to
[35:12.400 --> 35:20.160]  maintain a comparison of version because who cares, right? So let's come back to the
[35:20.160 --> 35:25.840]  presentation. So performances. Benchmarking performance is hard. I know that in the room
[35:25.840 --> 35:31.120]  we have some experts in performances and they are not going to contract with me that when
[35:31.120 --> 35:37.320]  I say that benching is hard. But we are using hyperfine. We can see that, for example, the
[35:37.320 --> 35:43.240]  start is almost five times faster than the boolean implementation because some people
[35:43.240 --> 35:48.840]  spend a lot of time improving it in our code, but also in the crate. So that one is basically
[35:48.840 --> 35:54.960]  taking all Shakespeare books, the text file, making it random and sort it. So we are significantly
[35:54.960 --> 36:04.480]  faster than the new implementation. Similarly, if you do a recursive LS or a recursive CP,
[36:04.480 --> 36:14.000]  we are 1.5 faster than new is probably because the code generated by RC is much better than
[36:14.000 --> 36:19.440]  the one and returned by GNU for a long time. But I don't want to pretend that we are always
[36:19.440 --> 36:26.840]  doing better. For example, with the factor function, we are significantly slower than
[36:26.840 --> 36:31.520]  the new implementation five times. I don't know who uses the comment factor. I learned
[36:31.520 --> 36:35.840]  about that comment when I started contributing, but still, we want to be a good replacement
[36:35.840 --> 36:42.880]  and that one is very slow. So I'm going to do another demo using, I'm going to do some
[36:42.880 --> 36:50.160]  product placement for a project called Sempli. He is the author of that project. He is in
[36:50.160 --> 36:56.440]  the room, so if you need anything, don't blame me, blame him. I will do a quick demo
[36:56.440 --> 37:03.600]  about how can you do a proper benchmarking of performances. I will try to find how to
[37:03.600 --> 37:25.680]  do the primary perf kernel change. That one. I love the SH. So I just did a recursive LS
[37:25.680 --> 37:31.200]  with a comment called Sempli. So basically, it is going to instrument it and it's going
[37:31.200 --> 37:37.840]  to upload into the Firefox profiler, the analysis of that program. So I will zoom in
[37:37.840 --> 37:45.240]  a second. So here it is. I think Bodhiya did four presentations of the profiler today,
[37:45.240 --> 37:50.440]  so if you haven't seen any, I'm going to do a quick demo. Anyway, it is one of the magic
[37:50.440 --> 37:55.680]  things of Rust is that we can easily do that stuff. You saw how long it took. I will show
[37:55.680 --> 38:01.440]  you the command again that I run. This is that one. So Sempli records the binary and
[38:01.440 --> 38:07.400]  the argument. So it's very easy. If I try to do that, we see another language is going
[38:07.400 --> 38:13.600]  to take forever, but Rust and Sempli makes that so easy. So here I have the flame graph.
[38:13.600 --> 38:18.880]  So I can see that for a recursive LS, most of the time that we are spending is in display
[38:18.880 --> 38:23.360]  grid. So most of the time that LS is going to spend is not reading at the file or reading
[38:23.360 --> 38:27.200]  the metadata. It's just doing computation about how do you want to show the result
[38:27.200 --> 38:34.200]  into the terminal. And one of the amazing things is that it's going to, I can look at
[38:34.200 --> 38:39.760]  the source directly and see the counters and I can easily benchmark that stuff. So if you
[38:39.760 --> 38:47.040]  are into performances in Rust, you can use Sempli and the Firefox profiler to do it.
[38:47.040 --> 38:52.160]  And again, not here as Bodhiya, but it's very valuable for any project. It's not Bodhiya
[38:52.160 --> 38:59.360]  related and it makes that stuff super easy. So we have also some fancy documentation.
[38:59.360 --> 39:05.240]  Tert in the room did most of the work. So one of the things that he did that I love
[39:05.240 --> 39:10.920]  is that one of the things that I really struggled at first when I became a UNIX developer like
[39:10.920 --> 39:16.960]  20 years ago is that I never knew how to do an example and find the example. So Tert linked
[39:16.960 --> 39:25.480]  to TLDR.sh where it's providing example for every comment. So for example, here it's
[39:25.480 --> 39:30.800]  base 64. You have example, but I'm going to use a comment that I didn't know about, shred.
[39:30.800 --> 39:34.840]  You have example for shred. So you don't have to Google on Stack Overflow how to do that
[39:34.840 --> 39:46.440]  stuff. You have that out of the box in our documentation. So we have development documentation.
[39:46.440 --> 39:50.760]  I want good suite for a matter of time. But one of the things, we are taking the liberty
[39:50.760 --> 39:56.480]  also to extend the new clarity. So we don't want to break compatibility, but we are doing
[39:56.480 --> 40:07.120]  some fancy thing like progress bar because who doesn't love progress bar? So for example,
[40:07.120 --> 40:18.240]  of course, it is that one. So we have a fancy progress bar now. You don't have that upstream.
[40:18.240 --> 40:24.280]  So you can think the guy is over there. I haven't done anything. But before the talk,
[40:24.280 --> 40:29.560]  he came to me and said, I looked at the implementation, at the patch that wasn't merging new and that
[40:29.560 --> 40:34.640]  patch was very hard to understand. If you look at the diff on the rest side, even if
[40:34.640 --> 40:38.520]  you don't know about rest, you will understand it because we are relying on a crate. I think
[40:38.520 --> 40:45.000]  the upstream developer of the crate is for them also. So thank you for your hard work.
[40:45.000 --> 40:49.400]  So here is what kind of stuff we can do. We can do it for MV. And MV is interesting because
[40:49.400 --> 40:55.520]  if you are moving between one file system to the other, it's not always super fast.
[40:55.520 --> 40:59.760]  We are also implementing some tools that new doesn't have. So B3Sum, for example, because
[40:59.760 --> 41:06.440]  it's like B2Sum, so it's a hash algorithm. And cut-w, someone contributed that patch
[41:06.440 --> 41:15.120]  recently. It's one of the options from BSD that new doesn't have. So what is next? We
[41:15.120 --> 41:20.280]  want to implement the missing option for the values binary. It's not hard. It's fun. We
[41:20.280 --> 41:26.000]  try to be nice. I'm not always nice, but I try to. We want to have a full compatibility
[41:26.000 --> 41:32.320]  with new. So the list that I show you, I like that to be fully green at some point. It's
[41:32.320 --> 41:37.080]  going to take years, but it's fun. It proves the performances on some key programs, like
[41:37.080 --> 41:40.520]  for example, for Factor. I don't know if we want to spend too much time, but there are
[41:40.520 --> 41:47.960]  other things where we could improve the performances. And if you're interested, that link is the
[41:47.960 --> 41:55.280]  most interesting one, to know where to start. It's well documented. We, if you are a student,
[41:55.280 --> 42:01.080]  we are probably going to apply for the Google Summer of Code. And if someone is interested
[42:01.080 --> 42:06.160]  in sponsoring that project, if your company or your foundation is interesting, I'd like
[42:06.160 --> 42:12.840]  to buy some credits on GitHub Action to build faster, because we're running the new street
[42:12.840 --> 42:19.760]  test about an hour. And I think we are using a lot of resources, so it would be nice to
[42:19.760 --> 42:29.760]  have a faster CI. And now I'd like to predict the future. So the Linux kernel, they landed
[42:29.760 --> 42:36.960]  the first support of Rust inside the tree. So if you do some stat on the Linux Git repo,
[42:36.960 --> 42:42.920]  you see that they are first. So there is only 37 files upstream. It's mostly glue. So how
[42:42.920 --> 42:50.000]  do you manage memory and compatibility with the system and support for the build system?
[42:50.000 --> 42:56.360]  We hope that they are going to land something in mainline soon, a feature, but is it going
[42:56.360 --> 43:00.440]  to happen this year or not? We don't know. And some people are still challenging Rust
[43:00.440 --> 43:08.400]  in the Linux kernel. The main argument that I heard is mainly for legacy system. But my
[43:08.400 --> 43:15.120]  prediction is that next year, we will start to see more and more distro vendor or cloud
[43:15.120 --> 43:21.040]  company to ship with some more and more Rust code. Probably my prediction is that some Linux
[43:21.040 --> 43:27.880]  distro are going to ship in the cloud our implementation in the next few years. So thanks
[43:27.880 --> 43:31.600]  for your time. I'd like to remind that it is not a Mozilla project, so please don't say
[43:31.600 --> 43:36.120]  Mozilla works on that. It will make my life easier. And I think we have a few minutes
[43:36.120 --> 43:39.120]  for questions.
[43:39.120 --> 44:07.120]  Here, I think we need to, in the back. We David.
[44:07.120 --> 44:22.480]  It's an issue for every Rust project currently. The fact that cargo is amazing at updating
[44:22.480 --> 44:28.880]  dependency is many of the Rust projects are subject to supply chain attack lately. And
[44:28.880 --> 44:35.440]  we are no exception. So I can do product placement if you, Mozilla has developed a tool called
[44:35.440 --> 44:42.040]  the cargo vet, which mitigates that stuff, but it's too complex to deploy for a hobby
[44:42.040 --> 44:50.560]  like this one. So yeah, you can do supply chain attack in our project. But we don't merge
[44:50.560 --> 44:57.640]  the dependable pull request immediately. We are trying to mitigate that by taking our
[44:57.640 --> 45:13.160]  time. Any other question? Yeah, two over there. Yeah, please.
[45:13.160 --> 45:18.680]  The code size compared to busy box, I don't know. With GNU, it's similar. So you have
[45:18.680 --> 45:22.920]  different way to compile the project. So the trick that I'm using in Debian for the package
[45:22.920 --> 45:28.200]  is that I'm building a single binary and I'm doing sim link. So if you do a sim link from
[45:28.200 --> 45:34.920]  curateals to a CP or to NS, the binary is going to understand that you are calling CP.
[45:34.920 --> 45:40.640]  So the size is comparable to the new implementation in that case. But because of the nature of
[45:40.640 --> 45:52.760]  course, the binaries are bigger than the rest. Another question? Yeah, it's a question that
[45:52.760 --> 46:04.960]  comes back so often. Do you compute the code coverage only on the passing tests or all
[46:04.960 --> 46:10.720]  of them? We run code coverage on everything, including the GNU test suite. So it takes
[46:10.720 --> 46:24.040]  an hour and a half. So the tests that are failing? Yeah, yeah. We test everything.
[46:24.040 --> 46:30.120]  Did you find any bugs in the original GNU implementation when doing the re-implementation?
[46:30.120 --> 46:36.320]  Yeah. We have a contributor who found two or three bugs in the upstream implementation
[46:36.320 --> 46:49.560]  and they have been fixed up. We ported and fixed up. Have you had any need to use unsafe
[46:49.560 --> 46:54.880]  and if yes, can you give a few examples in which kind of areas? We have some unsafe when
[46:54.880 --> 46:59.520]  we are calling the libc, some function of the libc we are calling it directly. So we have
[46:59.520 --> 47:17.560]  some unsafe. Yeah. I'm going to repeat what he said. He said that it's for the libc. It's
[47:17.560 --> 47:28.160]  mostly native calls. Have you looked at code complexity metrics like
[47:28.160 --> 47:32.760]  cyclomatic complexity? Because I guess with the error handling, the memory management
[47:32.760 --> 47:38.440]  and cleanup in C is often really messy and that could be a huge boon in maintainability.
[47:38.440 --> 47:44.360]  Yeah, yeah. We have some codes that clearly need to be factoring. So for example, in LS
[47:44.360 --> 47:49.640]  we have to work around the limitation of CLAP and the code complexity here is getting more
[47:49.640 --> 47:54.400]  and more complex. Yeah. That project needs some policies of our implementation need to
[47:54.400 --> 47:59.280]  be factoring to decrease it. But the code is usually pretty easy to understand. Thanks.
[47:59.280 --> 48:09.160]  Okay. I think we're out of time. Thank you all. Don't hesitate to contribute. I can find
[48:09.160 --> 48:25.200]  an email if you need.